Team:UESTC-Software/Overview.html

From 2014.igem.org

UESTC-Software

Overview

Just like “Plants vs. Zombies" has become one of the most popular games, CRISPR/Cas system learned, borrowed and modified from the natural game “Bacteria vs. Phages" has been the hottest technology for genome editing. The CRISPR craze wept across scientific community last year. Synthetic Biology and iGEM were no exception. Twelve 2013 iGEM teams worked on CRISPR in their projects, with chassis varying from E. Coli, yeast, to plant and Mammalian. More than a dozen parts related to CRISPR/Cas were submitted last years. Some of them are in the DNA Kit of Parts for 2014 iGEM, shipped worldwide from the iGEM Headquarters. The sgRNA design tools are an important part for CRISPR/Cas technology and have attracted bioinformatician all over the world.

However, all available tools neglect the purpose of a given experiment, pay no attention to BioBricks standards, and do not support standard for synthetic biology data exchange (SBOL). Thus, a sgRNA design tool for genome editing in synthetic biology is desirable. In this project, we present CRISPR-X, a sgRNA design tool fully supporting SBOL and BioBricks standards with dynamic algorithm based on intent function, chassis, and newest experimental data. It adopts a C/S and B/S framework. The program on the server side is implemented with C and front-ends are implemented with Python (Browser ends) or JAVA ( Android ends, Desktop ends ), making it across all platforms.

The Brief Introduction And Functioning Of The CRISPR-Cas Systems

Development of tools for targeted genome editing and regulation of gene expression has significantly expanded our ability to elucidate the mechanisms of interesting biological phenomena, and to engineer desirable biological systems. The clustered, regularly interspaced short palindromic repeats (CRISPR) in combination with a CRISPR-associated nuclease 9 (Cas9) were recently demonstrated to be versatile tools for genome engineering [1]. Figure 1 shows the functioning of the type II CRISPR-Cas systems in bacteria. Phase 1: in the immunization phase, the CRISPR system stores the molecular signature of a previous infection by integrating fragments of invading phage or plasmid DNA into the CRISPR locus as ‘spacers’. Phase 2: in the immunity phase, the bacterium uses this stored information to defend against invading pathogens by transcribing the locus and processing the resulting transcript to produce CRISPR RNAs (crRNAs) that guide effector nucleases to locate and cleave nucleic acids complementary to the spacer. First, tracrRNAs(trans-activating crRNA) hybridize to repeat regions of the pre-crRNA. Second, endogenous RNase III cleaves the hybridized crRNA-tracrRNA, and a second event removes the 5′ end of the spacer, yielding mature crRNAs that remain associated with the tracrRNA and Cas9. The complex cleaves complementary ‘protospacer’ sequences only if a PAM sequence is present [2].

Figure 1. Functioning of the type II CRISPR-Cas systems in bacteria.[2]

Potential applications for CRISPR-Cas9

In bacteria and archaea, CRISPR-Cas systems provide immunity by incorporating fragments of invading phage and plasmid DNA into CRISPR loci and using the corresponding CRISPR RNAs (crRNAs) to guide the degradation of homologous sequences [3]. Figure 2 shows the potential applications for CRISPR-Cas9.The diverse potential applications of Cas9 range from targeted genome editing (via simplex and multiplex double-strand breaks and nicks) to targeted genome regulation (via tethering of epigenetic effector domains to either the Cas9 or sgRNA, and via competition with endogenous DNA binding factors) and possibly programmable genome reorganization and visualization. Cas9 might also be engineered to function as an RNA-guided recombinase, and via RNA tethers could serve as a scaffold for the assembly of multiprotein and nucleic acid complexes[2].

Figure 2 .The potential applications for CRISPR-Cas9.[2]

Background and significance for CRISPR-X

Although the CRISPR/Cas9/sgRNA system efficiently cleaves intracellular DNA at desired target sites, major concerns remain on potential “off-target" cleavage that may occur throughout the whole genome. In order to improve CRISPR-Cas9 specificity for targeted genome editing and transcriptional control, we describe a bioinformatics tool “CRISPR-X", which is a software developed for design of CRISPR sgRNA with high cutting rate and minimized off-target effects. This software consists of programs to perform a search for CRISPR target sites (protospacers) with user-defined parameters, predict and evaluate genome-wide Cas9 potential off-target cleavage sites.

Comparison With Other CRISPR sgRNA Design Tools

Many online or stand-alone tools have been developed to design CRISPR target sites or predict off-target sites. Online tools “Cas9 Design" (http://cas9.cbi.pku.edu.cn/index.jsp)[4] and “CRISPR/Cas9 gRNA finder" (http://spot.colorado.edu/~slin/cas9.html) can be used to design single or paired sgRNAs, but does not find off-targets. Cas-OFFinder (http://www.rgenome.net/cas-offinder/portable) is a web and stand-alone tool, which very rapidly finds off-targets for individual CRISPR sgRNA, but does not find candidate sgRNAs[5]. Another stand-alone tool is CasOT. It can be used to find candidate sites from input sequence and print out potential off-target sites as well. It attempts to 'score' the effect of the off-target by notifying if it is placed inside a coding exon[6]. ZiFiT ZiFiT (http://zifit.partners.org/ZiFiT/ChoiceMenu.aspx)[7] is only available for genomes of 9 spieces. Other online tools, such as"CRISPR-P" (http://cbi.hzau.edu.cn/cgi-bin/CRISPR) , “Optimized CRISPR Design" (http://crispr.mit.edu/) and E-CRISP (http://www.e-crisp.org/E-CRISP/)[8]can identify all off-target sequences (preceding either NAG or NGG PAMs) across the genome. These tools can automatically rank each possible sgRNA according to its total predicted off-target cleavage; the top-ranked sgRNAs may represent those that are likely to have the greatest on-target and the least off-target cleavage. Although these online tools are powerful and easy to use, the length of protospacer( not including "NGG") is only 20nt, not containing 17nt or 18nt (which achieve up to 10,000 fold improvement in target specificity[9]) ,which limits their application. Detailed information about comparison of different CRISRP/Cas9 design tools is listed in the Table 1.

CRISRP/Cas9 design toolsSearch CRISPR target sites?Length of protospacer(nt), not including "NGG"Length of input sequence (bp)?Kinds of PAM(NGG, NRG, NGMTT, NAGAAW)Evaluate off-targetsMaximum number of
mismatches
Check off-target site whether located in geneNumber of species’genomeScoring consider sgRNAs’ efficacy and cutting rateWebsites
CRISPR-XYes17-20Not limitAll 4Yes4Yes≥5Yeshttps://2014.igem.org/Team:UESTC-Software/
ZiFiTYesNot limit1000Only NGGYes3No9Nohttp://zifit.partners.org/ZiFiT/ChoiceMenu.aspx
Optimized CRISPR DesignYes2023-500Only NAG and NGGYes4Yes15Nohttp://crispr.mit.edu/
E-CRISPYes20Not limitOnly NAG and NGGYes?Yes20Yeshttp://www.e-crisp.org/E-CRISP/
Cas-OFFinderNo20,18,2420,18,24All 4Yes10No16Nohttp://www.rgenome.net/cas-offinder/portable
CRISPR-PYes2023-5000Only NAG and NGGYes4Yes28Nohttp://cbi.hzau.edu.cn/cgi-bin/CRISPR
CasOTYes18-30<1000Only NAG and NGGYes6YesAnyNohttp://eendb.zfgenetics.org/casot
Cas9 DesignYes20Not limitOnly NGGNoNoNoNoNohttp://cas9.cbi.pku.edu.cn/index.jsp
CRISPR/Cas9 gRNA finderYes13-32Not limitOnly NGGNoNoNoNoNohttp://spot.colorado.edu/~slin/cas9.html
Relationship With Synthetic Biology And iGEM Standard Parts

Genome editing is an important part of synthetic biology. It has important application value in studying the function of genes, as well as gene correction and cell replacement therapy. CRISPR / Cas technology is a rising genome editing technology, which greatly improves the ability to modify and edit the genome sequence for scientists. Our software designs CRISPR sgRNA with minimized off-target effects and high cutting rate. A part is compatible with an assembly standard, as long as its sequence has no sites of relevant restriction enzymes. Our software designs a RFC filter option that contains RFC[10],RFC[12], RFC[21], RFC[23] and RFC[25]. This ensures that each sgRNA meets the requirements of the chosen assembly standard.

Best Practices

We do follow best practice in our project development. From the start of our project, we host and manage our project on github. So we have a clear and detailed develop history on github. Besides that, we use more modern software continuous integration (CI) tools. In details,Travis CI performs our automatic build and test tasks of CRISPR-X server core. To review our build history, please visit https://travis-ci.org/igemsoftware/UESTC-Software_2014/builds. Most importantly, we use Coverall.ioto evaluate our test coverage automatically. It’s significant for us to design better test cases. And our test coverage log can be found on this page, https://coveralls.io/r/uestc-igem-2014/CRISPR-X. To make our software more robust, we also did many test. Test report can be found here, https://2014.igem.org/Team:UESTC-Software/Testing.html

Future work

  • 1. More model organisms will be supported.
  • 2. We would help users share results with their collaborators through email.
  • 3. For further long term, we would integrate with functional design, helping users in whole CRISPR experiment.
  • 4. We would make CRISPR-X support plugins, so other iGEMers can extend it easily.

    Reference:
  • [1] Xie, S., Shen, B., Zhang, C., Huang, X., & Zhang, Y. (2014). sgRNAcas9: A software package for designing CRISPR sgRNA and evaluating potential off-target cleavage sites. PloS one, 9(6), e100448.
  • [2] Mali, P., Esvelt, K. M., & Church, G. M. (2013). Cas9 as a versatile tool for engineering biology. Nature methods, 10(10), 957-963.
  • [3] Terns, M. P., & Terns, R. M. (2011). CRISPR-based adaptive immune systems. Current opinion in microbiology, 14(3), 321-327.
  • [4] Ma, M., Ye, A. Y., Zheng, W., & Kong, L. (2013). A guide RNA sequence design platform for the CRISPR/Cas9 system for model organism genomes.BioMed research international, 2013.
  • [5] Bae S, Park J, Kim JS (2014) Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics [Epub ahead of print].
  • [6] Xiao A, Cheng Z, Kong L, Zhu Z, Lin S, et al.. (2014) CasOT: a genome-wide Cas9/gRNA off-target searching tool. Bioinformatics [Epub ahead of print].
  • [7] Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832.
  • [8] Heigwer, F., Kerr, G., & Boutros, M. (2014). E-CRISP: fast CRISPR target site identification. Nature methods, 11(2), 122-123.
  • [9] Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M., & Joung, J. K. (2014). Improving CRISPR-Cas nuclease specificity using truncated guide RNAs.Nature biotechnology, 32(3), 279-284.