Team:SUSTC-Shenzhen/gRNA Design
From 2014.igem.org
(Created page with "{{SUSTC-Shenzhen/removeStyles}} {{SUSTC-Shenzhen/themeCss}} {{:Team:SUSTC-Shenzhen/templates/nav-template}} {{:Team:SUSTC-Shenzhen/templates/page-header| title=gRNA-mCher...") |
(→Strip out sequences without PAM) |
||
(18 intermediate revisions not shown) | |||
Line 4: | Line 4: | ||
{{:Team:SUSTC-Shenzhen/templates/page-header| | {{:Team:SUSTC-Shenzhen/templates/page-header| | ||
- | title=gRNA | + | title=gRNA Design| |
- | subtitle=}} | + | subtitle=Not Only a Part of Modelling}} |
- | + | {{SUSTC-Math-Enabled}} | |
{{SUSTC-Shenzhen/main-content-begin}} | {{SUSTC-Shenzhen/main-content-begin}} | ||
+ | '''(Here we take HIV-1 as an example)''' | ||
+ | |||
+ | We used a method derived from the method described in the paper by Feng Zhang<sup>[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA]</sup>. | ||
+ | |||
+ | === Conserved Sequence Analysis === | ||
+ | |||
+ | We first tried to extract all conserved regions from the NIH HIV-1 Reference Genome using BioEdit. | ||
+ | In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus. | ||
+ | |||
+ | {| class="wikitable table-striped table-bordered table-hover" | ||
+ | ! colspan="11" | Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp | ||
+ | |- | ||
+ | | | ||
+ | | A % | ||
+ | | G % | ||
+ | | C % | ||
+ | | T % | ||
+ | | Empty % | ||
+ | | Non Empty % | ||
+ | | A(Corrected) | ||
+ | | G(Corrected) | ||
+ | | C(Corrected) | ||
+ | | T(Corrected) | ||
+ | |- | ||
+ | | 730 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 56.47 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 731 | ||
+ | | 0 | ||
+ | | 55.88 | ||
+ | | 0 | ||
+ | | 0.59 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 98.96% | ||
+ | | 0.00% | ||
+ | | 1.04% | ||
+ | |- | ||
+ | | 732 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 56.47 | ||
+ | | 43.53 | ||
+ | | 56.47 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 733 | ||
+ | | 0 | ||
+ | | 54.71 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 43.53 | ||
+ | | 55.89 | ||
+ | | 0.00% | ||
+ | | 97.89% | ||
+ | | 0.00% | ||
+ | | 2.11% | ||
+ | |- | ||
+ | | 734 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 58.24 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | |- | ||
+ | | 735 | ||
+ | | 56.47 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 96.96% | ||
+ | | 1.01% | ||
+ | | 1.01% | ||
+ | | 1.01% | ||
+ | |- | ||
+ | | 736 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 57.06 | ||
+ | | 0 | ||
+ | | 41.76 | ||
+ | | 58.24 | ||
+ | | 0.00% | ||
+ | | 2.03% | ||
+ | | 97.97% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 737 | ||
+ | | 1.18 | ||
+ | | 57.06 | ||
+ | | 0 | ||
+ | | 0.59 | ||
+ | | 41.18 | ||
+ | | 58.83 | ||
+ | | 2.01% | ||
+ | | 96.99% | ||
+ | | 0.00% | ||
+ | | 1.00% | ||
+ | |- | ||
+ | | 738 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 739 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 59.41 | ||
+ | | 0.99% | ||
+ | | 0.00% | ||
+ | | 99.01% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 740 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 100 | ||
+ | | 0 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 741 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 100 | ||
+ | | 0 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 742 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 58.24 | ||
+ | | 40 | ||
+ | | 60.01 | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 1.97% | ||
+ | | 97.05% | ||
+ | |- | ||
+ | | 743 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 744 | ||
+ | | 0 | ||
+ | | 1.18 | ||
+ | | 58.82 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 1.97% | ||
+ | | 98.03% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 745 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 1.18 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 98.03% | ||
+ | | 1.97% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 746 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 747 | ||
+ | | 0.59 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 748 | ||
+ | | 0.59 | ||
+ | | 59.41 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.98% | ||
+ | | 99.02% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 749 | ||
+ | | 0 | ||
+ | | 58.82 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 0.00% | ||
+ | | 98.03% | ||
+ | | 0.98% | ||
+ | | 0.98% | ||
+ | |- | ||
+ | | 750 | ||
+ | | 0.59 | ||
+ | | 0.59 | ||
+ | | 58.24 | ||
+ | | 0.59 | ||
+ | | 40 | ||
+ | | 60.01 | ||
+ | | 0.98% | ||
+ | | 0.98% | ||
+ | | 97.05% | ||
+ | | 0.98% | ||
+ | |- | ||
+ | | 751 | ||
+ | | 60 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 100.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |- | ||
+ | | 752 | ||
+ | | 59.41 | ||
+ | | 0.59 | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 40 | ||
+ | | 60 | ||
+ | | 99.02% | ||
+ | | 0.98% | ||
+ | | 0.00% | ||
+ | | 0.00% | ||
+ | |} | ||
+ | Table 1. Base-wise Statistics of One Designed Sequence | ||
+ | |||
+ | As we can see from Table 1, this sequence is highly conserved among about 50% of HIV-1 strains. | ||
+ | |||
+ | === Strip out sequences without PAM === | ||
+ | |||
+ | Sequences without PAM cannot bind with Cas9 protein and are thus of no use here. | ||
+ | Thus we used a small JavaScript script to strip out them. | ||
+ | |||
+ | === Select gRNA sequences with the best theoretical quality === | ||
+ | |||
+ | {| class="table" | ||
+ | ! colspan="5" | HIV-1 Quasi-Conservative gRNAs(Useful) | ||
+ | |- | ||
+ | | Sequence | ||
+ | | Rating(Zhang) | ||
+ | | Rank(Church) | ||
+ | | Free Energy(Approx.) | ||
+ | | | ||
+ | |- | ||
+ | | GTGTGGAAAATCTCTAGCAGTGG | ||
+ | | 71 | ||
+ | | - | ||
+ | | -1.4 | ||
+ | | rowspan="2" | HIV1_REF_2010 | ||
+ | |- | ||
+ | | TCTAGCAGTGGCGCCCGAACAGG | ||
+ | | 97 | ||
+ | | - | ||
+ | | -1.3 | ||
+ | |} | ||
+ | |||
+ | In this step, we used the tools from Feng Zhang and George Church to analyze off-target activity. | ||
+ | Still, we did BLAST ourselves to verify the results. | ||
+ | |||
+ | We have selected gRNA sequences with the best theoretical quality using the experimental formula by Feng Zhang: | ||
+ | <html> | ||
+ | $$\prod_{e\in{\mathcal{M}}}\left(1-\space W[e]\right)\times\frac{1}{\left(\frac{(19\space-\space\bar{d})}{19}\times 4\space+\space 1\right)}\times\frac{1}{n^2_{mm}} \\ M=[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,\\ 0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583]$$ | ||
+ | </html> | ||
+ | === Reference === | ||
+ | |||
+ | # Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832. | ||
+ | # Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., ... & Church, G. M. (2013). RNA-guided human genome engineering via Cas9. Science, 339(6121), 823-826. | ||
Latest revision as of 03:38, 18 October 2014
gRNA Design
Not Only a Part of Modelling
Contents |
(Here we take HIV-1 as an example)
We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].
Conserved Sequence Analysis
We first tried to extract all conserved regions from the NIH HIV-1 Reference Genome using BioEdit. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.
Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
A % | G % | C % | T % | Empty % | Non Empty % | A(Corrected) | G(Corrected) | C(Corrected) | T(Corrected) | |
730 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
731 | 0 | 55.88 | 0 | 0.59 | 43.53 | 56.47 | 0.00% | 98.96% | 0.00% | 1.04% |
732 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
733 | 0 | 54.71 | 0 | 1.18 | 43.53 | 55.89 | 0.00% | 97.89% | 0.00% | 2.11% |
734 | 0 | 0 | 0 | 58.24 | 41.76 | 58.24 | 0.00% | 0.00% | 0.00% | 100.00% |
735 | 56.47 | 0.59 | 0.59 | 0.59 | 41.76 | 58.24 | 96.96% | 1.01% | 1.01% | 1.01% |
736 | 0 | 1.18 | 57.06 | 0 | 41.76 | 58.24 | 0.00% | 2.03% | 97.97% | 0.00% |
737 | 1.18 | 57.06 | 0 | 0.59 | 41.18 | 58.83 | 2.01% | 96.99% | 0.00% | 1.00% |
738 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
739 | 0.59 | 0 | 58.82 | 0 | 40 | 59.41 | 0.99% | 0.00% | 99.01% | 0.00% |
740 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
741 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
742 | 0.59 | 0 | 1.18 | 58.24 | 40 | 60.01 | 0.98% | 0.00% | 1.97% | 97.05% |
743 | 0 | 0 | 60 | 0 | 40 | 60 | 0.00% | 0.00% | 100.00% | 0.00% |
744 | 0 | 1.18 | 58.82 | 0 | 40 | 60 | 0.00% | 1.97% | 98.03% | 0.00% |
745 | 0 | 58.82 | 1.18 | 0 | 40 | 60 | 0.00% | 98.03% | 1.97% | 0.00% |
746 | 0.59 | 0 | 59.41 | 0 | 40 | 60 | 0.98% | 0.00% | 99.02% | 0.00% |
747 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
748 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
749 | 0 | 58.82 | 0.59 | 0.59 | 40 | 60 | 0.00% | 98.03% | 0.98% | 0.98% |
750 | 0.59 | 0.59 | 58.24 | 0.59 | 40 | 60.01 | 0.98% | 0.98% | 97.05% | 0.98% |
751 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
752 | 59.41 | 0.59 | 0 | 0 | 40 | 60 | 99.02% | 0.98% | 0.00% | 0.00% |
Table 1. Base-wise Statistics of One Designed Sequence
As we can see from Table 1, this sequence is highly conserved among about 50% of HIV-1 strains.
Strip out sequences without PAM
Sequences without PAM cannot bind with Cas9 protein and are thus of no use here. Thus we used a small JavaScript script to strip out them.
Select gRNA sequences with the best theoretical quality
HIV-1 Quasi-Conservative gRNAs(Useful) | ||||
---|---|---|---|---|
Sequence | Rating(Zhang) | Rank(Church) | Free Energy(Approx.) | |
GTGTGGAAAATCTCTAGCAGTGG | 71 | - | -1.4 | HIV1_REF_2010 |
TCTAGCAGTGGCGCCCGAACAGG | 97 | - | -1.3 |
In this step, we used the tools from Feng Zhang and George Church to analyze off-target activity. Still, we did BLAST ourselves to verify the results.
We have selected gRNA sequences with the best theoretical quality using the experimental formula by Feng Zhang: $$\prod_{e\in{\mathcal{M}}}\left(1-\space W[e]\right)\times\frac{1}{\left(\frac{(19\space-\space\bar{d})}{19}\times 4\space+\space 1\right)}\times\frac{1}{n^2_{mm}} \\ M=[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,\\ 0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583]$$
Reference
- Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832.
- Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., ... & Church, G. M. (2013). RNA-guided human genome engineering via Cas9. Science, 339(6121), 823-826.