Team:SUSTC-Shenzhen/gRNA Design
From 2014.igem.org
gRNA Design
Not Only a Part of Modelling
Contents |
(Here we take HIV-1 as an example)
We used a method derived from the method described in the paper by Feng Zhang[http://www.nature.com/nbt/journal/v31/n9/abs/nbt.2647.html ZhangFgRNA].
Conserved Sequence Analysis
We first tried to extract all conserved regions from the NIH HIV-1 Reference Genome using BioEdit. In this step, we found around 10 alternatives for the next process. Here all screening processes are done in a per-strain basis because of the high mutability of the HIV-1 virus.
Supplementary Table 1 - Base Percentage of HIV-1 Aligned Genome 730bp-752bp | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
A % | G % | C % | T % | Empty % | Non Empty % | A(Corrected) | G(Corrected) | C(Corrected) | T(Corrected) | |
730 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
731 | 0 | 55.88 | 0 | 0.59 | 43.53 | 56.47 | 0.00% | 98.96% | 0.00% | 1.04% |
732 | 0 | 0 | 0 | 56.47 | 43.53 | 56.47 | 0.00% | 0.00% | 0.00% | 100.00% |
733 | 0 | 54.71 | 0 | 1.18 | 43.53 | 55.89 | 0.00% | 97.89% | 0.00% | 2.11% |
734 | 0 | 0 | 0 | 58.24 | 41.76 | 58.24 | 0.00% | 0.00% | 0.00% | 100.00% |
735 | 56.47 | 0.59 | 0.59 | 0.59 | 41.76 | 58.24 | 96.96% | 1.01% | 1.01% | 1.01% |
736 | 0 | 1.18 | 57.06 | 0 | 41.76 | 58.24 | 0.00% | 2.03% | 97.97% | 0.00% |
737 | 1.18 | 57.06 | 0 | 0.59 | 41.18 | 58.83 | 2.01% | 96.99% | 0.00% | 1.00% |
738 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
739 | 0.59 | 0 | 58.82 | 0 | 40 | 59.41 | 0.99% | 0.00% | 99.01% | 0.00% |
740 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
741 | 0 | 0 | 0 | 0 | 100 | 0 | ||||
742 | 0.59 | 0 | 1.18 | 58.24 | 40 | 60.01 | 0.98% | 0.00% | 1.97% | 97.05% |
743 | 0 | 0 | 60 | 0 | 40 | 60 | 0.00% | 0.00% | 100.00% | 0.00% |
744 | 0 | 1.18 | 58.82 | 0 | 40 | 60 | 0.00% | 1.97% | 98.03% | 0.00% |
745 | 0 | 58.82 | 1.18 | 0 | 40 | 60 | 0.00% | 98.03% | 1.97% | 0.00% |
746 | 0.59 | 0 | 59.41 | 0 | 40 | 60 | 0.98% | 0.00% | 99.02% | 0.00% |
747 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
748 | 0.59 | 59.41 | 0 | 0 | 40 | 60 | 0.98% | 99.02% | 0.00% | 0.00% |
749 | 0 | 58.82 | 0.59 | 0.59 | 40 | 60 | 0.00% | 98.03% | 0.98% | 0.98% |
750 | 0.59 | 0.59 | 58.24 | 0.59 | 40 | 60.01 | 0.98% | 0.98% | 97.05% | 0.98% |
751 | 60 | 0 | 0 | 0 | 40 | 60 | 100.00% | 0.00% | 0.00% | 0.00% |
752 | 59.41 | 0.59 | 0 | 0 | 40 | 60 | 99.02% | 0.98% | 0.00% | 0.00% |
Table 1. Base-wise Statistics of One Designed Sequence
As we can see from Table 1, this sequence is highly conserved among about 50% of HIV-1 strains.
Strip out sequences without PAM
Sequences without PAM cannot bind with Cas9 protein and are thus of no use here. Thus we used a small JavaScript script to strip out them.
Select gRNA sequences with the best theoretical quality
HIV-1 Quasi-Conservative gRNAs(Useful) | ||||
---|---|---|---|---|
Sequence | Rating(Zhang) | Rank(Church) | Free Energy(Approx.) | |
GTGTGGAAAATCTCTAGCAGTGG | 71 | - | -1.4 | HIV1_REF_2010 |
TCTAGCAGTGGCGCCCGAACAGG | 97 | - | -1.3 |
In this step, we used the tools from Feng Zhang and George Church to analyze off-target activity. Still, we did BLAST ourselves to verify the results.
We have selected gRNA sequences with the best theoretical quality using the experimental formula by Feng Zhang: $$\prod_{e\in{\mathcal{M}}}\left(1-\space W[e]\right)\times\frac{1}{\left(\frac{(19\space-\space\bar{d})}{19}\times 4\space+\space 1\right)}\times\frac{1}{n^2_{mm}} \\ M=[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,\\ 0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583]$$
Reference
- Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832.
- Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., ... & Church, G. M. (2013). RNA-guided human genome engineering via Cas9. Science, 339(6121), 823-826.