Team:UESTC-Software/Modeling.html

From 2014.igem.org

(Difference between revisions)
(Created page with "{{CSS/Main}} <html> <head> <title>UESTC-Software</title> <meta http-equiv=Content-Type content="text/html;charset=utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=Edge,c...")
 
(22 intermediate revisions not shown)
Line 21: Line 21:
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="question" id="p1">1.Overview</div>
<div class="question" id="p1">1.Overview</div>
-
<p>   Modeling is a powerful tool in synthetic biology and engineering. In our project, we aim to design a bioinformatics tool “CRISPR-X”, which is a software developed for design of CRISPR sgRNA with minimized off-target effects and high cutting rate. </p>
+
<p>Modeling is a powerful tool in synthetic biology and engineering. In our project, we aim to design a bioinformatics tool “CRISPR-X”, which is a software developed for design of CRISPR sgRNA with minimized off-target effects and high cutting rate.</p>
-
             <p>First of all, we find the protospacer-adjacent motif (PAM) based on user-specified gene region. Then, we find sgRNA corresponding to the PAM. Next, we find that whether there is a potential off-target binding sites for the sgRNA over the entire gene region, and evaluate the specificity and efficacy of the sgRNA. Finally, we provide a secondary structure and the restriction enzyme cutting sites for the sgRNA.</p>
+
            <p>The CRISPR-associated (Cas)9 can be programmed with a single guide RNA (sgRNA) to generate site-specific DNA breaks, but there are few known rules governing on-target efficacy of this system[1,2]. Related reports suggest gRNAs are most effective with a GC-content between 40 and 80%. [1] In addition, a guanine at position 20 in the target site, which appears to improve cutting rate. [1] Therefore, we use efficacy score to characterize the activity of the sgRNA.</p>
 +
            <p>For sgRNA sequences can be 17-20 nt in length to achieve similar levels of on-target gene editing,and up to 10,000 fold improvement in target specificity when truncated (17 or 18 base pair) sgRNA is used. [3] We design the length of sgRNA sequences vary from 17nt to 20nt.</p>
 +
             <p>First of all, we find the protospacer-adjacent motif (PAM) based on user-specified gene region. Then, we find sgRNA corresponding to the PAM. Next, we find that whether there is a potential off-target binding site for the sgRNA over the entire gene region, and evaluate the specificity and efficacy of the sgRNA. Finally, we provide a secondary structure and the restriction enzyme cutting sites for the sgRNA.</p>
</div>
</div>
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="question" id="p2">2.Parameters</div>
<div class="question" id="p2">2.Parameters</div>
<table border="1px" cellspacing="0px" style="border-collapse:collapse;word-break:break-word;border-color: #c7d3af;
<table border="1px" cellspacing="0px" style="border-collapse:collapse;word-break:break-word;border-color: #c7d3af;
-
color: #999;">
+
color: #999;line-height: 1.1em;">
<tbody>
<tbody>
<tr>
<tr>
Line 33: Line 35:
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">d 0</td> <td class="pc2">Average distance for all the mismatch nucleotides to the PAM of any off-target site</td> <td class="pc3">0-19</td> <td class="pc4">nt</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">d0</span></td> <td class="pc2">Average distance for all the mismatch nucleotides to the PAM of any off-target site</td> <td class="pc3">0-19</td> <td class="pc4">nt</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">i</td> <td class="pc2">Continuous variables for the number of mismatch nucleotide</td> <td class="pc3">1-Nmm</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">i</span></td> <td class="pc2">Continuous variables for the number of mismatch nucleotide</td> <td class="pc3">1-<span class="serif">Nmm</span></td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">j</td> <td class="pc2">Continuous variables for the total number of off-target sites exclude the perfect-hit off-target sites</td> <td class="pc3">1-(Nfg-Nph)</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">j</span></td> <td class="pc2">Continuous variables for the total number of off-target sites exclude the perfect-hit off-target sites</td> <td class="pc3">1-(<span class="serif">Nfg</span>-<span class="serif">Nph</span>)</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">M</td> <td class="pc2">Weight matrix</td> <td class="pc3">[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583]</td> <td class="pc4"></td> <td class="pc5">Reference: DNA targeting specificity of RNA-guided Cas9 nucleases, Hsu et al, 2013</td>
+
<td class="pc1"><span class="serif">M</span></td> <td class="pc2">Weight matrix</td> <td class="pc3">[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583]</td> <td class="pc4"></td> <td class="pc5">Reference: DNA targeting specificity of RNA-guided Cas9 nucleases, Hsu et al, 2013</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">n</td> <td class="pc2">Mismatch position</td> <td class="pc3">1-20</td> <td class="pc4"></td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">n</span></td> <td class="pc2">Mismatch position</td> <td class="pc3">1-20</td> <td class="pc4"></td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Nfg</td> <td class="pc2">The total number of off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">Nfg</span></td> <td class="pc2">The total number of off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Nmm</td> <td class="pc2">The number of mismatch nucleotide for the not perfect-hit off-target sites</td> <td class="pc3">1-4</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">Nmm</span></td> <td class="pc2">The number of mismatch nucleotide for the not perfect-hit off-target sites</td> <td class="pc3">1-4</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Nph</td> <td class="pc2">Perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, we allow the maximum value of Nph is 4, when Nph≥4, Sguide=0</td>
+
<td class="pc1"><span class="serif">Nph</span></td> <td class="pc2">The number of perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, we allow the maximum value of <span class="serif">Nph</span> is 4, when <span class="serif">Nph</span>≥4, <span class="serif">Sguide</span>=0</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">r 1</td> <td class="pc2">The proportion of specificity score in the total score</td> <td class="pc3">0-1</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, it’s default value is 0.65</td>
+
<td class="pc1"><span class="serif">r1</span></td> <td class="pc2">The proportion of specificity score in the total score</td> <td class="pc3">0-1</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, it’s default value is 0.65</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">r 2</td> <td class="pc2">The proportion of efficacy score in the total score</td> <td class="pc3">0-1</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, it’s default value is 0.35</td>
+
<td class="pc1"><span class="serif"><span class="serif">r2</span></span></td> <td class="pc2">The proportion of efficacy score in the total score</td> <td class="pc3">0-1</td> <td class="pc4">1</td> <td class="pc5">In our scoring algorithm, it’s default value is 0.35</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">S1</td> <td class="pc2">The score of the first step</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">S1</span></td> <td class="pc2">The score of the first step</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">S20</td> <td class="pc2">The subtracted score for the 20th nucleotide is not a guanine</td> <td class="pc3">=35</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">S20</span></td> <td class="pc2">The subtracted score for the 20th nucleotide is not a guanine</td> <td class="pc3">=35</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Seff</td> <td class="pc2">The efficacy score</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Represent the level of efficacy for the sgRNA</td>
+
<td class="pc1"><span class="serif">Seff</span></td> <td class="pc2">The efficacy score</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Represent the level of efficacy for the sgRNA</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Sgc</td> <td class="pc2">The subtracted score for different  GC ratio</td> <td class="pc3">0,35,65</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">Sgc</span></td> <td class="pc2">The subtracted score for different  GC ratio</td> <td class="pc3">0,35,65</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Sguide</td> <td class="pc2">The total score of the sgRNA</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Composed of Seff and Sspe, marking the overall properties(specificity and efficacy) of the sgRNA</td>
+
<td class="pc1"><span class="serif">Sguide</span></td> <td class="pc2">The total score of the sgRNA</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Composed of Seff and <span class="serif">Sspe</span>, marking the overall properties(specificity and efficacy) of the sgRNA</td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Smm</td> <td class="pc2">The subtracted score of the mismatch nucleotide for the not perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">Smm</span></td> <td class="pc2">The subtracted score of the mismatch nucleotide for the not perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Sph</td> <td class="pc2">The subtracted score of the perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
+
<td class="pc1"><span class="serif">Sph</span></td> <td class="pc2">The subtracted score of the perfect-hit off-target sites</td> <td class="pc3">≥0</td> <td class="pc4">1</td> <td class="pc5"></td>
</tr>
</tr>
<tr>
<tr>
-
<td class="pc1">Sspe</td> <td class="pc2">The specificity score</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Represent the level of specificity for the sgRNA</td>
+
<td class="pc1"><span class="serif">Sspe</span></td> <td class="pc2">The specificity score</td> <td class="pc3">0-100</td> <td class="pc4">1</td> <td class="pc5">Represent the level of specificity for the sgRNA</td>
</tr>
</tr>
</tbody>
</tbody>
Line 91: Line 93:
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="question"  id="p3">3.Scoring algorithm</div>
<div class="question"  id="p3">3.Scoring algorithm</div>
-
<p><ul>Judged conditions:
+
<p><ul style="width: 100%;
-
<li><pre>①Bad GC ratio (< 40% or > 80%) : Sgc = 65;
+
word-break: break-word;">Judged conditions:
-
Not so good GC ratio (40% - 50% or 50% - 80%): Sgc = 35;
+
<li><pre style="padding: 0;background: none;border: none;color: #777;line-height: inherit;">①Bad GC ratio (< 40% or > 80%) : <span class="serif">Sgc</span> = 65;
-
Good GC ratio (51%-69%): Sgc = 0.[1]
+
Not so good GC ratio (40% - 50% or 70% - 80%): <span class="serif">Sgc</span> = 35;
 +
Good GC ratio (51%-69%): <span class="serif">Sgc</span> = 0.[1]
</pre></li>
</pre></li>
-
<li>②The 20th nucleotide is not G: S20 = 35;[1]</li>
+
<li>②The 20th nucleotide is not G: <span class="serif">S20</span> = 35;[1]</li>
-
<li>③If the sgRNA designed perfectly hit another sites, the penalty Sph = 25;if perfectly hit more than or equal to 4 loci, the total score Sguide is 0.
+
<li>③If the sgRNA designed perfectly hit another sites, the penalty <span class="serif">Sph</span> = 25;if perfectly hit more than or equal to 4 loci, the total score <span class="serif">Sguide</span> is 0.
</li></ul></p>
</li></ul></p>
-
<p><ul>Steps:
+
<p><ul style="width:100%;">Steps:
-
<li>(1) Firstly, find out the number of off – target sequence Nfg, if Nfg = 0, output Sspe= r1*100;
+
<li>(1) Firstly, find out the number of off – target sequence <span class="serif">Nfg</span>, if <span class="serif">Nfg</span> = 0, output <span class="serif">Sspe</span>= <span class="serif">r1</span>*100;
-
Otherwise, detect the third condition. If there is a sgRNA designed perfectly hit another site, regard the number of it as Nph, and then the score of the first step: S1 = Sph * Nph (Nph is 4 or less). When S1 is equal to or less than 75, perform step (2), otherwise the output Sguide = 0;<br/>
+
Otherwise, detect the third condition. If there is a sgRNA designed perfectly hit another site, regard the number of it as <span class="serif">Nph</span>, and then the score of the first step: <span class="serif">S1</span> = <span class="serif">Sph</span> * <span class="serif">Nph</span> (<span class="serif">Nph</span> is 4 or less). When <span class="serif">S1</span> is equal to or less than 75, perform step (2), otherwise the output <span class="serif">Sguide</span> = 0;<br/>
-
If there is no sgRAN designed perfectly hit other sites, the score of the first step S1 = 0. This illustrates that there is no nucleotide which are matched between sgRNA and the place missed. Then perform step(2)
+
If there is no sgRAN designed perfectly hit other sites, the score of the first step <span class="serif">S1</span> = 0. This illustrates that there is no nucleotide which are matched between sgRNA and the place missed. Then perform step(2).
 +
<a href="https://static.igem.org/mediawiki/2014/5/54/2014-UESTC-Software-Ac1.jpg" target="_blank"><img src="https://static.igem.org/mediawiki/2014/5/54/2014-UESTC-Software-Ac1.jpg"></a>
</li>
</li>
-
<li>(2) When performing step (2), remove the Nph which is perfectly hit first.
+
<li>(2) When performing step (2), remove the <span class="serif">Nph</span> which is perfectly hit first.
-
For Nfg-Nph which does not perfectly hit, please combine the weight ratio which obtained in the literature:
+
For <span class="serif">Nfg</span>-<span class="serif">Nph</span> which does not perfectly hit, please combine the weight ratio which obtained in the literature:
-
M=[0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583];[2]
+
<span class="serif">M</span>=[ 0, 0, 0.014, 0, 0, 0.395, 0.317, 0, 0.389, 0.079, 0.445, 0.508, 0.613, 0.851, 0.732, 0.828, 0.615, 0.804, 0.685, 0.583];[4]
-
Using the formula:<img src="https://static.igem.org/mediawiki/2014/c/c5/2014-UESTC-Software-M1.png" style="position: relative;top: 20px;"><br/>
+
Using the formula:<img src="https://static.igem.org/mediawiki/2014/c/c5/2014-UESTC-Software-M1.png" style="position: relative;top: 4px;"><br/><a href="https://static.igem.org/mediawiki/2014/6/61/2014-UESTC-Software-Ac2.jpg" target="_blank"><img src="https://static.igem.org/mediawiki/2014/6/61/2014-UESTC-Software-Ac2.jpg"></a>
</li>
</li>
</ul></p>
</ul></p>
-
<p>Assuming that specific score: efficacy score = r1: r2 (the default is r1:r2 = 0.65:0.35), and then use formula specificity scores Sspe =<img src="https://static.igem.org/mediawiki/2014/3/37/2014-UESTC-Software-M2.png" style="position: relative;top: 8px;">(when 100-<img src="https://static.igem.org/mediawiki/2014/7/7f/2014-UESTC-Software-M3.png" style="position: relative;top: 8px;">, Sspe=0), efficacy score Seff = r2 * (100 - (Sgc + S20)), the total score: <img src="https://static.igem.org/mediawiki/2014/c/c0/2014-UESTC-Software-M4.png" style="position: relative;top: 4px;">;  
+
<p>Assuming that specific score: efficacy score = <span class="serif">r1</span>: <span class="serif">r2</span> (the default is <span class="serif">r1</span>:<span class="serif">r2</span> = 0.65:0.35), and then use formula specificity scores <span class="serif">Sspe</span> =<img src="https://static.igem.org/mediawiki/2014/3/37/2014-UESTC-Software-M2.png">(when 100-<img src="https://static.igem.org/mediawiki/2014/7/7f/2014-UESTC-Software-M3.png">, <span class="serif">Sspe</span>=0), efficacy score <span class="serif">Seff</span> = r2 * (100 - (<span class="serif">Sgc</span> + <span class="serif">S20</span>)), the total score: <img src="https://static.igem.org/mediawiki/2014/c/c0/2014-UESTC-Software-M4.png">;  
-
Finally according to Sguide score, arranging the sgRNA from high to low, outputting sgRNA, total score Sguide, specificity scores Sspe, efficacy score Seff, the chromosome and its site connected to sgRNA, the GC ratio.
+
Finally according to <span class="serif">Sguide</span> score, arranging the sgRNA from high to low, outputting sgRNA, total score <span class="serif">Sguide</span>, specificity scores <span class="serif">Sspe</span>, efficacy score <span class="serif">Seff</span>, the chromosome and its site connected to sgRNA, the GC ratio.
</p>
</p>
</div>
</div>
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="parts" style="padding: 20px 50px 20px 100px;">
<div class="question" id="p4">4.Algorithm illustration</div>
<div class="question" id="p4">4.Algorithm illustration</div>
-
<p>In literature [2], The algorithm used to score single off-targets is:</p>
+
<p>In literature [4], The algorithm used to score single off-targets is:</p>
<img src="https://static.igem.org/mediawiki/2014/e/e1/2014-UESTC-Software-M5.png">
<img src="https://static.igem.org/mediawiki/2014/e/e1/2014-UESTC-Software-M5.png">
             <p>This algorithm is adopted by CRISPR-P, the inadequacies of this algorithm are: (a) Despite the presence of off-target sites, but sometimes it's subtracted score will still be 0 (which seems unreasonable under certain circumstances, and it will confuse the scoring of those sgRNA that don’t exist off-target sites). (b) Using W function, which cannot be expressed by elementary functions, it will take some additional time in calculation.</p>
             <p>This algorithm is adopted by CRISPR-P, the inadequacies of this algorithm are: (a) Despite the presence of off-target sites, but sometimes it's subtracted score will still be 0 (which seems unreasonable under certain circumstances, and it will confuse the scoring of those sgRNA that don’t exist off-target sites). (b) Using W function, which cannot be expressed by elementary functions, it will take some additional time in calculation.</p>
Line 146: Line 150:
             </tr>
             </tr>
             </table>
             </table>
-
</div>
+
-
<div class="parts" style="padding: 20px 50px 20px 100px;">
+
-
<div class="question" id="p5">5.Algorithm validation</div>
+
-
<p>In order to confirm our algorithm is consistent with the experimental results, we use the experimental data on the MLE Cleavage with the different mismatches, and we compare our scoring results to corresponding to the experimental data , in addition find the correlation coefficient of them.</p>
+
-
            <p>First, we use aggregate data from single-mismatch guide RNAs for 15 EMX1 targets in literature [2](it’s relation figure is figure 2C, heatmap for relative SpCas9 cleavage efficiency for each possible RNA:DNA base pair).</p>
+
-
            <a href="https://static.igem.org/mediawiki/2014/f/f6/2014-UESTC-Software-F2c.png"  target="_blank"><img src="https://static.igem.org/mediawiki/2014/f/f6/2014-UESTC-Software-F2c.png"></a>
+
-
            <p>Aggregate data from single-mismatch guide RNAs for 15 EMX1 targets [2]</p>
+
-
            <a href="https://static.igem.org/mediawiki/2014/a/af/2014-UESTC-Software-F2.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/a/af/2014-UESTC-Software-F2.png"></a>
+
-
            <p>Heatmap for relative SpCas9 cleavage efficiency for each possible RNA:DNA base pair[2].
+
-
(a)We use this set of data to determine the relationship between our software score with the MLE Cleavage for single mismatch position. MATLAB program is shown below:</p>
+
-
<pre style='color:#d1d1d1;background:#000000;font-size: 14px;padding:20px 0;'>
+
-
data<span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>load</span><span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'E:\matlab\work\igem_data.mat'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#d2cd86; '>[</span>m n<span style='color:#d2cd86; '>]</span><span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>size</span><span style='color:#d2cd86; '>(</span>data<span style='color:#d2cd86; '>.</span>dataigem<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>for</span> i<span style='color:#d2cd86; '>=</span>1<span style='color:#d2cd86; '>:</span>n
+
-
    ave<span style='color:#d2cd86; '>(</span>i<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>mean</span><span style='color:#d2cd86; '>(</span>data<span style='color:#d2cd86; '>.</span>dataigem<span style='color:#d2cd86; '>(</span><span style='color:#d2cd86; '>:</span><span style='color:#d2cd86; '>,</span>i<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>end</span>
+
-
M<span style='color:#d2cd86; '>=</span><span style='color:#d2cd86; '>[</span>0<span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.014</span><span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.395</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.317</span><span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.389</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.079</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.445</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.508</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.613</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.851</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.732</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.828</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.615</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.804</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.685</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.583</span><span style='color:#d2cd86; '>]</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>for</span> j<span style='color:#d2cd86; '>=</span>2<span style='color:#d2cd86; '>:</span>20
+
-
S<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>-</span>1<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>=</span><span style='color:#d2cd86; '>(</span>4<span style='color:#d2cd86; '>*</span><span style='color:#bb7977; '>exp</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>-</span>M<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>/</span><span style='color:#d2cd86; '>(</span><span style='color:#d2cd86; '>(</span>4<span style='color:#d2cd86; '>*</span>j<span style='color:#d2cd86; '>+</span>19<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>/</span>19<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>end</span>
+
-
x<span style='color:#d2cd86; '>=</span>19<span style='color:#d2cd86; '>:</span><span style='color:#d2cd86; '>-</span>1<span style='color:#d2cd86; '>:</span>1<span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>subplot</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>,</span>1<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>stem</span><span style='color:#d2cd86; '>(</span>x<span style='color:#d2cd86; '>,</span>ave<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
title<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The relaition between the single mismatch location and cleavage activity '</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
xlabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'location/nt'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>ylabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'cleavage activity'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>subplot</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>stem</span><span style='color:#d2cd86; '>(</span>x<span style='color:#d2cd86; '>,</span>S<span style='color:#d2cd86; '>,</span><span style='color:#b060b0; '>'g'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
title<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The figure of the single mismatch location and mismatch score  '</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
xlabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'location/nt'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>ylabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'score'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>figure</span><span style='color:#d2cd86; '>(</span>2<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>plot</span><span style='color:#d2cd86; '>(</span>x<span style='color:#d2cd86; '>,</span>ave<span style='color:#d2cd86; '>,</span><span style='color:#b060b0; '>'b'</span><span style='color:#d2cd86; '>,</span>x<span style='color:#d2cd86; '>,</span>S<span style='color:#d2cd86; '>,</span><span style='color:#b060b0; '>'r'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>legend</span><span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'mismatch cleavage activity'</span><span style='color:#d2cd86; '>,</span><span style='color:#b060b0; '>'mismatch score'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
title<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The contrast figure of the single mismatch cleavage activity and mismatch score  '</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
xlabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'location/nt'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>ylabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'amplitude'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
B<span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>corrcoef</span><span style='color:#d2cd86; '>(</span>ave<span style='color:#d2cd86; '>,</span>S<span style='color:#d2cd86; '>)</span><span style='color:#9999a9; '>%find the correlation coefficient</span>
+
-
</pre>
+
-
<p>The result and figures are:</p>
+
-
<a href="https://static.igem.org/mediawiki/2014/0/02/2014-UESTC-Software-F3.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/0/02/2014-UESTC-Software-F3.png"></a><a href="https://2014.igem.org/File:2014-UESTC-Software-F4.png" target="_blank"><img src="https://2014.igem.org/File:2014-UESTC-Software-F4.png"></a><a href="https://static.igem.org/mediawiki/2014/6/64/2014-UESTC-Software-F5.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/6/64/2014-UESTC-Software-F5.png"></a>
+
-
<p>(b) We use the similar way to determine the relationship between our software score with the MLE Cleavage for two concatenated mismatches. MATLAB program is shown below:</p>
+
-
<pre style='color:#d1d1d1;background:#000000;font-size: 14px;padding:20px 0;'>
+
-
data<span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>load</span><span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'E:\matlab\work\data2misc.mat'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#d2cd86; '>[</span>m n<span style='color:#d2cd86; '>]</span><span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>size</span><span style='color:#d2cd86; '>(</span>data<span style='color:#d2cd86; '>.</span>data2misc<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>for</span> i<span style='color:#d2cd86; '>=</span>1<span style='color:#d2cd86; '>:</span>n
+
-
    ave<span style='color:#d2cd86; '>(</span>i<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>mean</span><span style='color:#d2cd86; '>(</span>data<span style='color:#d2cd86; '>.</span>data2misc<span style='color:#d2cd86; '>(</span><span style='color:#d2cd86; '>:</span><span style='color:#d2cd86; '>,</span>i<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>end</span>
+
-
N<span style='color:#d2cd86; '>=</span><span style='color:#d2cd86; '>[</span>19 20<span style='color:#d2cd86; '>;</span>17 18<span style='color:#d2cd86; '>;</span>15 16<span style='color:#d2cd86; '>;</span>13 14<span style='color:#d2cd86; '>;</span>11 12<span style='color:#d2cd86; '>;</span>9 10<span style='color:#d2cd86; '>;</span>7 8<span style='color:#d2cd86; '>;</span>5 6<span style='color:#d2cd86; '>;</span>3 4<span style='color:#d2cd86; '>]</span><span style='color:#d2cd86; '>;</span><span style='color:#9999a9; '>%Mismatch position</span>
+
-
M<span style='color:#d2cd86; '>=</span><span style='color:#d2cd86; '>[</span>0<span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.014</span><span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.395</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.317</span><span style='color:#d2cd86; '>,</span>0<span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.389</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.079</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.445</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.508</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.613</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.851</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.732</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.828</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.615</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.804</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.685</span><span style='color:#d2cd86; '>,</span><span style='color:#009f00; '>0.583</span><span style='color:#d2cd86; '>]</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>for</span> j<span style='color:#d2cd86; '>=</span>1<span style='color:#d2cd86; '>:</span>n
+
-
    d0<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>mean</span><span style='color:#d2cd86; '>(</span>N<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>,</span><span style='color:#d2cd86; '>:</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
    S<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>=</span><span style='color:#d2cd86; '>(</span><span style='color:#bb7977; '>exp</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>-</span>M<span style='color:#d2cd86; '>(</span>N<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>,</span>1<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>+</span><span style='color:#bb7977; '>exp</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>-</span>M<span style='color:#d2cd86; '>(</span>N<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>/</span><span style='color:#d2cd86; '>(</span><span style='color:#d2cd86; '>(</span>4<span style='color:#d2cd86; '>*</span>d0<span style='color:#d2cd86; '>(</span>j<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>+</span>19<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>/</span>19<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#e66170; font-weight:bold; '>end</span>
+
-
x<span style='color:#d2cd86; '>=</span>1<span style='color:#d2cd86; '>:</span>n<span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>subplot</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>,</span>1<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>stem</span><span style='color:#d2cd86; '>(</span>x<span style='color:#d2cd86; '>,</span>ave<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
title<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The two concatenated mismatches cleavage activity '</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
xlabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The serial number'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>ylabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'cleavage activity'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>subplot</span><span style='color:#d2cd86; '>(</span>1<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>,</span>2<span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
<span style='color:#bb7977; '>stem</span><span style='color:#d2cd86; '>(</span>x<span style='color:#d2cd86; '>,</span>S<span style='color:#d2cd86; '>,</span><span style='color:#b060b0; '>'g'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
title<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The two concatenated mismatches score  '</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
xlabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'The serial number'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>ylabel<span style='color:#d2cd86; '>(</span><span style='color:#b060b0; '>'score'</span><span style='color:#d2cd86; '>)</span><span style='color:#d2cd86; '>;</span>
+
-
B<span style='color:#d2cd86; '>=</span><span style='color:#bb7977; '>corrcoef</span><span style='color:#d2cd86; '>(</span>ave<span style='color:#d2cd86; '>,</span>S<span style='color:#d2cd86; '>)</span> <span style='color:#9999a9; '>%find the correlation coefficient</span>
+
-
</pre>
+
-
<p>The result and figures are:</p>
+
-
<a href="https://static.igem.org/mediawiki/2014/8/80/2014-UESTC-Software-F6.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/8/80/2014-UESTC-Software-F6.png"></a><a href="https://static.igem.org/mediawiki/2014/8/84/2014-UESTC-Software-F7.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/8/84/2014-UESTC-Software-F7.png"></a>
+
-
<p>(c) We use the similar way like (b)(just change N in MATLAB program) to determine the relationship between our software score with the MLE Cleavage for <span class="green">two interspaced mismatches</span>. The MATLAB program result and figures are:</p>
+
-
<a href="https://static.igem.org/mediawiki/2014/b/bc/2014-UESTC-Software-F8.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/b/bc/2014-UESTC-Software-F8.png"></a><a href="https://static.igem.org/mediawiki/2014/c/cf/2014-UESTC-Software-F9.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/c/cf/2014-UESTC-Software-F9.png"></a>
+
-
<p>(d) We use the similar way like (b) (just change N and S(j) (according to the Smm formula)in MATLAB program) to determine the relationship between our software score with the MLE Cleavage for <span class="green">three concatenated mismatches</span>. The MATLAB program result and figures are:</p>
+
-
<a href="https://static.igem.org/mediawiki/2014/5/55/2014-UESTC-Software-F10.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/5/55/2014-UESTC-Software-F10.png"></a><a href="https://static.igem.org/mediawiki/2014/5/52/2014-UESTC-Software-F11.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/5/52/2014-UESTC-Software-F11.png"></a>
+
-
<p>(e) We use the similar way like (b) (just change N and S(j) (according to the Smm formula)in MATLAB program) to determine the relationship between our software score with the MLE Cleavage for three interspaced mismatches. The MATLAB program result and figures are:</p>
+
-
<a href="https://static.igem.org/mediawiki/2014/3/32/2014-UESTC-Software-F12.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/3/32/2014-UESTC-Software-F12.png"></a><a href="https://static.igem.org/mediawiki/2014/c/c2/2014-UESTC-Software-F13.png" target="_blank"><img src="https://static.igem.org/mediawiki/2014/c/c2/2014-UESTC-Software-F13.png"></a>
+
-
<p>In summary, the correlation coefficient of the above-mentioned five different conditions (single mismatch, two concatenated mismatches, two interspaced mismatches, three concatenated mismatches and three interspaced mismatches) respectively are: 0.8840, 0.8902, 0.7688, 0.8566, 0.6092. The correlation coefficients are all over 0.6, and three correlation coefficients are over 0.85. In some extent, this result demonstrated the validity and availability of our scoring algorithm.</p>
+
<p><ul><b>Reference:</b>
<p><ul><b>Reference:</b>
-
<li>[1] <i>Genetic Screens in Human Cells Using the CRISPR-Cas9 System, Wang et al., 2014</i></li>
+
<li>[1] Wang, T., Wei, J. J., Sabatini, D. M., & Lander, E. S. (2014). Genetic screens in human cells using the CRISPR-Cas9 system. Science, 343(6166), 80-84.</li>
-
<li>[2] <i>DNA targeting specificity of RNA-guided Cas9 nucleases, Hsu et al, 2013</i></p></li>
+
<li>[2] Gagnon, J. A., Valen, E., Thyme, S. B., Huang, P., Ahkmetova, L., Pauli, A., ... & Schier, A. F. (2014). Efficient mutagenesis by Cas9 protein-mediated oligonucleotide insertion and large-scale assessment of single-guide RNAs.PloS one, 9(5), e98186.</li>
 +
<li>[3] Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M., & Joung, J. K. (2014). Improving CRISPR-Cas nuclease specificity using truncated guide RNAs.Nature biotechnology, 32(3), 279-284.</li>
 +
<li>[4] Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832.</li>
 +
</ul></p>
</ul>
</ul>
</div>
</div>
Line 229: Line 167:
<div id="go3" class="go" style="width: 120px;text-align: left;">3.Scoring algorithm</div>
<div id="go3" class="go" style="width: 120px;text-align: left;">3.Scoring algorithm</div>
<div id="go4" class="go" style="width: 120px;text-align: left;">4.Algorithm illustration</div>
<div id="go4" class="go" style="width: 120px;text-align: left;">4.Algorithm illustration</div>
-
<div id="go5" class="go" style="width: 120px;text-align: left;">5.Algorithm validation</div>
 
</div>
</div>
Line 236: Line 173:
<div class="navbar">
<div class="navbar">
<ul class="leftUl UL">
<ul class="leftUl UL">
-
<li id="nav_home"></li>
+
<li id="nav_home"><a href="Home.html" style="width: inherit;height: 100%;position: fixed;"></a></li>
<li id="nav_project" class="menu">
<li id="nav_project" class="menu">
-
<ul style="bottom:290px;">
+
<ul style="bottom:257px;">
<li><a href="Overview.html">Overview</a></li>
<li><a href="Overview.html">Overview</a></li>
<li><a href="Modeling.html">Modeling</a></li>
<li><a href="Modeling.html">Modeling</a></li>
<li><a href="Demo.html">Demo</a></li>
<li><a href="Demo.html">Demo</a></li>
-
<li><a href="Lab.html">Lab</a></li>
+
<li><a href="Validation.html">Validation</a></li>
-
<li><a href="FutureWorks.html">Future works</a></li>
+
<li><a href="Download.html">Download</a></li>
<li><a href="Download.html">Download</a></li>
-
<li><a href="Safety&Ethics.html">Safety&Ethics</a></li>
+
<li><a href="safety.html">Safety</a></li>
</ul>
</ul>
</li>
</li>
-
<li id="nav_team"><a href="Team.html"></a></li>
+
<li id="nav_team"><a href="team.html" style="width: inherit;height: 100%;position: fixed;"></a></li>
</ul>
</ul>
<ul class="rightUl UL">
<ul class="rightUl UL">
<li id="nav_notebk" class="menu">
<li id="nav_notebk" class="menu">
<ul style="bottom:90px;">
<ul style="bottom:90px;">
-
<li><a href="Journal.html">Journal</a></li>
+
<li><a href="journal.html">Journal</a></li>
-
<li><a href="HumanPractice.html">Human practice</a></li>
+
<li><a href="HumanPractice.html">Human Practice</a></li>
</ul>
</ul>
</li>
</li>
<li id="nav_reqrmt" class="menu">
<li id="nav_reqrmt" class="menu">
-
<ul style="bottom:130px;">
+
<ul style="bottom:93px;">
-
<li><a href="Requirements.html">Requirements</a></li>
+
<li><a href="Requirement.html">Requirements</a></li>
-
<li><a href="MedalFulfillment.html">Medal fulfillment</a></li>
+
<li><a href="Medal.html">Medal Fulfillment</a></li>
-
<li><a href=">Award&Prize.html">Award&Prize</a></li>
+
</ul>
</ul>
</li>
</li>
<li id="nav_doc" class="menu">
<li id="nav_doc" class="menu">
<ul style="bottom:170px;">
<ul style="bottom:170px;">
-
<li><a href="APIDoc.html">API documentation</a></li>
+
<li><a href="API.html">API Documentation</a></li>
-
<li><a href="TestingDoc.html">Testing documentation</a></li>
+
<li><a href="Testing.html">Testing Documentation</a></li>
<li><a href="Installation.html">Installation</a></li>
<li><a href="Installation.html">Installation</a></li>
<li><a href="UserGuide.html">User Guide</a></li>
<li><a href="UserGuide.html">User Guide</a></li>
Line 276: Line 211:
<img src="https://static.igem.org/mediawiki/2014/c/c5/2014-UESTC-Software-Top.png" id="top" onclick="javascript:$('#body').animate({scrollTop:0},700)">
<img src="https://static.igem.org/mediawiki/2014/c/c5/2014-UESTC-Software-Top.png" id="top" onclick="javascript:$('#body').animate({scrollTop:0},700)">
<div id="logoRay" style="width:86px;height:86px;border-radius:1000px;position:fixed;bottom:-20px;z-index: 999999;"></div>
<div id="logoRay" style="width:86px;height:86px;border-radius:1000px;position:fixed;bottom:-20px;z-index: 999999;"></div>
-
<img src="https://static.igem.org/mediawiki/2014/e/e5/2014_UESTC_Software_logo.gif" id="logo" style="border-radius: 90px;width:100px;height:100px;position:fixed;bottom:-20px;z-index: 999999;"/>
+
<a href="https://2014.igem.org/Team:UESTC-Software"><img src="https://static.igem.org/mediawiki/2014/e/e5/2014_UESTC_Software_logo.gif" id="logo" style="border-radius: 90px;width:100px;height:100px;position:fixed;bottom:-20px;z-index: 999999;"/></a>
 +
<a class="iGEM" href="https://2014.igem.org/Main_Page" style="position: fixed;top: 0;right: 0;"><img src="https://static.igem.org/mediawiki/2014/8/8d/2014-UESTC-Software-Igem.png"></a>
</div>
</div>
<script type="text/javascript">
<script type="text/javascript">
Line 341: Line 277:
         $("#go4").click(function(){   
         $("#go4").click(function(){   
         var top =$('#body').scrollTop() +$('#p4').position().top;
         var top =$('#body').scrollTop() +$('#p4').position().top;
-
            $('#body').animate({scrollTop:top+"px"},700);
 
-
        });
 
-
        $("#go5").click(function(){ 
 
-
        var top =$('#body').scrollTop() +$('#p5').position().top;
 
             $('#body').animate({scrollTop:top+"px"},700);  
             $('#body').animate({scrollTop:top+"px"},700);  
         });
         });

Latest revision as of 02:52, 18 October 2014

UESTC-Software

Models and Algorithms

1.Overview

Modeling is a powerful tool in synthetic biology and engineering. In our project, we aim to design a bioinformatics tool “CRISPR-X”, which is a software developed for design of CRISPR sgRNA with minimized off-target effects and high cutting rate.

The CRISPR-associated (Cas)9 can be programmed with a single guide RNA (sgRNA) to generate site-specific DNA breaks, but there are few known rules governing on-target efficacy of this system[1,2]. Related reports suggest gRNAs are most effective with a GC-content between 40 and 80%. [1] In addition, a guanine at position 20 in the target site, which appears to improve cutting rate. [1] Therefore, we use efficacy score to characterize the activity of the sgRNA.

For sgRNA sequences can be 17-20 nt in length to achieve similar levels of on-target gene editing,and up to 10,000 fold improvement in target specificity when truncated (17 or 18 base pair) sgRNA is used. [3] We design the length of sgRNA sequences vary from 17nt to 20nt.

First of all, we find the protospacer-adjacent motif (PAM) based on user-specified gene region. Then, we find sgRNA corresponding to the PAM. Next, we find that whether there is a potential off-target binding site for the sgRNA over the entire gene region, and evaluate the specificity and efficacy of the sgRNA. Finally, we provide a secondary structure and the restriction enzyme cutting sites for the sgRNA.

2.Parameters
Parameters Description Range Unit Remark
d0 Average distance for all the mismatch nucleotides to the PAM of any off-target site 0-19 nt
i Continuous variables for the number of mismatch nucleotide 1-Nmm 1
j Continuous variables for the total number of off-target sites exclude the perfect-hit off-target sites 1-(Nfg-Nph) 1
M Weight matrix [0,0,0.014,0,0,0.395,0.317,0,0.389,0.079,0.445,0.508,0.613,0.851,0.732,0.828,0.615,0.804,0.685,0.583] Reference: DNA targeting specificity of RNA-guided Cas9 nucleases, Hsu et al, 2013
n Mismatch position 1-20
Nfg The total number of off-target sites ≥0 1
Nmm The number of mismatch nucleotide for the not perfect-hit off-target sites 1-4 1
Nph The number of perfect-hit off-target sites ≥0 1 In our scoring algorithm, we allow the maximum value of Nph is 4, when Nph≥4, Sguide=0
r1 The proportion of specificity score in the total score 0-1 1 In our scoring algorithm, it’s default value is 0.65
r2 The proportion of efficacy score in the total score 0-1 1 In our scoring algorithm, it’s default value is 0.35
S1 The score of the first step ≥0 1
S20 The subtracted score for the 20th nucleotide is not a guanine =35 1
Seff The efficacy score 0-100 1 Represent the level of efficacy for the sgRNA
Sgc The subtracted score for different GC ratio 0,35,65 1
Sguide The total score of the sgRNA 0-100 1 Composed of Seff and Sspe, marking the overall properties(specificity and efficacy) of the sgRNA
Smm The subtracted score of the mismatch nucleotide for the not perfect-hit off-target sites ≥0 1
Sph The subtracted score of the perfect-hit off-target sites ≥0 1
Sspe The specificity score 0-100 1 Represent the level of specificity for the sgRNA
3.Scoring algorithm

    Judged conditions:
  • ①Bad GC ratio (< 40% or > 80%) : Sgc = 65;
    Not so good GC ratio (40% - 50% or 70% - 80%): Sgc = 35;
    Good GC ratio (51%-69%): Sgc = 0.[1]
    
  • ②The 20th nucleotide is not G: S20 = 35;[1]
  • ③If the sgRNA designed perfectly hit another sites, the penalty Sph = 25;if perfectly hit more than or equal to 4 loci, the total score Sguide is 0.

    Steps:
  • (1) Firstly, find out the number of off – target sequence Nfg, if Nfg = 0, output Sspe= r1*100; Otherwise, detect the third condition. If there is a sgRNA designed perfectly hit another site, regard the number of it as Nph, and then the score of the first step: S1 = Sph * Nph (Nph is 4 or less). When S1 is equal to or less than 75, perform step (2), otherwise the output Sguide = 0;
    If there is no sgRAN designed perfectly hit other sites, the score of the first step S1 = 0. This illustrates that there is no nucleotide which are matched between sgRNA and the place missed. Then perform step(2).
  • (2) When performing step (2), remove the Nph which is perfectly hit first. For Nfg-Nph which does not perfectly hit, please combine the weight ratio which obtained in the literature: M=[ 0, 0, 0.014, 0, 0, 0.395, 0.317, 0, 0.389, 0.079, 0.445, 0.508, 0.613, 0.851, 0.732, 0.828, 0.615, 0.804, 0.685, 0.583];[4] Using the formula:

Assuming that specific score: efficacy score = r1: r2 (the default is r1:r2 = 0.65:0.35), and then use formula specificity scores Sspe =(when 100-, Sspe=0), efficacy score Seff = r2 * (100 - (Sgc + S20)), the total score: ; Finally according to Sguide score, arranging the sgRNA from high to low, outputting sgRNA, total score Sguide, specificity scores Sspe, efficacy score Seff, the chromosome and its site connected to sgRNA, the GC ratio.

4.Algorithm illustration

In literature [4], The algorithm used to score single off-targets is:

This algorithm is adopted by CRISPR-P, the inadequacies of this algorithm are: (a) Despite the presence of off-target sites, but sometimes it's subtracted score will still be 0 (which seems unreasonable under certain circumstances, and it will confuse the scoring of those sgRNA that don’t exist off-target sites). (b) Using W function, which cannot be expressed by elementary functions, it will take some additional time in calculation.

However, our algorithm can avoid these two shortcomings. Our algorithm is:

First, we use the summation of exponential replaced W function; Secondly, when there exist off-target sites, our running results will be with the subtracted score, and we use the rounding to further ensure this situation. The following table can show the score contrast.

Off-target sequence Mismatches CRISPR-P Score OUR Software Score
GTTTCTCCGTAATCGCGTCA 4 0.8 0.989
GTTCTTCCACAATTCCGTTA 4 0 0.391
TTTCTTCCAGAATCGTGACT 4 0 0.426
GAAAAATTCCTCTTATTTCA 2 3.9 2.177
GAACAACTCCTCTTATTACA 2 2.4 1.187
GAAGAACTACGCTTATGACA 4 0 0.402

    Reference:
  • [1] Wang, T., Wei, J. J., Sabatini, D. M., & Lander, E. S. (2014). Genetic screens in human cells using the CRISPR-Cas9 system. Science, 343(6166), 80-84.
  • [2] Gagnon, J. A., Valen, E., Thyme, S. B., Huang, P., Ahkmetova, L., Pauli, A., ... & Schier, A. F. (2014). Efficient mutagenesis by Cas9 protein-mediated oligonucleotide insertion and large-scale assessment of single-guide RNAs.PloS one, 9(5), e98186.
  • [3] Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M., & Joung, J. K. (2014). Improving CRISPR-Cas nuclease specificity using truncated guide RNAs.Nature biotechnology, 32(3), 279-284.
  • [4] Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F. A., Konermann, S., Agarwala, V., ... & Zhang, F. (2013). DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology, 31(9), 827-832.

1.Overview
2.Parameters
3.Scoring algorithm
4.Algorithm illustration