Team:Tsinghua-A/Modeling

From 2014.igem.org

(Difference between revisions)
Line 46: Line 46:
We exploit amino acid degeneracy and alternate nucleotides to reduce the repetition rate of bases of TALE DNA sequence. (See Figure 1) Then the optimized TALE sequence will be tested in wet lab to verify our conjecture whether TALE DNA sequence of lower repeatability rate works at higher efficiency.<br><br>
We exploit amino acid degeneracy and alternate nucleotides to reduce the repetition rate of bases of TALE DNA sequence. (See Figure 1) Then the optimized TALE sequence will be tested in wet lab to verify our conjecture whether TALE DNA sequence of lower repeatability rate works at higher efficiency.<br><br>
<div align="center"><img src="https://static.igem.org/mediawiki/2014/3/3c/T14.jpg" alt="Title"><br></div>
<div align="center"><img src="https://static.igem.org/mediawiki/2014/3/3c/T14.jpg" alt="Title"><br></div>
-
<div align="center">Figure 1. RNA codon Table<br><br></div>
+
<div align="center"><p>Figure 1. RNA codon Table<br><br></div>
<p>For instance, we may change UUC into UUU to make one of the repeats different from others while the amino acid(PhenylalaninePhe) is identical.<br><br>
<p>For instance, we may change UUC into UUU to make one of the repeats different from others while the amino acid(PhenylalaninePhe) is identical.<br><br>
<p>By this way, we use two types of intelligent optimization algorithm to optimize TALE DNA sequence.<br><br>
<p>By this way, we use two types of intelligent optimization algorithm to optimize TALE DNA sequence.<br><br>
Line 55: Line 55:
<p>Genetic Algorithm is a search heuristic that mimics the process of natural selection. The heuristic is routinely used to generate useful solutions to optimization and search problems.[1]<br>
<p>Genetic Algorithm is a search heuristic that mimics the process of natural selection. The heuristic is routinely used to generate useful solutions to optimization and search problems.[1]<br>
<p>GA has lots of applications in many fields, such as in bioinformatics, using GA to optimize DNA successfully made the sequences have better physic-chemical properties for PCR. <br><br>
<p>GA has lots of applications in many fields, such as in bioinformatics, using GA to optimize DNA successfully made the sequences have better physic-chemical properties for PCR. <br><br>
-
 
+
<B>-Our strategy</B><br>
<B>-Our strategy</B><br>
<B> Initialization</B><br> Create a population of hundreds of TALE sequences.<br>  
<B> Initialization</B><br> Create a population of hundreds of TALE sequences.<br>  
<B> Mutation</B><br> Each sequence changes a base randomly under the premise that 1. Amino acids sequence remain unchanged 2. To prevent TALE sequence from being cut, no restriction enzyme site we exploit in experiment is allowed to be created 3. Mutation should not take place on the overhang of each repeat.<br>
<B> Mutation</B><br> Each sequence changes a base randomly under the premise that 1. Amino acids sequence remain unchanged 2. To prevent TALE sequence from being cut, no restriction enzyme site we exploit in experiment is allowed to be created 3. Mutation should not take place on the overhang of each repeat.<br>
-
<B> Crossover</B><br> In GA, crossover is a simulation of the process of synapsis. We randomly choose a point of amino acid, and two sequences exchange their parts from the point we choose to one of the ends. (此处应有图)<br>  
+
<B> Crossover</B><br> In GA, crossover is a simulation of the process of synapsis. We randomly choose a point of amino acid, and two sequences exchange their parts from the point we choose to one of the ends. <br>  
-
<B> Fitness Scoring</B><br>
+
<div align="center"><img src="https://static.igem.org/mediawiki/2014/a/ad/T18.jpg"><br></div>
 +
<div align="center"><p>Figure 2.The process of crossover is a simulation of natural synapsis of chromosome.<br></div>
 +
<p><B> Fitness Scoring</B><br>
We take two main factors into consideration:<br>  
We take two main factors into consideration:<br>  
1.Repeatability rate of the sequence<br>  
1.Repeatability rate of the sequence<br>  
Line 78: Line 80:
<B>Scoring</B><br> The same as fitness scoring in genetic algorithm.<br>
<B>Scoring</B><br> The same as fitness scoring in genetic algorithm.<br>
<B>Parameters</B><br> The initial temperature T is 10000.0, after each generation, the temperature reduces to 99% of the former temperature. If the new generation has higher score, it will be accepted. Otherwise it will be accepted at a probability P(A)<br>
<B>Parameters</B><br> The initial temperature T is 10000.0, after each generation, the temperature reduces to 99% of the former temperature. If the new generation has higher score, it will be accepted. Otherwise it will be accepted at a probability P(A)<br>
-
P(A)=0.001*e^(-(t^*-t)/T)<br>
 
-
t^*---- score of the new generation t---- score of the new generation<br><br>
 
-
<B>References</B><br>
+
<div align="center"><img src="https://static.igem.org/mediawiki/2014/d/de/T15.jpg" alt="Title"><br></div>
 +
 
 +
<p><B>References</B><br>
[1]Wikipedia http://en.wikipedia.org/wiki/Genetic_algorithm<br><br>
[1]Wikipedia http://en.wikipedia.org/wiki/Genetic_algorithm<br><br>
   
   

Revision as of 21:17, 17 October 2014

1

Algorithms

Overview

In order to lower down the repeatability rate of codons, we use intelligent optimization algorithm. We exploit amino acid degeneracy and alternate nucleotides to reduce the repetition rate of bases of TALE DNA sequence. (See Figure 1) Then the optimized TALE sequence will be tested in wet lab to verify our conjecture whether TALE DNA sequence of lower repeatability rate works at higher efficiency.

Title

Figure 1. RNA codon Table

For instance, we may change UUC into UUU to make one of the repeats different from others while the amino acid(PhenylalaninePhe) is identical.

By this way, we use two types of intelligent optimization algorithm to optimize TALE DNA sequence.

Genetic Algorithm(GA)
-Introduction of Genetic Algorithm

Genetic Algorithm is a search heuristic that mimics the process of natural selection. The heuristic is routinely used to generate useful solutions to optimization and search problems.[1]

GA has lots of applications in many fields, such as in bioinformatics, using GA to optimize DNA successfully made the sequences have better physic-chemical properties for PCR.

-Our strategy
Initialization
Create a population of hundreds of TALE sequences.
Mutation
Each sequence changes a base randomly under the premise that 1. Amino acids sequence remain unchanged 2. To prevent TALE sequence from being cut, no restriction enzyme site we exploit in experiment is allowed to be created 3. Mutation should not take place on the overhang of each repeat.
Crossover
In GA, crossover is a simulation of the process of synapsis. We randomly choose a point of amino acid, and two sequences exchange their parts from the point we choose to one of the ends.


Figure 2.The process of crossover is a simulation of natural synapsis of chromosome.

Fitness Scoring
We take two main factors into consideration:
1.Repeatability rate of the sequence
To judge the repeatability, we compare the changed sequence with the original one and count the Hamming distance between them. When a restriction enzyme site sequence occurs on the mutated TALE sequence, there will be a penalty.
2.Codon Usage
When the sequence contains too many rare codons, which means the number of the homologous tRNA is extremely low in E.coli, the whole sequence can hardly express. In order to avoid the appearance of rare codons, each occurrence of rare codons will lead to penalty.
Selection
In each generation we sort the 200 sequences according to the scores of each single sequence. And we terminate the half of lower scores, the rest of them will have better fitness. Repeat the process of mutation and crossover, the average score of each generation will increase. (See Results.)
We repeat the process for 600 generations.

Simulated Annealing(SA)
-Introduction of Simulated Annealing
Simulated annealing is a simple and general algorithm for finding global minima. It operates by simulating the cooling of a (usually fictitious) physical system whose possible energies correspond to the values of the objective function being minimized. The analogy works because physical systems occupy only the states with lowest energy as the temperature is lowered to absolute zero.

-Our Strategy
Mutation
We randomly choose 5 points on the sequence to mutate. To each single point, the process is the same with our computational mutation in genetic algorithm.
Scoring
The same as fitness scoring in genetic algorithm.
Parameters
The initial temperature T is 10000.0, after each generation, the temperature reduces to 99% of the former temperature. If the new generation has higher score, it will be accepted. Otherwise it will be accepted at a probability P(A)

Title

References
[1]Wikipedia http://en.wikipedia.org/wiki/Genetic_algorithm



2

Results



3

Simulation

System Model

The model of the report system is shown as the picture above.
We use the following formula to simulate the lac repressor system.


We use Michaelis-Menten equation to simulate the process.



Simulation Results


Figure 999. Simulation of the IPTG-lac repressor system.

We infer that the situation of TALE will be similar with IPTG. When TALE protein binds on the plasmid, the expression of lacI will be repressed. Consequently, the concentration of RFP will be increased. If no TALE proteins are expressed, the concentration of RFP will be very low.


Figure 1000. Situations that TALE successfully express and not express.

From the simulation, we can conclude that RFP can be a signal of whether our TALE has successfully expressed, moreover, the concentration of RFP, which shows as the fluorescence intensity, has positive correlation with the quantity of TALE protein.

References
[1]Uri Alon,An Introduction to Systems Biology[B],Chapman & Hall/CRC, P244