Team:Tsinghua-A/Modeling

From 2014.igem.org

Revision as of 19:35, 17 October 2014 by Eamon (Talk | contribs)

1

Algorithms

Overview
In order to lower down the repeatability rate of codons, we use intelligent optimization algorithm. We exploit amino acid degeneracy and alternate nucleotides to reduce the repetition rate of bases of TALE DNA sequence. (See Figure 1) Then the optimized TALE sequence will be tested in wet lab to verify our conjecture whether TALE DNA sequence of lower repeatability rate works at higher efficiency.

Title
Figure 1. RNA codon Table

For instance, we may change UUC into UUU to make one of the repeats different from others while the amino acid(PhenylalaninePhe) is identical.

By this way, we use two types of intelligent optimization algorithm to optimize TALE DNA sequence.

Genetic Algorithm(GA)
-Introduction of Genetic Algorithm
Genetic Algorithm is a search heuristic that mimics the process of natural selection. The heuristic is routinely used to generate useful solutions to optimization and search problems.[1]
GA has lots of applications in many fields, such as in bioinformatics, using GA to optimize DNA successfully made the sequences have better physic-chemical properties for PCR.

-Our strategy
Initialization
Create a population of hundreds of TALE sequences.
Mutation
Each sequence changes a base randomly under the premise that 1. Amino acids sequence remain unchanged 2. To prevent TALE sequence from being cut, no restriction enzyme site we exploit in experiment is allowed to be created 3. Mutation should not take place on the overhang of each repeat.
Crossover
In GA, crossover is a simulation of the process of synapsis. We randomly choose a point of amino acid, and two sequences exchange their parts from the point we choose to one of the ends. (此处应有图)
Fitness Scoring
We take two main factors into consideration:
1.Repeatability rate of the sequence
To judge the repeatability, we compare the changed sequence with the original one and count the Hamming distance between them. When a restriction enzyme site sequence occurs on the mutated TALE sequence, there will be a penalty.
2.Codon Usage
When the sequence contains too many rare codons, which means the number of the homologous tRNA is extremely low in E.coli, the whole sequence can hardly express. In order to avoid the appearance of rare codons, each occurrence of rare codons will lead to penalty.
Selection
In each generation we sort the 200 sequences according to the scores of each single sequence. And we terminate the half of lower scores, the rest of them will have better fitness. Repeat the process of mutation and crossover, the average score of each generation will increase. (See Results.)
We repeat the process for 600 generations.

Simulated Annealing(SA)
-Introduction of Simulated Annealing
Simulated annealing is a simple and general algorithm for finding global minima. It operates by simulating the cooling of a (usually fictitious) physical system whose possible energies correspond to the values of the objective function being minimized. The analogy works because physical systems occupy only the states with lowest energy as the temperature is lowered to absolute zero.

-Our Strategy
Mutation
We randomly choose 5 points on the sequence to mutate. To each single point, the process is the same with our computational mutation in genetic algorithm.
Scoring
The same as fitness scoring in genetic algorithm.
Parameters
The initial temperature T is 10000.0, after each generation, the temperature reduces to 99% of the former temperature. If the new generation has higher score, it will be accepted. Otherwise it will be accepted at a probability P(A)
P(A)=0.001*e^(-(t^*-t)/T)
t^*---- score of the new generation t---- score of the new generation

References
[1]Wikipedia http://en.wikipedia.org/wiki/Genetic_algorithm



2

Results



3

TALE Assembly

The TALE assembly strategy uses the Golden Gate cloning method, which is based on the ability of type IIS enzymes to cleave outside of their recognition site. When type IIS recognition sites are placed to the far 5’ and 3’ end of any DNA fragment in inverse orientation, they are removed in the cleavage process, allowing two DNA fragments flanked by compatible sequence overhangs, termed fusion sites, to be ligated seamlessly. Since type IIS fusion sites can be designed to have different sequences, directional assembly of multiple DNA fragments is feasible. Using this strategy, DNA fragments can be assembled from undigested input plasmids in a one-pot reaction with high efficiency.

We chose the native TALE AvrBs3 as a scaffold for customized assembly of TALE constructs. The central DNA binding domain of AvrBs3 is formed by 17.5 tandemly arranged 34 amino acid repeats, with the last half repeat showing similarity to only the first 20 amino acids of a full repeat. To reduce the risk of recombination events between the 17.5 highly homologous repeat sequences which is mentioned in the hypothesis part, we codon-optimized AvrBs3 applying the codon usage.

In a single Golden Gate cloning reaction, cloning efficiency is significantly reduced for assembly of 17 repeat modules. Therefore, we split the assembly in two successive steps. In the first cloning step, 10 repeats were assembled in one vector. The preassembly vectors confer spectinomycin resistance and encode a lacZ-α fragment for blue/white selection. On both sides of the lacZ-α fragment a type IIS recognition sequence, BsaI, was positioned. Similarly, 11~17 repeats and NG-last-repeat were respectively ligated and inserted into another vector. After preassembly of the 10 and 7 and last repeats using BsaI, the intermediate blocks were released via Esp3I and cloned into the final assembly vector (modified pTAL1). As is explained in the backbone part, we constructed a backbone with constitutive promoter which can express under normal condition, and another with a tetracycline-induced promoter, which is expressed with tet. Modified pTAL1 confers AmpR, and allows plasmid replication in E.coli. The vector pTAL1 also contains all elements of the final TALE expression construct, including TALE N’ and C’ arms, replication origin, etc., but a lacZ-α in between the left and right arms.

During the construction of wild-type control plasmid, we used the modified modules provided by our lab. In order to confirm our hypothesis, the designer TALEs were oligo-synthesized according to the results given by optimizing algorithm. We reserved the fusion sites for golden gate reaction, and broke each repeats into 2 parts. After annealing in PCR amplifier, the linear sequence of the repeats can be added to the golden gate reaction and be ligated with the primary vector and continue further ligation to accomplish the final construction.

Title Reference
http://www.ncbi.nlm.nih.gov/nuccore/NG_034463.1


More

4

TALE Expression

Background – pTAL1 vector
Based on the fact that Golden Gate is an effective way to assembly TALE (Transcription activator-like effectors) and various eukaryotic expression systems have been established but few in prokaryotic systems, we are determined to construct such efficient expression system in Esherichia coli so that we can test our brilliant idea. Through referring to numbers of paper, we find that most scientists choose to construct stable cell line via homologous arm such as attL1 and attL2 or they just introduce exogenous TALE[1]. Considering the principle of Golden Gate Assembly, we only need to reconstruct the final vector pTAL1 (figure 1) to solve this problem. The vector pTAL1 contains TALE N-terminal, TALE C-terminal, lacZ for blue white scanning and attL1, attL2 homologous arm. However, it lacks necessary elements for prokaryotic creature such as promoter, RBS and terminator. Here comes to our story of establish a TALE expression system in prokaryotic creature.

Title
Figure 1. The original vector pTAL1

Constitutive pTAL
Wisely, we choose 3A assembly (http://parts.igem.org/Help:Assembly/3A_Assembly) to construct our expression system. Firstly, we design forward and reverse primers with extension on which contains EcoRI, XbaI and SpeI, PstI restriction enzyme sites to get PCR prodcuts of pTAL. Then through naïve enzyme digestion and liagtion we can ligate pTAL with terminator, promoter and RBS one by one. And finally, we can easily get our constitutive TALE expression vector (Figure 2). And we also submit this expression vector as K1311003 (http://parts.igem.org/Part:BBa_K1311003:Design) in part.igem.org.

Regulative pTAL
Similar to the method of constitutive construction, we make use of the ligated pTAL with terminator to continue our regulative pTAL construction. After browsing on the igem parts website (http://parts.igem.org/Main_Page), we find that there is no ideal regulatory parts that can be directly applied. We need to make use of some parts to get our ideal composite regulatory parts. Based on parts C0040 (http://parts.igem.org/Part:BBa_C0040), we added promoter and RBS (K081005, http://parts.igem.org/Part:BBa_K081005), terminator (B0015, http://parts.igem.org/Part:BBa_B0015), pTet (TetR repressible promoter, R0040, http://parts.igem.org/Part:BBa_R0040) and RBS one by one via 3A assembly. Similarly, we insert this large fragment into the upstream of the ligated pTAL with terminator (Figure 2). Eventually, we successfully reconstruct regulative pTAL and offer 3 our own parts this year. One is regulative pTAL (K1311004, http://parts.igem.org/Part:BBa_K1311004:Design); one is K1311005 (http://parts.igem.org/Part:BBa_K1311005) and the other is K1311006 (http://parts.igem.org/Part:BBa_K1311006).

Title



5

Report System

We construct a report system so as to test the reliability and efficiency of our ‘Marvelous TALE’. In this section, we test the TALE’s DNA binding ability and report it with a common report gene ‘RFP’. We attempt to put the target of TALE’s DNA binding target sequence inside the expression cassette of report gene and binding TALE can disrupt the express of report gene. We use iGEM standard parts to build our report system.

We designed a standard iGEM part BBa_K1311007 to complete all the tasks. This part contains
Promoter (J23102)-TALE binding site (repeats three times)-RBS-LacI coding sequence-Terminator(B0015)-LacI Regulative Promoter(R0010)-RBS(B0034)-mRFP1(E1010)-Terminator(B0015)
Title
This part can convert the binding ability of TALE protein to its target DNA sequence to an easier available parameter, the florescent intensity of RFP. When the TALE protein is expressed, the TALE make binds to its target, which may interrupt the transcription of LacI. The lack of repressors may lead to the expression of RFP. So the stronger florescent intensity means the better binding ability of TALE protein. In this part, the target of TALE recognition site is chosen to be the 18bp sequence (ACCTCATCAGGAACATGTT).

Our Circuit Design
Title
This parts can convert the binding ability of TALE protein to its target DNA sequence to a easier available parameter, the florescent intensity of RFP. When the TALE protein is expressed, the TALE make binds to its target, which may interrupt the transcription of LacI. The lack of repressors may lead to the expression of RFP. So the stronger florescent intensity means the better binding ability of TALE protein.

Validation
We change the normal RBS sequence in the LacI coding sequence into an RBS containing three tandem TALE binding site sequences. So we have to validate that the LacI protein can express normally and can normally inhibit the expression of RFP.
We transformed this plasmid in to E.coli DH5α with electroporation and chemical transformation. We spread the plate and culture it in 37 degrees Centigrade for more than 15 hours until small colonies can be seen in the plate. At this moment, the colonies might look red for the sake of the delay of expression of LacI. We picked colonies into the 10mL tubes and added 5mL LB broth with Chl antibiotics.
After six hours’ shaking at 37 degrees Centigrade and 220 rpm, we got the tubes out and double, four times, eight times diluted. (40μL of 0.1M IPTG was added as the positive control group) After another shaking for 12h, florescence of the bacteria was evaluated with an enzyme-labeled instrument and OD600 was tested with spectrometry.

Picture of our tubes

Title
(The two red ones on the back are the positive control groups, others are not red)