Team:HZAU-China/Modeling
From 2014.igem.org
<!DOCTYPE html>
Modeling
Overview
Our project aims to engineer cells that can process information to adapt to the environment as we expected, so we need to yield quantitative predictions of gene behaviors. Mathematic model becomes a powerful tool here. It not only helps us predict and analyze the phenomena, but also contributes to the design of genetic circuits. Our models can be divided into several parts. Firstly, we will introduce the biological processes in our design and translate them into mathematical language. Then we will perform some comparisons to explain why we designed circuits this way. Next, we will present the simulation results and experiment evidences from wet lab. Parameter analysis will also be made here. Finally, we will talk about the design principle of the rewirable circuit using ODE sets with matrices. We hope that other researchers can get insight from the mathematic model and make the rewirable circuit widely used.
Biological Processes
In this part, we will list the important biological processes and explain how the related molecules work, and then we can describe them using equations. Because the timescales of many processes are separated, we use a quasi-steady-state approximation (QSSA) to reduce the number of dimensions in the systems in most cases. However, if we focus on some processes more than the final steady state, we may use other approximations such as prefactor method (Bennett et al., 2007).
2.1 Transcription and translation
According to the Central Dogma, DNAs can be transcribed into RNAs and RNAs can be translated into proteins. These processes sometimes can be regulated by some molecules like transcription factors or non-coding RNAs. The interactions among them can be understood by chemical reactions. We use one chemical reaction to depict the interaction between transcription factor $TF$ and inducible promoter $Pro$. We don't separate the polymerisation process and the binding process for simplicity
The equilibrium dissociation constant $K_D$ for the reaction can be calculated. \begin{equation} K_D=\frac{k_{-1}}{k_1}=\frac{[TF]^n\cdot [Pro]}{[Pro']}. \end{equation}
We assume that the DNA copy number is a constant n, \begin{equation} [Pro]+[Pro']=n. \end{equation} Then the $Pro'$ proportion depends on the concentration of $TF$, and this proportion can also be regarded as the probability of the binding events. \begin{equation} P(\text{binding})=\frac{[Pro']}{n}=\frac{[TF]^n}{K_D+[TF]^n} \end{equation}
If the transcription factor is an activator, the gene is transcribed at the maximal transcription rate $\beta_1$ when the promoter is bound by the transcription factor. If the transcription factor is a repressor, the gene is transcribed at the maximal transcription rate $\beta_1$ when the promoter is free from the transcription factor.
In deterministic model, we use Hill function to describe the rate of production, which is $\beta_1$ times its occurring probability.
For transcription activation, the maximal transcription rate will occur if the activator binds to the promoter, so the Hill function is \begin{equation} f(x)=\beta_1\frac{[TF]^n}{K^n+[TF]^n}. \end{equation}
For transcription repression, the maximal transcription rate will occur if the repressor doesn't bind to the promoter, so the Hill function is \begin{equation} f(x)=\beta_1\frac{K^n}{K^n+[TF]^n}. \end{equation} $K$ is activation coefficient or repression coefficient, which equals $\sqrt[n]{K_D}$.
In addition, many genes have a non-zero minimal expression level, namely basal expression level. It can be described by adding a term $\beta_0$. And some genes that have a constitutive promoter cannot be described by Hill function. We just set their production rate to be a constant.
The translation and degradation processes in our model are based on following reaction: For translation process, the protein production rate is proportional to the corresponding mRNA concentration $K_{tl}\cdot mRNA$. $K_{tl}$ is determined by the efficiency and concentration of ribosome, the sequence of RBS and the concentration of amino acids in the cell, which can be considered identical under the experiment condition. For degradation process, the degradation rate is proportional to the substrate. The proportion for mRNA is $K_{R}$ and the proportion for protein is $K_{P}$. So we list the following equations, \begin{equation} \begin{aligned} \frac{dmRNA_{Y}}{dt}&=\beta_0+\beta_1\frac{[TF]^n}{K^n+[TF]^n}-K_{R}\cdot mRNA_{Y} \text{ (activation)}\\ \frac{dmRNA_{Y}}{dt}&=\beta_0+\beta_1\frac{K^n}{K^n+[TF]^n}-K_{R}\cdot mRNA_{Y} \text{ (repression)}\\ \frac{dY}{dt}&=K_{tl}\cdot mRNA_{Y}-K_{P}\cdot Y. \end{aligned} \end{equation}
2.2 RNA interaction
Different riboregulators have different regulation mechanisms to control the gene expression. Here what we used is a classical engineered riboregulator designed by Isaacs and his colleagues in 2004. This riboregulator can control the gene expression at post-transcription level. More details can be found in the input module section.
We describe the RNA interactions by this reaction: which means \begin{equation} \frac{d[RNA\ duplex]}{dt}=k_2\cdot [taRNA]\cdot [crRNA]-k_{-2}\cdot [RNA\ duplex] \end{equation} This reaction is much faster than gene expression, so the equilibrium state can be reached quickly, \begin{equation} \begin{split} \frac{d[RNA\ duplex]}{dt}&=0 \\ [RNA\ duplex]_{st}&=\frac{k_2}{k_{-2}}\cdot [taRNA]\cdot [crRNA] \end{split} \end{equation}
The total mRNA of the recombinase has two forms, $crRNA$ and $RNA\ duplex$, resulting in another equation: \begin{equation} [crRNA]+[RNA\ duplex]=[mRNA_{Cre}]. \end{equation} Then we have \begin{equation} [RNA\ duplex]_{st}=[mRNA_{Cre}]\cdot\frac{[taRNA]}{K_m+[taRNA]}, \end{equation} where $K_m=\frac{k_{-2}}{k_2}$. So the protein production rate depends on not only the concentration of its corresponding mRNA but also the concentration of $taRNA$. This translation process with post-transcriptional control can be described by \begin{equation} \frac{d[Cre]}{dt}=K_{tl}\cdot [mRNA_{Cre}]\cdot\frac{[taRNA]}{K_m+[taRNA]}-K_{P}\cdot [Cre] \end{equation}
2.3 Processes related to AHL
In this section, we describe the 3OC6HSL synthesis, degradation and regulation. 3OC6HSL, a kind of AHL, is enzymatically synthesised by LuxI proteins from some substrates. Here, we use AHL to refer to 3OC6HSL.
For simplicity, we assumed that the amount of substrates is sufficient so that the 3OC6HSL synthesis rate is proportional to the LuxI protein concentration. And the degradation of 3OC6HSL can be divided into two parts. With the help of the AHL-degrading enzyme, AHL-lactonase (AiiA), the AHL can be degraded at a high rate, since the complex of AiiA and AHL is more easy to be degraded. Also, the AHL itself can be degraded at a lower rate.
Besides, AHL can regulate gene expression by binding the protein LuxR. The complex can activate luxpR promoter and repress luxpL promoter.
Although the binding events about AHL can be described by some differential equations, we apply QSSA when we focus on the gene expression process, because the timescale of these binding events is much less than the timescale of gene expression. So the change of AHL and AHL-LuxR complex over time can be described by differential equations: \begin{equation} \begin{split} &\frac{d[AHL]}{dt}=k_3\cdot [LuxI]-k_4k_5\cdot [AHL] \cdot [AiiA]-k_6\cdot [AHL]-k_7\cdot [LuxR]^2\cdot [AHL]^2\\ &\frac{d[AHL\text{-}LuxR\ complex]}{dt}=k_7\cdot [LuxR]^2\cdot [AHL]^2-K_{P}\cdot [AHL\text{-}LuxR\ complex] \end{split} \end{equation}
2.4 DNA recombination
Ringrose and his colleagues have developed mathematic models to describe the kinetics of Cre and Flp recombination. The site-specific recombination process can be mainly divided into four steps as the following figure depicts: DNA binding, synapsis, recombination and dissociation.
The detailed mathematical representation can be found in previous research (Ringrose et al., 1998), so we didn't list them here. We want to explain Cre-mediated inversion using the mutated Cre/loxP system we used prefers the forward reaction. The difference between the mutated loxP site and wild type loxP site only influences the DNA binding and dissociation processes. So we focus on these two steps. A Cre monomer binds to one half of a loxP site and then an asymmetrical homodimer is formed when a second Cre molecule binds to the other half of loxP.
Because the concentration of intermediate products $Cre\cdot lox$ is very low, and $k_9>>k_{-8}>k_{-9}$, we can apply steady-state treatment to it, \begin{equation} \frac{d[Cre\cdot lox]}{dt}=k_8\cdot [lox]\cdot [Cre]-k_{-8}[Cre\cdot lox]-k_9\cdot [Cre\cdot lox]\cdot [Cre]+k_{-9}\cdot[2Cre\cdot lox]=0, \end{equation}
The second dissociation process is the slowest reaction within these four reactions. Hence, we ignore the last term $k_{-9}\cdot[2Cre\cdot lox]$. Then we get the steady-state of intermediate products $Cre\cdot lox$ \begin{equation} [Cre\cdot lox]_{st}=\frac{k_8\cdot [lox]\cdot [Cre]}{k_{-8}+k_9\cdot [Cre]}. \end{equation} So the reaction rate for $2Cre\cdot lox$ is \begin{equation} r_c(2Cre\cdot lox)=\frac{k_9}{k_{-9}}\cdot[Cre]\cdot\frac{k_8\cdot [lox]\cdot [Cre]}{k_{-8}+k_9\cdot [Cre]}=\frac{k_8k_9\cdot [lox]\cdot [Cre]^2}{k_{-8}k_{-9}+k_9k_{-9}\cdot [Cre]}. \end{equation}
Comparison Between Different Designs
In this part, we want to demonstrate some advantages of our design by making quantitative comparisons. These advantages include safety, energy efficiency and stability.
3.1 The post-transcriptional control ensures lower leakage
Firstly, we don't expect that our designed processing modules will alter its function by some noises of environment. So we design a coherent feedforward loop to filter noise. Here we use $x$ to represent a general input signal. As we mentioned before, the dynamics of the input module with post-transcriptional control can be described by \begin{equation} \begin{aligned} \frac{d[mRNA_{Cre}]}{dt}&=\beta_0+\beta_1\frac{[x]^n}{K^n+[x]^n}-K_{R}\cdot [mRNA_{Cre}]\\ \frac{d[taRNA]}{dt}&=\beta_0+\beta_1\frac{[x]^n}{K^n+[x]^n}-K_{R}\cdot [taRNA]\\ \frac{d[Cre]}{dt}&=K_{tl}\cdot [mRNA_{Cre}]\cdot\frac{[taRNA]}{K_m+[taRNA]}-K_{P}\cdot [Cre]. \end{aligned} \end{equation}
If there is no post-transcriptional control, this process can be described by \begin{equation} \begin{aligned} \frac{d[mRNA_{Cre}]}{dt}&=\beta_0+\beta_1\frac{[x]^n}{K^n+[x]^n}-K_{R}\cdot [mRNA_{Cre}]\\ \frac{d[Cre]}{dt}&=K_{tl}\cdot [mRNA_{Cre}]-K_{P}\cdot [Cre]. \end{aligned} \end{equation}
We compare the expression dynamics of Cre in either case at different level of input signal $x$. This comparison reveals that the riboregulator ensures lower leakage but doesn't affect the expression at a high input level. We also design experiments to validate the model, the results are consistent with this model.
Figure 1. Dynamic of the input module.
Figure 2. Experimental result.
3.2 The mutated Cre/loxP system determines the inversion direction
Secondly, once the engineered cells receive a sure signal, they will process information as we designed but not the other way around. So we must ensure the direction of the DNA invertion. In other words, the site-specific recombination can be regarded as a unidirectional one. To this end, we choose Cre recombinase and a pair of mutant lox sites, lox66 and lox71 to rearrange DNA sequence. Here, we explain why the forward reaction rate is higher than the reverse one. The mutant site will have a lower affinity for Cre. The binding event mainly depends on the rate of free diffusion, but the dissociation rate will be high if the binding strength is weak. According to the equation \begin{equation} r_c(2Cre\cdot lox)=\frac{k_8k_9\cdot [lox]\cdot [Cre]^2}{k_{-8}k_{-9}+k_9k_{-9}\cdot [Cre]}, \end{equation} we know that the second dissociation event is more significant than the first one. So mutant lox site like lox66 and lox71 can choose a wise way to benefit the formation of dimer. Here we assume that lox66 and lox71 will have higher $k_{-8}$. However, double mutant loxP site like lox72 will have high dissociation rate at both steps. For simplicity, we assume that the affected dissociation rate is $\epsilon$ ($\epsilon>1$) times the original one.
After binding event, the two loxP-bound dimers associate to form a tetramer, and recombination proceeds via a Holiday Junction intermediate. Therefore, we can compare $r_c(2Cre\cdot loxP)\cdot r_c(2Cre\cdot lox72)$ with $r_c(2Cre\cdot lox66)\cdot r_c(2Cre\cdot lox71)$ to see what kind of synapsis is easy to form. \begin{equation} \begin{split} r_c(2Cre\cdot loxP)\cdot r_c(2Cre\cdot lox72)&=\frac{k_8^2k_9^2\cdot[lox]^2\cdot [Cre]^4}{({\epsilon}^2k_{-8}k_{-9}+\epsilon k_9k_{-9}[Cre])(k_{-8}k_{-9}+k_9k_{-9}[Cre])}\\ r_c(2Cre\cdot lox66)\cdot r_c(2Cre\cdot lox71)&=\frac{k_8^2k_9^2\cdot[lox]^2\cdot [Cre]^4}{({\epsilon}k_{-8}k_{-9}+k_9k_{-9}[Cre])^2} \end{split} \end{equation}
For $\epsilon>1$, \begin{equation} \begin{aligned} \frac{r_c(2Cre\cdot loxP)\cdot r_c(2Cre\cdot lox72)}{r_c(2Cre\cdot lox66)\cdot r_c(2Cre\cdot lox71)}&=\frac{({\epsilon}k_{-8}k_{-9}+k_9k_{-9}[Cre])^2}{({\epsilon}^2k_{-8}k_{-9}+\epsilon k_9k_{-9}[Cre])(k_{-8}k_{-9}+k_9k_{-9}[Cre])}\\ &=1-\frac{(\epsilon-1)(k_9k_{-9}[Cre]^2)+\epsilon(\epsilon-1)(k_{-8}k_9k_{-9}^2[Cre])}{({\epsilon}^2k_{-8}k_{-9}+\epsilon k_9k_{-9}[Cre])(k_{-8}k_{-9}+k_9k_{-9}[Cre])}\\ &<1 \end{aligned} \end{equation}
Under the condition that the $Cre\cdot lox$ is much more easy to get a Cre monomer rather than lose a Cre monomer, this proportion approximately equals to $\frac{1}{\epsilon}$, because \begin{equation} \begin{aligned} r_c(2Cre\cdot lox)&=\frac{k_8k_9\cdot [lox]\cdot [Cre]^2}{k_{-8}k_{-9}+k_9k_{-9}\cdot [Cre]} &\approx\frac{k_8\cdot [lox]\cdot [Cre]}{k_{-9}} \end{aligned} \end{equation}
Hence, $r_c($$2Cre\cdot loxP)$$\cdot r_c($$2Cre\cdot lox72)$$< r_c($$2Cre\cdot lox66)$$\cdot r_c($$2Cre\cdot lox71)$, which means the synapsis between lox66 and lox71 is easy to form.
This explanation is based on the kinetic model proposed in 1998. However, the real system may have additional processes. We introduce another explanation using a principle called kinetic proofreading that is widely employed to achieve high precision in diverse molecular recognition systems. Inspired by the DNA binding process described in available researches (Tlusty et al., 2004; Alon, 2007), we infer that the recombinase protein Cre undergoes a modification after binding to the half of lox site, and then it can recruit other proteins Cre to bind to another half of lox site. Such a modification can help the mutant lox site like lox66 and lox71 easily jump to the next state but prevent the incorrect binding of the double mutant lox site like lox72. This modification makes the second binding event more likely to happen and the corresponding dissociation event less likely to happen. So why $k_9>k_8$ and $k_{-8}>k_{-9}$ can also be explained by this modification.
In summary, we can explain Cre-mediated inversion using the mutated Cre/loxP system we used prefers the forward forward reaction by different mechanisms. No matter which theory we use to explain the phenomena, we emphasize that the two binding processes of one lox site are not identical. The first binding process contributes to the second one. Therefore, either the steady-state treatment of the first binding and dissociation or the kinetic proofreading by a modification is reasonable.
3.3 The time-sharing process module reduces crosstalk and resource cost
The greatest hallmark of our processing module is that it can enhance the utilization of resources. The processing module we designed in essence is a time-sharing system. In computer science, the time-sharing system can share computing resources among many users. The users in biology are the various environments. The time-sharing processing module allows the engineered cells to interact with multiple environments. With this module, cells can achieve different functions at different times.
We can make some comparisons between the time-sharing system and a simultaneous processing system. The latter wastes many resources to maintain or inhibit the unnecessary functions when the cell are running other functions, if these functions don’t need to run at the same time. Such a burden to cells may even cause growth defects since competition of resources can affect normal pathway in cells. Moreover, the more regulators we put into the cell, the more likely crosstalk will occur, which will result in a faulty response to the signals.
References
Bennett, M. R., Volfson, D., Tsimring, L., & Hasty, J. (2007). Transient dynamics of genetic regulatory networks. Biophysical journal, 92(10), 3501-3512.
Ringrose, L., Lounnas, V., Ehrlich, L., Buchholz, F., Wade, R., & Stewart, A. F. (1998). Comparative kinetic analysis of FLP and cre recombinases: mathematical models for DNA binding and recombination. Journal of molecular biology, 284(2), 363-384.
Tlusty, T., Bar-Ziv, R., & Libchaber, A. (2004). High-fidelity DNA sensing by protein binding fluctuations. Physical review letters, 93(25), 258103.
Alon, U. (2006). An introduction to systems biology: design principles of biological circuits. CRC press.
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry, 81(25), 2340-2361.
Elowitz, M. B., & Leibler, S. (2000). A synthetic oscillatory network of transcriptional regulators. Nature, 403(6767), 335-338.
Strelkowa, N., & Barahona, M. (2011). Transient dynamics around unstable periodic orbits in the generalized repressilator model. Chaos: An Interdisciplinary Journal of Nonlinear Science, 21(2), 023104.
Cao, Y., Gillespie, D. T., & Petzold, L. R. (2006). Efficient step size selection for the tau-leaping simulation method. The Journal of chemical physics, 124(4), 044109.
Ma, W., Trusina, A., El-Samad, H., Lim, W. A., & Tang, C. (2009). Defining network topologies that can achieve biochemical adaptation. Cell, 138(4), 760-773.