Team:UT-Tokyo/Counter/Project/Modeling

From 2014.igem.org

<img src = "Sub_overview.png" class = "contTitle" />

Modeling is an attempt to describe, in a precise way, an understanding of the elements of a system of interest, their states, and their interactions with other elements.

The purpose of our modeling team is to peel back the layer of appearance of the device to reveal it's underlying nature. We tried to improve the device, cooperating with the experiment team. To achieve our goal, we have developed three fundamental themes. These three themes divide the modeling part into three parts. At the beginning, we confirmed whether our circuit realizes a reaction:this for part 1. When it comes to extending the counter to a device which can count a lot of numbers, there is a possibility that the device skips the next state due to the leak expression of sigma factor. So we examined what influences the leak of sigma factor have on the state-transit device:this for part 2. Thirdly, as an implementation of our genetic circuit, we designed new construct freeing the degree of induce-time and checked how the circuit works:this for part 3. Finally, we discussed what would be appropriate modeling, frequent issue to attack, in order to find the best strategy of modeling and wrote how we constructed our model:this Guide for modeling.

In Part1(deterministic model, stochastic model), we considered whether the device outputs much higher fluorescence when 2nd-induced than that of 1st-induced. In addition, we checked the key function of our device, reset function. Two approaches was made.

・Deterministic model:In this model,intermolecular interactions such as DNA-protein interactions and protein-ligand interactions are described as differential equations and concentrations of product (multimolecular complex) can be calculated by those of reactants. This model is intuitive, simple and hence popular to estimate the result of experiment.

・Stochastic model:Due to a cell's small volume (down to a few femtoliters for bacterial cells) and typically low bimolecular concentrations, the absolute number of molecules of a given signaling species in a cell may be quite small (typically on the order of 100 to 103). At such low numbers of reacting molecules, the reactions are heavily influenced by chance events, such as the chance encounter of reactant molecules. So we also used the stochastic model. The most common formulation of stochastic models for biochemical networks is the chemical master equation (CME). We used Gillespie algorithm to solve CME. Other method to solve CME is mentioned in Part4.

In Part2(Leaky expression analysis), taking the influences of the leak of sigma factor into consideration, we designed model which can change the induce time. In the end, we concluded that the leak of sigma-factor can be ignored when the induce time is much longer than the time of degradation of sigma-factor.

In Part3(Improvement of counter), the current sigma-Re-counter depends much on pulse length, so we devised genetic circuits that would not be affected by the pulse length of the arabinose induction. We modeled this construct to test if it can be realized.

<img src = "Sub_our_main_construct.png" class = "contTitle" />

First construction we came up with about the counter is described as following.

<img src = "Nakashima_sigmaconst.png" class = "figure" />

These generators have following functions:

・production of cis-repressed mRNA of a sigma factor by constitutive promoter.

・production of taRNA by the 1st arabinose induction and activating of cis-repressed mRNA by taRNA.

・positive feedback loops formed by sigma factors.

・production of cis-repressed mRNA of GFP by sigma factors.

・activating mRNA of GFP by the 2nd arabinose induction.

・degradation of sigma factors by induction of IPTG.

The counter consists of these functions. Our first work is confirming theoretically that these functions cooperate with each other, GFP expresses significantly only by 2nd arabinose induction and the counter is reset by IPTG.

In this simulation, we used two types of models, the deterministic model and the stochastic model. First, we simulated the counter by the deterministic model, which is popular and intuitive. Second, we simulated it by the stochastic model, which takes paucity and fluctuation of reactants into account.

Formulation of models and results of simulations are described below.

Contents

Deterministic Model

Formulation of the Model

First of all, we constructed the deterministic model to estimate the behavior of the counter. In this model, biochemical reactions are described as differential equations and concentration of reaction product can be calculated by those of reactants. This model is intuitive, simple and hence popular to estimate the result of experiment. We were able to get some parameters for modeling of counter from previous works.[→ see parameter section]

We had simplified the construction of the mathematical model before described time evolution in which concentrations of mRNAs and proteins change as differential equations. First, we regarded that the reaction between taRNA(trans-activating RNA) and crRNA(cis-repressor RNA) in riboregulator is much faster than that of transcription or translation and equilibrium reaction. This diminution of parameters enable us to use the equilibrium constant as a parameter and prevent us from overfitting when we adapt this model to raw data.

equation (1):

<img src = "Ono_%281%29.png" height="50px" class = "math" />

So, the concentrations of mRNAs and the coupling of taRNA and crRNA are represented as stated above . Subscript means the gene encoded by the mRNA. We regarded that the affinities between crRNA and taRNA on different mRNAs were equal. The dissociation constant of equilibrium reaction was therefore shown as following.

equation (2):

<img src = "Ono_%282%29.png" height="100px" class = "math" />

Using dissociation constant, concentrations of reaction products such as [mcrcr-σ ] could be described as function of those of taRNA and mRNA of σ and GFP. We put X, A and B as the total quantity of taRNA, sigma and GFP, respectively.

equation (3):

<img src = "Ono_%283%29.png" height="20px" class = "math" />

equation (4):

<img src = "Ono_%284%29.png" height="20px" class = "math" />

equation (5):

<img src = "Ono_%285%29.png" height="20px" class = "math" />

Using these equations((3)-(5)) and equilibrium constant, concentrations of free taRNA, free mRNAs and taRNA-mRNA complexes are described as following. These are all of simplifications.

equation (6):

<img src = "Ono_%286%29.png" height="45px" class = "math" />

equation (7):

<img src = "Ono_%287%29.png" height="45px" class = "math" />

equation (8):

<img src = "Ono_%288%29.png" height="45px" class = "math" />

equation (9):

<img src = "Ono_%289%29.png" height=45px" class = "math" />

equation (10):

<img src = "Ono_%2810%29.png" height="44px" class = "math" />

Finally, we built up differential equations about concentrations of reaction products including mRNA of sigma which has no riboregulator. (It makes positive feedback loop.) This model assumed relationship between promoter and the amount of transcriptional product increasing per unit time. The amount is in proportion to the number of promoter if the promoter expressed constitutively and can be approximated by Hill equation if the inducer controlled its promoter. It was also assumed that the degradation of proteins and RNAs followed first order kinetics. Some of used parameters were cited from references.[1]~[9]

We aimed to determine parameters about sigma through experiment and used provisional parameter determined in reference to other promotor.

equation (11):

<img src = "Ono_%2811%29.png" height="76px" class = "math" />

equation (12):

<img src = "Ono_%2812%29.png" height="43px" class = "math" />

equation (13):

<img src = "Ono_%2813%29.png" height="45px" class = "math" />

The transcriptional activity of mRNA encoding sigma factor initiated by positive feedback loop and that of mRNA encoding anti-sigma induced by IPTG are described as following.

equation (14):

<img src = "Ono_%2814%29.png" height="46px" class = "math" />

equation (15):

<img src = "Ono_%2815%29.png" height="43px" class = "math" />

In our project, IPTG induction was aimed at enough production of anti-sigma to reset the counter and the sensitivity of Plac(lactose promoter) was not our main interest. Therefore, we used simple equation,(15) to describe how lactose promoter behave. Plac depend on the concentration of IPTG but we regarded it as a fixed number in this modeling.

Taking into account that translation coincide with transcription in prokaryotes, the linearity between the concentration of transcriptional product and the change rate of the amount of translational product was assumed. this linearity does not depend on the kind of translational product. We also assumed that anti-sigma combine with sigma and form inert matter irreversibly, and the reaction velocity of that is proportional to product of these.

equation (16):

<img src = "Ono_%2816%29.png" height="44px" class = "math" />

equation (17):

<img src = "Ono_%2817%29.png" height="44px" class = "math" />

equation (18):

<img src = "Ono_%2818%29.png" height="44px" class = "math" />

Using above-mentioned differential equations, we simulated behavior of the counter by Euler's method.

Parameter

Here, we explain how we determined the parameters of the deterministic model. PoPS (polymerase per second) of PConst is 0.03[3], so its promoter activity is 0.03/(6.0× 1023× 1.0× 10-15)[M/sec] = 0.051[nM/sec]. The switch point and the hill coefficient of PBAD are written in [4]. PoPS of PBAD is 5/60[5] , so its RPU (relative promoter unit) is (5/60)/(0.03) = 2.78. We set the RPU of Plac as 2 when IPTG is induced. We don't consider the leak expression from Plac.

The average half life of mRNA is 2-5 min[1], so we set the degradation rate of mRNA as 0.010[/sec]. The half life of GFP is infinite[9], so we set the degradation of GFP as 0.0[/sec]. The degradation rate of sigma factor[20] is fast. We set it as 0.001[/sec]. The degradation rate of anti-sigma is unknown, We set it as 6.0× 10-6, the average degradation rate of protein[11]. The equilibrium constant of the equations (1)is 80.0[nM][2]. The reaction rate of the association of sigma and anti-sigma is unknown. We assumed this reaction is fast and we set it as 10.0[/M sec]. The copy number of the plasmid is 100~300[7][8] , so we set as 200. When the number of lacZ mRNA is 62, the protein synthesis rate is 20[/sec][11], and we set the translational rate as 20/62 = 0.32, assuming it does not change from the kind of mRNA.

The summary of the parameters of this model is given in Table 1.


<img src = "Ono_table1.png" class = "figure" /> <legend>Table 1</legend>

Simulations


<img src = "Ono_2count_result.png" class = "math" /> <legend>Fig.1</legend>

arabinose induction time: 5000-5120[sec], 15000-15120[sec], IPTG induction time: t > 22500[sec]

The unit of vertical axis is [nM], and that of the horizontal axis is [sec]. We can see GFP expression only after 2nd-induction and rapid degradation of sigma after IPTG induction. These mean that the counter shows collect behavior theoretically.

Sensitivity analysis

Sensitivity analysis is an examination of output value changing when a certain parameter changes.

By conducting sensitivity analysis, we could know how each parameter affects the system. Results of some specific parameters are shown below. The parameters we changed are shown in Horizontal axis. The vertical axis is the ratio of GFP that is expressed in the first induction to GFP expressed over the entire experiment. When we changed one parameter, the other parameters were fixed.

<img src = "H_ara_sweep.png" height="200px" class = "math" /> <legend>Fig.2 Horizontal axis : pulse length of the arabinose induction</legend> <img src = "H_Ksigma_sweep.png" height="200px" class = "math" /> <legend>Fig.3 Horizontal axis : Kσ</legend> <img src = "H_Vsigma_sweep.png" height="200px" class = "math" /> <legend>Fig.4 Horizontal axis :Vσ</legend>

Comparing Fig.2 with Fig.3, Kσ seems to have less influence on the system than the pulse length of the arabinose induction. From Fig.4, Vσ shouldn't be so large or so small to minimize the leak_GFP/GFP.

Stochastic Model

Formulation of the Model

If there are a lot of molecules, modeling usually uses ordinary differential equations, but some in vivo reactions involve only a few molecules. For example, transcription involves the cell's genomic DNA which is one copy or plasmids which are about 200 copies [7][8] in a cell of Escherichia coli. The average size of a cell of E. coli is about 1.0× 10-15[L][15], so the concentration of DNA is about 1.7[nM] and the concentration of plasmids is about 200 times of it. This is obviously weak. Reactions like this are well affected by fluctuations due to the reactants's limited copy numbers. So, we need to take this fluctuations into our modeling which is derived from stochastic methods. We also introduce delay effect.

First we explain about the Gillespie algorithm which is often used in stochastic simulations. In the Gillespie algorithm, we treated not the concentration of molecules but the number of them. Reactions are also viewed as discrete, essentially instantaneous physical events. What we have to determine when using the Gillespie algorithm is (A) when the next reaction is going to occur and (B) which type of the reaction it will be. Looking more closely at the Gillespie algorithm by the next set of reaction formulas:

equation (19):

<img src = "Ono_%2819%29.png" height="48px" class = "math" />

Let n1, n2, and n3 denote the respective copy number of the components X1, X2, and X3. Notice that they are all integer. First we have to determine how easily each reactions could happen. It depends on the number of components copied. In stochastic simulations, we often determine the parameter called stochastic rate constant, which is often written as "c. We assume that each possible combinations of reactant molecules have the same probability c per unit time to react. In other words, c× dt gives the probability that a particular combination of reactant molecules will react in a short time interval [t,t+dt). We call the stochastic rate constant of a reaction j, cj. Considering the all combinations of reactant molecules, the probability that the reaction 0 occur in [t,t+dt) is c× n1× n2. We now define the propensity function as the function of which product with dt gives the probability that a particular reaction will occur in the next infinitesimal time dt, which is often written as "a. Later on, the propensity function of a reaction j is aj. Following the equation:

equation (20):

<img src = "Ono_%2820%29.png" height="15px" class = "math" />

Notice that cj is invariant parameter, but aj changes as the state changes. In the same way, a1 = c1× n3.

First we answer the question (A) when is the next reaction going to occur? Now, to simplify the situation we assume the situation that only the reaction 0 occurs. Set the time as 0, and define P(t) as the probability that the reaction 0 doesn't occur in [0,t). Then from the definition of a,we obtain the equation; P(t+dt) = P(t)(1-a× dt). (Because the probability that the reaction 0 doesn't occur in [0,t+dt) is the product of the probability that the reaction 0 doesn't occur in [0,t) with the probability that the reaction 0 doesn't occur in [t,t+dt).) Using P(t+dt) = P(t) + dP(t)/dt, we get :

equation (21):

<img src = "Ono_%2821%29.png" height="50px" class = "math" />

Because the probability that the reaction0 doesn't occur in a 0 second interval is zero; P(0)=1. Solving the above ordinary differential equation we get :

equation (22):

<img src = "Ono_%2822%29.png" height="24px" class = "math" />

If r1 is a uniform number from [0,1], the time of the next reaction should be determined by solving P(t) = r1. Using (22), we get t = -a0/log r1.

Now we suppose there is N types of reactions. Let a1,a2,…,aN denote the respective propensity function of reaction 1,2,…,N. From previous method;

equation (23):

<img src = "Ono_%2823%29.png" height="70px" class = "math" />

Let dt be so small that we can ignore the term of higher than two orders of dt. The equation(23) becomes:

equation (24):

<img src = "Ono_%2824%29.png" height="70px" class = "math" />

Solving (4) (a = Σ_{j=1}^{N} aj)

equation (25):

<img src = "Ono_%2825%29.png" height="23px" class = "math" />

Setting τ as the time of the next reaction, we get:

equation (26):

<img src = "Ono_%2826%29.png" height="45px" class = "math" />

Second we answer the question (B) what types of the reaction will it be? We determined the time of the next reaction, so what we have left to do is to determine what kind of reaction occurs. Some people may feel queer, but in the Gillespie algorithm, first the time of next reaction will be determined, and second the kind of reaction will be determined. It is natural to determine that the probability that the reaction j occurs is aj/a. If r2 is a uniform number from [0,1], j is the only number that meets below in equations:

equation (27):

<img src = "Ono_%2827%29.png" height="45px" class = "math" />

In the case a0 ≧ a× r2, the reaction that occurred is reaction 0.

Now we can run the Gillespie algorithm by following the next steps.(tMAX is the finish time of the simulation.)
1.Initialize the system at t = 0 with initial numbers of molecules for each spices, n0,… ,ns
2.For each j = 0,1,…,r, calculate aj(n) based on the current state n using (20)
3.Calculate the exit rate a(n) = Σ_{j=0}^{r} a</sub>j</sub>(n).
4.Compute a sample tau of the time until the next time using (26)
5.Update the time t = t +τ
6.Compute a sample j of the reaction index using (27)
7.Update the state n according to the reaction j.
8.If t < tMAX, return to Step 2

Stochastic rate constant can be determined by the parameters we used in the deterministic model (if we modeled the reaction in the deterministic model) . If there are a lot of reactant molecules, stochastic simulations have to show similar results as those of deterministic simulations. For this reason, stochastic rate constant, c, can be calculated from the chemical reaction rate constant, k. See [10] if you want to know the deriving process. Here we just write the result.

For a unimolecular reaction, c numerically equals to k, whereas for a bimolecular reaction, c equals to k/NAV if the species of the reactant molecules are different, or 2k/NAV if they are the same.V is the volume of the system and NA is the Avogadro's constant.

However, these results should not be taken to imply that the mathematical forms of the propensity functions are just heuristic extrapolations. The propensity functions are grounded in molecular physics, and the formulas of deterministic chemical kinetics are approximate consequences of the formulas of stochastic chemical kinetics, not the other way around.

The Gillespie algorithm is so clear and useful that it is often used. However, this algorithm is not suitable for describing transcriptions and translations because they are very slow and complex reactions involving many kinds of reactant molecules. If we treat transcription from plasmids as one reaction, assuming the copy number of plasmids as 200, then the propensity function a equals to the stochastic rate constant multiplied by 200 (200× c). So it will take about one of a two hundred times of an average transcription time to finish one transcription. Of course, in the time scale of average transcription time it is not a big problem, but this may not be good for simulating, like in our project, the system that uses the time for transcriptions and translations cannot be shortened. We introduce time-delay into the Gillespie algorithm based on [12]~[14]. The mathematical correctness of this algorithm is proved in [14]. Time-delay means treating reactions as following:

<img src = "Ono_%2828%29.png" height="48px" class = "math" />

Furthermore, transcriptions and translations are too complex to list up all of the reactions step by step. Therefore it is better to treat them as time-delay than reaction formulas.

Now we begin to model our project, sigma Re-counter. In our model, there are five types of reactions: transcription, translation, an association and disassociation of crRNA and taRNA, an association and disassociation of sigma and anti-sigma, and degradation. We introduce time-delay into only transcription and translation. Then, we explain how we treat these three reactions in general.

We explain transcription's model of mRNA[11]. When RNA polymerase binds to the promoter region, first they take the RNAP・promoter close complex. At this state, the complex can dissociate. But with a certain probability, the close complex turn to the open complex which doesn't dissociate. After the RNA polymerase and the promoter region take the open complex, the elongation of mRNA starts. In this state, the promoter region is cleared and RBS(Ribosome binding site) is synthesized, so we model as reaction3'. It is difficult to model the elongation of mRNA by elementary reaction formula because it is a complex reaction, so we model by time-delay. Then the reaction formula of transcription can be described as following's reactions:

<img src = "Ono_%2829%29.png" height="137px" class = "math" />

combining reaction3' and reaction3, we get:

<img src = "Ono_%2830%29.png" height="20px" class = "math" />

In the above reaction formula, DNA denotes the promoter region, and mRNA denotes RBS.

In our project, we have to model transcription of taRNA. Because taRNA functions only when it is completely transcribed, we model as following reactions:

<img src = "Ono_%2834%29.png" height="107px" class = "math" />

We refer to the translational model [8]. Similarly to the transcriptional model we model as following;

<img src = "Ono_%2831%29.png" height="106px" class = "math" />

combining reaction2' and reaction2, we get:

<img src = "Ono_%2832%29.png" height="20px" class = "math" />

We model the association and disassociation of crRNA and taRNA as a reversible reaction:

<img src = "Ono_%2833%29.png" height="45px" class = "math" />

We know that anti-sigma directly affect sigma[20], but a large part of their relation is still unknown. From their direct relation, We model as following reactions:

<img src = "Ono_%2835%29.png" height="45px" class = "math" />

We model the degradation as the following reaction:

<img src = "Ono_%2836%29.png" height="18px" class = "math" />

We can conclude that reaction formulas of our model are as follows:


<img src = "Ono_table2.png" class = "figure" />

Parameter

We got the parameters mainly from [11]. The summary of the parameters of this model is given in the Table 2 (α = 1/NA V, NA is Avogadro constant, and V is the volume of E. coli.):


<img src = "Ono_table3.png" class = "figure" /> <legend>Table 2</legend>

We estimated τ from the fact that average translational speed is 20 aa/sec, and average transcriptional speed is 40 bp/sec.[18][19] τ1 = 40.0,τ2 = 40.0,τ3 = 40.0, τ4 = 40.0,τ5 = 40.0,τ6 = 40.0, τ7 = 50.0,τ8 = 40.0,τ9 = 40.0.

Simulations

Before executing every step of the Gillespie algorithm, we set the number of RNAP molecules and ribosomes as 35 and 350[11].

<img src = "Ono_p_GFP.png" height="200px" class = "math" /> <legend>Fig.5</legend> <img src = "Ono_p_sigma.png" height="200px" class = "math" /> <legend>Fig.6</legend>

Expression of GFP and sigma when arabinose is added at t = 5000-5120, 15000-15120. Figure 5 is the time series of GFP, and figure 6 is the time series of sigma. From figure 6, we can confirm that the system is reset.

<img src = "Sub_leaky_expression_analysis.png" class = "contTitle" />

As written in the Project part, we can imagine an expansion of the sigma-recounter, which can count a lot of numbers. One of the most important problems arising from the expansion is that a single induction may inadvertently increase the count number 2 or more times. We will explain this by considering the 3 counter. Below is the construct of the 3 counter.

<img src = "Ono_3count_construct.png" class = "figure" />

When arabinose is induced for the first time, sigma A will be expressed and its positive feedback loop will be put in motion. On the second induction, sigma B will be expressed, and third, GFP will be expressed.

There is a problem that occurs in the first induction. Not only sigma A but also a little amount of sigma B will be expressed in the first induction, as a portion of sigma A precociously stimulates the next step in the counting sequence. The sigma B positive feedback loop circuit exemplifies that little expression. Because of these two characteristics, sigma B has possibility of being highly expressed before the second induction; in other words, it counts 2. We did not have to model this in our main construct, because GFP does not have its positive feedback loop circuit.

Therefore we modeled if the counting can accurately be conducted. We also analyzed parameter sensitivity on the pulse length of arabinose induction.

<img src = "Ono_3count_result.png" class = "math" /> <legend>Fig.7 The model of 3 counter. Expression of sigmaA, sigmaB, GFP when arabinose is added at t= 10000/7, 30000/7, 50000/7. The pulse length is t = 60.)The unit of vertical axis is [nM], and that of the horizontal axis is [sec].</legend>

Figure 7 shows that the first count can precisely be conducted even if a small amount of sigma B is expressed.

<img src = "Ono_ara_sweep.png" class = "math" /> <legend>Fig.8 Parameter sensitivity on the pulse length of arabinose induction. Vertical axis is the ratio of GFP that is expressed in the second induction to GFP that is expressed over the entire experiment. Horizontal axis is the pulse length of arabinose induction.</legend>

Figure 8 shows how strong the leaky expression is during the second induction event. There seems robust relation between the pulse length and whether it succeeded in counting. There seems to be three patterns depending on the pulse length. This simulation result can be explained by assuming that the positive feedback loop of sigma is put into motion when the concentration of sigma exceeds a threshold. When the pulse length of arabinose is too short and sigma A is expressed less than the threshold value, the sigma A concentration will head to zero after the induction. Therefore, in any induction event, the count number remains unchanged, and very few GFP is expressed. This is the t = 1 ~ 42 pattern in Figure 8. When the pulse length is correct, sigma A is expressed more than the threshold value of sigma A while sigma B was expressed less than the threshold value of sigma B. This time, the count is accurately conducted and GFP is expressed significantly in the third induction. This is the t = 43~117 pattern of Figure 8.When the pulse length is too long, sigma B is expressed more than the threshold value. This time, the first induction made 2 counts, which is the situation we were concerned about.

Thus, the pulse length plays an important role in 3 counter.

In fact, this behavior can be explained by an easy dynamical system analysis.

<img src = "Sub_improvement_of_counter.png" class = "contTitle" />

As we have seen in the previous sections, current sigma-recounter depends much on the pulse length of the arabinose induction; when the pulse length is too long, it would count 2 or more (if there is). We will show below the 2 counter's figure that the pulse length is hypothesized to be too long.

<img src = "Ono_implementation_failure.png" class = "math" /> <legend>Fig.9The model of 2 counter when the pulse length is long. Unit of vertical axis is [nM], and that of horizontal axis is [sec]. induction time: 20000-40000[sec], 60000-80000[sec]</legend>

However, with a little improvement, our counter would not count more than 1 by a single pulse, as long as the pulse length is long enough for it to count. You want to make a construct that is not affected by the pulse length. Shown below is such an improved construct.

<img src = "Ono_implementation_construct.png" class = "figure" />

X and Y are substances that both are required to activate PX&Y promoter.

Before arabinose induction, PTet and PConst express Y and crRBS-sigma. After arabinose induction, PBAD becomes activated and TetR and X are expressed. X binds to Y and the transcription of taRNA from PX&Y occur, which leads to counting. At that time, expression of Y is repressed by TetR and the amount of Y, which has short half life, decreases rapidly. Thus, PX&Y is again repressed, the amount of taRNA decreases, and the counter never counts more than 1. You might be afraid that PX&Y also begins transcription of taRNA when the induction ends; however, supposed degradation of X is faster than that of TetR, it will not occur. When the induction ends, X first degrades while still a lot of TetR remain and Y is not abundant. Since PTet has a sigmoidal transcriptional response, the production rate of Y will change little even if the concentration of TetR decrease a little. When TetR degrades so much that it finishes repression of Y, most of X have already decomposed, and PX&Y will not be activated to begin transcription of taRNA.

We modeled this construct to test if it can be realized. We did not model resetting this time, either.

<img src = "Ono_implementation_result.png" class = "math" />

<legend>Fig.10 The improved 2 counter. The induction was modeled to be conducted just the same as the non-improved version. Units of vertical axis and horizontal axis are the same, too.</legend>

Although the pulse length is long, counts are accurately conducted. Thus, theoretically, the counter works independently of the pulse length is suggested to be available. Only thing we have to do is to research for the substances that satisfy these conditions!

<img src = "Sub_guideformodeling.png" class = "contTitle" />

What is modeling

The model should be sufficiently detailed and precise so that it can in principle be used to simulate the behavior of the system on a computer.

In the context of molecular cell biology, a model may describe the mechanisms involved in transcription, translation, gene regulation, cellular signaling, DNA damage and repair processes, homeostatic processes, the cell cycle, or apoptosis. Indeed any biochemical mechanism of interest can, in principle, be modeled. At a higher level, modeling may be used to describe the functioning of a tissue, organ, or even an entire organism. At still higher levels, models can be used to describe the behavior and time evolution of populations of individual organisms.

The first issue to confront when embarking on a modeling project is to describe on exactly which features to include in the model, and in particular, the level of detail model is intended to capture. Interacting with other cells and/or its environment, the cell realizes four key functions: growth, proliferation, apoptosis, and differentiation. The processes that realize these functions of a cell can be further organized into three processes levels: gene regulation, signaltransduction and metabolism. So, a model of an entire organism is unlikely to describe the detailed functioning of every individual cell, but a model of a cell is likely to include a variety of very detailed description of key cellular processes. Even then, however, a model of a cell is unlikely to contain details of every single gene and protein.

Indeed, really accurate modeling of the process would require a model far more detailed and complex than most biologists would be comfortable with, using molecular dynamic simulations that explicitly manage the position and momentum of every molecule in the system.

The "art" of building a good model is to capture the essential features of the biology without burdening the model with non-essential details. Every model is to some extent a simplification of the biology, but models are valuable because they take ideas that might have been expressed verbally or diagrammatically and make them more explicit, so that they can begin to be understood in a quantitative rather than purely qualitative way.

Aims of modeling

The features of a model depend very much on the aims of the modeling exercise. We therefore need to consider why people model and what they hope to achieve by so doing. Often the most basic aim is to make clear the current state of knowledge regarding a particular system, by attempting to be precise about the elements involved and the interactions between them. Doing this can be a particularly effective way of highlighting gaps in understanding. In addition, having a detailed model of a system allows people to test that their understanding of a system is correct, by seeing if the implications of their models are consistent with observed experimental data. In practice, this model validation stage is central to the systems biology approach. However this work will often represent only the initial stage of the modeling process. Once people have a model they are happy with, they often want to use their models predictively, by conducting "virtual experiments" that might be difficult, time-consuming, or impossible to do in the lab. Such experiments may uncover important indirect relationships between the model components that would be hard to predict otherwise. An additional goal of modern biological modeling is to pool a number of small models of well-understood mechanisms into a large model in order to investigate the effect of interactions between the model components. Models can also be extremely useful for informing the design and analysis of complex biological experiments.

In summary, modeling and computer simulation are becoming increasingly important in post-genomic biology for integrating knowledge and experimental data and making testable predictions about the behavior of complex biological systems.

Stochastic Approaches

The most common formulation of stochastic models for biochemical networks is the chemical master equation (CME). The analytical nature of the early stochastic approaches was highly complicated and, in some cases, intractable so that they received little attention in the biochemical community. Later, the situation changed with the increasing computational power of modern computers. And finally Gillespie presented an ground-breaking algorithm for numerically generating sample trajectories of the abundances of chemical species in chemical reaction networks. The so-called "stochastic simulation algorithm," or "Gillespie algorithm," can easily be implemented in any programming or scripting language that has a pseudorandom number generator. Several software packages implementing the algorithm have been developed. Differerent stochastic approaches and their interrelationships are depicted in Figure.

<img src="Chart.png" class="figure" />

For large biochemical systems, with many species and reactions, stochastic simulations (based on the original Gillespie algorithm) become computationally demanding. Recent years have seen a large interest in improving the efficiency/speed of stochastic simulations by modification/approximation of the original Gillespie algorithm. These improvements include the "next reaction" method of Gibson and Bruck, the "τ-leap" method and its various improvements and generalizations and the "maximal time step method", which combines the next reaction and the τ-leap methods.

While stochastic simulations are a practical way to realize the CME, analytical approximations offer more insights into the influence of noise on cell function. Formally, the CME is a continuous-time discrete-state Markov process. For gaining intuitive insight and a quick characterization of fluctuations in biochemical networks, the CME is usually approximated analytically in different ways, including the frequently used chemical Langevin equation (CLE), the linear noise approximation (LNA), and the two-moment approximation (2MA).

The traditional Langevin approach is based on the assumption that the time-rate of abundance (copy number or concentration) or the flux of a component can be decomposed into a deterministic flux and a Langevin noise term, which is a Gaussian (white noise) process with zero mean and amplitude determined by the system dynamics. This separation of noise from system dynamics may be a reasonable assumption for external noise that arises from the interaction of the system with the other systems (such as the environment), but cannot be assumed for the internal noise that arises from within the system. Internal noise is not something that can be isolated from the system, because it results from the discrete nature of the underlying molecular events. Any noise term in the model must be derived from the system dynamics and cannot be presupposed in an ad hoc manner. However, the CLE does not suffer from above criticism because Gillespie derived it from the CME description. The CLE allows much faster simulations compared to the exact stochastic simulation algorithm (SSA) and its variants. The CLE is a stochastic different equation (dealing directly with random variables rather than moments) and has no direct way of representing the mean and (co)variance and the coupling between the two. That does not imply the CLE, like the LNA, which has the same mean as the solution of deterministic model, ignores the coupling.

Markov processes form the basis of the vast majority of stochastic models of dynamical systems. At the center of a stochastic analysis is the Chapman-Kolmogorov equation (CKE), which describes the evolution of a Markov process over time. From the CKE stem three equations of practical importance: the master equation for jump Markov processes, the Fokker-Planck equation for continuous Markov processes, and the differential Chapman-Kolmogorov equation (dCKE) for processes made up both the continuous and jump parts.

Stochastic Formulation and Markov Process

Since the occurrence of reactions involves discrete and random events at the microscopic level, it is impossible to deterministically predict the progress of reactions in terms of the macroscopic variables (observables).

Our goal is to determine how the process N(t) of copy numbers evolves in time. Starting at time t=0 from some initial state N(0), every sample path of the process remains in state N(0) for a random amount of time W1 until the occurrence of a reaction takes process to a new state N(W1); it remains in state N(W1) for another random amount of time W2 until the occurrence of another reaction takes the process to a new state N(W1+W2), and so on. In other words, the time-dependent copy number N(t) is a jump process.

The stochastic process N(t) is characterized by a collection of state probabilities and transition probabilities. The state probability P(n,t)=Pr[N(t)=n] is the probability that the process N(t) is the state n at a time t. The transition probability Pr[N(t0+t)=n|N(t0)=m] is the conditional probability that process N(t) has moved from state m to state n during the time interval [t0,t0+t]. The analysis of a stochastic process becomes greatly simplified when the above transition probability depends on (i) the starting state m but not on the states before time t0 and (ii) the interval length t but not on the. Property (i) is the well-known Markov process. The process holding property (ii) is said to be homogeneous process.

References

[1] Uri Alon『An introduction to Systems Biology: Design Principles od Biological Circuits』

[2] Isaacs, Farren J., et al. "Engineered riboregulators enable post-transcriptional control of gene expression." Nature biotechnology 22.7 (2004): 841-847.

[3] Kelly, Jason R., et al. "Measuring the activity of BioBrick promoters using an in vivo reference standard." Journal of biological engineering 3.1 (2009): 4.

[4] Part:BBa I13453 <http://parts.igem.org/Part:BBa_I13453> ( We finally accessed on 2014/8/20)

[5]Megerle, Judith A., et al. "Timing and dynamics of single cell gene expression in the arabinose utilization system." Biophysical journal 95.4 (2008): 2103-2115.

[6] iGEM Kyoto 2010 <https://2010.igem.org/Team:Kyoto/Project/Goal_A>

[7] pSB1A2 <http://parts.igem.org/Part:pSB1A2> ( We finally accessed on 2014/8/20)

[8] pSB1C3 <http://parts.igem.org/Part:pSB1C3> ( We finally accessed on 2014/8/20)

[9] Andersen, Jens Bo, et al. "New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria." Applied and environmental microbiology 64.6 (1998): 2240-2246.

[10] Gillespie, Daniel T. "A general method for numerically simulating the stochastic time evolution of coupled chemical reactions." Journal of computational physics 22.4 (1976): 403-434.

[11] Kierzek, Andrzej M., Jolanta Zaim, and Piotr Zielenkiewicz. "The effect of transcription and translation initiation frequencies on the stochastic fluctuations in prokaryotic gene expression." Journal of Biological Chemistry 276.11 (2001): 8165-8172.

[12]Roussel, Marc R., and Rui Zhu. "Validation of an algorithm for delay stochastic simulation of transcription and translation in prokaryotic gene expression." Physical biology 3.4 (2006): 274.

[13] Ribeiro, Andre S. "Stochastic and delayed stochastic models of gene expression and regulation." Mathematical biosciences 223.1 (2010): 1-11.

[14] Schlicht, Robert, and Gerhard Winkler. "A delay stochastic process with applications in molecular biology." Journal of mathematical biology 57.5 (2008): 613-648.

[15] Santillán, Moisés, and Michael C. Mackey. "Influence of Catabolite Repression and Inducer Exclusion on the Bistable Behavior of the< i> lac</i> Operon." Biophysical journal 86.3 (2004): 1282-1292.

[16] D.J.Wilkinson.Stochastic Modelling for Systems Biology.Mathematical & Computational Biology.Chapman & Hall/CRC, London, Apr. 2006. ISBN 1584885408

[17] Mukhtar Ullah & Olaf Wolkenhauer Stochastic Approaches for Systems Biology.

[18] Bionumbers <http://bionumbers.hms.harvard.edu/bionumber.aspx?&id=100059&ver=30&trm=translation> ( We finally accessed on 2014/8/20)

[19] Bionumbers <http://bionumbers.hms.harvard.edu/bionumber.aspx?&id=100059&ver=30&trm=translation> ( We finally accessed on 2014/8/20)

[20]Staron, Anna. Phylogenetic and functional analyses of stress-responsive bacterial transmembrane signal transducing systems. Diss. lmu, 2012.