Team:UT-Tokyo/Counter/Project/Modeling

From 2014.igem.org

(Difference between revisions)
Line 92: Line 92:
<p>Horizontal axis : K<sub>&sigma;</sub></p>
<p>Horizontal axis : K<sub>&sigma;</sub></p>
<img src = "https://static.igem.org/mediawiki/2014/4/4e/H_Ksigma_sweep.png" height="200px" class = "math" />  
<img src = "https://static.igem.org/mediawiki/2014/4/4e/H_Ksigma_sweep.png" height="200px" class = "math" />  
-
<p>THorizontal axis :V<sub>&sigma;</sub></p>
+
<p>Horizontal axis :V<sub>&sigma;</sub></p>
<img src = "https://static.igem.org/mediawiki/2014/0/09/H_Vsigma_sweep.png" height="200px" class = "math" />  
<img src = "https://static.igem.org/mediawiki/2014/0/09/H_Vsigma_sweep.png" height="200px" class = "math" />  
<h3>Stochastic Model</h3>
<h3>Stochastic Model</h3>

Revision as of 23:26, 17 October 2014

<img src = "Sub_overview.png" class = "contTitle" />

Modeling is an attempt to describe, in a precise way, an understanding of the elements of a system of interest, their states, and their interactions with other elements.

The purpose of our modeling team is to peel back the layer of appearance of the device to reveal it's underlying nature. We tried to improve the device, cooperating with the experiment team. To achieve our goal, we have developed three fundamental themes. These three themes divide the modeling part into three parts. At the beginning, we confirmed whether our circuit realizes a reaction:this for part 1. Next, we adjusted the parts and the conditions, for the device to reproduce a satisfactory value suitable for naming the device as a counter:this for part 2. Finally, we discussed what would be appropriate modeling, frequent issue to attack, in order to find the best strategy of modeling and wrote how we constructed our model: this for part 3.

In Part1(Deterministic Model,Stochastic Model), we approached the problem in two ways.

・Deterministic model:In this model,chemical reactions are described as differential equations and concentration of reaction products can be calculated by those of reactants. This model is intuitive, simple and hence popular to estimate the results of experiment.

・Stochastic model:The most common formulation of stochastic models for biochemical networks is the chemical master equation (CME). We used Gillespie Algorithm to solve CME.

In Part2(Result), changing measured values of gene copy numbers, strength of PConst, sequence of taRNA and etc. in silico, we estimated in which combination of values the counter outputs a sufficient amount of data.

In Part3(Guide for Modeling), what is modeling, aims of modeling and different stochastic approaches and their interrelationShips

<img src = "Sub_our_main_construct.png" class = "contTitle" />

First construction we came up with about the counter is described as following.

these generaters have following functions

・production of cis-repressed mRNA of a sigma factor by constitutive promoter.

・production of taRNA by 1st-induced arabinose and activating of cis-repressed mRNA by taRNA.

・positive feedback loops formed by sigma factors.

・production of cis-repressed mRNA of GFP by sigma factors.

・activating mRNA of GFP by 2nd-induced.

・degradation of sigma factors by induction of IPTG.

The counter consists of these functions. Our first work is confirming theoretically that these functions cooperate with each other, GFP expesses significantly only by 2nd-induced arabinose and the counter is reset by IPTG.

In this simulation, we used two types of model, deterministic model and stochastic model. First, we simulated the counter by deterministic model which is popular and intutive. Next, we simulated by stochastic model which takes into account paucity and fluctuation of reactants.

Formulation of models and results of simulations are described below.

Contents

Deterministic Model

Formulation of the Model

First of all, we constructed the deterministic model to estimate the behavior of the counter. In this model, biochemical reactions are described as differential equations and concentration of reaction product can be calculated by those of reactants. This model is intuitive, simple and hence popular to estimate the result of experiment. We were able to get some parameters for modeling of counter from previous works.[→ see parameter section]

We had simplified the construction of mathematical model before described time evolution in which concentrations of mRNAs and proteins change as differential equations. First, we regarded that the reaction between taRNA(transactivating RNA) and crRNA(cis-repressor RNA) in riboregulator is much faster than that of transcription or translation and equilibrium reaction. This diminution of parameters enable us to use the equilibrium constant as a parameter and prevent us from overfitting when we adapt this model to raw data.

equation (1):

<img src = "Ono_%281%29.png" height="50px" class = "math" />

So, the concentrations of mRNAs and the coupling of taRNA and crRNA are represented as stated above . Subscript means the gene encoded by the mRNA. We regarded that the affinities between crRNA and taRNA on different mRNAs were equal. The dissociation constant of equilibrium reaction was therefore shown as following.

equation (2):

<img src = "Ono_%282%29.png" height="100px" class = "math" />

Using dissociation constant, concentrations of reaction products such as [mcrcr-σ ] could be discribed as function of those of taRNA and mRNA of σ and GFP. We put X, A and B as the total quantity of taRNA, sigma and GFP, respectively.

equation (3):

<img src = "Ono_%283%29.png" height="20px" class = "math" />

equation (4):

<img src = "Ono_%284%29.png" height="20px" class = "math" />

equation (5):

<img src = "Ono_%285%29.png" height="20px" class = "math" />

Using these equations((3)-(5)) and equilibrium constant, concentrarions of free taRNA, free mRNAs and taRNA-mRNA complexes are described as following. These are all of simplifications.

equation (6):

<img src = "Ono_%286%29.png" height="45px" class = "math" />

equation (7):

<img src = "Ono_%287%29.png" height="45px" class = "math" />

equation (8):

<img src = "Ono_%288%29.png" height="45px" class = "math" />

equation (9):

<img src = "Ono_%289%29.png" height=45px" class = "math" />

equation (10):

<img src = "Ono_%2810%29.png" height="44px" class = "math" />

Finally, we built up differential equations about concentrations of reaction products including mRNA of sigma which has no riboregulator. (It makes positive feedback loop.) This model assumed relationship between promoter and the amount of transcriptional product increasing per unit time. The amount is in proportion to the number of promoter if the promoter expressed constitutively and can be approximated by Hill equation if the inducer controled its promoter. It was also assumed that the degradation of proteins and RNAs followed first order kinetics. Some of used parameters were cited from references.[1]~[9]

We aimed to determine parameters about sigma through experiment and used provisional parameter determined in reference to other promotor.

equation (11):

<img src = "Ono_%2811%29.png" height="76px" class = "math" />

equation (12):

<img src = "Ono_%2812%29.png" height="43px" class = "math" />

equation (13):

<img src = "Ono_%2813%29.png" height="45px" class = "math" />

The transctiptional activity of mRNA encoding sigma factor initiated by positive feedback loop and that of mRNA encoding anti-sigma induced by IPTG are described as following.

equation (14):

<img src = "Ono_%2814%29.png" height="46px" class = "math" />

equation (15):

<img src = "Ono_%2815%29.png" height="43px" class = "math" />

In our project, IPTG induction was aimed at enough production of anti-sigma to reset the counter and the sensitivity of Plac(lactose promoter) was not our main interest. Therefore, we used simple equation,(15) to describe how lactose promoter behave. Plac depend on the concentration of IPTG but we regarded it as a fixed number in this modeling.

Taking into account that translation coincide with transcription in prokaryotes, the linearity between the concentration of transcriptional product and the change rate of the amount of translational product was assumed. this linearity does not depend on the kind of translational product. We also assumed that anti-sigma combine with sigma and form inert matter irreversibly, and the reaction velocity of that is proportional to product of these.

equation (16):

<img src = "Ono_%2816%29.png" height="44px" class = "math" />

equation (17):

<img src = "Ono_%2817%29.png" height="44px" class = "math" />

equation (18):

<img src = "Ono_%2818%29.png" height="44px" class = "math" />

Using above-mentioned differential equations, we simulated behavior of the counter by Euler's method.

Parameter

Here, we explain how we determined the parameters of the deterministic model. PoPS (promoter per second) of PConst is 0.03[3], so its promoter activity is 0.03/(6.0× 1023× 1.0× 10-15)[M/sec] = 0.051[nM/sec]. The switch point and hill coefficients of PBAD is written in [4]. PoPS of PBAD is 5/60[5] , so its RPU (relative promoter unit) is (5/60)/(0.03) = 2.78. We set the RPU of Plac as 2 when induced. We don't consider the leak expression from Plac.

The average half life of mRNA is 2-5 min[1], so we set the degradation rate of mRNA as 0.010[/sec]. The half life of GFP is infinite[9], so we set the degradation of GFP as 0.0[sec]. The degradation rate of sigma factor(reference) is fast. So we set as 0.001[/sec]. The degradation rate of anti-sigma is unknown, so we set as 6.0× 10-6, the average degradation rate of protein(reference). The equilibrium constant of the equations (1)is 80.0[nM][2]. The reaction rate of the association of sigma and anti-sigma is unknown. We assumed this reaction is fast so we set as 10.0[/M sec]. The number of plasmids copied is 100~300[7][8] , so we set as 200. When the number of mRNA is 62, the protein synthesis rate is 20[/sec][11], so we set the translational rate as 20/62 = 0.32.

The summary of the parameters of this model is given in Table 1.

Table 1:


<img src = "Ono_table1.png" class = "figure" />

Simulations


Figure 1:

<img src = "Ono_2count_result.png" class = "figure" />

arabinose induction time: 5000-5120[sec], 15000-15120[sec], IPTG induction time: t > 22500[sec]

The unit of vertical axis is [nM], and that of the horizontal axis is [sec]. We can see GFP expression only after 2nd-induction and rapid degradation of sigma after IPTG induction. These mean that the counter shows collect behavior theoretically.

The sensitivity analysis

By conducting sensitivity analysis, we could know how each parameter affect the system.Results of some specific parameters are shown below. The parameters we changed were shown in Horizontal axis.When we changed the parameter to various values, other parameters were fixed.

Horizontal axis : pulse length of arabinose inducing time

<img src = "H_ara_sweep.png" height="200px" class = "math" />

Horizontal axis : Kσ

<img src = "H_Ksigma_sweep.png" height="200px" class = "math" />

Horizontal axis :Vσ

<img src = "H_Vsigma_sweep.png" height="200px" class = "math" />

Stochastic Model

Formulation of the Model

If there are a lot of molecules, modeling usually uses ordinary differential equations, but some in vivo reactions involve only a few molecules. For example, transcription involves the cell's genomic DNA which is one copy or plasmids which are about 200 copies [7][8] in a cell of Escherichia coli. The average size of a cell of E. coli is about 1.0× 10-15[L][15], so the concentration of DNA is about 1.7[nM] and the concentration of plasmids is about 200 times of it. This is obviously weak. Reactions like this are well affected by fluctuations due to the reactants's limited copy numbers. So, we need to take this fluctuations into our modeling which is derived from stochastic methods. We also introduce delay effect.

First we explain about the Gillespie algorithm which is often used in stochastic simulations. In the Gillespie algorithm, we treated not the concentration of molecules but the number of them. Reactions are also viewed as discrete, essentially instantaneous physical events. What we have to determine when using the Gillespie algorithm is (A) when the next reaction is going to occur and (B) which type of the reaction it will be. Looking more closely at the Gillespie algorithm by the next set of reaction formulas:

equation (19):

<img src = "Ono_%2819%29.png" height="48px" class = "math" />

Let n1, n2, and n3 denote the respective copy number of the components X1, X2, and X3. Notice that they are all integer. First we have to determine how easily each reactions could happen. It depends on the number of components copied. In stochastic simulations, we often determine the parameter called stochastic rate constant, which is often written as "c. We assume that each possible combinations of reactant molecules have the same probability c per unit time to react. In other words, c× dt gives the probability that a particular combination of reactant molecules will react in a short time interval [t,t+dt). We call the stochastic rate constant of a reaction j, cj. Considering the all combinations of reactant molecules, the probability that the reaction 0 occur in [t,t+dt) is c× n1× n2. We now define the propensity function as the function of which product with dt gives the probability that a particular reaction will occur in the next infinitesimal time dt, which is often written as "a. Later on, the propensity function of a reaction j is aj. Following the equation:

equation (20):

<img src = "Ono_%2820%29.png" height="15px" class = "math" />

Notice that cj is invariant parameter, but aj changes as the state changes. In the same way, a1 = c1× n3.

First we answer the question (A) when is the next reaction going to occur? Now, to simplify the situation we assume the situation that only the reaction 0 occurs. Set the time as 0, and define P(t) as the probability that the reaction 0 doesn't occur in [0,t). Then from the definition of a,we obtain the equation; P(t+dt) = P(t)(1-a× dt). (Because the probability that the reaction 0 doesn't occur in [0,t+dt) is the product of the probability that the reaction 0 doesn't occur in [0,t) with the probability that the reaction 0 doesn't occur in [t,t+dt).) Using P(t+dt) = P(t) + dP(t)/dt, we get :

equation (21):

<img src = "Ono_%2821%29.png" height="50px" class = "math" />

Because the probability that the reaction0 doesn't occur in a 0 second interval is zero; P(0)=1. Solving the above ordinary differential eqaution we get :

equation (22):

<img src = "Ono_%2822%29.png" height="24px" class = "math" />

If r1 is a uniform number from [0,1], the time of the next reaction should be determined by solving P(t) = r1. Using (22), we get t = -a0/log r1.

Now we suppose there is N types of reactions. Let a1,a2,…,aN denote the respective propensity function of reaction 1,2,…,N. From previous method;

equation (23):

<img src = "Ono_%2823%29.png" height="70px" class = "math" />

Let dt be so small that we can ignore the term of higher than two orders of dt. The equation(23) becomes:

equation (24):

<img src = "Ono_%2824%29.png" height="70px" class = "math" />

Solving (4) (a = Σ_{j=1}^{N} aj)

equation (25):

<img src = "Ono_%2825%29.png" height="23px" class = "math" />

Setting τ as the time of the next reaction, we get:

equation (26):

<img src = "Ono_%2826%29.png" height="45px" class = "math" />

Second we answer the question (B) what types of the reaction will it be? We determined the time of the next reaction, so what we have left to do is to determine what kind of reaction occurs. Some people may feel queer, but in the Gillespie algorithm, first the time of next reaction will be determined, and second the kind of reaction will be determined. It is natural to determine that the probability that the reaction j occurs is aj/a. If r2 is a uniform number from [0,1], j is the only number that meets below in equations:

equation (27):

<img src = "Ono_%2827%29.png" height="45px" class = "math" />

In the case a0 ≧ a× r2, the reaction that occurred is reaction 0.

Now we can run the Gillespie algorithm by following the next steps.(tMAX is the finish time of the simulation.)
1.Initialize the system at t = 0 with initial numbers of molecules for each spices, n0,… ,ns
2.For each j = 0,1,…,r, calculate aj(n) based on the current state n using (20)
3.Calculate the exit rate a(n) = Σ_{j=0}^{r} a</sub>j</sub>(n).
4.Compute a sample tau of the time until the next time using (26)
5.Update the time t = t +τ
6.Compute a sample j of the reaction index using (27)
7.Update the state n according to the reaction j.
8.If t < tMAX, return to Step 2

Stochastic rate constant can be determined by the parameters we used in the deterministic model (if we modeled the reaction in the deterministic model) . If there are a lot of reactant molecules, stochastic simulations have to show similar results as those of determinisitic simulations. For this reason, stochastic rate constant, c, can be calculated from the chemical reaction rate constant, k. See [10] if you want to know the deriving process. Here we just write the result.

For a unimolecular reaction, c numerically equals to k, whereas for a bimolecular reaction, c equals to k/NAV if the species of the reactant molecules are different, or 2k/NAV if they are the same.V is the volume of the system and NA is the Avogadro's constant.

However, these results should not be taken to imply that the mathematical forms of the propensity functions are just heuristic extrapolations. The propensity functions are grounded in molecular physics, and the formulas of deterministic chemical kinetics are approximate consequences of the formulas of stochastic chemical kinetics, not the other way around.

The Gillespie algorithm is so clear and useful that it is often used. However, this algorithm is not suitable for describing transcriptions and translations because they are very slow and complex reactions involving many kinds of reactant molecules. If we treat transcription from plasmids as one reaction, assuming the copy number of plasmids as 200, then the propensity function a equals to the stochastic rate constant multiplied by 200 (200× c). So it will take about one of a two hundred times of an average transcription time to finish one transcription. Of course, in the time scale of average transcription time it is not a big problem, but this may not be good for simulating, like in our project, the system that uses the time for transcriptions and translations cannot be shortened. We introduce time-delay into the Gillespie algorithm based on [12]~[14]. The mathematical correctness of this algorithm is proved in [14]. Time-delay means treating reactions as following:

<img src = "Ono_%2828%29.png" height="48px" class = "math" />

Furthermore, transcriptions and translations are too complex to list up all of the reactions step by step. Therefore it is better to treat them as time-delay than reaction formulas.

Now we begin to model our project, sigma Re-counter. In our model, there are five types of reactions: transcription, translation, an association and disassociation of crRNA and taRNA, an association and disassociation of sigma and anti-sigma, and degradation. We introduce time-delay into only transcription and translation. Then, we explain how we treat these three reactions in general.

We explain transcription's model of mRNA[11]. When RNA polymerase binds to the promoter region, first they take the RNAP・promoter close complex. At this state, the complex can dissociate. But with a certain probability, the close complex turn to the open complex which doesn't dissociate. After the RNA polymerase and the promoter region take the open complex, the elongation of mRNA starts. In this state, the promoter region is cleared and RBS(Ribosome binding site) is synthesized, so we model as reaction3'. It is difficult to model the elongation of mRNA by elementary reaction formula because it is a complex reaction, so we model by time-delay. Then the reaction formula of transcription can be described as following's reactions:

<img src = "Ono_%2829%29.png" height="137px" class = "math" />

combining reaction3' and reaction3, we get:

<img src = "Ono_%2830%29.png" height="20px" class = "math" />

In the above reaction formula, DNA denotes the promoter region, and mRNA denotes RBS.

In our project, we have to model transcription of taRNA. Because taRNA functions only when it is completely transcribed, we model as following reactions:

<img src = "Ono_%2834%29.png" height="107px" class = "math" />

We refer to the translational model [8]. Similary to the transcrptional model we model as following;

<img src = "Ono_%2831%29.png" height="106px" class = "math" />

combining reaction2' and reaction2, we get:

<img src = "Ono_%2832%29.png" height="20px" class = "math" />

We model the association and disassociation of crRNA and taRNA as a reversible reaction:

<img src = "Ono_%2833%29.png" height="45px" class = "math" />

We know that anti-sigma directly affect sigma(reference), but a large part of their relation is still unknown. From their direct relation, We model as following reactions:

<img src = "Ono_%2835%29.png" height="45px" class = "math" />

We model the degradation as the following formulas:

<img src = "Ono_%2836%29.png" height="18px" class = "math" />

We can conclude that reaction formulas of our model are as follows:


<img src = "Ono_table2.png" class = "figure" />

Parameter

We got the parameters mainly from [11]. The summary of the parameters of this model is given in the Table 2 (α = 1/NA V, NA is Avogadro constant, and V is the volume of E. coli.):

Table 2:


<img src = "Ono_table3.png" class = "figure" />

We estimated τ from the fact that average translational speed is 20aa/sec, and average transcriptional speed is 40bp/sec. τ1 = 40.0,τ2 = 40.0,τ3 = 40.0, τ4 = 40.0,τ5 = 40.0,τ6 = 40.0, τ7 = 50.0,τ8 = 40.0,τ9 = 40.0.

Simulations

Figure 2:

<img src = "Ono_p_GFP.png" height="200px" class = "math" />

Figure 3:

<img src = "Ono_p_sigma.png" height="200px" class = "math" />

From the figure 3, we can confirm that reset function was correctly conducted.

<img src = "Sub_leaky_expression_analysis.png" class = "contTitle" />

In this section we have discussed the improved models of the σ Recounter.

First, we modeled the 3 sigma-recounter, the expansion of the 2 counter. Below is the construct of the 3 recounter.

<img src = "Ono_3count_construct.png" class = "figure" />

Figure 4:

<img src = "Ono_3count_result.png" class = "figure" />

When arabinose is induced for the first time, sigma 11 will be expressed and its positive feedback loop will be formed. On the second induction, sigma 20 will be expressed, and third, GFP will be expressed. The reaction formulas were established just like as above-mentioned deterministic model.

The unit of vertical axis is [nM], and that of the horizontal axis is [sec]. The result of the modeling of the 3 recounter:

Figure 5:

<img src = "Ono_ara_sweep.png" class = "figure" />

Although there seems to be a little leaky expression, the count is precisely conducted. Here we did not model resetting, because it is obvious from its orthogonality that resetting will be precisely conducted if the pulse length of IPTG is long enough.

<img src = "Sub_improvement_of_counter.png" class = "contTitle" />

Second, we devised genetic circuits that would not be affected by the pulse length of the arabinose induction. The current σ Re-counter depends much on pulse length; when the pulse length is too long, it would count 2 or more (if there is).

Figure 6:

<img src = "Ono_implementation_failure.png" class = "figure" />

induction time: 20000-40000[sec], 60000-80000[sec]

If the induction is too long, there will be no difference in the first induction and the second induction; that is, it has no function of counting.

However, improving this construct a little, our counter would not count more than 1 by a single pulse, as long as the pulse length is long enough (longer than τ0) for it to count. Shown below is the improved constructs.

<img src = "Ono_implementation_construct.png" class = "figure" />

Here X and Y are substances that bind together to activate PX&Y promoter.

Before arabinose is induced, PTet and PConst express Y and crRBS-sigma. When the arabinose is induced (for longer than time τ0), PBAD becomes activated and TetR and X are expressed. X binds to Y and the transcription of taRNA from PX&Y occur, which leads to counting. At that time, expression of Y is repressed by TetR and the amount of Y decreases exponentially. Thus, PX&Y is again repressed, the amount of taRNA decreases, and the counter never counts more than 1. You might be afraid that PX&Y also begins transcription of taRNA when the induction ends; however, supposed degradation of X is faster than that of TetR, it will not occur. When the induction ends, X first degrades while still a lot of TetR remain and Y is not abundant. Since PTet has a sigmoidal transcriptional response, the production rate of Y will change little even if the concentration of TetR decrease a little. When TetR degrades so much that it finishes repression of Y, most of X have already decomposed, and PX&Y will not be activated to begin transcription of taRNA.

We modeled this construct to test if it can be realized. We did not modeled resetting this time, either.

Figure 7:

<img src = "Ono_implementation_result.png" class = "figure" />

The induction was modeled to be conducted just the same as non-improved version. Although pulse length is long, counts are precisely done. Thus, theoretically, the counter independent of the pulse length is suggested to be available. Only thing we have to do is to research for the substances that satisfy these conditions!

<img src = "Sub_guideformodeling.png" class = "contTitle" />

What is modeling

The model should be sufficiently detailed and precise so that it can in principle be used to simulate the behavior of the system on a computer.

In the context of molecular cell biology, a model may describe the mechanisms involved in transcription, translation, gene regulation, cellular signaling, DNA damage and repair processes, homeostatic processes, the cell cycle, or apoptosis. Indeed any biochemical mechanism of interest can, in principle, be modelled. At a higher level, modeling may be used to describe the functioning of a tissue, organ, or even an entire organism. At still higher levels, models can be used to describe the behavior and time evolution of populations of individual organisms.

The first issue to confront when embarking on a modeling project is to describe on exactly which features to include in the model, and in particular, the level of detail model is intended to capture. Interacting with other cells and/or its environment, the cell realizes four key functions: growth, proliferation, apoptosis, and differentiation. The processes that realize these functions of a cell can be further organized into three processes levels: gene regulation, signaltransduction and metabolism. So, a model of an entire organism is unlikely to describe the detailed functioning of every individual cell, but a model of a cell is likely to include a variety of very detailed description of key cellular processes. Even then, however, a model of a cell is unlikely to contain details of every single gene and protein.

Indeed, really accurate modeling of the process would require a model far more detailed and complex than most biologists would be comfortable with, using molecular dynamic simulations that explicitly manage the position and momentum of every molecule in the system.

The "art" of building a good model is to capture the essential features of the biology without burdening the model with non-essential details. Every model is to some extent a simplification of the biology, but models are valuable because they take ideas that might have been expressed verbally or diagrammatically and make them more explicit, so that they can begin to be understood in a quantitative rather than purely qualitative way.

Aims of modeling

The features of a model depend very much on the aims of the modeling excercise. We therefore need to consider why people model and what they hope to achieve by so doing. Often the most basic aim is to make clear the current state of knowledge regarding a particular system, by attempting to be precise about the elements involved and the interactions between them. Doing this can be a particularly effective way of highlighting gaps in understanding. In addition, having a detailed model of a system allows people to test that their understanding of a system is correct, by seeing if the implications of their models are consistent with observed experimental data. In practice, this model validation stage is central to the systems biology approach. However this work will often represent only the initial stage of the modeling process. Once people have a model they are happy with, they often want to use their models predictively, by conducting "virtual experiments" that might be difficult, time-consuming, or impossible to do in the lab. Such experiments may uncover important indirect relationships between the model components that would be hard to predict otherwise. An additional goal of modern biological modeling is to pool a number of small models of well-understood mechanisms into a large model in order to investigate the effect of interactions between the model components. Models can also be extremely useful for informing the design and analysis of complex biological experiments.

In summary, modeling and computer simulation are becoming increasingly important in post-genomic biology for integrating knowledge and experimental data and making testable predictions about the behavior of complex biological systems.

Stochastic Approaches

The most common formulation of stochastic models for biochemical networks is the chemical master equation (CME). The analytical nature of the early stochastic approaches was highly complicated and, in some cases, intractable so that they received little attention in the biochemical community. Later, the situation changed with the increasing computational power of modern computers. And finally Gillespie presented an ground-breaking algorithm for numerically generating sample trajectories of the abundances of chemical species in chemical reaction networks. The so-called "stochastic simulation algorithm," or "Gillespie algorithm," can easily be implemented in any programming or scripting language that has a pseudorandom number generator. Several software packages implementing the algorithm have been developed. Differernt stochastic approaches and their interrelationships are depicted in Figure.

<img src="Chart.png" class="figure" />

For large biochemical systems, with many species and reactions, stochastic simulations (based on the original Gillespie algorithm) become computationally demanding. Recent years have seen a large interest in improving the efficiency/speed of stochastic simulations by modification/approximation of the original Gillespie algorithm. These improvements include the "next reaction" method of Gibson and Bruck, the "τ-leap" method and its various improvements and generalizations and the "maximal time step method", which combines the next rection and the τ-leap methods.

While stochastic simulations sre a practical way to realize the CME, analytical approximations offer more insights into the influence of noise on cell function. Formally, the CME is a continuous-time discrete-state Markov process. For gaining intuitive insight and a quick characterization of fluctuations in biochemical networks, the CME is usually approximated analytically in different ways, including the frequently used chemical Langevin equation (CLE), the linear noise approximation (LNA), and the two-moment approximation (2MA).

The traditional Langevin approach is based on the assumption that the time-rate of abundance (copy number or concentration) or the flux of a component can be decomposed into a deterministic flux and a Lngevin moise term, which is a Gaussian (white noise) process with zero mean and amplitude determined by the system dynamics. This separation of noise from system dynamics may be a reasonable assumption for external noise that arises from the interaction of the system with the other systems (such as the environment), but cannot be assumed for the internal noise that arises from within the system. Internal noise is not something that can be isolated from the system, because it results from the discrete nature of the underlying molecular events. Any noise term in the model must be derived from the system dynamics and cannot be presupposed in an ad hoc manner. However, the CLE does not suffer from above criticism because because Gillespie derived it from the CME description. The CLE allows much faster simulations compared to the exact stochastic simulation algorithm (SSA) and its variants. The CLE is a stochastic different equation (dealing directly with random variables rather than moments) and has no direct way of representing the mean and (co)ariantce and the coupling between the two. That does not imply the CLE, like the LNA, which has the same mean as the solution of deterministic model, ignores the coupling.

Markov processes form the basis of the vast majority of stochastic models of dynamical systems. At the center of a stochastic analysis is the Chapman-Kolmogorov equation (CKE), which describes the evolution of a Markov process over time. From the CKE stem three equations of practical importance: the master equation for jump Markov processes, the Fokker-Planck equation for continuous Markov processes, and the differential Chapman-Kolmogorov equation (dCKE) for processes made up both the continuous and jump parts.

Stochastic Formulation and Markov Process

Since the occurrence of reactions involves discrete and random events at the microscopic level, it is impossible to deterministically predict the progress of recations interms of the macroscopic variables (obsevables) N(t) and Z(t). To account for this uncertainty, one of the observables N()Z()

Our goal is to determine how the process N(t) of copy numbers evolves in time. Starting at time t=0 from some initial state N(0), every sample path of the process remains in state N(0) for a random amount of time W1 until the occurrence of a reaction takes process to a new state N(W1); it remains in state N(W1) for another random amount of time W2 until the occurrence of another reaction takes the process to a new state N(W1+W2), and so on. In other words, the time-dependent copy number N(t) is a jump process.

The stochastic process N(t) is characterized by a collection of state probabilities and transition probabilities. The state probability P(n,t)=Pr[N(t)=n] is the probability that the process N(t) is the state n at a time t. The transition probability Pr[N(t0+t)=n|N(t0)=m] is the conditional probability that process N(t) has moved from state m to state n during the time interval [t0,t0+t]. The analysis of a stochastic process becomes greatly simplified when the above transition probability depends on (i) the starting state m but not on the states before time t0 and (ii) the interval length t but not on the. Property (i) is the well-known Markov process. The process holding property (ii) is said to be homogeneous process.

References

[1] Uri Alon『An introductio to Systems Biology: Design Principles od Biological Circuits』

[2] Farren J Isaacs, et al engineered riboregulators enable post-trasncriptional control of gene expression.

[3] Jason R Kelly, Adam J Rubin,et al Measuring the activity of BioBrick promoters using an in vivo reference standard

[4] Part:BBa I13453 <http://parts.igem.org/Part:BBa_I13453> ( We finally accessed on 2014/8/20)

[5] Judith A.Megerle, Georg Fritz, et al Timing and Dynamics of Single Cell Gene Expression in the Arabinose Utilization System

[6] iGEM Kyoto 2010 <https://2010.igem.org/Team:Kyoto/Project/Goal_A>

[7] pSB1A2 <http://parts.igem.org/Part:pSB1A2> ( We finally accessed on 2014/8/20)

[8] pSB1C3 <http://parts.igem.org/Part:pSB1C3> ( We finally accessed on 2014/8/20)

[9] JENS BO ANDERSEN, et al New Unstable Variants of Green Fluorescent Protein fot Studies of Transitent Gene Expression in Bacteria.

[10] Daniel T.Gillespi A General Method For Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reaction.

[11] Andrzej M.Kierzek,et al The Effect of Transcription and Translation Initiation Frequencies on the Stochastic Fluctuations in Prokaryotic Gene Expression

[12] Marc R Roussel and Rui Zhu Validation of an algorithm for delay stochastic simulation of transcription and translation in prokaryotic gene expression.

[13] Andre S. Ribeiro Stochastic and delayed stochastic models of gene expression and regulation.

[14] Robert Schlicht and Gerhard Winkler A delay stochastic process with applicartions in molecular biology.

[15] Moises Santillan, et al Influence of Catabolite Repression and Inducer Exclusion on the Bistable Behavior of the lac Operon.

[16] D.J.Wilkinson.Stochastic Modelling for Systems Biology.Mathematical & Computational Biology.Chapman & Hall/CRC, London, Apr. 2006. ISBN 1584885408

[17] Mukhtar Ullah & Olaf Wolkenhauer Stochastic Approaches for Systems Biology.