Team:TU Delft-Leiden/Modeling/Curli

From 2014.igem.org

Revision as of 14:49, 13 October 2014 by WMRozemuller (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Curli Module

The goal of our project for the conductive curli module is to produce a biosensor that consists of E. coli that are able to build a conductive biofilm, induced by any promoter, in our case a promoter that gets activated in the presence of DNT/TNT. The biofilm consists of curli containing His-tags that can connect to gold nanoparticles. When the curli density is sufficiently high, a dense network of connected curli fibrils is present around the cells. Further increasing the amount of curli results in a conductive pathway connecting the cells, thereby forming conductive clusters. Increasing the amount of curli even further, sufficiently curli fibrils are present to have a cluster that connects the two electrodes and thus have a conducting system.
The goal of the modeling of the curli module is to prove that our biosensor system works as expected and to capture the dynamics of our system. So, we want to answer the question: "Does a conductive path between the two electrodes arise at a certain point in time and at which time does this happen?" However, we not only want to answer the question if our system works as expected qualitatively, but we also want to make quantitative predictions about the resistivity between the two electrodes of our system in time.

The conductive curli module has different dynamics on different length scales:

The behavior of the system on the gene level, that is the dynamics of the activation of the promoter and the dynamics of the production of proteins needed for curli growth.
The behavior of the system on the cell level, that is the curli production of each cell in time.
The behavior of the system on the colony level, that is the change of the resistivity between the two electrodes of our system in time.

To capture the dynamics of our system, we have implemented a three-layered model, consisting of the gene level layer, the cell level layer and the colony level layer.
The gene level layer is used to determine characteristic parameters that will be used in the cell level layer. Subsequently, the cell level layer is used to determine characteristic parameters that will be used in the colony level layer. Lastly, the colony level layer is used to determine if our system works as expected, ie. determine if a conductive path between the two electrodes arises at a certain point in time and at which time this happens, and to determine the change of the resistivity between the two electrodes of our system in time. A figure of our three-layered model is displayed below. add caption and be more specific about characteristic parameters

Figure 0: A schematic view of our model. We aim to have a three layered model. Each level brings information to the next level. In the gene level, we calculate the curli production rates. In the cell level, we use this to calculate the curli growth over time. In the colony level, we use the curli growth to make predictions for the conductivity as function of time.

summary of the conclusions

Gene Level Modeling

We will start with the modeling of the expression of curli on the gene level. Proteins that are dedicated to the curli formation are CsgA/B/D/E/F/G [1]. CsgA is the main building block of the curli. When produced, this protein is secreted out of the cell by the CsgEFG complex. In the absence of CsgB, there is no curli formation, since the CsgA proteins remain unpolymerized. CsgB is the starting block of the curli fibrils and connect the cell membrane to the first CsgA protein in the curli fibril. Once CsgB is located on the outside of the cell surface, the CsgA can polymerize onto the starting curli fibril.
In the constructs we made in the wet lab, CsgA is continuously being produced. However, in our constructs the CsgB gene is placed under the control of a landmine promoter, activated by either TNT or DNT reference to landmine. So, when the cells get induced by TNT or DNT, CsgB protein production will get started and CsgA will already be present in the system, as CsgA is continuously being produced.

Extensive Gene Level Modeling

to be written, low priority

Simplified Gene Level Modeling

Though the model described above, providing that all rates are known, has a more accurate (though still simplified) representation of the curli assembly system, we have chosen to decrease the complexity further to the bare essentials, as most of the production rates cannot be found in literature. Measuring the accurate rates in the wet lab is, within the scope of this project, infeasible and therefore, we constructed a model that only includes the rate limiting step of the system as this will mostly determine the dynamics of the system.
First of all, we investigated if the diffusion of the CsgA and CsgB proteins to their final destination is the rate limiting step in curli formation. From the literature and the wet lab, we know that the system response to the induction by TNT or DNT is in the order of hours [reference]. If diffusion is the rate limiting step, it would mean that CsgA and CsgB proteins would pile up inside and outside the cell, because it takes a long time for them to travel to their final destination, the end of a growing curli fibril and the outer membrane, respectively. A quick calculation shows that after one second, the displacement of a spherical particle with radius $r = \ 10 \ nm$ is 6.6 μm due to Brownian motion in liquid water at room temperature using equation 1; many times the bacterial radius! Hence, we conclude that diffusion is not rate limiting [4].

$$ \bar{x}^2 = \ \frac{k_b T t}{3 \pi \eta r} \tag{1}$$

What we do expect to be the rate limiting step for curli formation is the large amount of CsgA and CsgB proteins that have to be produced. Hence, we expect the production rate of one of these proteins to be the rate limiting step. Instead of including the intermediate steps, we have implemented the production of the CsgA and CsgB proteins with one reaction and associated production rate each. These rates have to be measured in the lab. We will use the following system of equations:

$$ \emptyset \xrightarrow{p_{A}} \ CsgA_{free} \tag{2} $$ $$ \emptyset \xrightarrow{p_{B}} \ CsgB \tag{3} $$ $$ CsgA_{free} + \ CsgB \xrightarrow{k} \ CsgA_{curli} + \ CsgB \tag{4} $$

Reactions 2 and 3 represent the production of CsgA and CsgB proteins, respectively. Equation 4 represents the growing of a curli fibril, where a curli fibril reacts with a free CsgA protein to become part of the curli. In reality, this reaction only happens at the end of the curli fibrils. In our model, we assume a homogeneous concentration of all the substances and we cannot discriminate between curli subunits. It is theoretically possible to model the system as an infinite amount of possible reactions that can take place to increase a curli fibril with length i to length i+1 at rate k [7]. However, we are merely interested in the growth rates of the curli, since the distribution of the curli length will follow from the model at the cell level. Therefore, we decided to model the growing of curli at the gene level as reaction 4. We assume that each CsgB protein is the start of a curli fibrils, thus the concentration of CsgB equals the concentration of curli. We can do this, because we showed that the diffusion of CsgA and CsgB proteins to their final destination is not the rate limiting step. Therefore, nearly all the CsgB proteins will be the beginning of a curli fibril in reality and our assumption is valid.
So, in reaction 4 we let a free CsgA protein react with a curli fibril to a CsgA protein that is part of that curli and the curli itself again, as it is immediatily again availible for the next reaction with a free CsgA protein to grow even more. Therefore, curli growth is dependent on the rate k and the concentration of $CsgA_{free}$ and CsgB.

Writing reactions 2-4 into differential equations results in:

$$ \frac{d}{dt} [CsgA_{free}] = \ p_{A} - \ k [CsgA_{free}][CsgB] \tag{5.1} $$ $$ \frac{d}{dt} [CsgB] = \ p_{B} \tag{5.2} $$ $$ \frac{d}{dt} [CsgA_{curli}] = \ k [CsgA_{free}][CsgB] \tag{5.3} $$

Fortunately, this system can be solved analytically. To do this, we need the initial conditions. Say the CsgB promoter is activated at $t= \ 0$. At this time there are no curli present, so $[CsgB]|_{t=0} = \ [CsgA_{curli}]|_{t=0}= \ 0$. However, the CsgA promoter is continuously active, so we expect to have an initial concentration $A_0$ of free CsgA proteins at time $t= \ 0$.

The solution to equation 5.2 is trivial:

$$ [CsgB] = \ p_B t \tag{6}$$

Substituting this into equation 5.1 results in:

$$ \frac{d}{dt} [CsgA_{free}] = \ p_{A} - \ K p_B [CsgA]t \tag{7} $$

It can easily be proven that a first order differential equation of the form

$$ y(t)' + \ f(t)y(t) = \ g(t) $$

has a solution of the form

$$ y(t) = \ e^{-F(t)} \int{g(t) e^{F(t)} dt} + \ y_0 e^{-F(t)} $$

where $F(t)= \int{f(t) dt}$. In our case, $f(t) = \ k p_B t$ and $g(t) = \ p_A$. This yields equation 8.

$$ [CsgA_{free}] = \ p_A e^{\frac{-k \ p_B t^2}{2}} \int{e^{\frac{k \ p_B t^2}{2}} dt} + \ C_{1} e^{\frac{-k \ p_B t^2}{2}} = \ p_A e^{\frac{-k \ p_B t^2}{2}} \int_{0}^{t}{e^{\frac{k \ p_B \tau^2}{2}} d\tau} + \ C_{2} e^{\frac{-k \ p_B t^2}{2}} \tag{8} $$

One with a keen eye may recognize the Dawson function (equation 9):

$$ D_+ (x) = \ e^{-x^2 } \int_{0}^x{e^{y^2} dy} \tag{9} $$

As in our case, $x^2 = \ k p_B t^2 $ and $y^2 = k p_B \tau^2 $ and equation 10 obtained.

$$ [CsgA_{free}] = \ \frac{p_A D_+ (t\sqrt{\frac{k \ p_B}{2}})}{\sqrt{\frac{k \ p_B}{2}}} + \ C_{2} e^{\frac{-k \ p_B t^2}{2}} \tag{10}$$

Using the boundary condition $[CsgA_{free}]|_{t=0}= \ A_0$, the expression for the concentration of free CsgA proteins becomes:

$$ [CsgA_{free}] = \ \frac{p_A D_+ (t\sqrt{\frac{k \ p_B}{2}})}{\sqrt{\frac{k \ p_B}{2}}} + \ A_0 e^{\frac{-k \ p_B t^2}{2}} \tag{11}$$

Now, we can fill in equations 11 and 6 into equation 5.3, which gives us equation 12.

$$ \frac{d}{dt} [CsgA_{curli}] = \ k p_B t \left( \frac{p_A D_+ (t\sqrt{\frac{k \ p_B}{2}})}{\sqrt{\frac{k \ p_B}{2}}} + \ A_0 e^{\frac{-k \ p_B t^2}{2}} \right) \tag{12} $$

For the parameters $p_{A}$, $p_{B}$, $k$ and $A_0$, we have estimated the following values explain:

Table 1: Parameters used to obtain quantitative results from the analytical solution for the curli production.
Parameters	Value	Unit
$\boldsymbol{p_{A}}$	$1.0 \cdot 10^{-10}$	$\frac{1}{Ms}$
$\boldsymbol{p_{B}}$	$1.3 \cdot 10^{-13}$	$\frac{M}{s}$
$\boldsymbol{k}$	$4.0 \cdot 10^{4}$	$\frac{1}{Ms}$
$\boldsymbol{A_0}$	$6.0 \cdot 10^{-6}$	$M$

Plotting equation 12 with the parameter values in table 1 yields the graph shown in figure 1. insert caption

Figure 1: The production rates of curli (blue) and csgB (green) in units per second as function of time.

Figure 1 shows a steady production of CsgB. $CsgA_{curli}$ concentration at $t= \ 0$ is zero as expected, since there is no CsgB at that point. In the next few hours, $CsgA_{curli}$ concentration peaks. We think that this is due to the high concentration of $CsgA_{free}$ that is present at $t= \ 0$. In figure 2, curli growth as function of time is plotted for different initial concentrations of $CsgA_{free}$.

Figure 2: The curli subunit growth in units per second for various initial concentrations $ A_0 $ of CsgA as function of time. Initial concentrations that equal 0, 5, 10 or 15 hours of CsgA production are shown.

We conclude the following from figure 2:
Firstly, as expected, curli growth stabilizes to a rate equal to $p_{A}$ after approximately 2 hours, independent of the initial concentration of $CsgA_{free}$, $A_0$.
Secondly, increasing the initial concentration of $CsgA_{free}$, $A_0$, increases the height of the peak. Even with zero initial $CsgA_{free}$ concentration, a small peak can be found at one hour. This is a consequence of $CsgA_{free}$ build-up when the CsgB concentration is still very small.
Thirdly, during the first two hours, few CsgB proteins are present in the system. We therefore expect that the length of the curli fibrils that started in the first few hours are much longer than the fibrils that started at later times.

Cell Level Modeling

Now that the growth rate of curli and production of CsgB protein as function of time is obtained, the conductivity as a function of time can be computed. The relevant length scale is the cell length, or the micrometre scale. The approach we used for this is relatively simple:

We discretize the amount of curli subunits ($CsgA_{curli}$ in the gene level model) and CsgB proteins that have to be added for each time step.
At each time step, we add more curli subunits to growing curli fibrils. Also, we add more new curli fibrils to the model.
From the density of the curli fibrils around the cell as a function of the radius, we calculate the conductive radius of the cell.

Discretization of Gene Level Model

We have discretized equations 5.2 and 12 in N time steps. These give the expected number of new CsgB proteins and curli subunits for each time step, as we plotted the solution of these two equations in figures 1 and 2. From these figures we determine the expected number of new CsgB proteins and curli subunits for each time step. However, a fundamental assumption in deterministic modeling is that the concentration is continuous. In reality, the amount of added curli subunits is discrete, since we cannot add half a curli subunit.
Furthermore, in the gene level model we did not take into account the statistical variation of gene transcription and adding of curli subunits; sometimes less and some times more curli subunits are added with respect to the expected value. To include this in the cell level model, we drew the amount of new curli subunits from a Poisson distribution where λ equals the expected amount of added subunits.
So, for each time step we now have $B_n$ new CsgB proteins and $C_n$ new curli subunits, where $C_n$ varies for each time step, as it is drawn from a Poisson distribution. An assumption of this distribution is that the time at which a new curli subunit is added, is uncorrelated to the time at which the previous curli subunit was added, we think this is a fair assumption. Note that the cell level model we made, accounts for the stochasticity of adding curli subunits, but not for the stochasticity of gene expression, so for the production of CsgB protein. The value $B_n$ and the Poisson distribution are determined from figures 1 and 2. We have used 1000 discrete times between 0 hr and 10 hr. say something about the time steps, how much time represents each step and determine Bn and Cn from figures

Building the Curli Fibrils

Firstly, $B_n$ CsgB proteins are added to our model that mark the starting points for new curli fibrils. These new curli fibrils are located at random points on a sphere with radius r, which represents the cell. The radius r is chosen such that the volume of the cell is$\ \sim 1.1 \ \mu m^3$ [5]. A CsgB protein is modeled by a line of length 4 nm that points radially outward, perpendicular to the cell surface [source]. In reality, the distribution of CsgB on the cell surface is not uniformly distributed [6]. However, we assumed uniformly distributed CsgB to keep our model prehensile. This is a point that may be used to further improve the model.

Next, $C_n$, which is drawn from the Poisson distribution, where λ equals the expected amount of added curli subunits, new curli subunits are added to curli fibrils by repeating the following process $C_n$ times:

Firstly, a random curli fibril is selected, e.g. curli number k. A curli fibril is represented by a 3 (the x, y and z coordinates) by l+1 matrix, where l is the amount of curli subunits of the curli fibril and the origin is chosen to be the center of the sphere. Thus, by storing the ending coordinates of each curli subunit, we know the starting and end coordinates of each curli subunit. The curli subunits are modeled by a line of length 4 nm [source].
Secondly, the polar angle in spherical coordinates of the last curli subunit is computed, $\theta_{1}$.
Thirdly, the new curli subunit has a small angular deviation with respect to the previous one. This polar angle $\theta_{2}$ is chosen from a Gaussian distribution with parameters N(0,σ). σ is chosen such that the persistence length, the distance over which a fibril has bend by $90^{\circ}$ and has ‘lost’ its directional information, is 4 µm. The azimuthal angle ϕ is completely random between 0 and 2π radians, and chosen from an uniform distribution.
Fourthly, for the new curli subunit for which we determined $\theta_{2}$ and ϕ, the polar angle is determined to be $\theta_{1} + \theta_{2}$. We now know the length of the new curli subunit (4 nm), its polar angle and its azimuthal angle. Subsequently, we add it to the previous curli subunit of the fibril and calculate the ending coordinate of the added curli subunit from its length, polar angle and azimuthal angle and the ending coordinate of the previous curli subunit. This calculated ending coordinate of the added curli subunit is stored in the matrix that represents the curli fibril.

The angular deviation σ is a critical parameter in our model. Increasing this value increases the flexibility of our curli, where decreasing this value increases the stiffness of the curli. This is shown in figure 3. If the length of one subunit is 4 nm and the total persistence length is 4 µm, then $\sigma = \ 3.47^{\circ}$. Furthermore, we think that it is justified to add the curli subunits one at a time to a random curli. We expect no discrimination of the CsgA proteins for binding to a large or small curli or one that has recently gotten a new curli subunit. figure caption

Figure 3: The persistence length in number of units of a curli fibril as function of the angular deviation per subunit in degrees.

An illustrative view of what our cell looks like during the adding of curli subunits is shown in figure 4. This figure is created when just a few curli were added ($ \sim 1/2 \ hour$). A similar figure after $t = \ 10 \ hr$ would look like a fuzzy ball of curli. figure caption

Figure 4: Schematic view of our cell at t=1/2 hr after initiation (black sphere centred at x=y=z=0). The wires represent the curli fibrils. The labels on the axis are in meter.

[write something about the part where we tried the percolation on this level], low priority

One thing interesting thing to look at is the length of the curli fibrils at t=10 hr, shown in figure 6. Curli fibrils that are created first (low numbers) are much longer than ones that are created last (high number). The steep drop in curli fibril length for the first couple of hundred fibrils comes from the peak in curli production between 0 hr and 2 hr. After that, and the curli length is linear with the time it has existed, precisely what you'd expect from the model.

Figure 6: The length of the curli fibrils in number of subunits on the y-axis at t=10 hours. On the x-axis is the time. A dot at height 1000 at 1 hour means that the curli fibril that was started at t=1 hour had length 1000 at time=10 hours.

Since adding curli on the colony level would result in unreasonable computation times, we decided to extract our parameters for the colony level modeling from the curli density around the cell. Figure 5 contains a histogram with the amount of curli subunits as a function of the cell radius after 10 hours. [add figure] Note how no curli are found below the actual cell radius. It can be seen from the figure that there is a large peak, followed by a plateau. When this histogram is observed in time, you would notice that at first large curli are being created. Figure 6 shows the length of all curli after 10 hours. figure caption

Extracting information for the colony level.

Now that we have a model of a cell with growing curli, we want to extract relevant data for the colony level modeling. Ideally, the resistance as function of radius and time would be calculated by looking at connections between the curli fibrils. However, this requires insight of the behavior of the curli on the nanoscopic scale. For instance, what is the conductivity of a single curli fibril with gold nanoparticles and what is the critical distance between the fibrils that make them connect? Furthermore, when interactions between the curli fibrils have to be taken into account, the model becomes computationally too expensive. After an extensive literature study, [] we have decided to simplify this model. The simplest approach is by saying there is a critical density of curli that is needed to make connections. Also we tried to parametrize the curli density for more quantitative results.

Our model is subject to stochastic processes. Therefore, to acquire enough in silico results, we have repeat the script that builds the curli fibrils for 10 hours a hundred times. This should give us insight in the variation we might expect. Figure 9 displays the curli density at $\ t= \ 2 \ hours$ for all cells. in the left figure. The orange line represents the average of the simulations. It can be concluded that the intercellular variation is relatively small. This makes sense, since the relative deviation of stochastic processes decreases with the sample size. In the right figure, the mean and standard deviation of the curli density as a function of the radius is shown. insert caption

Figure 9: Left) The curli density in curli units $ \mu m ^{-3} $ as function of radial distance from the centre of the cell in $ \mu m$ for 100 different simulations at t=2 hr. The orange line represents the mean of all densities. Right) The orange line represents the mean curli density, and the green lines represent the variation within the simulations.

It is also interesting to study curli density as function of time at different times, shown in figure 10. This figure shows that, corresponding with what we have seen previously, $\rho_{curli}$ decreases as a function of the radius. Also, it decreases faster as a function of the radius in the first two hours. After two hours, we can see that the curli density increases only for small r, as mainly short curli are added to the system. This agrees with our previous results.

Figure 10: The mean curli density in curli units $ \mu m ^{-3} $ as function of radial distance from the centre of the cell in $ \mu m$, plotted at different times (.5 hr, 1hr, 2hr, 5hr and 10hr).

Conductive Radius of the Cell

We think that a reasonable first approximation of the conductivity is the density of the curli around the cell as a function of the radius. When the density is higher, there are more gold particles, thus higher conductivity. In our simplest approach we say that there is a critical density $\rho_{crit}$ of curli that is needed to have conductivity. The density $\rho_{curli}$ decreases as function of the radius. The largest radius where $\rho_{curli} > \rho_{crit}$, we call the conductive radius $r_{cond}$. Let's take a look at what this would look like from figure 10. If the critical density would be $ 1 \cdot 10^3$, then at 30 min, the conductive radius would be $\approx 2.5 \mu m $, at 1 hour it would be $ \approx 4.5 \mu m $ and at 2 hours it would be $ \approx 5 \mu m $. How this looks for 100 different cells is shown in figure 12. With only this simple approximation we can calculate some interesting properties of our system: the time at which we expect percolation to happen and the resistivity of our system. Though this approximation seems to be rather arbitrary, we do have some reasoning for this:

First of all, the goal of this parameter is to get information about our system that will be calculated in colony level modeling. We use this parameter in colony level modeling to find connections between cells. To have a continuous path from one electrode to the other electrode, we must have a lot of cells that are connected to each other. In order to know when cells are connected to each other, we have to assume that everything at a certain radius from the cell is conductive; for this radius we use the critical density $\rho_{crit}$. However, for this to be true the fibrils on one side of the cell must be connected to the fibrils on the other side. The Percolation Theory prescribes that this is a sharp transition as a function of the density, so we can choose $\rho_{crit}$ in such a way that we are very sure that everything at $\rho_{crit}$ from the cell is conductive.
While the precise value of $\rho_{crit}$ may be unknown and should be measured, we think that we can still get plenty of information about the qualitative behaviour of our system in advance. Figure 8 at the bottom shows the conductive radius $r_{cond}$ as function of time using $\rho_{crit}$ as shown in figure 8 in the middle as the red line. Increasing or decreasing $\rho_{crit}$ would result in a similar $r_{cond}$ as function of time. Hence, the qualitative behaviour is preserved.
Due to the simplifications that we made in order to be able to model our system, we cannot include interactions or cluster forming between the curli themselves. Using $\rho_{crit}$, we have an elegant way to filter out modeling errors.

Figure 12: The green lines are the conductive radius plotted versus the time for 100 cells with a critical density of $ \rho_{crit}=1204 $ curli subuntis $ \mu m ^{-3} $. The orange red represents the mean conductive radius. A sharp increase in the conductive radius can be observed for $t < 1 \ hour$, and after $t = \ 1 \ hour$ the conductive radius increases slowly. The cellular variation in the second regime is relatively large, as is shown by the dark blue lines that represent two standard deviations from the mean. Note how the conductive radius increases in discrete steps. This is a result of the fact that density is a parameter that only exists over a certain volume. We have divided the volume around the cell in hollow spheres with thickness $ dr=0.08 \mu m $. Increasing would increase the accuracy over the mean, but would decrease the spatial volume. Decreasing this would increase the variation between the conductive radii, but would increase the spatial volume.

Different values of $\rho_{crit}$ result in different characteristic curves for $r_{cond}$, see figure 11. In this figure, we set $\rho_{crit}$ equal to a fraction of the maximum $ \rho_{curli} $ ($ 1.2 \cdot 10^5 \# \mu m^{-3} $ ) as observed in figure 10. So, we set $ \rho_{crit} = \max{ (\rho) } /K $ for the $ K $ shown in the legend.

Figure 11: The conductive radius in $ \mu m $ versus the time from t=0 to 10 hr for different values of $ \rho_{crit} $. The thick lines represent the mean conductive radius of 100 cells with a $ \rho_{crit} $ equal to to a fraction of the maximum ( $ 1.2 \cdot 10^5 \# \mu m^{-3} $ ) corresponding with the legend. The thinner lines of the same color are the mean $ \pm $ the standard deviation.

From figure 11, we conclude that low values of $\rho_{crit}$ result in a sharp increase of $r_{cond}$ followed by a steady, slow increase of $r_{cond}$ in time. During the steady, slow increase of $r_{cond}$ in time, the cellular variation is relatively large. For high values of $\rho_{crit}$, there is a delayed sharp increase of $r_{cond}$ and less cellular variation. Unfortunately we have no wetlab data to fit this parameter. We can speculate however. A a conductive radius of more than 5 $ \mu m $ seems unlikely to us, for the cell diameter is only a micron. We set the critical value to $ 1.2 \cdot 10^5 \ \# \ \mu m^{-3} $. Even though this value might be off by a factor, we claim that this will change little in what we try to achieve in this approximation, namely that there is a sharp transition at which the conductivity increases.

Parametrization of the curli density

We aim to not only say something about the moment of percolation, but also predict the conductivity as function of time. Using a conductive radius captures only little information of our simulations. We have therefore fitted the function $$ \rho_n = C_{1_n} e^{-\frac{r}{C_{2_n}}} \tag{13} $$ to our curli density curves (see figure 13) at each time $ n $. Here $C_{2_n} $ and $ C_{2_n} $ are parameters that have to be fitted, and $ r $ is the distance from the cell centre. A weighted fitting method is used, where the weights are inversely proportional to the variance of the density (green lines).

Figure 13: Orange: The cell sensitivity as function of time with the standard deviation (green lines). The black line is a weighted fit of $ \rho_n = C_{1_n} e^{-\frac{r}{C_{2_n}}} $.

It can be seen that the fit is certainly not perfect, but it a reasonable approximation of the characteristics. The reason for fitting such a simple function is that, in the colony level, we need to quantify the conductivity between the cells. The integral for this rather complicated. In further research, we could improve our fit by fitting a set of decaying exponents.

Colony Level Modeling

The goal of the modeling of the curli module is to prove that our system works as expected and to capture the dynamics of our system. The product we aim for is a chip where two parallel electrodes are a distance w apart. Between the electrodes, cells will grow and start building curli in the presence of DNT/TNT. Then, we will measure the conductivity of the resulting biofilm, which is related to the amount of DNT/TNT. Since even with bound gold nanoparticles the conductivity of the curli is very low, the chip is designed such that the electrodes are as long as possible.
The first question we are interested in is: can we prove that our system works as expected? So, does a conductive path between the two electrodes arise at a certain point in time and at which time does this happen? We do this by modeling the curli growth on the colony level; each cell is now visualized and has curli growth. First we have to make some approximations. Since the cells are grown on a chip, we assume that the cells and curli grow on a surface. This reduces our problem from 3D to 2D. This saves much computational time and memory. For this model, we take a chip of 500 by 500 µm. The electrodes are placed parallel to the y axis on x = 20 µm and x = 480 µm. The next approximation is that the cells are already present when they are induced by DNT/TNT, we neglect cell growth. In our model, E. coli are present with a density of $\rho_{cell}$. Furthermore, we assume there is no spatial correlation between the cells; hence we place them at random on our chip. The cell density we use is $ 2 \cdot 10^4 $ cells $ mm^2 $. We'd like to model higher cell densities and larger chips. However, the memory cost of the solution increases with the amount of cells squared and, even when the code is neatly vectorized, the computational time increases drastically more.

We have come up with two different approaches. The first is that we let the cells increase their conductive radius in time, according with our findings on the cellular level (figure 12). A connection is created from one electrode to the other electrode when there is a conductive path between them. Conductive paths consists of cells that have a connection between each other, cells connect when there is an overlap between their conductive radius. This problem is very similar to problems in percolation theory. From this, we can make conclusions about how our system works in an experimental setting. However, we not only want to answer the question if our system works as expected with only a yes or no answer, but we also want to make predictions about the resistivity between the two electrodes of our system in time. Therefore, we used graph theory to translate the cells on the chip to a graph and used an algorithm from graph theory to calculate the resistivity between the two electrodes. The conductivity between the cells is computed from an integral that we have set up starting with formula 13.

Percolation

So, we now have designed our chip as a 500 by 500 µm square with an electrode on the left and right side. On this chip, we place cells randomly with a density of $\rho_{cell}$. Subsequently, we increase the conductive radius of each cell in time, corresponding with our findings on the cellular level, see figure 12. A connection is created from one electrode to the other electrode when there is a conductive path between them, so when there is percolation. Conductive paths consists of cells that have a connection with each other, cells have a connection with each other when there is an overlap between their conductive radii. In practice we have programmed this by comparing the distances of all cells with all other cells with the conductive radius. A simulation of our resulting model is shown in figure 14. Percolation is computed by applying an algorithm that can find clusters of connected cells. When one of the clusters connects both electrodes, we have percolation.

meh — Figure 14: NorthWest: A visual representation of our cells on the plate. The circels represent the cells with a conductive radius of 4.05 µm. In this simulation there are 500 cells present on a chip of 500µmx500µm. NorthEast: A spy matrix of 5000x5000 where the blue dots represent connections between the individual cells. A blue dot on position x,y means that cell x is connected with y. Each cell is connected to itself (diagonal). At the point of percolation, $ \approx 0.1 \% $ of the matrix is connected, meaning that each cell is on average connected to 5 others. SouthWest: Each square of nxn represents a cluster of n connected cells. The squares are sorted from small to large. SouthEast: This figure shows the largest cluster of cells in different colours.

We have stochasticity in our model, as we place the cells randomly with a density of $\rho_{cell}$ on the chip. Therefore, we simulated our model 100 times and for each point in time we checked if there was percolation. We will only get a yes (1) or no (0) response. This enables us to find the chance of percolation at each time point, shown in figure 15 as the yellow line. The yellow line shows a sharp transition between 1.5 and 2 hours. Since this is a Bernoulli process, [reference], the variance is exactly equal to p(1-p). The variance must be as low as possible to get trustworthy measurement results, as in that case the transition from no percolation to percolation is as sharp as possible.
At first we assumed in our model that $r_{cond}$ is the same for each cell at each point in time (figure 12 red line). However, figure 12 shows that there clearly is some cellular variation in $r_{cond}$. Therefore, we also added a feature to our model; the conductive radius of each cell can now deviate from the mean $r_{cond}$ with the standard deviation as found in figure 12. We simulated our resulting model again 100 times and for each point in time we checked the chance of percolation, see figure 15 as the blue line. Fortunately, the resulting curve is very similar to the curve without cellular variation in $r_{cond}$ (yellow line). This means that cellular variation has little influence on the chance of percolation at each point in time. Therefore, the results of our model are robust to cellular variation and it is likely that many factors that could increase the cellular variation, e.g. different CsgA or CsgB protein production rates, are relatively unimportant. add caption, maybe elaborate about the time at which percolation happens

Figure 15: The chance of percolation with 5000 cells on a 500x500 $\mu m $ chip. as function of time. The results are from 100 simulations. The yellow line represents the chance of percolation where all the cells have the same conductive radius. The blue line is the same simulation, but all cells have slightly different conductive radii. Note how there is no notable difference between the two.

Influence of the chip geometry on the point of percolation

To further investigate the point of percolation we have varied the shape of our chip. We have decreased the relative distance between the electrodes by making our chip 250 µm x 500 µm, where the electrodes are 250 µm apart. Furthermore, we also increased the distance between the electrode to $ 1000 \mu m $ with the same cell density.

Resistivity

To calculate the conductivity as function of time we repeat the following steps:

Place our cells on our chip.
Compute the conductivity between the cells.
Compute the conductivity between the electrodes.

Compute the conductivity between the cells

First, we have to get a quantitative measure for the conductivity between two cells. To do this, we will quantify the overlap of two conducting spheres, where we assumed that the conducting spheres represent cells surrounded by curli filaments. We subdivide the overlapping region in infinitesimal volumes $dV$. The infinitesimal conductivity of such an infinitesimal volume is given by:

$$ d \sigma (y) = \ \frac{\rho_1}{r_1} dV \frac{\rho_2}{r_2} dV \tag{}$$

The factor $ 1/r $ is introduced to account for the conductivity of the wires itself, which is inversely proportional to the length of the conducting wire. [source: Narinder Kumar (2003). Comprehensive Physics XII. Laxmi Publications. pp. 282–. ISBN 978-81-7008-592-8.] Further away from the cell, the wires need a longer distance to go to the cell. Since we want to know the strength of the connection between the cells, we have to include this factor. For a straight line this is inversely proportional to the distance. For a single curli fibril, this relation does not hold. However, we assume that the curli density is high, thus there are many connections between the curli. Then there is a pathway from the origin to $ r $ roughly proportional to the distance from the cell. To find the total conductivity, we integrate on both sides. To account for the fact that both volume elements $dV$ are the same, we make use of the Dirac-delta function $\delta_3$ [source]. This gives us the following:

$$ \sigma (y) = \int{ \frac{\rho_1(\vec{r_1})\rho_2(\vec{r_2})}{r_1 r_2}\delta_3(\vec{r_2}-f(\vec{r_1}))d^3\vec{r_1}d^3\vec{r_2}} \tag{} $$

The Dirac delta allows us to remove the $\vec{r_2}$ dependence by expressing these in $\vec{r_1}$. The still undetermined relation between $\vec{r_1}$ and $\vec{r_2}$ is given by $\vec{r_2} = f(\vec{r_1})$. Applying this removes one of the two volume integrations. Using spherical coordinates, the resulting single volume integration can be written as:

$$ \sigma (y) = \int_{r_0}^{r_{max}} \int_0^{\theta_{max}(r)} \int_0^{2\pi} \rho(r_1)\rho_2(f(r_1))\frac{r_1}{f(r_1)} \sin(\theta_1) d\phi_1 d\theta_1 dr_1 \tag{} $$

Here we have made use of the fact that the density $\rho$ is only dependent on $r$ and not on $\phi$ and $\theta $. The integral over $\phi_1$ is trivial and gives us a multiplication factor of $2 \pi$:

$$ \sigma (y) = \ 2 \pi \int_{r_0}^{r_{max}} \int_0^{\theta_{max}(r)} \rho(r_1)\rho_2(f(r_1))\frac{r_1}{f(r_1)} \sin(\theta_1) d\theta_1 dr_1 \tag{} $$

Now that we have reduced our integration to two dimensions, we will work out $f(\vec{r_1})$. To do this, we introduce the vector from the origin of cell 1 to the origin of cell 2, $\vec{y}$. This allows us to express $\vec{r_2}$ in terms of $\vec{y}$ and $\vec{r_1}$:

$$ \vec{r_2} = \ \vec{y} - \ \vec{r_1} = \begin{bmatrix}y \\0\\ \end{bmatrix} - \begin{bmatrix} r_1 \cos(\theta_1) \\r_1 \sin(\theta_1)\\ \end{bmatrix} \tag{} $$

Now it is straightforward to express $r_2$ in terms of $y$, $r_1$ and $\theta_1$:

$$ r_2 = \ |\vec{r_2}| = \ \sqrt{(y - r_1 \cos(\theta_1))^2 + \ r_1^2 \sin^2(\theta_1)} \tag{} $$

Plugging this in yields the following integral:

$$ \sigma (y) = \ 2 \pi \int_{r_0}^{r_{max}} \int_0^{\theta_{max}(r)} \frac{\rho(r_1)\rho_2 \left( \sqrt{(y - r_1 \cos(\theta_1))^2 + r_1^2 \sin^2(\theta_1)}\right) r_1 \sin(\theta_1)}{ \sqrt{(y - r_1 \cos(\theta_1))^2 + r_1^2 \sin^2(\theta_1)}} d\theta_1 dr_1 \tag{} $$

We will now have a closer look at the boundary values for $r_1$ and $\theta_1$. We want to integrate over the entire space. Therefore, $ \theta(max) = \pi $ and $ r_{max}=\infty $. By introducing no cut-off radius, we are able to take into account the possibility of having by chance a very large conductive radius. Here we have approximated our cells as points in space. Hence $ r_0 =0 $.

We will now use the previously [link] found fact that the curli density can be described as:

$$ \rho(r) = \ C_{1}e^{-\frac{r}{C_{2}}} \tag{} $$

Plugging in the boundary values and our expression for $\rho(r)$, we find the following expression for the conductivity between two cells:

$$ \sigma (y) = \ 2 \pi C_{1}^2 \int_{0}^{\infty} \int_0^{\pi} \frac{e^{-\frac{r_1}{C_{2}}} e^{-\frac{ \sqrt{(y - \ r_1 \cos(\theta_1))^2 + \ r_1^2 \sin^2(\theta_1)}}{C_{2}}} r_1 \sin(\theta_1)}{\sqrt{(y - r_1 \cos(\theta_1))^2 + r_1^2 \sin^2(\theta_1)}} d\theta_1 dr_1 \tag{} $$

This integral looks very complicated, but don't panic! It can algebraically be simplified with some substitutions. We can rewrite this integral by moving all terms independent of $ \theta $ out of the integral over $\theta_1$. Furthermore, using that $ \sin^2 (\theta_1) + \cos^2(\theta_1) = 1 $ we get.

$$ \sigma (y) = \ 2 \pi C_{1}^2 \int_{0}^{\infty} r_1 e^{-\frac{r_1}{C_{2}}} \int_0^{\pi} \frac{e^{-\frac{ \sqrt{y^2+r_1^2-2yr_1 cos( \theta_1 ) }}{C_{2}}} \sin(\theta_1)}{ \sqrt{y^2+r_1^2-2yr_1 \cos( \theta_1 ) }} d\theta_1 dr_1 \tag{} $$

Now we must recognize that we can substitute $ x= cos(\theta_1) $ such that $ dx = -\sin(\theta_1) d\theta_1 $. This results in:

$$ \sigma (y) = - \ 2 \pi C_{1}^2 \int_{0}^{\infty} r_1 e^{-\frac{r_1}{C_{2}}} \int_1^{-1} \frac{e^{-\frac{ \sqrt{y^2+r_1^2-2yr_1 x }}{C_{2}}}}{\sqrt{y^2+r_1^2-2yr_1 x }} dx dr_1 \tag{} $$

In the second integral we recognize something of the form $ \int \frac{e^{-\sqrt{a+bx}}}{C_2\sqrt{a+bx}} dx $ with $ a= \frac{y^2+r_1^2}{C^2_2} $ and $b=-\frac{2yr_1}{C^2_2} $. Substituting $ h= \sqrt{a+bx} $ with $ dx= \frac{2h}{b} dh $ yields:

$$ \int_1^{-1} \frac{e^{-\sqrt{a+bx}}}{C_2\sqrt{a+bx}} dx = \frac{2}{bC_2} \int_{\sqrt{a+b}}^{\sqrt{a-b}} e^{-h} dh= \frac{-2}{bC_2} (e^{-\sqrt{a-b}}- \ e^{-\sqrt{a+b}})$$

Now $a$ and $b$ can be substituted:

$$ \int_1^{-1} \frac{e^{-\sqrt{a+bx}}}{C_2\sqrt{a+bx}} dx = \frac{C_2}{yr_1} \left( e^{-\frac{\sqrt{y^2+r_1^2+2yr_1}}{C_2}} - e^{-\frac{\sqrt{y^2+r_1^2-2yr_1}}{C_2}} \right)$$

Hence, the entire integral now becomes

$$ \sigma (y) = \frac{ 2 \pi C_{1}^2 C_2 }{y} \int_{0}^{\infty} e^{-\frac{|y-r_1|+r_1}{C_2} } - e^{-\frac{y+2r_1}{C_2} } dr_1 \tag{} $$

Solving the second integral is fairly easy:

$$ \sigma (y) = \frac{ 2 \pi C_{1}^2 C_2 }{y} \int_{0}^{\infty} e^{-\frac{|y-r_1|+r_1}{C_2} }-e^{-\frac{y+2r_1}{C_2} } dr_1 = \frac{ 2 \pi C_{1}^2 C_2 }{y} \left( \int_{0}^{y} e^{-\frac{y}{C_2}} dr_1 +\int_{y}^{\infty} e^{-\frac{2r_1-y}{C_2}} dr_1 -e^{\frac{-y}{C_2}}\int_0^{\infty} e^{-\frac{2r_1}{C_2} } dr_1 \right) \tag{} $$

Which brings us to the final result:

$$ \sigma (y) = \ 2 \pi C_{1}^2 C_2 e^{-\frac{y}{C_2}} \tag{} $$

For future research, we could extend our models such that the cellular variation is included. If $ \rho_1(r) = \ C_{1}e^{-\frac{r}{C_{2}}} $ and $ \rho_2(r) = \ C_{3}e^{-\frac{r}{C_{4}}} $ then the conductivity between the two electrodes, using the approach as described above is.

$$ \sigma (y) = \ \frac{4 \pi C_{1}C_3 C_2^2 C_4^2}{y \left( C_2^2 - C_4^2 \right)} \left( e^{-\frac{y}{C_2}} -e^{-\frac{y}{C_4}} \right) \tag{} $$

Compute the conductivity between the electrodes

Now, we use graph theory to translate the cells on the chip to a graph and use an algorithm from graph theory to calculate the resistivity between the two electrodes.

Results

We have calculated the conductivity as function of time for various different dimensions of our plate. The distances of the electrodes is varied from 250-500-1000x500 $ \mu m $. The result is shown in figure 16. The process is, for the smaller chips repeated 10 times to say something about the variance. For the 1000x100 $ \mu m $ plate it is only repeated thrice, for a single curve already takes over twelve hours to compute.

Figure 16: Orange: The conductivity (rbitrary units) as function of time (hours) for different dimensions of our plate. The first dimension is the distance between the two electrodes. The second dimension represents the length of the electrodes.

From figure 16 we can draw a couple of conclusions.

It seems that the conductivity increases exponentially over time. We expect that even after a long time, there is low conductivity [paper]. This means that before that time, it is hard to measure changes in conductivity.

The response-curve of the system is independent of the shape of our plate. The blue lines that have the distance between the electrodes doubled compared to the green line also has half the conductivity (and a quarter of the red lines). Thus the conductance is inversely proportional to the distance between the electrodes. This is precisely what you would expect if you see the system as a single resistor.

Increasing the chip size decreases the relative uncertainty of the response. The red lines are much further apart (also relatively) than the blue lines. This makes sense from a physical point of view, since we're dealing with larger samples. It is then more insensitive to the randomness due to the placement of the cells.

Our findings are in accordance to the behaviour expected from a random resistor network with percolation theory, where the conductivity increases exponentially after percolation.

It is impossible to observe a point of percolation in this. This is because we have made a continuous model.

Other simulations shows us that indeed, the conductivity scales linearly with the length of the electrodes. If we want to make a design of our system and have as high conductivity as possible, we want to decrease the distance between the electrodes as much as possible. At the same time, the total area of the chip should be reasonable large to reduce the effects of the stochastic behaviour of the system.

References

still has to be made

Parameters	Value	Unit
\(\boldsymbol{p_{A}}\)	\(1.0 \cdot 10^{-10}\)	\(\frac{1}{Ms}\)
\(\boldsymbol{p_{B}}\)	\(1.3 \cdot 10^{-13}\)	\(\frac{M}{s}\)
\(\boldsymbol{k}\)	\(4.0 \cdot 10^{4}\)	\(\frac{1}{Ms}\)
\(\boldsymbol{A_0}\)	\(6.0 \cdot 10^{-6}\)	\(M\)