Team:TU Darmstadt/Results/Modeling/ANS Engineering


Revision as of 03:36, 18 October 2014 by SaHein (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)



The anthocyanidin synthase from Fragaria x ananassa (ANS, EC catalyzes many reactions in the anthocyanidin pathway. We used its functionality by catalyzing the conversion of the leucoanthocyanidin (2R,3S,4S)-cis-lucopelargonidin to the anthocyanidin pelargonidin. It also catalyzes the conversion of the leucoanthocyanidin to flavonol (kampferol). Earlier studies hypothesized that ANS may be involved in metabolic channeling in their native organisms. So the ANS became a target for an ambitious modeling pipeline. The project eANS was born. The modeling pipeline:

In order to optimize the metabolic channeling of ANS, we chose a rational protein engineering approach. The first step of our multi scale and rational engineering project was the creation of a sophisticated 3D model with YASARA structure. This model was then used for a structural refinement with the SCWRL alghorithm and was energy minimized with YASARA nova force field.
Afterwards, we started a true mechanical engineering approach to determine the movements within the protein. Therefore, a Gaussian Network Model (GNM) and an Anisotropic Network Model were implemented.Those are simple models which simulate the mechanical behavior of the protein. Moreover, Linear Response Theory (LRT) was used to simulate the substrate binding inside the pocket and thus simulate an induced fit mechanism. 
Subsequently, we collected our data, defined rational mutations and finally constructed eANS. With this eANS version another MD simulation was started and the sequence of the protein was given to the wetlab for in vitro construction and in vivo characterization.

Coarse Grained Models (ANM & GNM)

With a computed GNM and ANM we were able to take a closer look inside the mechanics of the ANS. The result of the GNM computation showed a great peak at the C-terminus. It lead to the assumption that the C-terminal region of the ANS is highly flexible. Unfortunately, this region belongs to the active side of the protein.  One can imagine that this region may cover the active site and decrease the probability of substrate binding during the process of catalysis.  

Following figure shows the flexibility, represented by the slow modes, of wild-type ANS in a three dimensional model as displayed above. The results are extracted out of an ANM. The spatial directions of the simulated movement displayed below. This directions are represented as arrows with color coded strength (from red ~ strong to blue ~ weak). Only the C-terminus exhibits a large correlated movement.


If we simulate the substrate binding in the pocket of the ANS by applying a force vector to the active site and binding region we can observe a strong deformation of the enzyme. This process is called induced fit. This result reveals that the C-terminal region of the ANS is still highly flexible during the process of induced fit.  This is a problem because the substrate release as well as the binding is perturbed and thus the reaction rate is limited.

Design Prediction

We´ve concluded that we´d have to remove the C-terminal region to increase the substrate binding and destroy the fluctuating C-terminal tail near the active site. A model depicting the protein flexibility is presented below, which encouraged us to pursue our approach.

Molecular Dynamics Simulation (MD)


The Root mean square deviation (short: RMSD) can be computed as followed:

\[ RMSD(v,w) = \sqrt{\frac{1}{n} \sum_{i=1}^{n} ||v_i - w_i ||^2} \]

\[  = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (v_{ix} - w_{ix} )^2+(v_{iy} - w_{iy} )^2+(v_{iz} - w_{iz} )^2} \] here n is the number of atoms (Cα or backbone atoms). V i is defined as the coordinates of protein V atom i. Here, the RMSD is used to quantify a comparison between the structures of two protein (v and w) folds. The RMSD was computed from the atomic coordinates of the C-alpha in R using the bio3d package.

RMSD Results 

As can be seen in Figures RMSD (ANS short ; ANS_long), the wild type has minimal changes in the four calculated distances, which leads to the consumption that the central core stays quite stable during the simulation - equation is shown below. If we take a closer look at the RMSD distributions (RMSD Histograms) we can observe that the engineered ANS is more stable than the wilt type. Additionally, the wild typ reaches a higher plateau and overall RMSD.

RMSD of wild type ANS vs. the simulation time in ps is shown below.

RMSD of engineered ANS vs. the simulation time in ps is shown below.


Overall structure of the engineered ANS is more stable over time. Moreover, the RMSF (ref. RMSF Plots) computations reproduced the results derived from the coarse grained simulations (GNM, ANM and LRT). This reveals that coarse grained simulations are suitable for rational design approach. These models arent as computational expensive as MD Simulations. For example the ANM as well as GNM models can be computed on a single core Processor N270 (512K Cache, 1.60 GHz, 533 MHz FSB) in only a few minutes. Contrary to this the MD Simulation of the ANS and eANS calculated a few month on a Phenom II X6 1090T with 6 cores, 2.8 GHz&nbsp. This underlines the complexity and importance of coarse grained simulations for rational protein design. With the RMSF we can clearly bring to proof that the C Terminal region is highly flexible and thus a obstacle to the active site of the ANS.&nbsp


The Root mean square fluctuation (short: RMSF) describes the dynamic movement of a amino acid residue in a protein. High RMSF values in a certain area indicate a high flexibility.

RMSF can be computed as followed:

\[ RMSF= \sqrt{ \frac{1}{T} \sum_{t_j = 1}^T (x_i (t_j) - \tilde{x} )^2  } \]

Where T is the duration of the simulation (time steps) and x i (t j) the coordinates of atom x i at time t j. Now we are calculating the sum of the squared difference of the mean coordinate x i and x i (t j). Next we divide the sum to T and extract the root of it. Hence we are able to calculate the fluctuation of an atom with its mean in trajectory files. The RMSF was computed from the atomic coordinates of the Cα in R using the bio3d library.

RMSF Results 

Plots of RMSF are shown below. Here, the residue position is plotted against the RMSF in Angström. The first plot shows the native ANS simulation, whereas the second displays the engineered eANS.


It was necessary to improve metabolic channeling of the active site by removing the C-terminal region. Moreover, the overall stability of eANS is increased (represented by the RMSD histograms). We could demonstrate in our wetlab experiments that our design approach helped to increase the pelargonidin yield in vivo.

E.coli BL21 (DE3) pellet containing the pelargonidin producing operon after the fermentation. According to Yan et al. (2007) a pelargonidin producing E.coli should be red after a pelargenidin production. The operon with the engineered anthocyanindin synthase produces more pelargonidin.