Team:UFMG Brazil/Project/Modelling

From 2014.igem.org

(Difference between revisions)

Revision as of 02:43, 17 October 2014

Home

- Modelling -

Using our PCs to do the funny stuff!

To obtain the three-dimensional structure of our conditional sensor designed to bind repetitive DNA sequences, we employed comparative modelling. We began searching for appropriate templates for the selected biobrick sequences of TALE (Bba_K747027, Bba_K747043, Bba_K747059, Bba_K747075 obtained from the registry, plus Bba_K1514002 and Bba_K1514003 we synthesized) + linker + hemiCherry1 (BBa_K1514000) or 2 (BBa_K1514001). These were submitted to a PSI-BLAST similarity search against the Protein Data Bank (PDB). Templates for each domain were selected based on the percentages of residue identity, e-values, alignment scores and sequence coverage.

To start modeling, the program Chimera 1.9 (Pettersen et al., 2004) was used for sequence aligment. The aligned sequences were generated with default values and manual curation. To build the three-dimensional models of our chimera proteins, different templates were chosen for each protein region. Manual curation of the alignments obtained was performed using DNATagger (Scherer and Basso, 2008). Then, a set of at least 100 models was generated using Modeller 9.10 (Eswar et al., 2006). Structural characteristics of each protein part was analysed for the best models generated. Manual adjustment of torsional angles in the linker region were performed afterwards, using Swiss-PDB viewer (Johansson et al., 2012), and the quality of the final models was validated using the QMEAN Z-score calculation (Benkert et al., 2008).

After obtaining our final models we performed a structural alignment of both mCherry parts against the active mcherry structure (PDB 2H5Q). This alignment enabled us to estimate the final structure of our models bound to DNA and the distance between both TALE domains in the DNA, to perform our mathematical modeling.

Results

Two PDB proteins were selected as templates for model building. For the N-terminal part of our molecule, the crystal structure of TAL effector (3UGM) was selected and for the C-terminal part, mCherry (2H5Q) was used. Except for the linker region, the templates had 100% coverage and close to 100% identity against our sequences (99.4% to mCherry1, 100% to mCherry2 and 92.9% for both TALE parts). After modeling we selected the best Z-DOPE scores models for each protein (figure 1). Our model consists of six concatenated TALES self-associated shaped as a right-handed superhelix wrapped around the DNA major groove and connected by a linker to a hemicherry beta barrel structure.

Figure 1: Both protein models with all parts builded linked to a repetitive (GT)n DNA string. They are:The six TALE parts (Dark and light green), Linker (grey) and each mCherry part (pink).

Assessment of model quality for each protein through the QMEAN Server showed that our models have high quality, with |Z-scores| lesser than 1. QMEAN is a composite scoring function which is able to derive both global and local error estimates on the basis of one single model. The QMEAN Z-score indicates how many standard deviations the score differs from the expected values of experimental structures. This is illustrated in the two graphs in figure 2, where being closer to black better reflects and low Z-score and low standard deviation.

Figure 2: Comparison with non redundant set of PDB structures for both query models (Red).

To estimate if our models would be able to bind to DNA while maintaining the restituted mCherry conformation, we aligned both parts to the structure of the active form and kept their TALE domains spatially in a linear configuration. This showed that our models are compatible with DNA binding and mCherry restitution. We also calculated the inner distance between both linked TALES, which resulted in 35 Å, suggesting that there must be approximately 10 DNA base pairs between each (GT)6 binding region (Figure 3).

References

Scherer N.M. and Basso D.M. (2008) DNATagger, colors for codons. Genet. Mol. Res. 7 (3): 853-860

Eswar, N., Marti-Renom, M. A., Webb, B., Madhusudhan, M. S., Eramian, D., Shen, M., Pieper, U., Sali, A. (2006) Comparative Protein Structure Modeling With MODELLER. Current Protocols in Bioinformatics, John Wiley & Sons, Inc., Supplement 15, 5.6.1-5.6.30, 2006.

Johansson, M.U., Zoete V., Michielin O. & Guex N. (2012) Defining and searching for structural motifs using DeepView/Swiss-PdbViewer BMC Bioinformatics, 13:173.

Benkert, P., Tosatto, S.C.E. and Schomburg, D. (2008). "QMEAN: A comprehensive scoring function for model quality assessment." Proteins: Structure, Function, and Bioinformatics, 71(1):261-277

@@ Line 131: / Line 131: @@
 <br>
-<div style = "margin: 0 auto; height: 582px; width: 884px"><img src="https://static.igem.org/mediawiki/2014/2/27/UFMG_Brazil_modelling1.jpg" width=884px height=582px></img></div>
+<div style = "margin: 0 auto; height: 582px; width: 884px"><img src="https://static.igem.org/mediawiki/2014/2/27/UFMG_Brazil_modelling1.jpg" width=884px height=582px></img>
 <span style = "text-align: center;"> Figure 1: Both protein models with all parts builded linked to a repetitive (GT)n DNA string. They are:The six TALE parts (Dark and light green), Linker (grey) and each mCherry part (pink). </span>
+</div>
 <br>
@@ Line 142: / Line 142: @@
 <br>
-<div style = "margin: 0 auto; height: 489px; width: 801px"><img src="https://static.igem.org/mediawiki/2014/2/27/UFMG_Brazil_modelling1.jpg" width=801px height=489px></img></div>
+<div style = "margin: 0 auto; height: 489px; width: 801px"><img src="https://static.igem.org/mediawiki/2014/2/27/UFMG_Brazil_modelling1.jpg" width=801px height=489px></img>
 <span style = "text-align: center;"> Figure 2: Comparison with non redundant set of PDB structures for both query models (Red).</span>
+</div>
 <br>