Team:Bordeaux/Modeling

From 2014.igem.org

PolySMA : Modeling of proteins and theirs self-binding in a 3D environment

Introduction

This software is done for IGEM 2014 and more precisely for the project Elasticoli. This project aims to artificially produce elastic proteins by the bacterium Escherichia coli. These elastic proteins, also called ELP (stands for Elastin Like Protein), are able to binding to each other and may form aggregates (and even strings in particular conditions).
PolySMA aims to better understand how this proteins bind to each other. Indeed, the software will model this proteins and their interaction in a 3D environment.

Analysis

In this case, we only want modelize one protein type. The software need to model several proteins in a 3D environment, then these proteins will be moved and possibly binding to each other.

Proteins modeling

To easily model a protein, it could be represented by a spherical form when she is alone (no-binding with other proteins). In laboratory, it was observed that, after binding, the resultant aggregate has a volume lower than the initial volumes of each element sum together. Indeed, the volume of a 100 proteins aggregates makes about 1,5 times the one of the isolated protein. So, the initial spherical form of the protein could changed according to the bindings number and will takes a smaller volume when binds.

Bindings and move rules

To know if proteins bind each other, we need to take in consideration several elements. Firstly the meeting probability, to consider it the software need to simulate protein movements in a 3D space (that can be a cell for example). However, the information that a protein has been at the location A and then in B is not sufficient, we also need to know where it will be between this two points because it will not move in a straight line.
To consider this meeting probability, we choose to move proteins by small movements. Each movement is randomly generated and is smaller (or equal) the protein size, so we have all the information that we need.
If two proteins are close enough, the software must decide if they are binding together. Given the binding reaction occurs with a certain probability, we need to include an association probability. If the binding doesn't occur, the proteins will repulse each other (attraction-repulsion phenomena).
The temperature could be modified in the simulated model. Temperature effect is to increase the binding reaction by an increase of the meeting probability. Indeed, this probability is mdoified by the thermic agitation. So, it can be model by an increase of the protein speed and so an increase of the distance traveled for the same time step. Finally, when a protein is moved, we need to decide what happens when it goes to an edge of the simulated environement.

State of the art of 3D modeling softwares

Now that we know what we will simulate, we need to show it and interact with the user. In this aim, we need to choose a software or some libraries which can model in three dimensions and give to the user the means to interact with the resultant visualization (the user must be able to "move" in it).

Blender

The Blender interface allows to interact with the 3D environment simulated. With this software, we can easily animate objects (and so we can model time). Finally, the software has an API in Python easily usable (which allows to write programs which use it).

Soya3D and vPython

Soya3D and vPython are python librairies allowing to model objects in 3D. However, by default, they show a window too simple allowing few interactions with the user.
So, Blender has been choose for more simplicity. The language used to program will be Python 3 (because it's the language used by the Blender API).

Conception

Simulated environment

The simulated environment, meaning the area where proteins can moved, will be represented by a cube which size is defined by the user.
So the user can choose the environment size and also the number of simulated proteins. By setting these two variables, the protein concentration in the environment is changed.
As previously stated, when a protein is moved, we need to decide what happens when it reaches an edge of the simulated environement. Here, simulation will feign a infinite space by folding the space edge to edge (called toric configuration). For example, if a protein reaches the bottom edge of the simulated cube, so it is put at the top of this cube.

Simulated proteins

As previously stated, proteins are simulated by grains (spherical object). In this situation, we need two types of information : those link to the biologic grain (coordinates, positions of binding sites...) and those link with to the simulated grain (blender object, ...)..

It makes sense to use a structure "biological grain" which store for each protein : • coordinates • positions of the binding sites • proteins which are bind to it • the association probability with an other protein

The grain simulated structure could contain: • the biological grain • the Blender object • an information saying if the grain was moved around during the current movement round.

This information is summarized in the figure below:

Initial positioning of proteins

For the initial placement of proteins, the program will first randomly determined their positions.
The active sites are positioned at the same locations for all spheres, these locations are determined randomly and evenly distributed. Then, the program applies a random rotation on each sphere created.

Protein movements

As mentioned previously, proteins are randomly moved by small steps (each is smaller than the protein size). The proteins to move are randomly selected and are all moved once during each round.
During each movement, if a protein is close enough to the current protein, the program tests if the association occurs and determines the two sites which participate to this binding. If binding occurs, the area of the binding site will be put closer to the center of the sphere (compression) and the adjacent sites binding probability increases.

User interface

When the user runs the Python script in the free software Blender, a popup window appears with different parameters. The user can modify the simulation and choose parameters like grain diameter, environment size, number of protein, run, temperature… Default parameters are programmed.

The user can write the parameters or use arrows to change the values. When he is satisfied, he can accept the result and click on “OK”. After this operation the script uses the parameters to calculate the protein positions and the simulation begins.

After, the result appear on the Blender window. The user can change the visualization axes to see the protein in the 3D environment. At the beginning all proteins are single and after the simulation groups of protein can be made randomly. The user can see the evolution between run with Blender software.

This is an example of the simulation with 100 proteins at the beginning (image on the left) and after 5 runs (image on the right) at a temperature of 37°C. The proteins are purple when they are single, unbinding and they become pink when they are binding. It helps for the visualization.



This is an example of the simulation with 100 proteins at the beginning (image on the left) and after 5 runs (image on the right) at a temperature of 37°C. The proteins are purple when they are single, unbinding and they become pink when they are binding. It helps for the visualization.