Team:UCSD Software/Project

From 2014.igem.org

(Difference between revisions)
Line 541: Line 541:
<h3><b>Algorithm</b></h3>
<h3><b>Algorithm</b></h3>
<h4><b>Overview of the Algorithm</b></h4>
<h4><b>Overview of the Algorithm</b></h4>
 +
SBiDer's search algorithm is fundamentally based on multiple breadth first searches (BFS) on the SBiDer database. From a set of input species, the algorithm searches for any operon that can be activated by the input species (see Tutorial for how to format the input query). If such an operon is found, the output species of the activated operon and/or other existing species in the system (details below) is used to search for the next operon. In the process, the search algorithm constructs an operon paths by successively adding activated operons to the path. There are two ways of activating an operon: linear successional activation and nonlinear successional activation (more details below), and the search algorithm can create an operon path using either or both of the activation methods. This search algorithm continues until either: the search algorithm concludes that there is no operon path that can lead to the production of the output species; or an operon path producing the output species is found. <br>
 +
 +
<h4><b>Linear Successional Activation Search (LSAS)</b></h4>
 +
The search algorithm uses linear successional activation search (LSAS) to generate an operon path in which the output species of a preceding operon produces all of the required input species for the succeeding operon. LSAS extends an operon path by adding only operons that can be activated by the species produced by the last operon in the path. Thus, for each potential activated operon by the input species, LSAS uses a tree data structure in searching an operon path. From an input species (or multiple input species with an AND relationship), LSAS builds a tree, and if an output species is found in the process of growing the tree, the path from the root operon to the leaf operon that has produced the output species is returned as a resulting operon path. Finally, this resulting set operon paths is used to construct a Petri net path including all of the input species, input transitions, input transition logic, operons, output transitions, output transition logic, and output species. <br>
 +
 +
<h4><b>Nonlinear Successional Activation Search (NLSAS)</b></h4>
 +
An operon may not produce all of the input species required for an activation of another operon. Yet, multiple operons may together produce all of the required input species for the other operon. Here, each of the activating operons produces some, but not all, of the required species for the activated operon. The search algorithm captures such operon activation mechanism using nonlinear successional search (NLSAS) in building an operon path. As LSAS, NLSAS also uses a tree data structure, but appends an operon whose input species, or activation requirements, are produced by any of the preceding operons in the path. While NLSAS assumes no degradation of the species in the system, it allows for a wider range of operon sets that could produce our desired output from our given input. As a result, each resulting operon path can be linear or nonlinear. Since a NLSAS result captures operon activation in which a single or multiple preceding operons produce species required for the activation of a succeeding operon, LSAS result is a subset of a corresponding NLSAS result. <br>
 +
 +
<h4><b>Overview of LSAS and NLSAS</b></h4>
 +
Overall, LSAS generates a single or multiple operon paths in which a preceding operon directly activates a succeeding operon, and NLSAS paths in which any of the preceding operons can activate a succeeding operon. In general, a resulting LSAS operon path includes operons mined from same or similarly conducted experimental protocols, where the operons are optimized to coexist in a system. As a result, LSAS can return well optimized operon paths and interactions, which may be used in biochemical reactions with higher confidence. The SBiDer web application can further facilitate the use of such operons paths by providing literature and other published resources. NLSAS, on the other hand, can generate an operon path by including operons from dissimilar experimental protocols. So, using NLSAS can potentially lead to the discovery of novel operon paths and thus new biochemical systems. By providing LSAS and NLSAS variations of the search algorithm, SBiDer assists a user in capturing the biological reality underlying an operon activation as accurately as possible, and provides a means to evaluate, analyze, and apply the results.
 +
</div>
</div>

Revision as of 10:18, 16 October 2014


Project Description

Abstract

Genetic circuits are often difficult to engineer, requiring months to design, build, and test each individual genetic device involved in the circuit. SBiDer, a web tool developed by the UCSD Software iGEM team, will leverage existing devices to construct a database with consideration for the function of each device interpreted as boolean logic. The data can be queried by the user through SBiDer's visual interface to explore circuit designs. The displayed circuit's literature reference, characterization data, and images of included devices can be viewed through the built-in table. Basic validation of the circuit performance is also provided within in the interface. SBiDer's web of information can be expanded through user-generated additions to the database to improve the efficiency of the application and the accuracy of the models. (Partial STUB PAGE)

Future Directions

Synthetic genetic circuits created by synthetic biologists have yielded exciting applications such as biofuels production and cancer killing bacteria. These circuits are often difficult to engineer, requiring months to design, build, and test each individual genetic device involved in the circuit. Although there are many genetic devices that have been built, re-using these devices often requires a time-consuming review of the literature. The UCSD Software iGEM team will address this challenge by creating a web-tool that leverages existing genetic devices to create complex genetic circuits. (STUB PAGE)

Web Application

We will connect known genetic devices together via device input and outputs to create a network of devices that can interact. We define a genetic device as a DNA construct transformed into cells that can cause expression of some protein in response to stimuli (or input). We will develop a web interface to facilitate access to the complex network that we have constructed. Our Web interface makes extensive use of Cytoscape, an open source bioinformatics software package for metabolic network visualization and simulation. In addition, the interface will generate SBOL Visual Images, a standard language that is easily understood by synthetic biologists all over the world. Users can also update our database with additional devices through this interface. Using the Cynetshare framework, users can share their circuit designs. (stub page - going to write about the technologies that we used and why we used them).

The Search Algorithm

Introduction to the SBiDer Search Algorithm

Purpose of the Algorithm

SBiDer is fundamentally constructed on a manually curated database of existing genetic devices, or operons. This SBiDer database stores operons, species, and most importantly the biochemical interactions of these elements. Also, the database can be represented as an operon interaction network via species. Using this network representation of the database, SBiDer’s search algorithm searches for feasible operon paths connecting a set of species to another set of species. As a result, given a set of input species and a set of output species, the search algorithm returns a subnetwork of operon paths connecting the input species to the output species. Each path within the subnetwork is an operon path that can be used to produce the output species from the input species. In other words, placing operons from a returned operon path as well as the input species into a system can produce the output species.

Biology Captured by the Algorithm

The search algorithm conducts a complex, effective, and robust operon path search over the network representation of the database, ultimately generating single or multiple operon paths connecting a set of input species to a set of output species. A resulting path represents a system that can produce the output species given the input species. Also, each path attempts to capture biological reality by considering two major mechanisms of operon activation (detailed description below). Finally, SBiDer represents a resulting path using Petri net model, such as chemical reactions (C.A. Petri and W. Reisig). The Petri net generated by the search algorithm consists of all of the input species, input transitions, input transition logic, operons, output transitions, and output species in a path. As a result, the Petri net model captures the fundamental components of operon interactions, allowing the biological phenomena to be better understood, modeled, and analyzed.

Robustness and Modularity of the Algorithm

Due to the current number of operons in the database, the search algorithm should scale with the increasing size of the database, which will expand as the synthetic biology community adds more species, operons, and plasmids to the database (see complexity analysis of the algorithm is located below). In addition to this robustness, the search algorithm is easily modifiable since it has been implemented independently from the database and the SBiDer web application. Therefore, the search algorithm can be applied to a broad range of networks. Furthermore, the search algorithm was developed on an open source platform to encourage global collaboration in further optimizing the search algorithm and minimizing the barrier for algorithm improvement. As a result, the database as well as the search algorithm will be easily, effectively, and accurately expanded, extended, and optimized by the global synthetic biology community - together.

Algorithm

Overview of the Algorithm

SBiDer's search algorithm is fundamentally based on multiple breadth first searches (BFS) on the SBiDer database. From a set of input species, the algorithm searches for any operon that can be activated by the input species (see Tutorial for how to format the input query). If such an operon is found, the output species of the activated operon and/or other existing species in the system (details below) is used to search for the next operon. In the process, the search algorithm constructs an operon paths by successively adding activated operons to the path. There are two ways of activating an operon: linear successional activation and nonlinear successional activation (more details below), and the search algorithm can create an operon path using either or both of the activation methods. This search algorithm continues until either: the search algorithm concludes that there is no operon path that can lead to the production of the output species; or an operon path producing the output species is found.

Linear Successional Activation Search (LSAS)

The search algorithm uses linear successional activation search (LSAS) to generate an operon path in which the output species of a preceding operon produces all of the required input species for the succeeding operon. LSAS extends an operon path by adding only operons that can be activated by the species produced by the last operon in the path. Thus, for each potential activated operon by the input species, LSAS uses a tree data structure in searching an operon path. From an input species (or multiple input species with an AND relationship), LSAS builds a tree, and if an output species is found in the process of growing the tree, the path from the root operon to the leaf operon that has produced the output species is returned as a resulting operon path. Finally, this resulting set operon paths is used to construct a Petri net path including all of the input species, input transitions, input transition logic, operons, output transitions, output transition logic, and output species.

Nonlinear Successional Activation Search (NLSAS)

An operon may not produce all of the input species required for an activation of another operon. Yet, multiple operons may together produce all of the required input species for the other operon. Here, each of the activating operons produces some, but not all, of the required species for the activated operon. The search algorithm captures such operon activation mechanism using nonlinear successional search (NLSAS) in building an operon path. As LSAS, NLSAS also uses a tree data structure, but appends an operon whose input species, or activation requirements, are produced by any of the preceding operons in the path. While NLSAS assumes no degradation of the species in the system, it allows for a wider range of operon sets that could produce our desired output from our given input. As a result, each resulting operon path can be linear or nonlinear. Since a NLSAS result captures operon activation in which a single or multiple preceding operons produce species required for the activation of a succeeding operon, LSAS result is a subset of a corresponding NLSAS result.

Overview of LSAS and NLSAS

Overall, LSAS generates a single or multiple operon paths in which a preceding operon directly activates a succeeding operon, and NLSAS paths in which any of the preceding operons can activate a succeeding operon. In general, a resulting LSAS operon path includes operons mined from same or similarly conducted experimental protocols, where the operons are optimized to coexist in a system. As a result, LSAS can return well optimized operon paths and interactions, which may be used in biochemical reactions with higher confidence. The SBiDer web application can further facilitate the use of such operons paths by providing literature and other published resources. NLSAS, on the other hand, can generate an operon path by including operons from dissimilar experimental protocols. So, using NLSAS can potentially lead to the discovery of novel operon paths and thus new biochemical systems. By providing LSAS and NLSAS variations of the search algorithm, SBiDer assists a user in capturing the biological reality underlying an operon activation as accurately as possible, and provides a means to evaluate, analyze, and apply the results.

Database (the foundation)

We will mine the scientific literature for existing genetic devices and then construct a database that captures device characteristics such as:
  • composition of devices
  • function
  • characterization data
  • literature reference
We will design our database by rigorously constructing an entity relationship diagram and then normalizing these relationships to construct tables for a relational database. (another stub)

Modeling

We will connect known genetic devices together via device input and outputs to create a network of devices that can interact. We define a genetic device as a DNA construct transformed into cells that can cause expression of some protein in response to stimuli (or input). We will develop a web interface to facilitate access to the complex network that we have constructed. Our Web interface makes extensive use of Cytoscape, an open source bioinformatics software package for metabolic network visualization and simulation. In addition, the interface will generate SBOL Visual Images, a standard language that is easily understood by synthetic biologists all over the world. Users can also update our database with additional devices through this interface. Using the Cynetshare framework, users can share their circuit designs. (stub page - same as web app stub page).