Team:StanfordBrownSpelman/Modelling
From 2014.igem.org
(Difference between revisions)
Line 66: | Line 66: | ||
<h5 id="results"><center> | <h5 id="results"><center> | ||
DoubleOptimizer </h5> | DoubleOptimizer </h5> | ||
- | < | + | <h7><center>A utility for simultaneous codon and gene synthesis optimization<br> </div></div> |
<div class="image"><center><img src="https://static.igem.org/mediawiki/2014/0/0b/Double_optimizer_graphic.jpg" height="174" width="479"></div> | <div class="image"><center><img src="https://static.igem.org/mediawiki/2014/0/0b/Double_optimizer_graphic.jpg" height="174" width="479"></div> | ||
- | </ | + | </h7> |
- | + | ||
- | + | ||
- | + | ||
<h6> | <h6> | ||
<center><iframe width="420" height="315" src="//www.youtube.com/embed/sO1qd3eTzRo" frameborder="0" allowfullscreen></iframe><br><br> | <center><iframe width="420" height="315" src="//www.youtube.com/embed/sO1qd3eTzRo" frameborder="0" allowfullscreen></iframe><br><br> |
Revision as of 04:03, 17 October 2014
Bioinformatics & Modeling
While working on our synthetic biology projects for this year's iGEM competition, we found ourselves in need of some computational tools for synthetic biology that did not yet exist. Therefore, we developed our own software packages to meet these needs. In particular, we have developed two pieces of software, specifically designed for the needs of synthetic biologists:
- DoubleOptimizer: a tool that facilitates synthesis of repetitive genes by optimizing codon usage to both match a codon usage distribution for a desired organism, and to reduce and avoid repetitive nucleotide sequences, allowing for easier synthesis.
- CompositionSearch: a tool that quickly ranks all proteins in a proteome by their similarity to a given amino acid distribution.
DoubleOptimizer
Gene synthesis as a tool for biological engineering presents both opportunities and challenges. One opportunity presented is the ability to optimize codon usage in a gene to match that of a host organism. Compared to traditional cloning methods, this can increase protein yields in the host organism by several fold.[1] However, while there exist a large number of freely-usable programs that perform codon optimization, there is no guarantee that the sequences these programs provide will be able to be synthesized. Specifically, in the case of genes with repetitive amino acid sequences, these programs will often generate outputs that contain too many repeated short DNA sequences to be synthesized commercially.
As an example, the hypothetical protein X777_06170 from the ant species Cerapachys biroi has an amino acid sequence that appears to be somewhat repetitive:
001 mklfkclvpv vvlllikdss arpglirdfv ggtvgsilep fqilkpkdsy adanshasah
061 nlggtfslgp vslggglssa sasssasang gglasasska daqaggygyg gsnanaqasa
121 sanaqgggyg nggihgiypg qqgvhggnpf lggagsnana naiananaqa naggnngglg
181 syggyqqggn ypidsstgpi gnnpflsggh gdgnanaaan anagasaign gggpidvnnp
241 flhggaansg agginyqpgn aggiilsekp lglptiypgq hppayldsig spgansnaga
301 napcsecgss gatilgyegq glggikesgs sgatilgyeg qglggikesg ssgatilgye
361 gqglggikes gssgatilgs ydgqgpsgat ilgdyngqgl ggikessgvt vlgdyegqgl
421 ggisgphggh gqaganagan ananagatvg ssggvlggvg dhggyhgyng hdgssglnlg
481 gygggsnana qassnalass ggsssatsda lsnahssggs alanssskas angsgsanan
541 ahassnassg shglgsktsa ssqasasadt rdmlifs[2]
Gene synthesis as a tool for biological engineering presents both opportunities and challenges. One opportunity presented is the ability to optimize codon usage in a gene to match that of a host organism. Compared to traditional cloning methods, this can increase protein yields in the host organism by several fold.[1] However, while there exist a large number of freely-usable programs that perform codon optimization, there is no guarantee that the sequences these programs provide will be able to be synthesized. Specifically, in the case of genes with repetitive amino acid sequences, these programs will often generate outputs that contain too many repeated short DNA sequences to be synthesized commercially.
As an example, the hypothetical protein X777_06170 from the ant species Cerapachys biroi has an amino acid sequence that appears to be somewhat repetitive:
001 mklfkclvpv vvlllikdss arpglirdfv ggtvgsilep fqilkpkdsy adanshasah
061 nlggtfslgp vslggglssa sasssasang gglasasska daqaggygyg gsnanaqasa
121 sanaqgggyg nggihgiypg qqgvhggnpf lggagsnana naiananaqa naggnngglg
181 syggyqqggn ypidsstgpi gnnpflsggh gdgnanaaan anagasaign gggpidvnnp
241 flhggaansg agginyqpgn aggiilsekp lglptiypgq hppayldsig spgansnaga
301 napcsecgss gatilgyegq glggikesgs sgatilgyeg qglggikesg ssgatilgye
361 gqglggikes gssgatilgs ydgqgpsgat ilgdyngqgl ggikessgvt vlgdyegqgl
421 ggisgphggh gqaganagan ananagatvg ssggvlggvg dhggyhgyng hdgssglnlg
481 gygggsnana qassnalass ggsssatsda lsnahssggs alanssskas angsgsanan
541 ahassnassg shglgsktsa ssqasasadt rdmlifs[2]
061 nlggtfslgp vslggglssa sasssasang gglasasska daqaggygyg gsnanaqasa
121 sanaqgggyg nggihgiypg qqgvhggnpf lggagsnana naiananaqa naggnngglg
181 syggyqqggn ypidsstgpi gnnpflsggh gdgnanaaan anagasaign gggpidvnnp
241 flhggaansg agginyqpgn aggiilsekp lglptiypgq hppayldsig spgansnaga
301 napcsecgss gatilgyegq glggikesgs sgatilgyeg qglggikesg ssgatilgye
361 gqglggikes gssgatilgs ydgqgpsgat ilgdyngqgl ggikessgvt vlgdyegqgl
421 ggisgphggh gqaganagan ananagatvg ssggvlggvg dhggyhgyng hdgssglnlg
481 gygggsnana qassnalass ggsssatsda lsnahssggs alanssskas angsgsanan
541 ahassnassg shglgsktsa ssqasasadt rdmlifs[2]