Team:Vanderbilt Software/Project/darwin
From 2014.igem.org
(5 intermediate revisions not shown) | |||
Line 67: | Line 67: | ||
<div id="right_page" class="page"> | <div id="right_page" class="page"> | ||
- | |||
</div> | </div> | ||
</div> | </div> | ||
Line 73: | Line 72: | ||
<div id="left_button" class="button"></div> | <div id="left_button" class="button"></div> | ||
<div id="right_button" class="button"></div> | <div id="right_button" class="button"></div> | ||
- | |||
- | |||
<script type="text/javascript" src="https://2014.igem.org/Team:Vanderbilt/subpagebuilder?action=raw&ctype=text/javascript"></script> | <script type="text/javascript" src="https://2014.igem.org/Team:Vanderbilt/subpagebuilder?action=raw&ctype=text/javascript"></script> | ||
+ | |||
<script> | <script> | ||
- | + | var builder = new SubPageBuilder(); | |
- | var | + | var img1 = builder.createPhoto("https://static.igem.org/mediawiki/2014/f/f9/Editing_single_lines.png", "darwin eliminates extra lines in the output file", 487, 322, "darwin eliminates extra lines in the output file"); |
- | + | var img2 = builder.createPhoto("https://static.igem.org/mediawiki/2014/5/50/Editing_characters_in_lines.png", "darwin's unique method of parsing ORF", 487, 381, "darwin's unique method of parsing ORF"); | |
- | + | var img3 = builder.createPhoto("https://static.igem.org/mediawiki/2014/b/bc/Pipeline_diagram_concurrency.png", "Representation of darwin‘s block processor increasing processing speed", 487, 381, "Representation of darwin‘s block processor increasing processing speed"); | |
- | + | ||
+ | document.getElementById("right_page").appendChild(img1); | ||
+ | document.getElementById("right_page").appendChild(img2); | ||
+ | document.getElementById("right_page").appendChild(img3); | ||
</script> | </script> | ||
+ | </body> | ||
+ | |||
</html> | </html> |
Latest revision as of 16:37, 26 January 2015
Version control systems sych as git and svn focus on differences between lines. Since most DNA file formats split DNA to fixed-length lines, many lines are changed at once, for example, when inserting a single new line. darwin does away with that by producing a formatted file representing each ORF on its own line of text, making each edit only modify a single line of the output text.
Genes can be very long. To combat this, darwin will sample a section of every newly inserted ORF and compare it to nearby ORFs; if the new ORF is similar to another ORF, it is counted as “edited,” and darwin only records the character-by-character changes required to transform the old ORF into the new ORF.
Finally, darwin uses concurrency to help speed up the process. File I/O is typically extremely slow, much slower than processing a file data already in memory. Splitting the processing concurrently helps to open up that speed bottleneck.