Team:HFUT CHINA/Examples.html

From 2014.igem.org

(Difference between revisions)
 
(10 intermediate revisions not shown)
Line 1: Line 1:
-
<!DOCTYPE HTML>
 
<html>
<html>
<head>
<head>
Line 5: Line 4:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
-
<link href='http://fonts.googleapis.com/css?family=Numans' rel='stylesheet' type='text/css'>
+
 
<link href="https://2014.igem.org/Team:HFUT_CHINA/style.css?action=raw&amp;ctype=text/css" rel="stylesheet" type="text/css" />
<link href="https://2014.igem.org/Team:HFUT_CHINA/style.css?action=raw&amp;ctype=text/css" rel="stylesheet" type="text/css" />
<link href="https://2014.igem.org/Team:HFUT_CHINA/webwidget_menu_dropdown.css?action=raw&amp;ctype=text/css" rel="stylesheet" type="text/css">
<link href="https://2014.igem.org/Team:HFUT_CHINA/webwidget_menu_dropdown.css?action=raw&amp;ctype=text/css" rel="stylesheet" type="text/css">
Line 15: Line 14:
<!--
<!--
.STYLE1 {font-size: 120%}
.STYLE1 {font-size: 120%}
 +
.STYLE2 {font-size: 24px}
 +
.STYLE5 {
 +
font-size: 24px;
 +
color: #0000CC;
 +
font-family: Verdana, Arial, Helvetica, sans-serif;
 +
}
 +
.STYLE6 {font-size: 200%}
 +
.STYLE9 {
 +
font-size: 150%;
 +
font-weight: bold;
 +
}
 +
.STYLE12 {font-size: 120%}
-->
-->
</style>
</style>
Line 44: Line 55:
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/Devblogs.html">Devblogs</a></li>
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/Devblogs.html">Devblogs</a></li>
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/Database&Algorithm.html">Algorithm</a></li>
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/Database&Algorithm.html">Algorithm</a></li>
-
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/Tutorial.html">Tutorial</a></li>
+
                         <li><a href="https://2014.igem.org/Team:HFUT_CHINA/user_guide.html">Tutorial</a></li>
                        
                        
Line 90: Line 101:
<div class="wrapper">
<div class="wrapper">
<div class="main">
<div class="main">
-
<div class="details">
+
-
  <div class="font_1" >Results</div>
+
  <div class="det-para">
  <div class="det-para">
-
  </div>
+
</div>
</div>
  </div>
  </div>
 +
  <span class="details">
 +
      <h2 class="font_1">Time Complexity Analysis</h2>
 +
      </span>
 +
 +
<span class="STYLE12">The algorithms used in  Biodesigner could be grouped into two categories: preprocessing part and executing  part.</span></p>
 +
        <p align="center"><br>
 +
              <span class="STYLE9">Preprocessing Part</span></p>
 +
        <p align="justify"><br>
 +
          <span class="STYLE12">The preprocessing part  includes three algorithms. Here we assume there are n devices and each device  contains at most m biobrickes.</span></p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          The first one is to count  the probabilities of two successive biobrickes, three successive biobrickes,  four successive biobrickes and five successive biobrickes. Here successive  biobrickes means the biobrickes are neighbors. Thus the number of two successive  biobrickes, three successive biobrickes, four successive biobrickes and five successive  biobrickes are<strong> n(m-1),  n(m-2), n(m-3),  n(m-4)</strong>, respectively. Hash tables are used to store these successive  biobrickes. Therefore, the probabilities could be obtained by scanning the  device chains four times. Thus, the time complexity of this algorithm is <strong>O(2n(m-1)+3n(m-2)+4n(m-3)+5n(m-3))=O(nm). </strong>   </p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          The second one is to count  the probabilities of the first biobrick of each device chain. This work could  be done by scanning the first biobrick of each chain. Thus, the complexity of  this algorithm is <strong>O(n)</strong>.</p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          The third one is to  calculate the hamming distances of each paired biobrickes.  Since there are n device chains and each  device contains at most m biobrickes. Therefore, there are at most nm  biobrickes, which means there are at most <strong>nm*nm</strong> paired biobrickes. The  calculation of hamming distances between two sequences with length l could be  done in <strong>O(l)</strong> time. Thus, the complexity of this algorithm is<strong> O(l*nm*nm).</strong></p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          Hence, the time complexity  of preprocessing part is <strong>O(nm)+O(n)+O(l*nm*nm) =O(l*nm*nm)</strong>.</p>
 +
        <p align="center"><br>
 +
          <span class="STYLE9">Executing Part</span></p>
 +
        <p align="justify"><br>
 +
          <span class="STYLE12">The executing part contains  two algorithms. Here we assume the given device chain contains at most m  biobrickes.</span></p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          The first one is to  inquire the probabilities of two successive biobrickes, three successive  biobrickes, four successive biobrickes and five successive biobrickes of a  given device chain. Since all the data is stored in a hash table and the access  of hash table needs O(1) time. Thus, the time complexity of this algorithm is  O(m).</p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          The second one is to store  the given device chain. This work is done by store the two successive  biobrickes, three successive biobrickes, four successive biobrickes and five successive  biobrickes of the given device chain into the hash table. Thus, the time  complexity of this algorithm is O(m).</p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          Hence, the time complexity  of executing part is O(m).</p>
 +
        <p align="justify" class="STYLE12"><br>
 +
          Note, we only need to  execute the preprocessing part once. After the execution of preprocessing part,  all the data is store in database. When using our software, users do not need  to waste their time for the preprocessing part as it has been executed.</p>
 +
 +
<br>
 +
<br>
 +
<span class="details">
 +
      <h2 class="font_1">Experimental Results</h2>
 +
      </span>
 +
<p align="justify" class="STYLE1">In order to test the  performance of our software, we collected about 3600 device chains from  Registry of Standard Parts, and extracted the associate relationships among  these chains. All these information are stored in our database.</p>
<p align="justify" class="STYLE1">In order to test the  performance of our software, we collected about 3600 device chains from  Registry of Standard Parts, and extracted the associate relationships among  these chains. All these information are stored in our database.</p>
<p align="justify" class="STYLE1"><br>
<p align="justify" class="STYLE1"><br>

Latest revision as of 03:13, 18 October 2014

HFUT_CHINA | meetup

Time Complexity Analysis

The algorithms used in Biodesigner could be grouped into two categories: preprocessing part and executing part.


Preprocessing Part


The preprocessing part includes three algorithms. Here we assume there are n devices and each device contains at most m biobrickes.


The first one is to count the probabilities of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes. Here successive biobrickes means the biobrickes are neighbors. Thus the number of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes are n(m-1),  n(m-2), n(m-3), n(m-4), respectively. Hash tables are used to store these successive biobrickes. Therefore, the probabilities could be obtained by scanning the device chains four times. Thus, the time complexity of this algorithm is O(2n(m-1)+3n(m-2)+4n(m-3)+5n(m-3))=O(nm).   


The second one is to count the probabilities of the first biobrick of each device chain. This work could be done by scanning the first biobrick of each chain. Thus, the complexity of this algorithm is O(n).


The third one is to calculate the hamming distances of each paired biobrickes.  Since there are n device chains and each device contains at most m biobrickes. Therefore, there are at most nm biobrickes, which means there are at most nm*nm paired biobrickes. The calculation of hamming distances between two sequences with length l could be done in O(l) time. Thus, the complexity of this algorithm is O(l*nm*nm).


Hence, the time complexity of preprocessing part is O(nm)+O(n)+O(l*nm*nm) =O(l*nm*nm).


Executing Part


The executing part contains two algorithms. Here we assume the given device chain contains at most m biobrickes.


The first one is to inquire the probabilities of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes of a given device chain. Since all the data is stored in a hash table and the access of hash table needs O(1) time. Thus, the time complexity of this algorithm is O(m).


The second one is to store the given device chain. This work is done by store the two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes of the given device chain into the hash table. Thus, the time complexity of this algorithm is O(m).


Hence, the time complexity of executing part is O(m).


Note, we only need to execute the preprocessing part once. After the execution of preprocessing part, all the data is store in database. When using our software, users do not need to waste their time for the preprocessing part as it has been executed.



Experimental Results

In order to test the performance of our software, we collected about 3600 device chains from Registry of Standard Parts, and extracted the associate relationships among these chains. All these information are stored in our database.


Firstly, we tested the response time of different functions of our software. Table 1 shows the results of different query functions. The results are averaged from 10 independent runs. From the results, we can find that all the time is really short, less than 0.1 seconds.


Table 1 Response time for different functions

 

Function

Response Time (sec.)

Query of Biobrick information

0.04

Query of Biobrick’s twins

0.03

Ambiguous Query of Biobrick

0.1

 

We also tested the response time of recommendation. Table 2 listed the results in terms of different number of given biobricks. Since our software recommend biobrick based on at most 4 biobricks’ information. Thus, the maximum number of given biobricks is 4. These times are really short, less than 0.4 seconds. Users could get the recommendation almost instantly. This could greatly reduce the time used for designing.


Table 2 Response time for recommending based on different number of biobricks

 

Number of Biobricks

Response Time(sec.)

0

0.3

1

0.19

2

0.14

3

0.27

4

0.35

 

Secondly, we tested the performance of recommendation. 10 device chains were randomly selected for this test. Table 3 shows the results, where the recommendation ratio is the ratio of the number of correctly recommended biobricks over the length of device chains, and the correctly recommended biobricks means these biobricks are in recommended lists. The average recommendation ratio is 75.4%. This ratio actually is really high, since only 24.6% of biobricks are not in the recommended lists.  

 

Device Chain

Correctly recommended biobricks

Recommendation ratio

BBa_R0040-BBa_B0034-BBa_J36835-BBa_J36837-BBa_K283009-BBa_B0010-BBa_B0012

BBa_B0034 BBa_J36837
BBa_K283009 BBa_B0010 BBa_B0012

71.4%

BBa_R0010-BBa_B0030-BBa_K233307-BBa_I732005-BBa_E0040-BBa_B0010-BBa_B0012

BBa_B0030 BBa_K233307 BBa_I732005 BBa_E0040 BBa_B0010
BBa_B0012

85.7%

BBa_J23118-BBa_B0030-BBa_K233306-BBa_I732005-BBa_E0040-BBa_B0010-BBa_B0012

BBa_K233306 BBa_I732005 BBa_E0040 BBa_B0010
BBa_B0012

71.4%

BBa_R0040-BBa_B0034-BBa_K255000-BBa_B0010-BBa_B0012-BBa_R0040-BBa_B0034-BBa_E0030-BBa_B0010-BBa_B0012 

BBa_B0034 BBa_B0010 BBa_B0012 BBa_R0040
BBa_B0034 BBa_E0030
BBa_B0010 BBa_B0012

80%

BBa_J23119-BBa_B0034-BBa_C0040-BBa_B0034-BBa_E1010-BBa_B0010-BBa_B0012

BBa_B0034 BBa_C0040
BBa_B0034 BBa_E1010
BBa_B0010 BBa_B0012

85.7%

BBa_J23106-BBa_B0034-BBa_C0012-BBa_B0034-BBa_C0062-BBa_B0012-BBa_B0011 

BBa_C0012 BBa_B0034
BBa_C0062 BBa_B0012
BBa_B0011

71.4%

BBa_K145150-BBa_B0034-BBa_K145151-BBa_B0034-BBa_E1010-BBa_B0012-BBa_B0011

BBa_K145151 BBa_B0034 BBa_E1010 BBa_B0012 BBa_B0011

71.4%

BBa_R0040-BBa_B0034-BBa_C0179-BBa_B0010-BBa_B0012-BBa_R0079

BBa_B0034 BBa_B0010
BBa_B0012 BBa_R0079

66.7%

BBa_R0040-BBa_K118012-BBa_I742151-BBa_J15001-BBa_K118002

BBa_I742151 BBa_J15001
BBa_K118002

60%

BBa_I14032-BBa_B0034-BBa_C0062-BBa_B0010-BBa_B0012-BBa_R0062-BBa_J61127-BBa_K228000-BBa_B0010-BBa_B0012

BBa_B0034 BBa_C0062
BBa_B0010 BBa_B0012 BBa_R0062 BBa_J61127 BBa_K228000 BBa_B0010 BBa_B0012

90%

 

We also use the new designed chains from BIT team to test the performance of our software. Table 4 shows the results. The averaged recommendation ratio is 66.9%. This ratio is smaller than the ratio tested above. The reason is that the new designed chains consist of some new biobricks that the database does not contain. These biobricks are hardly to recommend and effect the recommendation of next biobrick. However, this ratio still proves our software works well even when using new designed biobricks. We could believe that as more and more biobricks and device chains are added into our database, our software could recommend more and more efficiently and accurately.


Table 4 Performance of BioDesigner with BIT’s new designed device chains

 

Device Chain

Correctly recommended biobricks

Recommendation ratio

BBa_B0034-BBa_C0071-BBa_B0010-BBa_B0012-BBa_R0071-BBa_B0030-BBa_E1010

BBa_B0010 BBa_B0012
BBa_R0071 BBa_E1010

57.1%

BBa_J23100-BBa_B0034-BBa_C0179-BBa_B0010-BBa_B0012-BBa_R0079-BBa_B0030-BBa_E1010

BBa_B0034 BBa_B0010
BBa_B0012 BBa_R0079
BBa_E1010

62.5%

BBa_R0079-BBa_B0034-BBa_E0040-BBa_B0010-BBa_B0012

BBa_B0034 BBa_E0040
BBa_B0010 BBa_B0012

80%

BBa_R0079-BBa_B0030-BBa_E1010-BBa_B0010-BBa_B0012

BBa_E1010 BBa_B0010
BBa_B0012

60%

BBa_J23100-BBa_B0034-BBa_C0179-BBa_B0010-BBa_B0012-BBa_R0079-BBa_B0034-BBa_E0040

BBa_B0034 BBa_B0010
BBa_B0012 BBa_R0079
BBa_B0034 BBa_E0040

75%

BBa_B0034-BBa_C0171-BBa_B0010-BBa_B0012-BBa_R0071-BBa_B0030-BBa_E1010-BBa_B0010-BBa_B0012

BBa_B0010 BBa_B0012
BBa_R0071 BBa_E1010
BBa_B0010 BBa_B0012

66.7%