Team:HFUT CHINA/Time Complexity Analysis

From 2014.igem.org

HFUT_CHINA

Time Complexity Analysis

The algorithms used in Biodesigner could be grouped into two categories: preprocessing part and executing part.

Preprocessing Part

The preprocessing part includes three algorithms. Here we assume there are n devices and each device contains at most m biobrickes.

The first one is to count the probabilities of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes. Here successive biobrickes means the biobrickes are neighbors. Thus the number of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes are n(m-1), n(m-2), n(m-3), n(m-4), respectively. Hash tables are used to store these successive biobrickes. Therefore, the probabilities could be obtained by scanning the device chains four times. Thus, the time complexity of this algorithm is O(2n(m-1)+3n(m-2)+4n(m-3)+5n(m-3))=O(nm).

The second one is to count the probabilities of the first biobrick of each device chain. This work could be done by scanning the first biobrick of each chain. Thus, the complexity of this algorithm is O(n).

The third one is to calculate the hamming distances of each paired biobrickes. Since there are n device chains and each device contains at most m biobrickes. Therefore, there are at most nm biobrickes, which means there are at most nm*nm paired biobrickes. The calculation of hamming distances between two sequences with length l could be done in O(l) time. Thus, the complexity of this algorithm is O(l*nm*nm)=O(ln2m2).

Hence, the time complexity of preprocessing part is O(nm)+O(n)+O(ln2m2) =O(ln2m2).

Executing Part

The executing part contains two algorithms. Here we assume the given device chain contains at most m biobrickes.

The first one is to inquire the probabilities of two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes of a given device chain. Since all the data is stored in a hash table and the access of hash table needs O(1) time. Thus, the time complexity of this algorithm is O(m).

The second one is to store the given device chain. This work is done by store the two successive biobrickes, three successive biobrickes, four successive biobrickes and five successive biobrickes of the given device chain into the hash table. Thus, the time complexity of this algorithm is O(m).

Hence, the time complexity of executing part is O(m).

Note, we only need to execute the preprocessing part once. After the execution of preprocessing part, all the data is store in database. When using our software, users do not need to waste their time for the preprocessing part as it has been executed.