Team:SJTU-Software/Database/AssessmentModel
From 2014.igem.org
Assessment Model
An assessment model is optimized to judge the quality of biobricks. With a default score given to each biobrick respectively, biobricks that are related to the input keyword in the “Easy BBK” search engine can be listed in descending order of scores although users can define their own sorting order. If sorted by default scores, users can always get biobricks with high quality. Our assessment model considered 4 general properties, namely status, reliability, feedback and publication, based on 12 attributes of a biobrick. Weights of the attributes in the general properties are already optimized and fixed; weights of the 4 general properties can be adjusted by users although default weights of the properties are optimized already and recommended to users.
Evaluation criteria
Based on the description on the website of Registry of Standard Biological Parts and advice collected from our instructor, 13 attributes of biobricks are picked out as evaluation criteria in the assessment model.
Considering different aspects users frequently takes into account when choosing a biobrick, we deliberately divided these attributes into 4 properties, which are listed in Table 1.
Property | Description |
---|---|
Status | Will I get this official part? This property measures the availability of the part. |
Reliability | Will this part work? This property measures the working quality of the part. |
Feedback | What is the review of previous teams or experimenters for this part? This property collects the feedbacks of previous workers on the part. |
Publication | How many publications are related to this part? This property contains the related results on Google Scholar. |
Table 1 The four properties in the assessment model
More detailed description of the attributes in the properties is demonstrated in Table 2.
Property | Attribute | Description |
---|---|
Status | |
Part Status | The status of a part based on the completeness of its documentation and characterization. |
DNA Status | States the DNA status of your part: Deleted, Planning, Sent, Available, etc. These statuses are generated by the Registry, so the user cannot edit them. |
Whether or not Deleted | Whether the part is deleted or not. |
Confirmed Times | Times of the sequence of the part being confirmed. Part samples are sequenced using the VF2 and VR primer sites on their plasmid backbones. Sequence results are then uploaded and compared to their target sequence (part's documented sequence) through Registry software. |
Length of Documentation | Length of documentation for the part on Registry |
Reliability | |
Part Results | The experience status for a part, as documented by the part authors. However, the part's experience is not a validation of the part by iGEM HQ. |
Star Rating | Stars given by Registry |
Group Favorite | Whether the part is a favorite of one team/group. |
Feedback | |
Used Times | The number of times the part has been specified in composite parts in the Registry. |
Average Rating | The average rating given for this part by other users. |
Number of Comments | Times being reviewed by other users. |
Publication | |
Number of Publication | Number of related literatures and webpages on Google scholar |
Table 2 Detailed attributes in the assessment model
Scores given to the values in each criterion
There are two types of values in all the criteria: categorical variables (such as part status, part results, etc) and continuous variables (such as used times, average stars, etc).
For categorical variables, we give reasonable scores to different values in each criterion based on the researchers' experience and description of the values.
For continuous variables, we normalize the values in those criteria using the following formula, so that all the continuous variables range from 0 to 1 like categorical variables:
where x is an original value, x' is the normalized value. The scores given to different values in all the attributes is presented in Table 3.
Part Status | ||
---|---|---|
Released HQ 2013 | 1 | 3 |
Deleted; Not Released | 0 | 25304 |
Sample Status | ||
In Stock | 1 | 3238 |
It’s complicated | 0.5 | 9368 |
For Reference Only | 0.25 | 20 |
Not In Stock; No Part Sequence; Discontinued | 0 | 14457 |
DNA Status | ||
Available | 1 | 3247 |
Planning; Informational | 0.5 | 10274 |
Unavailable; Deleted | 0 | 13572 |
Whether or Not Deleted | ||
Not Deleted | 1 | 25251 |
Deleted | 0 | 1842 |
Confirmed Times | ||
Normalized values ranging | 0-1 | 27093 |
Length of Documentation | ||
Normalized values ranging | 0-1 | 27093 |
Part Results | ||
Works | 1 | 5164 |
Issues | 0.5 | 450 |
Fails; None; NULL | 0 | 21479 |
Star Rating | ||
1 | 1 | 811 |
NULL | 0 | 26282 |
Group Favorite | ||
Yes | 1 | 3009 |
No | 0 | 24084 |
Used Times | ||
Normalized values ranging | 0-1 | 27093 |
Average Rating | ||
Normalized values ranging | 0-1 | 27093 |
Number of Comments | ||
Normalized values ranging | 0-1 | 27093 |
Number of Publication | ||
Normalized values ranging | 0-1 | 27093 |
Table 3 Different scores given to the different values in all the criteria
Weights given to each criterion
50 biobricks were picked out as positive samples in Table 4. The weight of each criterion are then trained and adjusted to make the biobricks in the positive samples ranking the highest in the final scores.
Positive Examples | Attribute | Type |
---|---|---|
BBa_B0034 | RBS | BBa_B0034 is the RBS used most. |
BBa_B0015 | Terminator | BBa_B0015 is the terminator used most. |
BBa_E1010 BBa_E0040 | Coding Region | These two parts are well documented and they perform well in other attributes. |
BBa_J23114 BBa_J23113 | Regulatory | These two parts are not outstanding in any attribute, however, they are better than most of the other parts in all of attributes. |
BBa_R0040 BBa_R0010 | Promoter | These two parts are used most frequently among promoters and they get good feedbacks from users. |
Table 4 Part of the positive samples in our assessment model
The weights of the criteria in each property are fixed and cannot be adjusted by users. In “Easy BBK”, users could adjust the weight of the 4 properties - Status, Reliability, Feedback, Publication - in our assessment model although optimized weights and default scores have already given to the biobricks in the assessment model. The optimized weights are presented in Table 5.
Property | Attribute | Default Weight | Weight |
---|---|---|---|
Status | |||
Part Status | 40.6% | 1.6% | |
Sample status | 6.9% | ||
DNA status | 8.3% | ||
Whether or not deleted | 6.6% | ||
Confirmed times | 11.4% | ||
Length of Documentation | 5.8% | ||
Reliability | |||
Part results | 23.3% | 11.9% | |
Star rating | 10.2% | ||
Group favorite | 1.2% | ||
Feedback | |||
Used times | 24.4% | 11.3% | |
Average Rating | 2.2% | ||
Number of comments | 10.9% | ||
Publication | Number of publication | 11.7% | 11.7% |
Table 5 Weights given to the attributes in the default score of the biobricks
A glance at the assessment results
Table 6 shows the first three biobricks of different types:
Type | Biobrick | General information |
---|---|
Coding | |
BBa_E1010 | Commented 7 times with average rating 3.14, 264 uses in composite parts,23 records on Google Scholar |
BBa_E0040 | Commented 5 times with average rating 3, 435 uses in composite parts, 32 records on Google scholar |
BBa_I712019 | Commented 4 times with average rating 4 |
Composite | |
BBa_K081022 | 1 registry star, group favorite, commented one time with average rating 5 |
BBa_K145279 | 1 registry star, group favorite, commented one time with average rating 5 |
BBa_K137055 | 1 registry star, group favorite,2 records on Google scholar |
RBS | |
BBa_B0034 | Commented 10 times with average rating 5, 2935 uses in composite parts, 62 records on Google scholar |
BBa_B0032 | Commented 2 times with average rating 5, 487 uses in composite parts, 24 records on Google scholar |
BBa_B0030 | Commented 3 times with average rating 4.33, 673 uses in composite parts, 9 records on Google scholar |
Regulatory | |
BBa_R0040 | Commented 7 times with average rating 4.14, 792 uses in composite parts, 31 records on Google scholar |
BBa_R0010 | Commented 7 times with average rating 4.71, 549 uses in composite parts, 19 records on Google scholar |
BBa_R0011 | Commented 9 times with average rating 3.44, 373 uses in composite parts, 13 records on Google scholar |
Reporter | |
BBa_J04450 | More documentation on website than others, commented 3 times with average rating 5 |
BBa_E0840 | Group favorite, 138 uses in composite parts, 10 records on Google scholar |
BBa_J52008 | Group favorite, Commented 3 times with average rating 5 |
Terminator | |
BBa_B0015 | Commented 6 times with average rating 4.5, 2650 uses in composite parts, 70 records on Google scholar |
BBa_B0014 | Commented 3 times with average rating 5, 248 uses in composite parts, 10 records on Google scholar |
BBa_B0011 | 76 uses in composite parts |
The distribution of default scores of all biobricks is shown as follows:
BBa_B0034 and BBa_B0031 are both RBS, ranked the 1st and 4th in type RBS in our assessment model respectively. The information of the two biobricks is listed as follows.
Part name | BBa_B0034 | BBa_B0031 | |
---|---|---|---|
Basic Information | |||
Short Description | RBS(Elowitz 1999)--defines RBS efficiency | RBS.2(weak)--derivative of BBa_0030 | |
Part type | RBS | RBS | |
Status | |||
DNA Status | Available | Available | |
Release status | Released HQ 2013 | Released HQ 2013 | |
Sample status | Sample In stock | Sample In stock | |
Delete This Part | Not Deleted | Not Deleted | |
Rating | |||
Qualitative Experience | Works | Works | |
Group Favorite | No | No | |
Star Rating | 1 | 1 | |
Feedback | |||
Uses number | 2935 Uses | 196 Uses | |
Comments number | 10 | 2 | |
Comment stars | 5 | 5 | |
Criterion | Number of Results in Google Scholar | 62 | 28 |
Obviously, the two biobricks are the same in many attributes, and based on the overall information, we can generally conclude that the two parts are both of good quality.
However, when it comes to the number of used times, commented times, and number of results on Google scholar, it is suggested that BBa_B0031 is less popular and has less number of studies than BBa_B0034. What’s more, on Registry, the definition of efficiency of BBa_B0034 is 1.0, which is much higher than 0.07 of BBa_B0031. This is also literally confirmed by their short description.
The conclusion above and our user studies have confirmed that the quality of biobricks is in accordance with the scores in our assessment model.