Team:SJTU-Software/Database/AssessmentModel

From 2014.igem.org




Assessment Model

An assessment model is optimized to judge the quality of biobricks. With a default score given to each biobrick respectively, biobricks that are related to the input keyword in the “Easy BBK” search engine can be listed in descending order of scores although users can define their own sorting order. If sorted by default scores, users can always get biobricks with high quality. Our assessment model considered 4 general properties, namely status, reliability, feedback and publication, based on 12 attributes of a biobrick. Weights of the attributes in the general properties are already optimized and fixed; weights of the 4 general properties can be adjusted by users although default weights of the properties are optimized already and recommended to users.



Evaluation criteria

Based on the description on the website of Registry of Standard Biological Parts and advice collected from our instructor, 13 attributes of biobricks are picked out as evaluation criteria in the assessment model.
Considering different aspects users frequently takes into account when choosing a biobrick, we deliberately divided these attributes into 4 properties, which are listed in Table 1.

 
PropertyDescription  
StatusWill I get this official part?
This property measures the availability of the part.  
ReliabilityWill this part work?
This property measures the working quality of the part.
FeedbackWhat is the review of previous teams or experimenters for this part?
This property collects the feedbacks of previous workers on the part.
PublicationHow many publications are related to this part?
This property contains the related results on Google Scholar.

Table 1 The four properties in the assessment model

More detailed description of the attributes in the properties is demonstrated in Table 2.

 
PropertyAttributeDescription
Status  
Part StatusThe status of a part based on the completeness of its documentation and characterization.  
DNA StatusStates the DNA status of your part: Deleted, Planning, Sent, Available, etc. These statuses are generated by the Registry, so the user cannot edit them.
Whether or not DeletedWhether the part is deleted or not.
Confirmed TimesTimes of the sequence of the part being confirmed. Part samples are sequenced using the VF2 and VR primer sites on their plasmid backbones. Sequence results are then uploaded and compared to their target sequence (part's documented sequence) through Registry software.
Length of DocumentationLength of documentation for the part on Registry
Reliability  
Part ResultsThe experience status for a part, as documented by the part authors. However, the part's experience is not a validation of the part by iGEM HQ.  
Star RatingStars given by Registry
Group FavoriteWhether the part is a favorite of one team/group.
Feedback  
Used TimesThe number of times the part has been specified in composite parts in the Registry.  
Average RatingThe average rating given for this part by other users.
Number of CommentsTimes being reviewed by other users.
Publication  
Number of PublicationNumber of related literatures and webpages on Google scholar

Table 2 Detailed attributes in the assessment model



Scores given to the values in each criterion

There are two types of values in all the criteria: categorical variables (such as part status, part results, etc) and continuous variables (such as used times, average stars, etc).
For categorical variables, we give reasonable scores to different values in each criterion based on the researchers' experience and description of the values.
For continuous variables, we normalize the values in those criteria using the following formula, so that all the continuous variables range from 0 to 1 like categorical variables:

where x is an original value, x' is the normalized value. The scores given to different values in all the attributes is presented in Table 3.

 
Part Status  
Released HQ 201313
Deleted; Not Released025304  
Sample Status  
In Stock13238  
It’s complicated0.59368
For Reference Only0.2520
Not In Stock; No Part Sequence; Discontinued014457
DNA Status  
Available13247  
Planning; Informational0.510274
Unavailable; Deleted013572
Whether or Not Deleted  
Not Deleted125251  
Deleted01842
Confirmed Times  
Normalized values ranging0-127093
Length of Documentation  
Normalized values ranging0-127093
Part Results  
Works15164  
Issues0.5450
Fails; None; NULL021479
Star Rating  
11811  
NULL026282
Group Favorite  
Yes13009  
No024084
Used Times  
Normalized values ranging0-127093
Average Rating  
Normalized values ranging0-127093
Number of Comments  
Normalized values ranging0-127093
Number of Publication  
Normalized values ranging0-127093

Table 3 Different scores given to the different values in all the criteria



Weights given to each criterion

50 biobricks were picked out as positive samples in Table 4. The weight of each criterion are then trained and adjusted to make the biobricks in the positive samples ranking the highest in the final scores.

 
Positive ExamplesAttributeType  
BBa_B0034 RBSBBa_B0034 is the RBS used most.
BBa_B0015 TerminatorBBa_B0015 is the terminator used most.
BBa_E1010
BBa_E0040
Coding RegionThese two parts are well documented and they perform well in other attributes.
BBa_J23114
BBa_J23113
RegulatoryThese two parts are not outstanding in any attribute, however, they are better than most of the other parts in all of attributes.
BBa_R0040
BBa_R0010
PromoterThese two parts are used most frequently among promoters and they get good feedbacks from users.

Table 4 Part of the positive samples in our assessment model

The weights of the criteria in each property are fixed and cannot be adjusted by users. In “Easy BBK”, users could adjust the weight of the 4 properties - Status, Reliability, Feedback, Publication - in our assessment model although optimized weights and default scores have already given to the biobricks in the assessment model. The optimized weights are presented in Table 5.

 
PropertyAttributeDefault WeightWeight
Status  
Part Status40.6%1.6%
Sample status6.9%  
DNA status8.3%
Whether or not deleted6.6%
Confirmed times11.4%
Length of Documentation5.8%
Reliability  
Part results23.3%11.9%
Star rating10.2%  
Group favorite1.2%
Feedback  
Used times24.4%11.3%
Average Rating2.2%  
Number of comments10.9%
PublicationNumber of publication11.7%11.7%

Table 5 Weights given to the attributes in the default score of the biobricks



A glance at the assessment results

Table 6 shows the first three biobricks of different types:

 
TypeBiobrickGeneral information
Coding  
BBa_E1010 Commented 7 times with average rating 3.14, 264 uses in composite parts,23 records on Google Scholar
BBa_E0040 Commented 5 times with average rating 3, 435 uses in composite parts, 32 records on Google scholar
BBa_I712019 Commented 4 times with average rating 4
Composite  
BBa_K081022 1 registry star, group favorite, commented one time with average rating 5
BBa_K145279 1 registry star, group favorite, commented one time with average rating 5
BBa_K137055 1 registry star, group favorite,2 records on Google scholar
RBS  
BBa_B0034 Commented 10 times with average rating 5, 2935 uses in composite parts, 62 records on Google scholar
BBa_B0032 Commented 2 times with average rating 5, 487 uses in composite parts, 24 records on Google scholar
BBa_B0030 Commented 3 times with average rating 4.33, 673 uses in composite parts, 9 records on Google scholar
Regulatory  
BBa_R0040 Commented 7 times with average rating 4.14, 792 uses in composite parts, 31 records on Google scholar
BBa_R0010 Commented 7 times with average rating 4.71, 549 uses in composite parts, 19 records on Google scholar
BBa_R0011 Commented 9 times with average rating 3.44, 373 uses in composite parts, 13 records on Google scholar
Reporter  
BBa_J04450 More documentation on website than others, commented 3 times with average rating 5
BBa_E0840 Group favorite, 138 uses in composite parts, 10 records on Google scholar
BBa_J52008 Group favorite, Commented 3 times with average rating 5
Terminator  
BBa_B0015 Commented 6 times with average rating 4.5, 2650 uses in composite parts, 70 records on Google scholar
BBa_B0014 Commented 3 times with average rating 5, 248 uses in composite parts, 10 records on Google scholar
BBa_B0011 76 uses in composite parts

The distribution of default scores of all biobricks is shown as follows:

BBa_B0034 and BBa_B0031 are both RBS, ranked the 1st and 4th in type RBS in our assessment model respectively. The information of the two biobricks is listed as follows.

 
Part nameBBa_B0034BBa_B0031
Basic Information  
Short DescriptionRBS(Elowitz 1999)--defines RBS efficiencyRBS.2(weak)--derivative of BBa_0030
Part typeRBSRBS  
Status  
DNA StatusAvailableAvailable
Release statusReleased HQ 2013Released HQ 2013
Sample statusSample In stockSample In stock
Delete This PartNot DeletedNot Deleted
Rating  
Qualitative ExperienceWorksWorks
Group FavoriteNoNo
Star Rating11
Feedback  
Uses number2935 Uses196 Uses
Comments number102
Comment stars55
CriterionNumber of Results in Google Scholar6228

Obviously, the two biobricks are the same in many attributes, and based on the overall information, we can generally conclude that the two parts are both of good quality.

However, when it comes to the number of used times, commented times, and number of results on Google scholar, it is suggested that BBa_B0031 is less popular and has less number of studies than BBa_B0034. What’s more, on Registry, the definition of efficiency of BBa_B0034 is 1.0, which is much higher than 0.07 of BBa_B0031. This is also literally confirmed by their short description.

The conclusion above and our user studies have confirmed that the quality of biobricks is in accordance with the scores in our assessment model.



iGemdry2014@163.com

SJTU-Software,Shanghai,China

back to top