Team:Cambridge-JIC/Marchantia/Promoter
From 2014.igem.org
Line 62: | Line 62: | ||
We used Geneious™ to run tblastn<a href="#Footnote6">[6]</a> and query the protein coding sequences against the nucleotide sequences of the <em>m. polymorpha</em> scaffolds. Our dataset was made of large gap read mapping transcripts obtained by mRNA sequencing conducted by the Haseloff Lab on the m. polymorpha Cam strain. Open Reading Frame (ORF) and Coding Sequence (CDS) predictions were made using the CLC bio Transcript Discovery plugin<a href="#Footnote7">[7]</a>. | We used Geneious™ to run tblastn<a href="#Footnote6">[6]</a> and query the protein coding sequences against the nucleotide sequences of the <em>m. polymorpha</em> scaffolds. Our dataset was made of large gap read mapping transcripts obtained by mRNA sequencing conducted by the Haseloff Lab on the m. polymorpha Cam strain. Open Reading Frame (ORF) and Coding Sequence (CDS) predictions were made using the CLC bio Transcript Discovery plugin<a href="#Footnote7">[7]</a>. | ||
<br> | <br> | ||
+ | |||
<figure> | <figure> | ||
<img src="https://static.igem.org/mediawiki/2014/5/5d/Cambridge_JIC_Blast_example.png" width = "550px"> | <img src="https://static.igem.org/mediawiki/2014/5/5d/Cambridge_JIC_Blast_example.png" width = "550px"> | ||
<figcaption>Figure 1: Example of a blast hit, matching a nitrate transporter protein sequence to a <em>Marchantia<em> gene</figcaption> | <figcaption>Figure 1: Example of a blast hit, matching a nitrate transporter protein sequence to a <em>Marchantia<em> gene</figcaption> | ||
</figure> | </figure> | ||
- | < | + | <br> |
+ | <p> | ||
We selected the most convincing hits as those with a grading above 30% and got a shortlist by ranking them based on concurrence with an existing gene prediction as this made the selections more reliable. We isolated possible promoter regions as those 2kbp upstream from the start of the purported gene. <br> | We selected the most convincing hits as those with a grading above 30% and got a shortlist by ranking them based on concurrence with an existing gene prediction as this made the selections more reliable. We isolated possible promoter regions as those 2kbp upstream from the start of the purported gene. <br> | ||
Hits for very short regions of homology were not selected. This generally corresponded to hits shorter than 5% of the sequence length of the predicted gene, although slightly shorter hits were noted as support for the reliability of a good match. | Hits for very short regions of homology were not selected. This generally corresponded to hits shorter than 5% of the sequence length of the predicted gene, although slightly shorter hits were noted as support for the reliability of a good match. |
Revision as of 23:42, 17 October 2014
One of the key aims for our project is to introduce Marchantia polymorpha to iGEM with a toolset that enables future teams to develop it further and capitalise on its benefits. So given the limited knowledge about its genetic makeup at present, we have sought to find possible inducible promoters in Marchantia that could be used for parts.
Method
To start off, we hypothesised that inducible promoters would be associated with responses to environmental cues or pressures and that, given their importance, related genetic motifs would be largely conserved in the evolution of land plants. So as Marchantia polymorpha is an early land plant[1], we thought it was likely that many homologues to its genes could still be found in later plants and that these genes could be used to find promoters.
We reviewed research papers to create a shortlist of inducible plant promoters for which we might find homologues in the marchantia genome. We narrowed our search to promoters regulated under limiting supply of nutrients nitrates, sulphates, phosphates (although light variation, circadian rhythm, metabolism and development related inducers were also considered initially). For the majority of our analyses, we selected Arabidopsis thaliana as a model organism from which to identify target genes given the quality of genetic information that is available for the plant. However, we also used genes from the following organisms were data was available: B. nigra, L. esculentum, B. napus, C. reinhardtii, G. max, N. plumbaginifolia, P. patens.
We identified a shortlist of 27 genes that might be regulated under limiting supply of the essential nutrients nitrates, sulphates, and phosphates.
We obtained the peptide sequences for these genes from the following online databases:
Thalemine - https://apps.araport.org/thalemine/begin.do [2]
GenBank - http://www.ncbi.nlm.nih.gov/genbank [3]
TAIR - http://www.arabidopsis.org/ [4]
UniProt - http://www.uniprot.org/ [5]
We used Geneious™ to run tblastn[6] and query the protein coding sequences against the nucleotide sequences of the m. polymorpha scaffolds. Our dataset was made of large gap read mapping transcripts obtained by mRNA sequencing conducted by the Haseloff Lab on the m. polymorpha Cam strain. Open Reading Frame (ORF) and Coding Sequence (CDS) predictions were made using the CLC bio Transcript Discovery plugin[7].
We selected the most convincing hits as those with a grading above 30% and got a shortlist by ranking them based on concurrence with an existing gene prediction as this made the selections more reliable. We isolated possible promoter regions as those 2kbp upstream from the start of the purported gene.
Hits for very short regions of homology were not selected. This generally corresponded to hits shorter than 5% of the sequence length of the predicted gene, although slightly shorter hits were noted as support for the reliability of a good match.
We identified 30 candidate promoters this way, that we are planning to screen by inserting in a construct driving the yellow fluorescent protein Venus. For each promoter, we will make a construct with and one without amplification by GAL4 and GAL4 UAS, to evaluate the promoter strength and get around any leakages due to the use of GAL4.
References
1. Wellman CH, Osterloff PL, Mohiuddin U. 2003. Fragments of the earliest land plants. Nature 425, pp. 282-285. back to top
2. Thalemine - https://apps.araport.org/thalemine/begin.do [Accessed: July – September 2014]back to top
3. GenBank - http://www.ncbi.nlm.nih.gov/genbank [Accessed: July – September 2014]back to top
4. TAIR - http://www.arabidopsis.org/ [Accessed: July – September 2014]back to top
5. UniProt - http://www.uniprot.org/ [Accessed: July – September 2014]back to top
6. NCBI Blast ®, http://blast.ncbi.nlm.nih.gov/Blast.cgi back to top
7. Qiagen, CLC bio Transcript Discovery ®, http://www.clcbio.com/clc-plugin/transcript-discovery/#description back to top