Difference between revisions of "Team:DTU-Denmark/Software"

Line 101: Line 101:
 
To accomplish this, we have chosen to use a tRNA Adaptation Index-based method (tAI) (dosReis et. al. 2004) REFERENCE. The fundamental assumption behind this method is that highly expressed proteins have their genes encoded with a set of codons that is overall more susceptible to tRNA-binding and translation compared to less expressed proteins. Hence, this optimization estimates the codon preferences such that the correlation between protein level and tAI is maximized.</p>
 
To accomplish this, we have chosen to use a tRNA Adaptation Index-based method (tAI) (dosReis et. al. 2004) REFERENCE. The fundamental assumption behind this method is that highly expressed proteins have their genes encoded with a set of codons that is overall more susceptible to tRNA-binding and translation compared to less expressed proteins. Hence, this optimization estimates the codon preferences such that the correlation between protein level and tAI is maximized.</p>
 
<p>
 
<p>
The formulas for calculating this are stated in Table 1 in dosReis 2004 (SHOULD WE STATE THEM HERE?). Using this, all 64 \(W_i\)'s can be calculated in one matrix multiplication, by letting \(G\) be the 4\(\times\)16 matrix consisting of the tGCN's (in TaiCO referred to as 'gcn') and letting \(S\) be the 4\( \times\)4 matrix containing the (1 - \(s_{ij}\)) values. Hence,
+
The formulas for calculating this are stated in Table 1 in dosReis 2004 (SHOULD WE STATE THEM HERE?). Using this, all 64 \(W_i\)'s can be calculated in one matrix multiplication, by letting \(G\) be the 4\(\times\)16 matrix consisting of the tGCN's (in TaiCO referred to as 'gcn') and letting \(S\) be the 4\( \times\)4 matrix containing the (1 \(-s_{ij}\)) values. Hence,
 
</p>
 
</p>
 
<p>
 
<p>

Revision as of 13:17, 18 October 2016

New HTML template for the wiki




Bootstrap Example

Title

leader under the title, short introduction. Ubique moderatius efficiantur eum et, dico oporteat recusabo ius cu, pro id modus sadipscing. Maluisset patrioque eum ad, mel eius doctus accommodare eu, minimum deleniti repudiandae mel ea. Noster nostrud diceret sea no. Eos an nullam molestiae signiferumque, vel ne laudem ignota oblique. Duo te luptatum percipitur signiferumque, at dicunt iriure dolorem his.


Section 1

Quote Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer posuere erat a ante.

Someone famous in Source Title

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

Section 2

Regardless of the topic, iGEM projects often create or adapt computational tools to move the project forward. Because they are born out of a direct practical need, these software tools (or new computational methods) can be surprisingly useful for other teams. Without necessarily being big or complex, they can make the crucial difference to a project's success. This award tries to find and honor such "nuggets" of computational work.

Inspiration

Here are a few examples from previous teams:

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

Section 2.1

Paragraph

Paragraph

Section 2.2

Paragraph

Paragraph

Section 2.3

Paragraph

Paragraph

Theory

The central issue in codon optimization is to determine which codons are most efficiently translated for each amino acid. The quantity needed for this task is called 'translatability' and is denoted \(W_i\) for the \(i\)'th codon.

To accomplish this, we have chosen to use a tRNA Adaptation Index-based method (tAI) (dosReis et. al. 2004) REFERENCE. The fundamental assumption behind this method is that highly expressed proteins have their genes encoded with a set of codons that is overall more susceptible to tRNA-binding and translation compared to less expressed proteins. Hence, this optimization estimates the codon preferences such that the correlation between protein level and tAI is maximized.

The formulas for calculating this are stated in Table 1 in dosReis 2004 (SHOULD WE STATE THEM HERE?). Using this, all 64 \(W_i\)'s can be calculated in one matrix multiplication, by letting \(G\) be the 4\(\times\)16 matrix consisting of the tGCN's (in TaiCO referred to as 'gcn') and letting \(S\) be the 4\( \times\)4 matrix containing the (1 \(-s_{ij}\)) values. Hence,

$$W = SG$$

The computed \(W_i\)'s are the normalized by setting \(w_i = \frac{W_i}{W_{\text{max}}}\), and those normalized translatabilities, \(w_i\) do then form the basis for codon selection. Higher \(w_i\)-values are simply selected over lower values. This concludes the method for codon selection.

The \(G\) matrix

\(G\) consists of 64 tGCN values, which are the gene copy number of tRNA's recognizing specific codons. Normally, available gcn-files lists the tGCN's in terms of the reversed anticodon corresponding to the recognized codon, hence, the tricodons in the raw gcn-files are reversed and have their bases replaced by the complemetary ones. For instance, in S. cerevisiae the gcn of tRNA's recognizing TTC (encoding glutamic acid) is 10, so in the raw file, this information is presented as the reversed anticodon, GAA, being equal to 10 instead. When converted into their encoding form, the tGCN's are put into the \(G\) matrix such that each column has the first two position fixed and each row has a fixed third position:

AAAACAAGAATACAACCACGACTAGAAGCAGGAGTATAATCATGATTA
AACACCAGCATCCACCCCCGCCTCGACGCCGGCGTCTACTCCTGCTTC
AAGACGAGGATGCAGCCGCGGCTGGAGGCGGGGGTGTAGTCGTGGTTG
AATACTAGTATTCATCCTCGTCTTGATGCTGGTGTTTATTCTTGTTTT

The \(S\) matrix

While \(G\) is precisely known, \(S\) needs to be optimized. In dosReis 2004, the optimized \(s_{ij}\)-values for S. cerevisiae is published, yielding the \(S\)-matrix, $$ S = \begin{pmatrix} 1 & 0 & 0 & 0.0001 \\ 0 & 1 & 0 & 0.72 \\ 0.32 & 0 & 1 & 0 \\ 0 & 0.59 & 0 & 1 \end{pmatrix} $$ Where both rows and columns are ordered as A,C,G,T. Thus, the \(W_i\)'s computed from the \(SG\) multiplication are each influenced by two tGCN's. As an example, calculating the translatability of CCG will be equal to the dot product of the third row of \(S\) (because third position is a G), and the sixth row of \(G\) (because first two positions are CC): $$ W_{CCG} = 0.32 \cdot \text{tGCN}_{CCA} + 1 \cdot \text{tGCN}_{CCG} $$ clearly taking the wobbling potential of G to A in third position into account.

Section 4

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

Section 5

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

Section 6

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

Sponsors

Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.

  • FIND US AT:
Facebook Twitter
  • DTU BIOBUILDERS
  • DENMARK
  • DTU - SØLTOFTS PLADS, BYGN. 221/002
  • 2800 KGS. LYNGBY

  • E-mail:
  • dtu-biobuilders-2016@googlegroups.com
  • MAIN SPONSORS:
Lundbeck fundation DTU blue dot Lundbeck fundation Lundbeck fundation