Difference between revisions of "Team:IIT-Madras/Model"

Line 74: Line 74:
 
<html><figure>
 
<html><figure>
 
<img style="display:block;margin-left:auto;margin-right:auto;" src="https://static.igem.org/mediawiki/2016/d/d0/Igemiitm_exp_model2.png">
 
<img style="display:block;margin-left:auto;margin-right:auto;" src="https://static.igem.org/mediawiki/2016/d/d0/Igemiitm_exp_model2.png">
<figcaption>Our model gives a correlation coefficient of 0.87 for 90% data, 0.83 for 95% data; 5% outliers are shown in light green color, additional 5% outliers are shown in little dark green color</figcaption>
+
<figcaption>Experiment model</figcaption>
 
</figure></html>
 
</figure></html>
  
 
<html><figure><img style="display:block;margin-left:auto;margin-right:auto;" src="https://static.igem.org/mediawiki/2016/d/d3/Igemiitm_exp_model1.png">
 
<html><figure><img style="display:block;margin-left:auto;margin-right:auto;" src="https://static.igem.org/mediawiki/2016/d/d3/Igemiitm_exp_model1.png">
<figcaption>Our null model (excluding codon preference feature) gives a correlation coefficient of 0.83 for 90% data, 0.81 for 95% data; 5% outliers are shown in light green color, additional 5% outliers are shown in little dark green color</figcaption>
+
<figcaption>Null model</figcaption>
 
</figure></html>
 
</figure></html>
  

Revision as of 17:36, 15 October 2016


Modularity of RBS parts

Introduction

Methodology

The dataset from "Causes and effects of N-terminal codon bias in bacterial genes" paper was taken. Protein expression Data was available for following constructs : 2 promoters x 3 RBSs x 1781 (137x13) sfGFP variants in first 11 codons at N-terminal (3 RBS parts were B0034, B0032, B0030) And 2 promoter x 137 natural RBSs x 13 sfGFP variants in first 11 codons at N-terminal (2 Promoters were J23100 & J23108)


Hypothesis and Algorithm

At the beginning, we hypothesized following things based on the information available in literature: Expression is inversely proportional to the stability of secondary structure of mRNA near RBS part. Rare codons present in first 11 codons of proteins have the ability to increase or decrease the translational score of RBS parts. Each RBS part has a native strength irrespective of the promoter and protein coding part it can be used with.

We designed an algorithm to compute the translational score of a given protein expressing construct in following way: \begin{equation*} TS = \dfrac{S*C_{pref}}{1+dG} + \alpha \end{equation*}

\begin{equation*} C_{pref}= C_{1}*C_{2}*C_{3}*...*C_{11}*C_{sfGFP} \end{equation*}

Objective function: minimize \(\sum \mid TS_{model}-TS_{experiment} \mid\)

Outlier Removal: top scores in \(\mid TS_{model}-TS_{experiment} \mid\)

where \(C_{i}\) : the codon preference of codon at \(i^{th}\) position, \(C_{sfGFP}\) a constant for sfGFP protein codons TS : Translational Score of RBS part S : Native Strength of RBS part dG : Stability of RNA strand, from RBS to \(11^{th}\) codon of protein \(\alpha\) is a constant


Optimization

Above model was optimized to compute the unknown variables, PiRi, codon matrix values, using the data from above mentioned paper. In MATLAB, fmincon function was used to minimize the sum of (model-experimental)^2 for all 14137 constrcuts. Further, the system was optimized to by removing 5%, 10% outliers, which were computed as the top scores in abs(model-experimental).


Results

After several iterations of optimization, we achieved following results Optimization was done in MATLAB on a supercomputer facility at IIT Madras.

Relative native strength: B0034: 0.6163, B0032: 1.0 , B0030: 0.5473 averaged over with J23100 and J23108 promoters.

Global modularity of RBSs w.r.t promoters : (defined as std(score)/mean(score)) B0034: 0.3372, B0032: 0.2974, B0030: 1.1370

Global modularity of RBSs w.r.t protein coding parts : (defined as mean(std(score)/mean(score)) for different promoters) B0034: 0.7815, B0032: 0.7127, B0030: 0.9958


Conclusion

We could achieve a heuristic solution with a correlation of 0.87 with 90% of the data points. Model gives us the strength of promoter-RBS combined strength for 280 (2 promoters x 140 RBSs) combiations. It also gave us the codon preference matrix for 64 codons, (shown below).

Experiment model

Null model

We can observe a significant decrease (on average xx%) in the strength of RBSs (xx out of 140), when they are used with high strength promoters.