Difference between revisions of "Team:Rice/Modeling"

 
(36 intermediate revisions by 2 users not shown)
Line 6: Line 6:
 
<head>
 
<head>
 
<style>
 
<style>
         h1 {
+
         .h1 {
             font-family: "Abadi MT Condensed Extra Bold", Helvetica, Arial;
+
             font-family: "DIN Alternate Bold", Helvetica, Arial;
             font-size: 30pt;
+
             font-size: 30px;
 
             font-style: normal;
 
             font-style: normal;
 
             font-variant: normal;
 
             font-variant: normal;
 
             font-weight: bold;
 
             font-weight: bold;
             line-height: 30pt;
+
             line-height: 26.4pt;
            color: white;
+
        }
+
        h3 {
+
            font-family: "DIN Alternate Bold", Helvetica, Arial;
+
            font-size: 18pt;
+
            font-style: normal;
+
            font-variant: normal;
+
            font-weight: bold;
+
            line-height: 18pt;
+
 
             color: white;
 
             color: white;
 
         }
 
         }
Line 29: Line 20:
  
 
<body>
 
<body>
 
+
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
   <div class='h1'>Introduction</div>
+
   <div class="fixed_flyer" id = "sec1" style="position:relative;z-index:1">
 
+
    <div class = "h1">Introduction</div>
   <div class='para'>
+
  </div>
   Violacein is a fluorescent reporter with anticancer activity (Ref) that has been
+
  <div class="pagediv">
  used in several other igem projects (Cambridge 2009, Slovenia 2010, Johns Hopkins 2011,
+
  <br>
  UCSF 2012). Although it would be a good pigment candidate for our project, it has a
+
   <div class="para">
  complex synthetic pathway requiring five specialized enzymes, and oxygen (Fig 1.)
+
   Violacein is a fluorescent reporter with anticancer activity (Carvalho et al., 2006)
  (Michael E. Lee et al, 2013). It also presents multiple off-path reactions that can
+
  that has been used in several other iGEM projects (<a href="https://2009.igem.org/Team:Cambridge">Cambridge 2009</a>, <a href="https://2009.igem.org/Team:Cambridge">Slovenia 2010</a>,
  reduce the efficiency of the pathway. Before building constructs to use for violacein
+
  <a href="https://2009.igem.org/Team:Cambridge">Johns Hopkins 2011</a>, <a href="https://2012.igem.org/Team:UCSF">UCSF 2012</a>). Although it would be a good pigment candidate for
  production, we needed to find a way to determine which promoters to use for the five
+
    our project, it has a complex synthetic pathway requiring five specialized enzymes
  genes involved in the pathway. Although there are studies focused on the optimization
+
    and oxygen (Fig. 2). It also presents multiple off-path reactions that can reduce
  of the production of violacein, none of the studies gives a biochemical model of the
+
      the efficiency of the pathway. Before building constructs to use for violacein
  rates of the reactions that take place in the bacteria (Ref).
+
      production, we needed to find a way to determine which promoters to use for
 +
      the five genes involved in the pathway. Although there are studies focused on
 +
        the optimization of the production of violacein (Lee et al., 2013), none of
 +
        the studies give a biochemical model of the rates of the reactions that take
 +
        place in the bacteria.
 +
<br><br>
 +
  </div>
 
   </div>
 
   </div>
  
  
   <div class='h1'>Objective</div>
+
   <div class="fixed_flyer" id = "sec2" style="position:relative;z-index:2">
 
+
    <div class = "h1">Objective</div>
<div class='para'>
+
  </div>
 +
  <div class="pagediv">
 +
  <br>
 +
<div class="para">
 
Create a biochemical model of the violacein production based on the synthetic
 
Create a biochemical model of the violacein production based on the synthetic
pathway and violacein production data from bacteria with different promoters
+
pathway and violacein production data from bacteria with different promoters for
for each of the five genes involved in the pathway.
+
each of the five genes involved in the pathway.
</div>
+
<br><br>
 +
  </div>
 +
  </div>
  
  <div class='h1'>Model assumptions</div>
 
  
<div class='para'>
+
  <div class="fixed_flyer" id = "sec3" style="position:relative;z-index:3">
 +
    <div class = "h1">Model Assumptions</div>
 +
  </div>
 +
  <div class="pagediv">
 +
    <br>
 +
<div class="para">
 
<ol>
 
<ol>
 
<li>The rate of dilution of the enzymes and the intermediaries is much greater than
 
<li>The rate of dilution of the enzymes and the intermediaries is much greater than
Line 70: Line 76:
 
only requires one parameter for reaction and is less susceptible to overdosing
 
only requires one parameter for reaction and is less susceptible to overdosing
  
 +
</div>
 
</div>
 
</div>
  
<div class='h1'>Model Building Process</div>
+
<br><br>
 
+
  <div class="fixed_flyer" id = "sec4" style="position:relative;z-index:4">
<div class='para'>
+
    <div class = "h1">Model Building Process</div>
<div class='h3'>1. Modeling Promoter Strength</div>
+
  </div>
 +
  <div class="pagediv">
 +
<br>
 +
<div class="para">
 +
<div class="h3">1. Modeling Promoter Strength</div>
 +
<br>
 
Because a major goal of the model is to predict the effects of the selection of
 
Because a major goal of the model is to predict the effects of the selection of
 
promoters on the final production of violacein, we decided to find a way to
 
promoters on the final production of violacein, we decided to find a way to
Line 81: Line 93:
 
strength as a single standard to characterize the promoters. Moreover, we assumed
 
strength as a single standard to characterize the promoters. Moreover, we assumed
 
the degradation rate of proteins only depends on the growth rate of E.coli. Then,
 
the degradation rate of proteins only depends on the growth rate of E.coli. Then,
  every enzyme has the same degradation rate. The bacteriophage T7 promoter
+
  every enzyme has the same degradation rate. The bacteriophage T7 promoter has
has been widely used for protein expression and purification (J. Andrew Jones
+
been widely used for protein expression and purification (Jones et al., 2013),
et al., 2013), so we used data of five mutant T7 promoters to create a
+
so we used data of five mutant T7 promoters to create a proof-of-concept model.
proof-of-concept model. If this model is functional, we can implement the same
+
If this model was functional, we could implement the same modeling technique to
modeling technique to the promoters we are working with.
+
the promoters we were working with.The five mutant T7 promoters have distinct
 +
promoter strength over time after induction. The experimental data from the
 +
  literature are shown in the figure below (Jones et al., 2013).
  
The five mutant T7 promoters have distinct promoter strength over time after
+
<br><br>
induction. The experimental data are shown in the figure below.
+
<img src="https://static.igem.org/mediawiki/2016/3/32/Promoter_Strengh_vs_Time_paper.png" style="display: block; margin: auto; width: 80%">
 +
<br><br>
  
 +
The first step of our model is to describe the rate of change of enzymes based on promoter strength. Here we assumed that the enzyme production rate is directly proportional to strength of the promoter. Therefore, we were able to use a mass-action kinetics equation of promoters to describe the enzyme concentration. The equation is shown below:
 +
 +
<br><br>
 +
<img src="https://static.igem.org/mediawiki/2016/2/28/Protomter_Equations_new.png" style="display: block; margin: auto; width: 25%">
 
<br><br>
 
<br><br>
<img src="https://static.igem.org/mediawiki/2016/3/32/Promoter_Strengh_vs_Time_paper.png">
+
 
 +
In this equation, Ai is the concentration of enzyme i, ki­ is the production rate of each  enzyme i, kd is the degradation rate of all enzymes, and t is time. By solving this equation, we derived the equation of enzyme concentration against time.
  
 +
<br><br>
 +
<img src="https://static.igem.org/mediawiki/2016/5/54/Promoter_ODE_new.png" style="display: block; margin: auto; width: 25%">
 
<br><br>
 
<br><br>
  
The first step of our model is to describe the rate of change of enzymes based
+
Since we assumed that the promoter strength is proportional to the promoter concentration, we would use the equation to fit our data using least squares method (Fig. 1).
on promoter strength. Here we assumed that the enzyme production rate is
+
<br><br>
directly proportional to strength of the promoter. Therefore, we were
+
<img src="https://static.igem.org/mediawiki/2016/thumb/3/3a/Fitted_Lines_of_Promoter_Strength_vs_Time.png/800px-Fitted_Lines_of_Promoter_Strength_vs_Time.png" style="display: block; margin: auto; width: 80%">
able to use a mass-action kinetics equation of promoters to describe
+
<br><br>
the enzyme concentration. The equation is shown below:
+
 
 +
<b>Figure 1.</b> Linear regressions fitted to normalized fluorescence vs time. The circles represent data from Jones et al., 2013. The solid lines are our regression lines. The colors indicate with which promoters the circles and lines correspond.
 +
<br><br>
 +
In general, the regression lines are able to capture the change of strength of each enzyme over time. In this way, the parameters are determined. The table below lists the parameter values.
  
 
<br><br>
 
<br><br>
<img src="Promoter Equation.JPG">
+
<img src="https://static.igem.org/mediawiki/2016/7/7e/Promoter_Strength_Fit_Parameters.png" style="display: block; margin: auto; width: 60%">
 
<br><br>
 
<br><br>
  
In this equation, Ai is the concentration of enzyme i, ki­ is the
+
<b>Table 1.</b> Parameters realted to promoter strength and degradation of molecules. In the table, ki­ (i = 1,2,3,4,5) are the production rate coefficients of promoter I (i = 1,2,3,4,5), and kd is the degradation rate coefficient of all promoters.
production rate of each  enzyme i, kd is the degradation rate of all
+
enzymes, and t is time. By solving this equation, we derived the
+
equation of enzyme concentration against time.
+
  
 
<br><br>
 
<br><br>
<img src="Promoter ODE.JPG">
+
<div class='h3'>2. Modeling the Steady-state Violacein Yield</div>
 +
<br>
 +
After we finished the regression model of each promoter, we created a second model to describe the violacein biosynthetic pathway. The pathway (Fig. 2) involves five enzyme-catalyzed reactions and one non-enzymatic reaction (Lee et al, 2013).
 
<br><br>
 
<br><br>
 +
<img src= "https://static.igem.org/mediawiki/2016/thumb/2/2f/Violacein_Biosynthetic_Pathway.png/737px-Violacein_Biosynthetic_Pathway.png" style="display: block; margin: auto; width: 80%">
 +
<br><br>
 +
 +
<b>Figure 2.</b> Violacein synthetic pathway. The purple arrows highlight the five enzymatic and one non-enzymatic steps of violacein production from two molecules of tryptophan. The five enzymes are indicated by bolding (VioA, VioB, etc.).
 +
 +
<br><br>
 +
 +
The model was developed as three major parts. A pseudocode of this model is provided here.
 +
 +
<br>
 +
 +
<b>Define ODE System</b>
 +
<ol>
 +
<li>Calculate the production and degradation rate of each molecule in the pathway from the concentration of reagents and parameters.</li>
 +
<li>Obtain the rate of change of each molecule based on the production and degradation rates.</li>
 +
</ol>
 +
<b>Solve the System of Nonlinear Equations at Steady State</b>
 +
<ol>
 +
<li>Solve the system of nonlinear equations at steady state starting at an initial guess X0.</li>
 +
<li>Use the result as a new initial guess; repeat the numerical method to solve the system of equations again.</li>
 +
<li>Calculate the relative error of each chemical in the new result.</li>
 +
<li>If the maximum error is smaller than 0.0001%, output violacein concentration at steady state as the final result.</li>
 +
</ol>
 +
<b>Optimize Parameters to Fit Experimental Data</b>
 +
<ol>
 +
<li>Set the initial guess of the parameters.</li>
 +
<li>Load the data from literature, which include the choice of promoter for each gene and the corresponding violacein yield determined experimentally.</li>
 +
<li>For each promoter selection scenario, pass the promoter types and the temporary parameters to the steady-state model.</li>
 +
<li>Obtain the violacein yield predicted by the steady-state model for each promoter selection scenario.</li>
 +
<li>Compute the residual sum of squares (RSS) of between the predicted violacein yields and the violacein yields given by experiment.</li>
 +
<li>Determine the optimal parameters by minimizing the RSS (least square method).</li>
 +
</ol>
  
Since we assumed that the promoter strength is proportional to the promoter
+
<br>
concentration, we can use the equation to fit our data using least
+
squares method. The regression lines are overlaid on the data.
+
  
 +
Using the principles of mass action kinetics, we derived the system of ODE equations in the model. The equations involves 17 parameters (Table 2). Five parameters (kA, kB, kC, kD and kE) are related to the production rates of the five enzymes, which depend only the strength of the promoter type. Another parameter, kd, is the degradation coefficient of all molecules due to the growth of E.coli. The value of this parameter is fixed  and shown in Table1. In addition to these known parameters, the equations include 11 undetermined parameters related to the reaction rates at specific steps in the violacein synthetic pathway. As described in the pseudocode, we used least square regression to determine the optimal values of these parameters.
 
<br><br>
 
<br><br>
<img src="Fitted Lines of Promoter Strength vs Time.png">
+
Each one of the11 differential equations describes the rate of change of specific molecule in the system. The equations consider the production, consumption, and degradation rates of the molecules. Degradation of molecules is described by first order decay. Therefore, the rate of degradation of a molecule depends on a degradation constant and the degradation coefficient. The degradation coefficient is identical for all molecules since it only depends on E.coli growth rate.
 
<br><br>
 
<br><br>
 +
<div class="h3"> Differential Equations in the Model</div>
  
In the plot, circles represent data from paper. (J. Andrew Jones et al., 2013). The solid lines are regression lines. In general the regression lines are able to capture the change of strength of each enzyme over time. In this way, the parameters are determined. The table below lists the parameter values.
 
  
 
<br><br>
 
<br><br>
<img src="Promoter Strength Fit Parameters.png">
+
<img src="https://static.igem.org/mediawiki/2016/c/cf/Enzyme_Production_Rate.png" style="display: block; margin: auto; width: 80%">
 +
<img src="https://static.igem.org/mediawiki/2016/thumb/7/7f/Chemical_Production_Rate_1.png/1200px-Chemical_Production_Rate_1.png" style="display: block; margin: auto; width: 100%">
 +
<img src="https://static.igem.org/mediawiki/2016/thumb/f/fd/Chemical_Production_Rate_2.png/1199px-Chemical_Production_Rate_2.png" style="display: block; margin: auto; width: 100%">
 
<br><br>
 
<br><br>
  
In the table, ki­ (i = 1,2,3,4,5) are the production rate coefficients of promoter I (i = 1,2,3,4,5), and kd is the degradation rate coefficient of all promoters.
+
<div class="fixed_flyer" id = "sec5" style="position:relative;z-index:5">
 +
  <div class = "h1">Results</div>
 +
</div>
 +
<div class="pagediv">
 +
<br>
 +
<div class="para">
  
 +
Our model is able to compute the average violacein yields for all the strains tested experimentally, but can not capture the difference of violacein yield with different promoters strengths. The comparison between the violacein yields determined by experiments and those predicted by our model is shown in Figure 3.  The optimal parameters determined by the model are listed in Table 2.
 +
 +
<br><br>
 +
<img src="https://static.igem.org/mediawiki/2016/thumb/0/01/Violacein_Yields_Model_Prediction_vs_Data.png/800px-Violacein_Yields_Model_Prediction_vs_Data.png" style="display: block; margin: auto; width: 100%">
 +
<br><br>
 +
 +
<b>Figure 3.</b> VIolacein yield with different promoter combinations. This graph compares the violacein found for various promoter combinations determined by Jones et al., 2013 (shown in blue) with the violacein concentrations that our model predicted for the same promoter combinations. The root-mean-square error (RMSE) is 52.04.
 +
 +
<br><br>
 +
<img src="https://static.igem.org/mediawiki/2016/b/bb/Full_parameter_table.png" style="display: block; margin: auto; width: 100%">
 +
<br><br>
 +
 +
<b>Table 2.</b> Notations of parameters.
 +
<br><br>
 +
</div>
 
</div>
 
</div>
  
  
  
 +
</div>
 +
</div>
 +
  <div class="fixed_flyer" id = "sec6" style="position:relative;z-index:6">
 +
    <div class = "h1">Discussion</div>
 +
  </div>
  
 +
  <div class="pagediv">
 +
  <br>
 +
  <div class="para">
 +
The current model is not able to show the expected dependence of violacein yield on promoter strength. After reevaluating our assumptions, we identified some potential flaws of the model that might cause the unexpected results.
 +
<br><br>
 +
One of the assumptions from our model is that the rate of production of L-tryptophan is constant and independent of the promoter strength. Jones el al. suggest that the L-tryptophan production rate may be affected by the metabolic burden of the production of the recombinant enzymes (VioA, VioB, etc.). This phenomenon may be caused by the depletion of essential metabolic resource, such as amino acids, mRNA and ATP. Therefore, the L-tryptophan production rate might need to be dependent on enzymes production rates.
 +
<br><br>
 +
Another effect that we didn’t consider is the saturation of the enzymes. To improve our model, we could include these effects by employing Michaelis-Menten Kinetics equations in our next step. Nevertheless, we have been cautious about including this in our model, since increasing the number of parameters, without increasing the number of data points usually causes the overfitting of the model.
 +
<br><br>
 +
Finally, since the violacein pathway has not been fully characterized, it is possible that we ignored some reactions in the complete pathway. Moreover, there may be feedback loops that regulate the pathway. We will need to investigate these possible components and incorporate them into our model if they prove to be present in the pathway.
 +
<br><br>
 +
  </div>
 +
    </div>
 +
 +
    <div class="fixed_flyer" id = "sec7" style="position:relative;z-index:7">
 +
      <div class = "h1">Conclusion</div>
 +
    </div>
 +
    <div class="pagediv">
 +
    <br>
 +
    <div class="para">
 +
    Here we present a method to fit a model of violacein production in E.coli to experimental data of violacein yield with different promoters using nonlinear regression. Although  it fails to calculate the dependence on promoter strength, our model is able predict the average violacein concentration. We expect that small changes on the model, such as including a L-tryptophan production dependence of the metabolic burden, would allow us to successfully predict the violacein production in response to the variation of promoter strength. Once the predictive model is complete, we will be able to find the strains that lead to optimal violacein yield computationally.
 +
    </div>
 +
    </div>
 +
 +
  <div class="fixed_flyer" id = "sec8" style="position:relative;z-index:8">
 +
    <div class = "h1">References</div>
 +
  </div>
 +
  <div class="para">
 +
 +
<ol>
 +
<li>Carvalho, D. D., Costa, F. T. M., Duran, N., & Haun, M. (2006). Cytotoxic activity of violacein in human colon cancer cells. <i>Toxicology in Vitro</i>, 20(8), 1514–1521. <br><a href="http://dx.doi.org/10.1016/j.tiv.2006.06.007">http://dx.doi.org/10.1016/j.tiv.2006.06.007</a></li>
 +
<li>Jones, J. A., Vernacchio, V. R., Lachance, D. M., Lebovich, M., Fu, L., Shirke, A. N., … Koffas, M. A. G. (2015). ePathOptimize: A Combinatorial Approach for Transcriptional Balancing of Metabolic Pathways. <i>Scientific Reports</i>, 5, 11301. <br><a href="http://doi.org/10.1038/srep11301">http://doi.org/10.1038/srep11301</a></li>
 +
<li>Lee, M. E., Aswani, A., Han, A. S., Tomlin, C. J., & Dueber, J. E. (2013). Expression-level optimization of a multi-enzyme pathway in the absence of a high-throughput assay. <i>Nucleic Acids Research</i>, 41(22), 10668–10678. <br> <a href="http://doi.org/10.1093/nar/gkt809">http://doi.org/10.1093/nar/gkt809</a></li>
 +
</ol>
 +
</div>
 +
</div>
 
<br><br><br><br><br>
 
<br><br><br><br><br>
 
</body>
 
</body>
 
+
<script>
 +
$(document).ready(function(){
 +
    var sec1=$(".fixed_flyer#sec1");
 +
    var sec1_pos=sec1.offset().top;
 +
    var sec2=$(".fixed_flyer#sec2");
 +
    var sec2_pos=sec2.offset().top;
 +
    var sec3=$(".fixed_flyer#sec3");
 +
    var sec3_pos=sec3.offset().top;
 +
    var sec4=$(".fixed_flyer#sec4");
 +
    var sec4_pos=sec4.offset().top;
 +
    var sec5=$(".fixed_flyer#sec5");
 +
    var sec5_pos=sec5.offset().top;
 +
    var sec6=$(".fixed_flyer#sec6");
 +
    var sec6_pos=sec6.offset().top;
 +
    var sec7=$(".fixed_flyer#sec7");
 +
    var sec7_pos=sec7.offset().top;
 +
    var sec8=$(".fixed_flyer#sec8");
 +
    var sec8_pos=sec8.offset().top;
 +
    $(window).scroll(function () {
 +
        var y=$(this).scrollTop();
 +
        console.log('new position-----------------------');
 +
        console.log('y:',y);
 +
        console.log('sec1:', sec1_pos);
 +
        console.log('sec2:', sec2_pos);
 +
        console.log('sec3:', sec3_pos);
 +
        console.log('sec4:', sec3_pos);
 +
        if(y<sec2_pos-50){
 +
          if (y>sec1_pos){
 +
            console.log('sec 1 supposed to move');
 +
            sec1.stop().animate({'top':y-sec1_pos+18},1);
 +
          }
 +
        } else {
 +
          sec1.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec3_pos-50){
 +
          if(y>sec2_pos){
 +
            console.log('sec 2 supposed to move');
 +
            sec2.stop().animate({'top':y-sec2_pos+18},1);
 +
          }
 +
        } else {
 +
          sec2.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec4_pos-40){
 +
          if(y>sec3_pos){
 +
            console.log('sec 3 supposed to move');
 +
            sec3.stop().animate({'top':y-sec3_pos+18},1);
 +
          }
 +
        } else {
 +
          sec3.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec5_pos-40){
 +
          if(y>sec4_pos){
 +
            console.log('sec 4 supposed to move');
 +
            sec4.stop().animate({'top':y-sec4_pos+18},1);
 +
          }
 +
        } else {
 +
          sec4.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec6_pos-40){
 +
          if(y>sec5_pos){
 +
            console.log('sec 5 supposed to move');
 +
            sec5.stop().animate({'top':y-sec5_pos+18},1);
 +
          }
 +
        } else {
 +
          sec5.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec7_pos-40){
 +
          if(y>sec6_pos){
 +
            console.log('sec 6 supposed to move');
 +
            sec6.stop().animate({'top':y-sec6_pos+18},1);
 +
          }
 +
        } else {
 +
          sec6.stop().animate({'top':10},1);
 +
        };
 +
        if(y<sec8_pos-40){
 +
          if(y>sec7_pos){
 +
            console.log('sec 7 supposed to move');
 +
            sec7.stop().animate({'top':y-sec7_pos+18},1);
 +
          }
 +
        } else {
 +
          sec7.stop().animate({'top':10},1);
 +
        };
 +
        if(y<5000){
 +
          if(y>sec8_pos){
 +
            console.log('sec 8 supposed to move');
 +
            sec8.stop().animate({'top':y-sec8_pos+18},1);
 +
          }
 +
        } else {
 +
          sec8.stop().animate({'top':10},1);
 +
        };
 +
    });
 +
  });
 +
</script>
  
 
</html>
 
</html>

Latest revision as of 03:25, 20 October 2016


           
Introduction

Violacein is a fluorescent reporter with anticancer activity (Carvalho et al., 2006) that has been used in several other iGEM projects (Cambridge 2009, Slovenia 2010, Johns Hopkins 2011, UCSF 2012). Although it would be a good pigment candidate for our project, it has a complex synthetic pathway requiring five specialized enzymes and oxygen (Fig. 2). It also presents multiple off-path reactions that can reduce the efficiency of the pathway. Before building constructs to use for violacein production, we needed to find a way to determine which promoters to use for the five genes involved in the pathway. Although there are studies focused on the optimization of the production of violacein (Lee et al., 2013), none of the studies give a biochemical model of the rates of the reactions that take place in the bacteria.

Objective

Create a biochemical model of the violacein production based on the synthetic pathway and violacein production data from bacteria with different promoters for each of the five genes involved in the pathway.

Model Assumptions

  1. The rate of dilution of the enzymes and the intermediaries is much greater than its degradation (for example by ubiquitination for the proteins or by conversion to products not included on the pathway)
  2. There is no saturation of the enzymes and all the reactions will follow the law of mass action
  3. Independence of external factors such as oxygen and NADH in the reactions
  4. None of the reactions are reversible
We use the mass action kinetics because this type of equation only requires one parameter for reaction and is less susceptible to overdosing


Model Building Process

1. Modeling Promoter Strength

Because a major goal of the model is to predict the effects of the selection of promoters on the final production of violacein, we decided to find a way to characterize promoters first. To simplify the computation, we used the promoter strength as a single standard to characterize the promoters. Moreover, we assumed the degradation rate of proteins only depends on the growth rate of E.coli. Then, every enzyme has the same degradation rate. The bacteriophage T7 promoter has been widely used for protein expression and purification (Jones et al., 2013), so we used data of five mutant T7 promoters to create a proof-of-concept model. If this model was functional, we could implement the same modeling technique to the promoters we were working with.The five mutant T7 promoters have distinct promoter strength over time after induction. The experimental data from the literature are shown in the figure below (Jones et al., 2013).



The first step of our model is to describe the rate of change of enzymes based on promoter strength. Here we assumed that the enzyme production rate is directly proportional to strength of the promoter. Therefore, we were able to use a mass-action kinetics equation of promoters to describe the enzyme concentration. The equation is shown below:



In this equation, Ai is the concentration of enzyme i, ki­ is the production rate of each enzyme i, kd is the degradation rate of all enzymes, and t is time. By solving this equation, we derived the equation of enzyme concentration against time.



Since we assumed that the promoter strength is proportional to the promoter concentration, we would use the equation to fit our data using least squares method (Fig. 1).



Figure 1. Linear regressions fitted to normalized fluorescence vs time. The circles represent data from Jones et al., 2013. The solid lines are our regression lines. The colors indicate with which promoters the circles and lines correspond.

In general, the regression lines are able to capture the change of strength of each enzyme over time. In this way, the parameters are determined. The table below lists the parameter values.



Table 1. Parameters realted to promoter strength and degradation of molecules. In the table, ki­ (i = 1,2,3,4,5) are the production rate coefficients of promoter I (i = 1,2,3,4,5), and kd is the degradation rate coefficient of all promoters.

2. Modeling the Steady-state Violacein Yield

After we finished the regression model of each promoter, we created a second model to describe the violacein biosynthetic pathway. The pathway (Fig. 2) involves five enzyme-catalyzed reactions and one non-enzymatic reaction (Lee et al, 2013).



Figure 2. Violacein synthetic pathway. The purple arrows highlight the five enzymatic and one non-enzymatic steps of violacein production from two molecules of tryptophan. The five enzymes are indicated by bolding (VioA, VioB, etc.).

The model was developed as three major parts. A pseudocode of this model is provided here.
Define ODE System
  1. Calculate the production and degradation rate of each molecule in the pathway from the concentration of reagents and parameters.
  2. Obtain the rate of change of each molecule based on the production and degradation rates.
Solve the System of Nonlinear Equations at Steady State
  1. Solve the system of nonlinear equations at steady state starting at an initial guess X0.
  2. Use the result as a new initial guess; repeat the numerical method to solve the system of equations again.
  3. Calculate the relative error of each chemical in the new result.
  4. If the maximum error is smaller than 0.0001%, output violacein concentration at steady state as the final result.
Optimize Parameters to Fit Experimental Data
  1. Set the initial guess of the parameters.
  2. Load the data from literature, which include the choice of promoter for each gene and the corresponding violacein yield determined experimentally.
  3. For each promoter selection scenario, pass the promoter types and the temporary parameters to the steady-state model.
  4. Obtain the violacein yield predicted by the steady-state model for each promoter selection scenario.
  5. Compute the residual sum of squares (RSS) of between the predicted violacein yields and the violacein yields given by experiment.
  6. Determine the optimal parameters by minimizing the RSS (least square method).

Using the principles of mass action kinetics, we derived the system of ODE equations in the model. The equations involves 17 parameters (Table 2). Five parameters (kA, kB, kC, kD and kE) are related to the production rates of the five enzymes, which depend only the strength of the promoter type. Another parameter, kd, is the degradation coefficient of all molecules due to the growth of E.coli. The value of this parameter is fixed and shown in Table1. In addition to these known parameters, the equations include 11 undetermined parameters related to the reaction rates at specific steps in the violacein synthetic pathway. As described in the pseudocode, we used least square regression to determine the optimal values of these parameters.

Each one of the11 differential equations describes the rate of change of specific molecule in the system. The equations consider the production, consumption, and degradation rates of the molecules. Degradation of molecules is described by first order decay. Therefore, the rate of degradation of a molecule depends on a degradation constant and the degradation coefficient. The degradation coefficient is identical for all molecules since it only depends on E.coli growth rate.

Differential Equations in the Model




Results

Our model is able to compute the average violacein yields for all the strains tested experimentally, but can not capture the difference of violacein yield with different promoters strengths. The comparison between the violacein yields determined by experiments and those predicted by our model is shown in Figure 3. The optimal parameters determined by the model are listed in Table 2.



Figure 3. VIolacein yield with different promoter combinations. This graph compares the violacein found for various promoter combinations determined by Jones et al., 2013 (shown in blue) with the violacein concentrations that our model predicted for the same promoter combinations. The root-mean-square error (RMSE) is 52.04.



Table 2. Notations of parameters.

Discussion

The current model is not able to show the expected dependence of violacein yield on promoter strength. After reevaluating our assumptions, we identified some potential flaws of the model that might cause the unexpected results.

One of the assumptions from our model is that the rate of production of L-tryptophan is constant and independent of the promoter strength. Jones el al. suggest that the L-tryptophan production rate may be affected by the metabolic burden of the production of the recombinant enzymes (VioA, VioB, etc.). This phenomenon may be caused by the depletion of essential metabolic resource, such as amino acids, mRNA and ATP. Therefore, the L-tryptophan production rate might need to be dependent on enzymes production rates.

Another effect that we didn’t consider is the saturation of the enzymes. To improve our model, we could include these effects by employing Michaelis-Menten Kinetics equations in our next step. Nevertheless, we have been cautious about including this in our model, since increasing the number of parameters, without increasing the number of data points usually causes the overfitting of the model.

Finally, since the violacein pathway has not been fully characterized, it is possible that we ignored some reactions in the complete pathway. Moreover, there may be feedback loops that regulate the pathway. We will need to investigate these possible components and incorporate them into our model if they prove to be present in the pathway.

Conclusion

Here we present a method to fit a model of violacein production in E.coli to experimental data of violacein yield with different promoters using nonlinear regression. Although it fails to calculate the dependence on promoter strength, our model is able predict the average violacein concentration. We expect that small changes on the model, such as including a L-tryptophan production dependence of the metabolic burden, would allow us to successfully predict the violacein production in response to the variation of promoter strength. Once the predictive model is complete, we will be able to find the strains that lead to optimal violacein yield computationally.
References
  1. Carvalho, D. D., Costa, F. T. M., Duran, N., & Haun, M. (2006). Cytotoxic activity of violacein in human colon cancer cells. Toxicology in Vitro, 20(8), 1514–1521.
    http://dx.doi.org/10.1016/j.tiv.2006.06.007
  2. Jones, J. A., Vernacchio, V. R., Lachance, D. M., Lebovich, M., Fu, L., Shirke, A. N., … Koffas, M. A. G. (2015). ePathOptimize: A Combinatorial Approach for Transcriptional Balancing of Metabolic Pathways. Scientific Reports, 5, 11301.
    http://doi.org/10.1038/srep11301
  3. Lee, M. E., Aswani, A., Han, A. S., Tomlin, C. J., & Dueber, J. E. (2013). Expression-level optimization of a multi-enzyme pathway in the absence of a high-throughput assay. Nucleic Acids Research, 41(22), 10668–10678.
    http://doi.org/10.1093/nar/gkt809