Team:OUC-China/Model

Model

modeling-banner


MOTIVATION


With the rapid development of synthetic biology, there is an increasing requirement of an accurate quantitative regulation of gene expression. Moreover, researchers can provide insight into how the quantitative regulation system can be improved with the application of mathematical modeling.

Briefly, our team was in the devotion to developing a brand new method of accurate quantitative regulation at post-transcriptional level by means of utilizing the effect of inhibiting degradation of mRNA stem loop. More specifically, we hoped to realize a gradient amount of expression at protein or mRNA level by quantitatively manipulating the free energy (∆G) of designed stem loops downstream. However, the designed system is quite complicated than one can think—it consists of a variety of reactions and complex physical chemical process. To validate the effectiveness of initial designs and realize quantitative manipulating, we constructed a series of mathematical expression and built a mathematical modeling to simulate the theoretic curves of mRNA and protein. Thus, we could mutually authenticate the theoretic curves with the experimental ones and prove the feasibility and potential capacity of our designs’ functionality.



INTRODUCTION


Firstly, we analyzed and constructed the whole metabolic network of the system. The whole network could be separated into several independent processes: (1)mRNA transcription, (2) mRNA translation, (3) mRNA Cleavage by RNase E,(4) mRNA decay by RNase II. The complete metabolic network was listed in given blow. To determine the parameters in these processes, we respectively sought for Michaelis-Menten Kinetics, Empirical Formula and Partition Function (literature material) to respectively calculate the parameters’ value.

Secondly, our designed stem loops shared a gradient free energy. In order to simulate the theoretic curves of mRNA and protein and add the free energy (∆𝐺) into our model, we expect to determine the mathematical expression between the values of free energy (∆𝐺) of stem loops and the corresponding parameters—the constant of decay. To solve this problem, we sought for the Hansch-Fujita Equation which was a regression models widely used in the chemical and biological sciences. We also combined this sub model with some data from literature and experiments due to some theoretical defects.

Finally, we got the simulated results and evaluated it with statistical method, with the expectation to improve our model and our designs and make them more realistic and practical.

Metabolic reaction networks

(1)Transcription $ \phi \xrightarrow{{K}_{r}} [mRNA] $

(2)Primary translation $ [mRNA] \xrightarrow{{K}_{p_1}} [mRNA] + [{Protein}_{1}] + [{Protein}_{2}] $

(3)Cleavage by RNase E $ [mRNA] \xrightarrow{{K}_{d_1}} [{mRNA}_{1}] + [{mRNA}_{2}] $

(4)Secondary translation of GFP mRNA $ [{mRNA}_{1}] \xrightarrow{{K}_{{p}_{11}}} [{mRNA}_{1}] + [{Protein}_{1}] $

(5)Secondary translation of mCherry mRNA $ [{mRNA}_{2}] \xrightarrow{{K}_{{p}_{12}}} [{mRNA}_{2}] + [{Protein}_{2}] $

(6)Primary decay by RNase II $ [mRNA] \xrightarrow{{K}_{d_0}} \phi $

(7)Secondary decay of GFP mRNA by RNase II $ [{mRNA}_{1}] \xrightarrow{{K}_{{d}_{11}}} \phi $

(8)Secondary decay of mCherry mRNA by RNase II $ [{mRNA}_{2}] \xrightarrow{{K}_{{d}_{12}}} \phi $

(9)Decay of GFP $ [{Protein}_{1}] \xrightarrow{{K}_{{d}_{p_1}}} \phi $

(10)Decay of mCherry $ [{Protein}_{2}] \xrightarrow{{K}_{{d}_{p_2}}} \phi $

[Table 1]Symbol Description

Symbol Definition Unit
$[mRNA]$ The concentration of transcription product designed $mol/L$
$[{mRNA}_{1}]$ The concentration of GFP mRNA $mol/L$
$[{mRNA}_{2}]$ The concentration of mCherry mRNA $mol/L$
$[{Protein}_{1}]$ The concentration of GFP $mol/L$
$[{Protein}_{2}]$ The concentration of mCherry $mol/L$
$K_r$ The constant of Transcription ${s}^{-1}$
${K}_{p_1}$ The constant of Primary translation ${s}^{-1}$
${K}_{d_1}$ The constant of Cleavage by RNase E ${s}^{-1}$
${K}_{{p}_{11}}$ The constant of Secondary translation of GFP mRNA ${s}^{-1}$
${K}_{{p}_{12}}$ The constant of Secondary translation of mCherry mRNA ${s}^{-1}$
${K}_{{d}_{0}}$ The constant of Primary decay by RNase II ${s}^{-1}$
${K}_{{d}_{11}}$ The constant of Secondary decay of GFP mRNA by RNase II ${s}^{-1}$
${K}_{{d}_{12}}$ The constant of Secondary decay of mCherry mRNA by RNase II ${s}^{-1}$
${K}_{{d}_{p_1}}$ The constant of Decay of GFP ${s}^{-1}$
${K}_{{d}_{p_2}}$ The constant of Decay of mCherry ${s}^{-1}$


MODEL DEVELOPMENT

1) Michaelis-Menten Kinetics

This submodel aims at calculating the rate constant of reaction (1). In order to make the result more accurate, we considered the details of transcription and used Michaelis-Menten equation to describe the whole process. In the end, we obtained a satisfactory result.

[Table 2]

Symbol Definition Unit
$[AraC]$ The concentration of dissociative repressor protein AraC $mol/L$
$[AraB]$ The concentration of arabinose $mol/L$
$[AraC·AraB]$ The concentration of complex [AraC·AraB] $mol/L$
${[AraC]_T}$ The sum of the concentration of both dissociative repressor protein Arac and complex AraC·AraB $mol/L$
$K_i$,i = 1, 2, 3 reaction rate constant ${s}^{-1}$
$k_m$ Michaelis constant ${s}^{-1}$
$v$ transcription rate ${mol}·{{L}^{-1}}·{{s}^{-1}}$

In our circuit design, we chose araBAD promoter which will be combined with repressor protein AraC and the latter represses transcription of mRNA without arabinose. Then, Arabinose of reagent addition will bind to AraC and form the AraB·AraC compound, allowing transcription to occur.

Hypothesis

We make an assumption that AraC are always in large concentration and that its binding to arabinose happens on a faster time scale than transcription. Therefore, we do not need to consider the individual concentrations of arabinose and AraC, instead we just need to include the concentration of the complex AraC · AraB.

The process boils down to the following formula:

$$Arac + AraB \overset{k_1}{\underset{k_2}{\rightleftharpoons}} AraC·AraB \xrightarrow{{k}_{3}} mRNA + Arac$$

according to law of mass action,

$$ { {d[Arac·AraB]} \over {dt} } = {{k_1}·{({{[AraC]}_T}-{[AraC·AraB]})} - {k_2}·{[AraC·AraB]} - {k_3}·{[AraC·AraB]} }$$

Because $ { {d[Arac·AraB]} \over {dt} } = 0 $

Therefore $ { { {k_2} + {k_3}} \over {k_1} } = { { {({{[AraC]}_T}-{[AraC·AraB]})}·[AraB] } \over {[AraC·AraB]} } $

Define $ {k_m} = { { {k_2} + {k_3}} \over {k_1} } $, then

$$ { [AraC·AraB] } = { { { { [AraC] }_T } · [AraB] } \over { {k_m} + [AraB] } } $$

Then the transcription rate can be confirmed like the following mathematical expression:

$$ {v} = {k_3}·{ [AraC·AraB] } = {k_3} · { {{[AraC]}_T} } · { {[AraB]} \over { {k_m} + [AraB] } } $$

Further, according to the theory of order of reaction, transcription rate can be convert into reaction rate constant.

$$ {K_r} = { v \over { {{[AraC]}_T} } } = {k_3} · { {[AraB]} \over { {k_m} + [AraB] } } $$

2) Empirical Formula

To determine the reaction rate constant of mRNA cleavage by RNase E, we chose the empirical formula, obtained from document literature.

After transcription, RNaseE, an enzyme, recognized specific site and then split mRNA into two sections. The process could be described as an reaction as follow:

$$ [mRNA] \xrightarrow {{k}_{d_1}} {{[mRNA]}_1} + {{[mRNA]}_2} $$

The reaction rate constant was obtained from document literature[2] as follow:

$$ {K_{d_1}} = { {{[H^+]}{K_{E1}}{k_0}} \over { {{K_{E1}}·{K_{E2}}} + { {[H^+]}{K_{E1}} } + {{[H^+]}^2} } } (2.1)$$

where ${K_{E1}}$ and ${K_{E2}}$ were acid dissociation constants of the free enzyme. $k_0$ is the maximal rate constant of the catalytically competent form of the enzyme. ${K_{d_1}}$ is the rate constant at a given pH.

The experimental data showed the pH data in the table below:

[Table 3]

Time/min 90 240 1385 2700 Mean value
Circuit 2 6.723 6.447 7.773 / 6.981
Circuit 5 6.613 6.480 7.360 8.170 7.156
Circuit 9 6.450 6.470 7.313 / 6.744
Circuit 11 6.660 6.480 7.830 8.390 7.340

Substituting the mean values of pH in each circuit into the formula (2.1), we got the reaction rate constant of each circuit:

[Table 4]

Circuit 2 5 9 11
${K_{d_1}} / {h^{-1}}$ 0.310 0.449 0.185 0.653

3) Partition Function

To determine the reaction rate constant ${K}_{{p}_{1}}$ , ${K}_{{p}_{11}}$ , ${K}_{{p}_{12}}$ , we seek for partition function. Inspired by the references [3], we applied the partition function to the dynamic description of the translation process, in which we can obtain the probability that the ribosomes bind to RNA and then succeeded in converting the probability into translation rate. Further, we can calculate the reaction rate constant ${K}_{{p}_{1}}$ , ${K}_{{p}_{11}}$ , ${K}_{{p}_{12}}$ by utilizing the concentration data of (2),(4),(5) from our wet lab experiments.

Hypothesis

(1)Ribosomes are always in large concentration.
(2)Ribosomes are more inclined to bind with RBS.

[Table 5]

Symbol Definition Unit
$P_{bound}$ Probability of ribosome binding to RBS /
$P$ Effective number of ribosome available for binding to RBS $number$
$N_{NS}$ The number of nonspecific site of mRNA $number$
$K^{S}_{pd}$ Dissociation constants for specific binding $nM$
$K^{NS}_{pd}$ Dissociation constants for non-specific binding $nM$
${\epsilon }^{S}_{pd}$ Binding energy for ribosome on the RBS $J$
${\epsilon }^{NS}_{pd}$ Average binding energy of ribosome to the genomic background $J$
$k_B$ Boltzmann constants /
$T$ Temperature $K$
$v$ Rate of reaction $mol/(L·s)$
$Volume$ Volume $L$
$Avogadro$ Avogadro constants /
$[mRNA](0)$ Initial concentration of mRNA $mol/L$
$[mRNA](90)$ Concentration of mRNA in 90 min $mol/L$

According to the document literature [3], the probability that the ribosomes bind to RNA can be obtained as follow:

$$ { P_{bound} } = { 1 \over { 1 + { { N_{NS} } \over P }exp({ {{{\epsilon }^{S}_{pd}}-{{\epsilon }^{NS}_{pd}}} \over {{k_B}T} }) } } $$

Because $ { {{\epsilon }^{S}_{pd}} - {{\epsilon }^{NS}_{pd}} } \approx { {{k_B}T}ln({ {K^{S}_{pd}} \over {K^{NS}_{pd}} }) } $ , therefore the expression above can be further simplified into the following:

$$ { P_{bound} } = { 1 \over { 1 + { { {N_{NS}} \over {P} } · { {{K^{S}_{pd}}} \over {{K^{NS}_{pd}}} } } } } $$

Then, convert the probability into reaction rate [4]:

$$ v = {{1000*{ P_{bound} }} \over {Volume*Avogadro}} $$

Then, in terms of the theory of order of reaction, translation rate can be converted to reaction rate constant.

$$ k = { {1000*{ P_{bound} }} \over {Volume*Avogadro*{{[mRNA]}(0)}} } $$

If you want to know more about how to calculate our parameters, please click here

4) Hansch-Fujita Equation

Hansch-Fujita equation models are regression models used in biological sciences[8]. It firstly summarizes a supposed relationship between chemical structures and biological activity in a data-set of chemicals. For our model, we treated the stem loop as our target chemical substances and hoped to determine a quantitative mathematical relation between the values of the free energy (ΔG) of different stem loops (the target substances) with the constant of decay.

Hypothesis

The target chemical’s activity was the function of physiochemical properties, structural properties and error. The terminator could function like a stem loop.

Following was the hansch-fujita equation:

$$ {lg{1 \over {C_{inh}}}} = { {lg{{C_{m}} \over {C_0}}} - {0.0434{ {\Delta G} \over {RT} }} } $$

[Table 6]

Symbol Definition Unit
$C_{inh}$ Inhibition constant $mol/L$
$C_{m}$ The maximum concentration of target chemical $mol/L$
$C_0$ The initial concentration of target chemical $mol/L$
$|\Delta G|$ Absolute value of Δ𝐺 $kcal/mol$
$R$ Thermodynamics constant , R = 8.314J/(mol · K) $J/(mol·K)$
$T$ Temperature $K$

Derived from Hansch-Fujita equation , we arrived at the equations:

$$ { { K_{ d_{11} } } , { { K_{ d_{12} } } } } = { { {C_0} \over {C_m} } · { 1 \over { {10}^{ {0.0434} · { {|\Delta G|} \over {RT}} } } } } $$
16

The main algorithms of RNA free energy prediction were mfold, RNAfold, Sfold and so on. We fully compared both advantages and disadvantages of these algorithms and selected the mfold algorithms.[11]

[Table 7]

Circuit mfold RNAfold Sfold Length(bp)
∆G values of Stem loop $(kcal·{mol^{-1}})$
5 -51.4 -51.4 -50.8 63
11 -44.9 -45.0 -44.5 66
13 -38.8 -38.9 -38.3 60
15 -38.8 -38.7 -38.3 57
7 -38.7 -39.0 -38.1 45
14 -38.5 -38.6 -38.1 58
12 -34.4 -34.5 -33.6 54
10 -30.1 -30.2 -29.4 50
1 -25.6 -25.5 -25.0 37
9 -20.0 -20.1 -19.5 42
6 -14.9 -15.0 -14.1 36
8 -10.0 -10.3 -9.7 34

We chose the following circuits of design to simulate. Specially, the terminator’s free energy.$∆G=-19.10kcal/mol$

A diagram of circuit

Figure 1. A diagram of circuit

[Table 8]

Circuit ∆G of Stem loop (kcal·mol^(-1))

∆G values calculated by Mfold

5 -51.4
11 -44.9
9 -20
2 -25.6

[Table 9]

Circuit 9 Circuit 2 Circuit 11 Circuit 5
${K_{d_0}}{( {10}^{-4}{s}^{-1} )}$ 0.236 0.0450 0.0067 0.0037
${K_{d_{11}}}{( {10}^{-4}{s}^{-1} )}$ 5.24 0.998 0.150 0.0827
${K_{d_{12}}}{( {10}^{-4}{s}^{-1} )}$ 6.06 2.86 9.83 1.56


SIMULATION & ANALYSIS


We chose the Gillespie algorithm[12] to get the stochastic simulation of coupled. The stoichiometric matrix and constants obtained above were inputs of our MATLAB program.
Firstly, we simulated the theoretical curves at mRNA level. Theoretical curve of Circuit 9 at mRNA level was showed below, where the stem loop free energy ∆G=-20.0kcal/mol and terminator’s free energy was ∆G=-19.10kcal/mol. From the figure below, we can see the trends of GFP mRNA and mCherry mRNA are similar, which proves that the free energy has the relationship with mRNA.

Simulation of circuit 9 at mRNA level

Figure 2. Simulation of circuit 9 at mRNA level

Then we simulated the theoretical curves at protein level. Theoretical curve of Circuit 5 at protein level was showed below, where the stem loop free energy ∆G=-51.4kcal/mol and terminator’s free energy was ∆G=-19.10kcal/mol.

GFP simulation of Circuit 5

Figure 3. GFP simulation of Circuit 5

Our dry lab measured GFP and mCherry expression in E. coli to extract experimental values. A plot showing below was the fit of our theoretical model to experimental data. Thus, we could calculate the limiting concentrations that our products would be expressed. By replacing the reaction rate constants of circuit obtained above, we could stimulate the all theoretical curves of mRNA and proteins, and directly mutually validated each other. However, the experimental data were limited.

Comparison of GFP expression data and model

Figure 4. Comparison of GFP expression data and model

So, we only presented several group of results here.To further validate the theoretical data of circuit 5’s GFP , correlation analysis was made[13].

[Table 10]

Symbol Definition
$X$ The experimental data of GFP concentration
$Y$ The theoretical data of GFP
$n$ The amounts of data
${\sigma}_x$ Mean variance of the experimental data
${\sigma}_y$ Mean variance of the theoretical data
$\gamma$ Correlative coefficient
$\alpha$ Significance level
11
12
13

According to the level of significant test:

n=9, n-2=7; significance level α = 0.1 , ${γ_0.1}$=0.5822

$$ γ = 0.9026 ≥ { γ _ {0.1} } = 0.5822 $$

So the correlation is significant, and we could think the simulation of GFP was good evaluation

14

Further, an regression equation could be established:

15


CONCLUSION


In a conclusion, we validated the reaction rate constants in our system using Michaelis-Menten Equation, Empirical Formula, Partition Function and Hansch-Fujita Equation. With those constants we successfully simulated the whole processes and the results fit with the experimental data well. Besides, we comprehensively utilized research literature and our experimental data and generated a universal mathematical equation between the values of free energy of different stem loops and the decay constant.

We believe our model is beneficial to help us understand the biological system and further evaluate our project’s potential capacity.



REFERANCE


[1] https://2011.igem.org/Team:St_Andrews/modelling
[2] Redko Y, Tock M R, Adams C J, et al. Determination of the catalytic parameters of the N-terminal half of Escherichia coli ribonuclease E and the identification of critical functional groups in RNA substrates[J]. Journal of Biological Chemistry, 2003, 278(45): 44001-44008.
[3] Bintu L, Buchler N E, Garcia H G, et al. Transcriptional regulation by the numbers: models[J]. Current opinion in genetics & development, 2005, 15(2): 116-124.
[4] Eric Sobic, Lecture 29-Stochastic Modeling-Part1, https://www.coursera.org/learn/dynamical-modeling/lecture/LxcTP/lecture-29-stochastic-modeling-part-1
[5] Filonava L. Kinetic dissection of translation initiation in prokaryotes[D]. Georg-August-Universitat Gottingen, 2013.
[6]https://en.wikipedia.org/wiki/Escherichia_coli#cite_note-16
[7]https://2011.igem.org/Team:ETH_Zurich/Modeling/Parameters
[8] Dong Qing Wei, Ruo Xu Gu, et al.. Molecular Simulation And Computer-Aided Drug Design [M]. Shanghai: SJTU Press, 2011
[9] WANG Xiao—jia,WANG Yue,JIN Qi, et al.. Establishment and evaluation of mathematical model of circadian rhythm mechanism [J]. Med J West China 2005(3):672—3511
[10] Carlos Fraza˜o, et al.. Unravelling the dynamics of RNA degradation by ribonuclease II and its RNA-bound complex [J]. Nature 2006,doi:10.1038
[11] http://unafold.rna.albany.edu/?q=mfold/RNA-Folding-Form
[12] Danlel T. Gillespi . Exact Stochastic Simulation of Coupled Chemical Reactions[J]. The Journal of Physical Chemistry, Vol, 8 1, No. 25, 1977
[13] Rong Heng Sun, Applied Mathematical Statistics [M]. Beijing, Science Press,2014


Cistrons Concerto

Thanks:

1.Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences

2.NEW ENGLAND Biolabs

3.GenScript

Contact us:

E-mail: oucigem@163.com

Designed and built by @ Jasmine Chen and @ Zexin Jiao

We are OUC-iGEM logo-one logo-two