Team:OUC-China/Model

Model

Motivation
Introduction
Model development
Simulation & Analysis
Conclusion
References

MOTIVATION

With the rapid development of synthetic biology, there is an increasing requirement of an accurate quantitative regulation of gene expression. Moreover, researchers can provide insight into how the quantitative regulation system can be improved with the application of mathematical modeling.

Briefly, our team was in the devotion to developing a brand new method of accurate quantitative regulation at post-transcriptional level by means of utilizing the effect of inhibiting degradation of mRNA stem loop. More specifically, we hoped to realize a gradient amount of expression at protein or mRNA level by quantitatively manipulating the free energy (∆G) of designed stem loops downstream. However, the designed system is quite complicated than one can think—it consists of a variety of reactions and complex physical chemical process. To validate the effectiveness of initial designs and realize quantitative manipulating, we constructed a series of mathematical expression and built a mathematical modeling to simulate the theoretic curves of mRNA and protein. Thus, we could mutually authenticate the theoretic curves with the experimental ones and prove the feasibility and potential capacity of our designs’ functionality.

INTRODUCTION

Firstly, we analyzed and constructed the whole metabolic network of the system. The whole network could be separated into several independent processes: (1)mRNA transcription, (2) mRNA translation, (3) mRNA Cleavage by RNase E, (4) mRNA decay by RNase II. The complete metabolic network was listed in [ ]. To determine the parameters of these reactions, we respectively sought for Michaelis-Menten Kinetics, Empirical Formula and Partition Function (literature material) to respectively calculate the parameters’ value.

Secondly, our designed stem loops shared a gradient free energy. In order to simulate the theoretic curves of mRNA and protein and add the free energy (∆𝐺) into our model, we expect to determine the mathematical expression between the values of free energy (∆𝐺) of stem loops with the corresponding parameters—the constant of decay. To solve this problem, we sought for the Hansch-Fujita Equation which was a regression models widely used in the chemical and biological sciences. We also combined this sub model with some data from literature and experiments due to some theoretical defects.

Finally, we got the simulated results and evaluated it with statistical method, with the expectation to improve our model and our designs and make it more realistic and practical.

Metabolic reaction networks

(1)Transcription $ \phi \xrightarrow{{K}_{r}} [mRNA] $

(2)Primary translation $ [mRNA] \xrightarrow{{K}_{p_1}} [mRNA] + [{Protein}_{1}] + [{Protein}_{2}] $

(3)Cleavage by RNase E $ [mRNA] \xrightarrow{{K}_{d_1}} [{mRNA}_{1}] + [{mRNA}_{2}] $

(4)Secondary translation of GFP mRNA $ [{mRNA}_{1}] \xrightarrow{{K}_{{p}_{11}}} [{mRNA}_{1}] + [{Protein}_{1}] $

(5)Secondary translation of mCherry mRNA $ [{mRNA}_{2}] \xrightarrow{{K}_{{p}_{12}}} [{mRNA}_{2}] + [{Protein}_{2}] $

(6)Primary decay by RNase II $ [mRNA] \xrightarrow{{K}_{d_0}} \phi $

(7)Secondary decay of GFP mRNA by RNase II $ [{mRNA}_{1}] \xrightarrow{{K}_{{d}_{11}}} \phi $

(8)Secondary decay of mCherry mRNA by RNase II $ [{mRNA}_{2}] \xrightarrow{{K}_{{d}_{12}}} \phi $

(9)Decay of GFP $ [{Protein}_{1}] \xrightarrow{{K}_{{d}_{p_1}}} \phi $

(10)Decay of mCherry $ [{Protein}_{2}] \xrightarrow{{K}_{{d}_{p_2}}} \phi $

Symbol Description

Symbol	Definition	Unit
$[mRNA]$	The concentration of transcription product designed
$[{mRNA}_{1}]$	The concentration of GFP mRNA
$[{mRNA}_{2}]$	The concentration of mCherry mRNA
$[{Protein}_{1}]$	The concentration of GFP
$[{Protein}_{2}]$	The concentration of mCherry
$K_r$	The constant of Transcription
${K}_{p_1}$	The constant of Primary translation
${K}_{d_1}$	The constant of Cleavage by RNase E
${K}_{{p}_{11}}$	The constant of Secondary translation of GFP mRNA
${K}_{{p}_{12}}$	The constant of Secondary translation of mCherry mRNA
${K}_{{d}_{0}}$	The constant of Primary decay by RNase II
${K}_{{d}_{11}}$	The constant of Secondary decay of GFP mRNA by RNase II
${K}_{{d}_{12}}$	The constant of Secondary decay of mCherry mRNA by RNase II
${K}_{{d}_{p_1}}$	The constant of Decay of GFP
${K}_{{d}_{p_2}}$	The constant of Decay of mCherry

MODEL DEVELOPMENT

1) Michaelis-Menten Kinetics

To determine the reaction constant $K_r$ in (1), we seek for Michaelis-Menten kinetics. Further, we take the pilot process of DNA transcription into consideration for the sake of validating the accuracy of calculation and finally got satisfactory results.

Symbol	Definition	Unit
$[AraC]$	The concentration of dissociative repressor protein - AraC
$[AraB]$	The concentration of arabinose
$[AraC·AraB]$	The concentration of complex - [AraC·AraB]
${[AraC]_T}$	The sum of the concentration of both dissociative repressor protein - Arac and complex - AraC·AraB
$K_i$,i = 1, 2, 3	reaction rate constant
$k_m$	Michaelis constant
$v$	transcription rate

In our circuit design, we chose araBAD promoter which will be combined with repressor protein - AraC and the latter represses transcription of mRNA without arabinose. Then, Arabinose of reagent addition will bind to AraC and form the AraB·AraC compound, allowing transcription to occur.

[Hypothesis] We make an assumption that AraC are always in large concentration and that its binding to arabinose happens on a faster time scale than transcription. Therefore, we do not need to consider the individual concentrations of arabinose and AraC, instead we just need to include the concentration of the complex AraC · AraB.

The process boils down to the following formula:

$$Arac + AraB \overset{k_1}{\underset{k_2}{\rightleftharpoons}} AraC·AraB \xrightarrow{{k}_{3}} mRNA + Arac$$

according to law of mass action,

$$ { {d[Arac·AraB]} \over {dt} } = {{k_1}·{({{[AraC]}_T}-{[AraC·AraB]})} - {k_2}·{[AraC·AraB]} - {k_3}·{[AraC·AraB]} }$$

Because $ { {d[Arac·AraB]} \over {dt} } = 0 $

Therefore $ { { {k_2} + {k_3}} \over {k1} } = { { {({{[AraC]}_T}-{[AraC·AraB]})}·[AraB] } \over {[AraC·AraB]} } $

Define $ {k_m} = { { {k_2} + {k_3}} \over {k_1} } $, then

$$ { [AraC·AraB] } = { { { { [AraC] }_T } · [AraB] } \over { {k_m} + [AraB] } } $$

Then the transcription rate can be confirmed like the following mathematical expression:

$$ {v} = {k_3}·{ [AraC·AraB] } = {k_3} · { {{[AraC]}_T} } · { {[AraB]} \over { {k_m} + [AraB] } } $$

Further, according to the theory of order of reaction, transcription rate can be convert into reaction rate constant.

$$ {K_r} = { v \over { [AraC·AraB] } }$$

2) Partition Function

To determine the reaction rate constant ${K}_{{p}_{1}}$ , ${K}_{{p}_{11}}$ , ${K}_{{p}_{12}}$ , we seek for partition function. Inspired by the references [ ], we applied the partition function to the dynamic description of the translation process, in which we can obtain the probability that the ribosomes bind to RNA and then succeeded in converting the probability into translation rate. Further, we can calculate the reaction rate constant ${K}_{{p}_{1}}$ , ${K}_{{p}_{11}}$ , ${K}_{{p}_{12}}$ by utilizing the concentration data of (2),(4),(5) from our wet lab experiments.

[Hypothesis] Ribosomes are always in large concentration and they are more inclined to bind with RBS.

Symbol	Definition	Unit
$P_{bound}$	Probability of ribosome binding to RBS	/
$P$	Effective number of ribosome available for binding to RBS
$N_{NS}$	The number of nonspecific site of mRNA
$K^{S}_{pd}$	Dissociation constants for specific binding	nM
$K^{NS}_{pd}$	Dissociation constants for non-specific binding	nM
${\epsilon }^{S}_{pd}$	Binding energy for ribosome on the RBS	J
${\epsilon }^{NS}_{pd}$	Average binding energy of ribosome to the genomic background	J
$k_B$	Boltzmann constants	/
$T$	Temperature	K
$v$	Rate of reaction
$Volume$	Volume	L
$Avogadro$	Avogadro constants	/
$[mRNA](0)$	Initial concentration of mRNA

According to the document literature [], the probability that the ribosomes bind to RNA can be obtained as follow:

$$ { P_{bound} } = { 1 \over { 1 + { { N_{NS} } \over P }exp({ {{{\epsilon }^{S}_{pd}}-{{\epsilon }^{NS}_{pd}}} \over {{k_B}T} }) } } $$

Because $ { {{\epsilon }^{S}_{pd}} - {{\epsilon }^{NS}_{pd}} } \approx { {{k_B}T}ln({ {K^{S}_{pd}} \over {K^{NS}_{pd}} }) } $ , therefore the expression above can be further simplified into the following:

$$ { P_{bound} } = { 1 \over { 1 + { { {N_{NS}} \over {P} } · { {{K^{S}_{pd}}} \over {{K^{NS}_{pd}}} } } } } $$

Then, convert the probability into reaction rate [ ]：

$$ v = {1000*{ P_{bound} }} \over {Volume*Avogadro} $$

Then, to integrate experimental data and emulation analysis, we use the method of curve fitting to mRNA concentration in the initial time. Finally, we obtain the rate constant:

$$ k = { {1000*{ P_{bound} }} \over {Volume*Avogadro*{{[mRNA]}(0)}} } $$

3) Empirical Formula

To determine the reaction rate constant of mRNA cleavage by RNase E, we chose the empirical formula, obtained from document literature.

After transcript, RNaseE, a enzyme, recognized particular loci and then split mRNA into two sections. The process could be described as an reaction like the following:

$$ [mRNA] \xrightarrow {{k}_{d_1}} {{[mRNA]}_1} + {{[mRNA]}_2} $$

The reaction rate constant was obtained from document literature was following:

$$ {K_{d_1}} = { {{[H^+]}{K_{E1}}{k_0}} \over { {{K_{E1}}·{K_{E2}}} + { {[H^+]}{K_{E1}} } + {{[H^+]}^2} } } $$

where ${K_{E1}}$ and ${K_{E2}}$ were acid dissociation constants of the free enzyme. $k_0$ is the maximal rate constant of the catalytically competent form of the enzyme. ${K_{d_1}}$ is the rate constant at a given pH.

The experimental data showed the pH data in the table below:

Time/min	90	240	1385	2700	Mean value
Circuit 2	6.723	6.447	7.773	/	6.981
Circuit 5	6.613	6.480	7.360	8.170	7.156
Circuit 9	6.450	6.470	7.313	/	6.744
Circuit 11	6.660	6.480	7.830	8.390	7.340

Substituting the mean values of pH in each circuit into the formula ( ), we got the reaction rate constant of each circuit:

Circuit	2	5	9	11
${K_{d_1}} \over {s^{-1}}$	0.310	0.449	0.185	0.653

4) Hansch-Fujita Equation

Hansch-Fujita equation models are regression models used in biological sciences. It firstly summarizes a supposed relationship between chemical structures and biological activity in a data-set of chemicals. For our model, we treated the stem loop as our target chemical substances and hoped to determine a quantitative mathematical relation between the values of the free energy (ΔG) of different stem loops (the target substances) with the constant of decay.

[Hypothesis]

Following was the hansch-fujita equation:

$$ {lg{1 \over {C_n}}} = { {lg{{C_{max}} \over {C_0}}} - {0.0434{ {\Delta G} \over {RT} }} } $$

Symbol	Definition	Unit
$C_n$	The concentration that target chemical begins to work	mol/L
$C_{max}$	The maximum concentration of target chemical	mol/L
$C_0$	The initial concentration of target chemical	mol/L
$\|\Delta G\|$	Absolute value of Δ𝐺	kcal/mol
$R$	Thermodynamics constant , R = 8.314J/(mol · K)	J/(mol· K)
$T$	Temperature	K

Derived from Hansch-Fujita equation , we arrived at the equations:

$$ { { K_{ d_{11} } } \over { { K_{ d_{12} } } } } = { { {C_0} \over {C_m} } · { 1 \over { {10}^{ {0.0434} · { {|\Delta G|} \over {RT}} } } } } $$ $$ { { K_{ d_0 } } } = { {C_0} \over {C_m} · { 1 \over { {10}^{ {0.0434} · { {{\Sigma}{|\Delta G|}} \over {RT}} } } } } $$

Circuit	mfold	RNAfold	Sfold	Length(bp)
Circuit	∆G values of Stem loop $(kcal·{mol^{-1}})$			Length(bp)
5	-51.4	-51.4	-50.8	63
11	-44.9	-45.0	-44.5	66
13	-38.8	-38.9	-38.3	60
15	-38.8	-38.7	-38.3	57
7	-38.7	-39.0	-38.1	45
14	-38.5	-38.6	-38.1	58
12	-34.4	-34.5	-33.6	54
10	-30.1	-30.2	-29.4	50
1	-25.6	-25.5	-25.0	37
9	-20.0	-20.1	-19.5	42
6	-14.9	-15.0	-14.1	36
8	-10.0	-10.3	-9.7	34

Circuit	∆G of Stem loop (kcal·mol^(-1))	∆G values calculated by Mfold
5	-51.4
11	-44.9
9	-20
2	-25.6

Circuit 9	Circuit 2	Circuit 11	Circuit 5

SIMULATION & ANALYSIS

At the beginning of the Mfold replace the parameters with simple characters.
We chose the Gillespie algorithm to get the Exact Stochastic Simulation of Coupled.
Now, we are ready to simulate the system by Matlab.

CONCLUSION

We designed a series of stem loops of gradient free energy to explore the relationship between free energy and quantitative expression. And measured the relative expression of the up and down stream genes on both mRNA and protein level. The result are as follows:

REFERANCE

After we got the relationship between free energy and quantitative expression, we wanted to test our result in the tri-fluorescent reporter system.and we constructed the tri-fluorescent reporter system as follows:

The result are as follows:

[1] Carrier, T. A., & Keasling, J. D. (1997). Engineering mRNA stability in E. coli by the addition of synthetic hairpins using a 5′ cassette system.Biotechnology and bioengineering, 55(3), 577-580.
[2] Smolke, C. D., & Keasling, J. D. (2002). Effect of gene location, mRNA secondary structures, and RNase sites on expression of two genes in an engineered operon. Biotechnol Bioeng, 80(7), 762-776. doi: 10.1002/bit.10434
[3] Nojima, Takahiko, et al. "Controlling the expression ratio of two proteins by inserting a terminator between the two genes." Nucleic Acids Symposium Series. Vol. 50. No. 1. Oxford University Press, 2006.
[4] Nilsson, P., & Uhtin, B. E. (1991). Differential decay of a polycistronic Escherichia coli transcript is initiated by RNaseE‐dependent endonucleolytic processing. Molecular microbiology, 5(7), 1791-1799.

Cistrons Concerto

Thanks:

1.Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences

2.NEW ENGLAND Biolabs

3.Genscript

Contact us:

E-mail: oucigem@163.com

Designed and built by @ Jasmine Chen and @ Zexin Jiao

We are OUC-iGEM logo-one logo-two