Team:Sydney Australia/InternalCellular

Introduction and Motivations


Our cellular model develops on the characterization of the EtnR1 and EtnR2 regulation pathway as well as exploring possible targets for future development and optimization of our final biosensor construct.
Using this model, we predicted the behaviour of our biosensor under it’s designed conditions. Should this match up to experimental observations, we can confirm that the EtnR1 and EtnR2 regulation network behaves as originally hypothesized. This would be a key milestone in the characterization of these unknown proteins.

The main modelling environment utilized in this project is the MATLAB SimBiology toolbox, a plugin designed specifically for computational systems biology. The toolbox converts the underpinning mathematics of systems biology into a graphical interface, making it ideal for gene regulatory network modelling. SimBiology will be utilized with a deterministic differential equation solver method.In other words, the computations are identical every time the simulation is run, giving the same solutions. This means no randomization of the results. In real life, there is some degree of random variation in the quantities - this is where a stochastic model would be useful. However, over a cell population these random variations average out to give deterministic results.

As the cellular model is quite versatile, it will be used to model the similar 2-component regulatory system by iGEM Hamburg team as a collaboration between our two teams..This can be found on our collaboration page

The Signalling Cascade

Hypothesized signalling pathway

The signalling cascade is the unknown component of our model. While we have evidence to suggest it behaves as hypothesized below, it must be confirmed. If the model is successful and can be compared against experimental values, we may be able to confirm the characterization of EtnR1 and EtnR2.

To begin with characterization, the signalling cascade model has been assumed to act as a standard two-component signalling pathway. These types of systems have been characterized by well established mathematical models in the past (Chen, He and Church, 1999) (Ingalls, 2012).

More specifically, our model hypothesizes the following process, represented visually in the video below.
  1. Ethylene diffuses into the cell, and binds to EtnR2.
  2. The EtnR2-Eth complex phosphorylates EtnR1.
  3. Phosphorylated EtnR1 (now activated) binds to DNA, forming a DNA-EtnR1 complex.
  4. The DNA-EtnR1 complex activates the ethylene response promoter and transcribes the AmilCP gene.

This pathway seems to be plausible, since EtnR1 has confirmed DNA-binding activity. For more evidence supporting this hypothesis, see Sense.

View the video for a visual representation of our hypothesized signalling mechanism.

Signalling Kinetics

In cell signalling, molecules (called ligands) bind to their receptor sites which triggers a conformational change in the receptor. This results in a change of function. These ligands bind to their receptors with varying rates and affinities depending on the molecules involved.

These rates of binding are crucial, as they determine how fast the biosensor responds to the ethylene it is detecting. Low rates will lead to low expression (no blue), and high rates will lead to high expression (lots of blue).

We model these rates using the Law of Mass Action kinetics. This principle governs that the rate of a reaction (d/dt) is proportional to the concentrations of the substrates.
For a particular reaction, for example:

the rate can be written mathematically as:

But most biological reactions are reversible, like this:

So when we want to model the rate of reaction we have to take this reverse into account, by adding an extra term.

Applying these principles to the phosphorylation of EtnR1 and EtnR2 yields:

So, what are the values of these "k" numbers?

The answer is we don't really know all of them. So we take a few guesses. Our model begins and ends with these k values. If the output of our model matches that of the experimental, then we know that the k values we have guessed below are correct, and that the overall system behaves as hypothesized. Before we can do this, we have to consider the downstream processes of our biosensor (transcription, translation) and ensure that these processes have been modelled correctly.

Signalling Cascade Model Parameters Value Units
k1 Ethylene binding constant 4.8 (est)
k-1 Ethylene dissociation 0.4 (est)
k2 EtnR1 phosphorylation constant 9.8 (est)
k-2 EtnR1 dephosphorylation constant 0.02 (est)

Transcription


Transcription rate can be affected by a variety of factors. However, it is well established that initiation of transcription is the major factor in determining overall rate of mRNA production. Transcription initiation is a process influenced by a number of key steps including: promoter escape (Kugel & Goodrich, 1998), binding of the RNA Polymerase onto the promoter, isomerization, and formation of the closed RNA Pol complex from the open complex (Hakkinen et al. 2013). In this model, we will assume that binding of the RNA polymerase is the rate determining step in transcription initiation, and thus the rate determining step of transcription as a whole.

Thus, our key assumptions for transcription are:

  1. Overall transcription rate is determined primarily by the amount of substrate (Etr1) bound to the DNA and the strength of the promoter. Maximum transcription rate occurs when all binding sites are filled.

  2. Promoter strength (in Polymerases per second) is a constant, and represents the rate of transcription at maximal expression.

  3. Thus, overall transcription rate is a function of a) maximum transcription rate, and b) the fraction of signalling molecules bound to their binding sites.

  4. For the constitutively expressed genes (Etr 1 and Etr2), we can assume that they are at maximal transcription for all t.

Finding parameters for EtnR1 (Constitutive Transcription)

In our final construct, Etr1 is placed under the control of promoter J23100, and Etr2 is placed under the control of a T7 promoter.
J23100, originally submitted by Berkeley iGEM2006 team, has been characterized on the Standard Registry . This promoter has been characterized in terms of relative units. For example, looking at the database entry we can see that J23101 is 1791 times stronger than promoter J23112, which has a value of 1. Similarly, J23100 (our promoter), is 2547/1791 times stronger than J23101. Now, J23101 has been previously characterized by Kelly et al. 2009, who found that it has a value of 0.03PoPS (polymerases per second). Multiplying 0.03 by the ratio of the J23101/J23100 promoter strengths gives us a value of 0.021905PoPS for J23100. Thus, we can conclude that EtnR1 (under the control J23100), has a constitutive transcription rate of 0.021905PoPS, per molecule of DNA construct. This translates to 4.381 molecules/sec for our plasmid (mean copy number of 200)
However, this is by no means a conclusive value. A study by Yale iGEM2015 revealed that the relative expression rate of J23100 may vary depending on the reporter gene fused to the promoter as well as the particular strain of E. Coli used. In this model, J23100 is fused to EtnR1.

Finding parameters for EtnR2 (Constitutive Transcription)

In the final construct, EtnR2 has been placed under the control of the T7 Promoter. The Imperial College London iGEM2007 team characterized this promoter, giving a value of 8.88*10^13 molecules in 240 minutes, for 4 ug of DNA. This translates to 1 molecule every 905.7 seconds, per molecule of DNA construct. This rate is multiplied by 200, which is the mean copy number of the plasmid in use - yielding 0.22 mRNA molecules/sec.

Finding parameters for the AmilCP reporter protein (Induced Transcription)

For our reporter genes, the model for expression is complicated by the binding of signalling molecules. In contrast to the constitutively expressed genes, we must assume that the transcription rate is a function of the amount of signalling molecules that are bound.

This rate can be modelled by the Hill equation

Where Vm is the maximal transcription rate, [S] is substrate concentration, and K_d is the dissociation constant. In our model, [S] is Etr1. Using promoter strength as the maximum transcription rate would be preferred, however finding literature values for the uncharacterized Etn promoter was extremely difficult. In the absence of existing research on the Etn promoter, an alternative, but more intuitive method for transcription rate determination must be used.
To determine transcription rate, we determine the speed of the RNA Polymerase across the DNA transcript. An approximate value of 42 nucleotides per second, found by Proshkin et al. 2010, will be used for our model in the absence of promoter activity information. This represents a transcription rate of 0.06278 amilCP mRNA transcripts per second, per molecule of DNA construct. This is again multipled by the mean copy number of the plasmid (200) to give the final rate.
So why was this method not used for Etr 1 and Etr 2? It was found that calculating the Etr1/2 transcription rate using 42 ntd/sec produced a rate that was faster than that calculated by the promoter strength method. The promoter strength would be the rate limiting step for Etr1 and 2 but not for AmilCP.

Another factor to consider is the basal expression level of AmilCP. Many promoters have a slight level of expression even with no signalling molecules bound, which would modify our equation to:

where B is the basal expression rate. However, we can assume that B is so low that it can be ignored in our model.


Transcription Model Parameters Value Units
J23100 Promoter Strength 0.02195 Polymerases per Second (PoPS)
T7 Promoter strength 0.22 mRNA molecules/sec
Transcription rate for AmilCP mRNA 0.06278 mRNA molecules/second/DNA construct

Translation


After the AmilCP mRNA transcript has been formed, it must be translated into a functional chromoprotein. Translation involves 3 main stages: initiation, elongation of the polypeptide, and termination. The key molecular machinery driving this process are ribosomes, who's binding and movements determine the overall rate of transcription.
Translation rate can be affected by 3 major processes: a) the binding of the ribosome to the RBS, b) formation of initiation complexes, or c) speed of the ribosome across the mRNA transcript (Proshkin et al. 2010).
In this model, we will assume that the movement of the ribosome (in amino acids formed per second) is the rate determining step in the translation of the proteins.

Thus, our key assumptions for translation are:

  1. The major factor affecting translation rate is by the speed of the ribosome across the mRNA, which can be approximated by a constant rate.

  2. Translation rate is limited by the amount of available mRNA transcripts for ribosomes to bind to.

  3. Thus, translation rate is a function of ribosome speed and mRNA transcript concentration.

Finding parameters for translation

A literature search revealed an average translation rate of 17.1 amino acids/sec (Young and Bremer, 1976 and Proshkin et al. 2010. The rate of transcription therefore, will depend on the length of the mRNA transcript divided by 17.1. This is presented in the table below.

Product Length (amino acids) Translation rate (1/s)
AmilCP 223 0.07668
EtnR1 581 0.02943
EtnR2 199 0.08593

These translation rate coefficients are for ribosomes moving across a single mRNA molecule. In reality, our cell will contain multiple copies of the same mRNA molecule. We multiply these translation rate coefficients by total number of mRNA molecules to give the total translation rate, in molecules per second per DNA construct.

Degradation and Dilution


The concentrations of each species in the model is affected by it’s degradation (inherent to the molecule) and dilution (due to cell growth). While these processes are completely unrelated, they have the same net effect: Decreasing the concentration of the species. Therefore, we represent the two rates as a combined degradation/dilution parameter in our model.

The lifetime of mRNA is usually extremely limited. Compared against the rate of cell growth, or dilution of the cell, mRNA degrades before it encounters any significant dilution.
The opposite is true for proteins. While proteins are victims of degradation, it is not the major factor in the reduction of a protein’s concentration. Proteins tend to far outlive the mRNA molecules that produced them. It is often the changes in the protein’s environment that lead to it’s decrease in concentration. As a cell grows in size, the amount of AmilCP inside it remains relatively constant, but the concentration of AmilCP decreases.

In our model, we consider only the degradation/dilution of AmilCP mRNA and AmilCP. Regardless, it is revealed in the sensitivity analysis below that degradation and dilution have a minimal effect on overall biosensor performance when compared to other more important parameters.

Name Value Units
AmilCP mRNA degradation/dilution 0.0022

(Selinger et al. 2003)

molecules/sec
AmilCP Protein degradation/dilution Estimated Molecules/sec

Cellular Equations


Using the parameters found above, we can construct a series of ordinary differential equations which represent our biosensor network.

The system described by these equations can be represented diagramatically by SimBiology. The following illustration is a full visual representation of the regulatory network.

Results


The simulation was run with an initial ethylene concentration of 20 mM, over a time period of 1 hr. AmilCP concentrations rose to 581 nM at the end of this hour. Longer simulations produced higher concentrations, however this may not be representative as our model has not taken into account long term cell effects such as cell division, or the effect of high AmilCP concentration on cell life cycles.
The graphs below show the time course concentrations of each species, with species of similar concentrations being plotted on the separate axes.

Sensitivity Analysis


Compared to the gene regulation pathways existing in nature, our biosensor is quite simplistic. Nevertheless it has led to a model with a large number of parameters. However, not all of these parameters contribute equally to the overall output of the model.
Often it is useful to know which of these parameters have the greatest influences on the system because these parameters are the ones that should be targeted for any optimization or further study.
Essentially, this section seeks to answer the question “which of these numbers are important and which ones are not?”.

The sensitivity analysis is built into MATLAB SimBiology. The software uses complex-step approximatio to calculate the derivative of the output with respect to each individual parameter. The larger the change in output, the more impact that parameter has and thus the more sensitive our system is to the parameter of interest.

Two crucial results can be drawn from the sensitivity analysis:

1. A large portion of the parameters with estimated values are negligible. This includes dephosphorylation, dissociation, degradation and dilution.
2. AmilCP protein production is most sensitive to the following parameters in the table below.

Input parameter or species Function Sensitivity Value (arbitrary units)
R2 Etr1P’rylate rate constant for phosphorylation of EtnR1 0.9969
R4 DNA binding n Hill coefficient for the binding of phosphorylated EtnR1 to DNA 1.1334
R5 AmilCP Promoter Strength Strength of the promoter controlling AmilCP transcription 0.5670
R6 AmilCP translation Rate of translation of AmilCP mRNA 0.5239
Etr1J23100 Promoter Strength Strength of the promoter controlling EtnR1 transcription 0.9989
Etr1 Translation Rate of translation of EtnR1 0.9953
T7 Promoter Strength Strength of the promoter controlling EtnR2 transcription 0.9981
Etr2 Translation Rate of EtnR2 Translation 0.9881

Of these parameters, 6/8 are transcription or translation terms which have been characterized in the sections above. This leaves us with two parameters: The EtnR1 Phosphorylation rate constant and the Hill coefficient for the DNA binding activity of EtnR1. These two values have the a) largest impact on the overall results and b) are uncharacterized, and thus should be the target of any future optimization of our biosensor.

The full sensitivity analysis output for all parameters is summarized in the matrix below. The sensitivities are represented by a colour scale in arbitrary units.

Conclusions

How do these findings affect our biosensor?

Due to the ambiguous nature of the ethylene receptor system and lack of existing literature on the topic, it was not possible to completely characterize the signalling cascade within the constraints of the iGEM project.
However, our model provides a hypothesis with which we can compare future results to. If graphs of experimental concentration vs time course match the shape of our model, it provides stronger evidence that the signalling cascade behaves as we have predicted. If it does not, the differences in the graphs may give us direction in making constructive changes to our mathematical model.

The sensitivity analysis showed us that among all the unknown parameters, only two important processes have a significant impact on the biosensor output: EtnR1 phosphorylation and its corresponding DNA binding activity. In future characterization, these two parameters should be specifically targeted to save time and resources.

References


Chen, Ting, Hongyu He, and George Church. Modeling Gene Expression With Differential Equations. 1st ed. Boston: Harvard Medical School, 1999. Web. 9 Oct. 2016.
Häkkinen, Antti et al. "Effects Of Rate-Limiting Steps In Transcription Initiation On Genetic Filter Motifs". PLoS ONE 8.8 (2013): e70439. Web.
Ingalls, Brian. Mathematical Modelling In Systems Biology: An Introduction. Waterloo: University of Waterloo Department of Applied Mathematics, 2012. Print.
Kelly, Jason R et al. "Measuring The Activity Of Biobrick Promoters Using An In Vivo Reference Standard". J Biol Eng 3.1 (2009): 4. Web. 8 Oct. 2016.
Kugel, J. F. and J. A. Goodrich. "Promoter Escape Limits The Rate Of RNA Polymerase II Transcription And Is Enhanced By TFIIE, TFIIH, And ATP On Negatively Supercoiled DNA". Proceedings of the National Academy of Sciences 95.16 (1998): 9232-9237. Web.
Proshkin, S. et al. "Cooperation Between Translating Ribosomes And RNA Polymerase In Transcription Elongation". Science 328.5977 (2010): 504-508. Web.
Selinger, D. W. "Global RNA Half-Life Analysis In Escherichia Coli Reveals Positional Patterns Of Transcript Degradation". Genome Research 13.2 (2003): 216-223. Web. 18 Oct. 2016.
Young, R and H Bremer. "Polypeptide-Chain-Elongation Rate In Escherichia Coli B/R As A Function Of Growth Rate". Biochem. J. 160.2 (1976): 185-194. Web.

School of Life and Environmental Sciences
The University of Sydney
City Road, Darlington
2006, New South Wales, Sydney, Australia