Revision as of 03:25, 20 October 2016

Projects :: Syndustry - iGEM Marburg 2016

SynDustry Fuse. Use. Produce.

Quantitative evolutionary stability analysis of kill switches

Introduction

The fact that Synthetic Biology not only opens doors to a new era of science but also threatens humankind and the ecosystem we live in is widely known. The impact of genetically modified organisms (GMOs) in nature can not be foreseen and are practically irreversible once having been in contact with nature.

Genetic constructs designed to kill GMOs on purpose once they have unintentionally escaped the lab environment are a great way to reduce the threat through these organisms. These constructs are called killswitches in analogy to their industrial equivalent. It might be implemented into an otherwise already genetically modified organism. Once the organism escapes the lab, it will die. This way, it cannot harm the environment.

However, one has to keep in mind that all organisms underlie natural occurring mutation. Therefore the killswitch might be destroyed through mutations while the organism is under lab conditions. This problem causes subsequently that the GMO might threatens nature due to its survival after an accidental escape.

A killswitch is a genetic regulatory network. Such a network is built from biological components such as promoters and genes which are interlinked with each other. The network’s structure – also called its topology – plays a crucial role for killswitchs: Weakly design killswitch topologies are prone to destruction through mutation. Therefore, a GMO’s safety classification depends on its killswitch topology. This raises the need for a tool to quantify a killswitch topology’s robustness against mutation. Such a tool does not only provide help for a biologist designing a GMO how to design a specific killswitch but it can be also used to derive general design guidelines.

In the following, we will describe how the escape rate of a killswitch is computed. The escape rate is defined as the probability that a killswitch is destroyed due to mutations during lab conditions such that the organism survives in wild life.

Genetic regulatory networks

Genetic regulatory networks (GRN) describe how genes are expressed inside an organism. This is very comprehensively summarized in wiring diagrams. Taking electrical circuits as comparison, the electrical current is equivalent to the flow of gene expression in GRNs. Genetic promoters represent nodes with different logical behaviors that lead to gene expression and therefore proteins being built. Subsequently, these proteins control promoters again such that a complex behavior arises.

A common way to put this into a context so that quantitative statements are possible is to use the modeling of GRNs based on ordinary differential equations (ODEs).

First, a vector of all involved substances is defined $$\vec{X} := \sum_{ i=1 }^{ n } X_i \cdot \vec{e}_i$$ with $X_i$ being the concentration of the $i$th substance. Substances span from mRNA over proteins to intermediate complexes. This vector $\vec{X}_i$ depends on the degree of model detail and the precise biological processes (for instance dimerization or cooperativity).

The time derivative of the vector $\vec{X}_i$ is then formulated as a continuous function that might depend on the concentrations of all involved substances $$\frac{d}{dt}\vec{X}(t) = \vec{f}(\vec{X},t,k)$$ where $t$ is a given point in time and $k=(k_i)_i$ encapsulates all constant parameters used in the time derivative. The parameters $k_i$ are of very high relevance: They capture the biological processes and determine if the mathematical model in return resembles the involved biology.

Modeled killswitches

(a) BNU China 2014

The killswitch of BNU China’s iGEM team in 2014 was chosen to be modeled first. Topologically, it is a simple cascade of repressing units. The network topology given by BNU China 2014 is shown in figure 1. We investigated the involved biological processes and updated the network’s topology accordingly to resemble the biological structure correctly. This can be found in figure 1 as well.

Figure 1. (a) The topology of the BNU China 2014 killswitch. (b) Our adapted and more detailed topology.

The inducer is IPTG and controls the toxin MazF. We introduced the dimerization of LacI and CI such that the dimers interact with the promoters. Furthermore, we modeled the repression through IPTG by a reversible second order reaction.

With these adaptions, the ODE system of BNU China’s iGEM team 2014 is a 10 dimensional vector and requires 23 parameters. The parameters have been chosen to be in accordance with biological processes. This killswitch is the starting point for other network topologies that have been studied.

(b) Parallel (OR) regulation

While the previous killswitch consists of a serial cascade, another generic topology is two parallel cascades regulating the same toxin. These two cascades regulate the promoter in a OR gate manner: The promoter is repressed if one regulator is present.

Each cascade consists of the BNU China 2014 killswitch explained above. We take the step from specific biological substances to abstract substances that behave like their biological equivalences: IPTG becomes I, lacI (LacI) becomes aa/ab (Aa/Ab), cI (CI) becomes ba/bb (Ba/Bb) and finally mazF (MazF) becomes c (C). This way, we focused on the topology of the killswitches. The topology can be found in figure 2.

Figure 2. The parallel (OR) killswitch's topology.

These changes lead to a 19 dimensional concentration vector with 42 parameters . Since the overall mutation probability depends on the number of parameters, this killswitch is not comparable to the original BNU China 2014 killswitch anymore. For that reason, we introduce a longer cascade in the following that is comparable to this parallel (or) regulation.

(c) Serial regulation

This serial killswitch introduces two chained BNU China 2014 killswitches. That feature increases the number of parameters to 39. Therefore, it is comparable to the parallel (or) killswitch.

Figure 3. The serial killswitch's topology.

(d) Parallel (AND) regulation

The last killswitch is similar to the parallel (or) topology. It differs in the sense that the two branches are now interlinked in an AND gate manner: Both regulating branches must be active in order to repress the promoter (CHECK!). Each branch is equivalent to the BNU China 2014 killswitch. The number of parameters is 41 and the concentration vector is 18 dimensional. Therefore, it is also comparable to the killswitches in (b) and (c).

Figure 4. The parallel (AND) killswitch's topology.

General genetic algorithms

General genetic algorithms are a population-based meta-heuristic approach to optimize a complex target function. The leading question is: Given a large input parameter space, how to choose a configuration that yields a maximal value in terms of a certain measure function.

This problem is formulated in terms of biological evolution. A specific configuration from the large input parameter space is called individual and its value in terms of the chosen measure function is defined as its fitness value. An individual has a representation. A set of $N$ individuals is called a population of size $N$.

A population is simulated over the course of time for a number of generations $g$. After each generation, previously modeled evolutionary processes are applied. These comprise mutation of an individual, parental selection from the population based on the individuals’ fitness values and individual pairing. The so generated offspring replaces the previous generation and forms the next generation.

Thus, genetic algorithms resemble evolution through mutation and selection pressure in order to find the fittest individual.

Adapted genetic algorithm

We use a modified genetic algorithm with fixed population size that computes the escape rate of individuals implementing a killswitch due its destruction through mutations. The killswitches are modeled as genetic regulatory networks. A major difference is that it is not used for an optimization task but rather for the evolution of the population. Below, the algorithm processes are defined and comprehensively summarized in figure 5.

Figure 5. The used genetic algorithm.

The algorithm takes various input parameters: The killswitch to simulate, the population size $N$, the mutation rate per parameter $\eta$ and the generations in incubation $g$. The major result is the escape rate: The probability that an individual survives a release into wild life even though the killswitch should kill it. Since mutations gradually change the killswitch’s parameters over the incubation time, the escape rate is a measure for how robust the killswitch’s topology is in terms of evolutionary stability.

Implementation

The computationally most expensive step is solving the differential equation for $\vec{X}(t)$ for all substances and all $N$ individuals at each time. Various mutation rates $\eta$ are set and each simulation is executed multiple times to obtain statistical relevance.

The algorithm including all numerics has been implemented in Python to run on arbitrary many CPU cores in parallel. The systems of ODEs are generally not solvable analytically. Thus, we have used a stiff ODE integrator to compute the solutions: lsoda with bdf routine from the well-established ODEPACK.

The code corresponding to the algorithm explained above and further optimizations is available upon request (Martin Lellep, martin.lellep@physik.uni-marburg.de).

Summary

We introduced further steps in the general genetic algorithm to adapt it to our needs. Not the individual with the highest fitness is computed but rather the escape rate of individuals after incubation time due to mutations. We accomplished this by using an integration of ODEs to update the population from one to another time step.

The results section will present the simulated killswitches' dynamics and the evolutionary algorithm's findings in terms of escape rates. We provide general statements on how a killswitch topology should be designed for future constructs.

@@ Line 155: / Line 155: @@
              <p>
-First, a vector of all involved substances is defined $$\vec{X} := \sum_{ i=1 }^{ n } X_i cdot \vec{e}_i$$
+First, a vector of all involved substances is defined $$\vec{X} := \sum_{ i=1 }^{ n } X_i \cdot \vec{e}_i$$
 with \(X_i\) being the concentration of the \(i\)th substance. Substances span from mRNA over proteins to intermediate complexes. This vector \(\vec{X}_i\) depends on the degree of model detail and the precise biological processes (for instance dimerization or cooperativity).
              </p>
@@ Line 331: / Line 331: @@
              <div class="figure_text">
-<b>Figure 2.</b> The serial killswitch's topology.
+<b>Figure 3.</b> The serial killswitch's topology.
              </div>
@@ Line 364: / Line 364: @@
              <div class="figure_text">
-<b>Figure 2.</b> The parallel (AND) killswitch's topology.
+<b>Figure 4.</b> The parallel (AND) killswitch's topology.
              </div>

Difference between revisions of "Team:Marburg/Modeling"