Line 257: | Line 257: | ||
</p> | </p> | ||
− | + | ||
<h3 class="h3">Adding B_carotene reactions</h3> | <h3 class="h3">Adding B_carotene reactions</h3> | ||
<p> | <p> |
Revision as of 03:35, 19 October 2016
Overview
A Genome-scale-model (GSM) is a representation of the metabolic network in a form that computers can understand, and make calculations to predict the fluxes (rates of reactions).
We used GSM for two purposes,
- Explore how Yarrowia Lipolytica (Y.Lip) grows in different environments, and identify optimal conditions for growing Y.Lip.
- Predict what genes can be amplified to enhance beta-carotene yield.
Results
We used the most recently published GSM for Y.Lip, solved using Cameo [cameo.bio] and CobraMatlab.
We fulfilled purpose 1: using Phenotype Phase Planes (PHPP). We investigated different Carbon sources as substrates and identified lines of optimality (the optimal relationship between substrate and oxygen uptake rates). The simulation results are supported by the substrate screening experiments, adding the value of predicting the effect of substrate uptake rates on growth behaviour.
We fulfilled purpose 2: using Flux Scanning based on Enforced Objective Flux (FSEOF) simulations. We identified the short list of genes that could be amplified to optimize beta-carotene production.
The results will serve as excellent starting point and guidance for future genetic engineering efforts in the lab.
Theory
As annotated whole genome sequences become common, It is possible to reconstruct a genome scale representations of the metabolic network as a computational model. The GSM has information on what metabolites are in the cell, what reactions does each metabolite take part in, the stoichiometry of the reactions, reactions’ rate limits, reversabilities, charge balances, genes associations etc. Each organism has its own GSM, containing its specific metabolites, reactions and genes.
The living organisms must abide by certain constraints or limitations, such as mass balance, limitation of nutrients in medium etc. The different types of constraints are explained in [Genome scale models of microbial cells: D. price] these constraints can be modelled in GSM by setting balanced equations and limiting reaction rates (fluxes).
Setting different constraints on the fluxes allow us to define different cultivation environments. We can set some fluxes to certain levels to simulate a specific environment, or gene manipulation, allowing us to simulate the Wild-Type’s (WT) and/or Mutant’s growth behaviour in that specific environment.
We have algorithms from literatures which help us to perform these simulations. The algorithms are implemented in Cobrapy extension Cameo and Matlab’ Cobra toolbox, some functions are modified by our team to better suit our needs. The algorithms include FBA (flux balance analysis) based PHPP(PHenotype Phase Plane), and FSEOF (Flux scanning based on enforced objective flux). In the next sections, we will tell our story on how we used these algorithms to achieve our goals of exploring Y.Lip growth and predicting gene manipulations to achieve better b-carotene yield.
Using FBA to calculate steady-state flux distribution
Most GSM problems are solved using the flux balance analysis (FBA). It formulates a constraint based optimization problem from the constraints we defined for the GSM. If solved, FBA reveals the flux distribution within the metabolic network. FBA assumes the network is in steady-state, which means the concentration of metabolites are not changing and in balance. The other constraints are uptake rates of substrate and oxygen. FBA underlies all the algorithms we used to conduct the genome scale modelling.
Genome-scale Modelling
The following explains the mathematical basis for genome-scale modeling (GSM). In short, a metabolic network can be modeled as linear equations and optimized by solving the corresponding linear program. The output is the set of reaction rates for all reactions in the network along with the set of shadow prices associated with the metabolites. Both play important roles in the subsequent phenotype phase plane analysis.
A simple metabolic network
Consider a very simple metabolic network with two metabolites, \(A\) and \(B\), where \(A\) flows into the cell with rate \(r_1\), is converted into \(B\) with rate \(r_2\), which is then excreted from the cell with rate \(r_3\): $$ \begin{split} & \xrightarrow{r_1} A \\ A & \xrightarrow{r_2} B \\ B & \xrightarrow{r_3} \\ \end{split} $$ Under a steady-state assumption, i.e. where all metabolite levels are constant and no metabolite can be accumulated in the system, the formation and degradation rates for each metabolite must cancel each other. For the network above, this is equivalent to the following set of constraints: $$ \begin{split} A:& \quad r_1 - r_2 = 0 \\ B:& \quad r_2 - r_3 = 0 \end{split} $$The linear program
Now, consider the full genome-scale model of Yarrowia lipolytica with \(M\) = 1683 metabolites and \(R\) = 1985 reactions.
Let \(r = (r_1, ... , r_R)\) be the vector of the \(R\) reaction rates, and similarly let \(m = (m_1, ... ,m_M)\) be the vector of the \(M\) metabolites. The reaction of which the rate is to be maximized, for example the biomass function, is then denoted the objective function and can be expressed as, $$ f(r) = r_{\text{obj}} $$ Now, for \(m_i\), the equality constraint \(c_i\) can be expressed as a sum of all \(r_j\) weighted by the stoichiometric coefficients \(a_{ij}\), where \(a_{ij}\) = 0 for reactions that do not affect the concentration of the considered metabolite. Thus, $$ c_i(r) = \sum\limits_{j=1}^R a_{ij}r_j $$ Which is just a generalization of the constraints of the two-metabolite network presented in the previous section. In addition to the steady-state constraints on the metabolites, for each reaction rate, \(r_j\), there is a lower and an upper bound (\(l_j\) and \(u_j\)). These inequality constraints can be expressed as, $$ \begin{split} c_{j,l} &= r_j \geq l_j \iff r_j - l_j \geq 0 \\ c_{j,u} &= r_j \leq u_j \iff -r_j + u_j \geq 0 \end{split} $$ Having the objective function and the constraints well-stated, the linear program can be formulated in standard form: $$ \begin{split} \max \quad & f(x) \\ s.t. \quad c_i &= 0 \quad \forall i \\ c_j &\geq 0 \quad \forall j \end{split} $$ Solving this program is equal to optimizing the genome-scale model. In order to accomplish that, it is necessary to define the so-called Lagrangian Function.
Solution to FBA and shadow prices
For each constraint \(c_i\) on the metabolite \(m_i\), let \(\lambda_i\) be the associated Lagrange multiplier or shadow price, let \(\lambda = (\lambda_i,...,\lambda_M)\) and let \(c = (c_1,...,c_M, c_{M+1}...,c_{M+R})\). Now, define the Lagrangian function: $$ \mathcal{L}(\lambda,r) = f(r) - \lambda^Tc(r) $$ The solution to the linear program is then found by solving the system, $$ \begin{pmatrix} \bigtriangledown \mathcal{L}(\lambda,c,r) \\ c(r) \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix} $$ Then, the output will consist of the set \(f(r),r,\lambda)\), i.e. the reaction rates (including the one corresponding to the objective function) and the shadow prices. In our case, the linear program was solved in COBRA, which uses a simplex algorithm by default. The objective function and constraints were based on an SBML model. REFERENCE Further mathematical details are omitted. REFERENCE.
Using Phenotype Phase Plane to explore growth and production in different environments
The Phenotype Phase Plane is a great way to explore target flux (e.g. growth rate, product secretion rate), as a function of two fluxes, in our case, carbon source uptake rate and oxygen uptake rate. This allows us to predict how Y.Lip grows using different Carbon(C) sources and different availability of the C source and oxygen. This kind of dual-variable (C and O2) flux investigation is called Phenotype Phase Plane(PHPP) analysis, it predicts the growth rate (or product fluxes, depending on your goal) in different cultivation environments.[characterizing the metaboluc phenotyoe: a phenotype plase plane analysis, Edwards] We can then find clues on how environment affects fluxes, and identify what conditions are favoured by Y.Lip. If we change the substrates in the medium to wastes, we can see how much growth behaviour changes.
There are multiple feasible solutions to a FBA problem, Not all of these solution have a biological counterpart. However, there are regions in the solution space over which the rate of shadow prices are constant. These regions or phases have been shown to resemble real phenotypes .[characterizing the metaboluc phenotyoe: a phenotype plase plane analysis, Edwards] For some uptake rate combinations, we have infeasible phenotype, this means this combination of uptake rates cannot satisfy the constraints, this maybe due to inability to satisfy ATP maintenance energy.
We produced phenotype phase planes for Y.Lip using different carbon sources. This allow us to see what substrates paired with what oxygen level makes better growth.
Adding Beta-Carotene pathway into the GSM
To model the insertion of the beta-carotene pathway, we introduced 2 more genes - crtYB and crtI, and they catalyze 4 new reactions. These reactions are introduced into the GSM as below: [https://2011.igem.org/Team:Johns_Hopkins/Project/VitA]
- phytoeneSynthase (crtYB): 2.0 geranylgeranyl diphosphate < --> 2.0 diphosphate + phytoene
- caroteneDesaturaseStep1 (crtI): phytoene <--> 6.0 H+ + neurosporene
- caroteneDesturaseStep2 (crtI): neurosporene <--> lycopene + 2.0 H+
- lycopeneCyslase (crtYB): lycopene --> b-carotene
To be able to optimize beta-carotene product flux, we must define a sink, or demand reaction in the GSM, this is done by creating a boundary exchange reaction where beta-carotene is in silico secreted into extracellular space, this reaction is defined as:
- DM_bcarotene: b-carotene →
DM_bcarotene is then set consequently as objective function when optimizing for production flux.
Using FSEOF to identify gene amplification and inhibition targets to optimize yield
Any organism will, by its evolutionary nature, maximize growth rate. As bioengineers however, we want a maximum of desired products while keeping a reasonable growth rate. Therefore we want an organism that is optimized for both growth rate and product yield. We do this by manipulating its genes, forcing product yield and growth rate to go hand-in-hand. In other words, we want an engineered strain that, in order to maximize its growth rate, it also produces the desired product. This is called coupling of product yield to growth rate. In our study, we couple beta-carotene yield to growth rate, and find out what genes should be amplified, inhibited, or knocked out, to achieve this coupling. this can be done by manipulating the genes, changing the flux bounds of certain reactions, to force the overall flux distribution towards the optimal solution.
We can also use constraints to reflect the effect of gene manipulation. As genes are defined and associated to reactions and metabolites in the GSM, we can simulate gene amplification, inhibition, and KOs, by giving the flux associated with that gene a value larger, smaller, or zero value, respectively.
We used FSEOF to identify gene amplification targets.
FSEOF is a technique to identify what fluxes increase, when beta-carotene flux is enforced to be a certain value. This is done, by incrementally increasing the beta-carotene production flux from zero towards its theoretical maximum, then under the constraints, optimize growth rate. As we increase the enforced production flux, and solve for the other fluxes, some of these fluxes will increase, while other will decrease. The assumption is, if we amplify the genes associated to the reactions whose fluxes increase, we can force the product and growth rate to couple. The FSEOF is introduced in [InSilico identficicaiton of gene amplificaiotn targets for imporvment of lycoepne prooduciton CHOI,].
Adding B_carotene reactions
Paragraph
Paragraph
what genes to amp? FVA and differential FVA
Paragraph
Paragraph
what genes to amp FSEOF
Paragraph
Paragraph
what genes to ko? optgene
Paragraph
Paragraph
lmoma and room results
Paragraph
Paragraph
Results
Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.
phpp on y.lip and s.cer on diff. substrates
Paragraph
Paragraph
what genes to amp and ko
Paragraph
Paragraph
so whta does this means?
Paragraph
Paragraph
Section 4
Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.
Section 5
Has ut facer debitis, quo eu agam purto. In eum justo aeterno. Sea ut atqui efficiantur, mandamus deseruisse at est, erat natum cum eu. Quot numquam in vel. Salutatus euripidis moderatius qui ex, eu tempor volumus vituperatoribus has, ius ea ullum facer corrumpit.