Difference between revisions of "Team:Cambridge-JIC/Model"

Line 39: Line 39:
 
     <center><h1 style="font-family:'Montserrat'; line-height:1.295em">MODELLING</h1></center>
 
     <center><h1 style="font-family:'Montserrat'; line-height:1.295em">MODELLING</h1></center>
 
     <div class="row scroller" style="color:#000">
 
     <div class="row scroller" style="color:#000">
         <hr style="height: 7px;color: #B7E2F0;background-color: #B7E2F0; width:70% ; float:center">
+
         <hr style="height: 7px;color: #B7E2F0;background-color: #B7E2F0; width:70% ; text-align:center">
 
         <div class="col-md-4">
 
         <div class="col-md-4">
 
         <a href="#formulation" class="darkBlue" style="font-family: 'Pacifico'"><h2 style="text-align: center">Model Formulation</h3></a>
 
         <a href="#formulation" class="darkBlue" style="font-family: 'Pacifico'"><h2 style="text-align: center">Model Formulation</h3></a>

Revision as of 13:51, 18 October 2016

Cambridge-JIC

MODELLING

INTRODUCTION


The aim of the modelling section was to create an integrated, kinetic model of our Cas9-guided chloroplast transformation mechanism. This model can be used to understand the internal workings of our proposed transformation method, and to determine whether it is viable and genuinely superior to existing methods. It also has practical use, in determining the expected timescales for homoplasmy (and thus when re-plating and selection could plausibly begin). The model is fully documented, and the code is available online and thoroughly commented. It could also be adapted not just to predict homoplasmy times in the chloroplasts of other organisms, but to yield useful information about the kinetics of Cas9-driven genetic transformation in any organism, encompassing the kinetics of:

  • Cas9-gRNA active complex formation
  • Cas9 cleavage on the chloroplast genome (for both on- and off-target sites)
  • integration of the gene of interest, via homologous recombination

PREVIOUS MODELS

Kinetic modelling of CRISPR/Cas9 action has been attempted by several iGEM teams in the past. However, past iGEM models have largely been geared towards using dCas9, a modified Cas9 molecule with no cleavage activity, to form genetic circuits. To the best of our knowledge, no past iGEM project has included a freely available, integrated model of Cas9-induced cleavage or homologous recombination. The Salis lab at Penn State University, however, have recently published a complete, biophysical model of CRISPR/Cas9 cleavage activity [1], which was a major point of reference for our modelling.

MODEL FORMULATION

1. Gene expression

For active Cas9 cleavage, a gRNA template must be transcribed, the Cas9 protein transcribed and translated, and a gRNA-Cas9 complex formed. In our proposed chloroplast transformation method, these are both expressed from the same “driver” cassette, introduced into the chloroplast by a transformation method such as biolistic transformation with a gene gun, but under individual promoters (the psaA promoter for the gRNA, and atpA promoter for Cas9). The Cas9 and gRNA then diffuse randomly through the chloroplast until they encounter one another, at which point they form a Cas9:gRNA complex. A further isomerisation reaction must then take place for this complex to become capable of cleavage. This process is modelled in the following 5 equations:

Eqn. 1

Where:

  • NgRNA is the number of gRNA strands in a single chloroplast
  • NCas9 mRNA is the number of mRNA strands coding for Cas9 in a single chloroplast
  • NCas9 is the number of Cas9 molecules in a single chloroplast
  • Nintermediate is the number of inactive, pre-isomerisation, Cas9:gRNA complexes in a single chloroplast
  • NCas9:gRNA is the number of active Cas9:gRNA complexes in a single chloroplast
  • Ndriver is the number of driver cassettes introduced into the chloroplast by a given transformation method
  • The various k values represent first or second order rate constants
  • The various δ values represent degradation rates
  • rc[i] represents the cleavage rate of the ith target site on the chloroplast genome

2. Cleavage rates

Once the Cas9:gRNA active complex has been formed, it is free to diffuse through the chloroplast until it finds a site on the genome. Cas9 uses no external energy source for binding, relying solely on the bound state being an energetically favourable configuration. It will only bind to the site if it “recognises” the PAM – i.e., if the binding of Cas9 to the PAM is energetically favourable (notably, this does not require a perfect match).

Once the site has been “recognised”, Cas9 moves down the genome, unwinding it nucleotide by nucleotide, displacing the complementary DNA strand, and allowing the gRNA to bind in its place (by Watson-Crick base pairing). This is known as R-loop formation. Each mismatch between the gRNA and genome carries an energy penalty. Enough mismatches will make the interaction energetically unfavourable, causing Cas9 to dissociate from the genome and diffuse freely, until it finds another site. If the interaction is still energetically favourable once the R-loop has been formed along the full length of the gRNA, Cas9 will cause a double stranded break in the DNA, 3-4 base pairs from the PAM [1], and dissociate, losing its efficacy (as a single-turnover enzyme) [2].

The kinetics used to model this process are very heavily influenced by the work of the Salis Lab, although similar work was done by the 2013 iGEM team from Wuhan University in China [3]. From its site of production, the Cas9:gRNA complex can be taken to undergo molecular diffusion in the form of an isotropic, 3-dimensional random walk. This leads to the following equation to describe the rate of contact of the Cas9:gRNA complex with all possible sites on the genome:

Eqn 2

Where:

  • rRW is the rate of DNA site contact
  • D is Cas9’s diffusivity
  • λ is the characteristic length between the site of Cas9:gRNA complex production and binding site
  • V is the chloroplast volume

As mentioned previously, Cas9 binding is governed by the sum of the free energy changes involved. Three energy exchanges are involved in this process. The first energy exchange results from binding of the PAM, ΔGPAM.

A second energy exchange, ΔΔGexchange, comes from R-loop formation. Using a nucleic acid nearest neighbour model, this can be considered a weighted sum of the energy exchanges involved in binding each gRNA duplex to its corresponding DNA duplex. 256 of these energy exchange parameters exist (for all existing RNA duplex combinations, binding to all possible DNA duplex combinations) and are expected to be near 0 for a match (e.g. rCG/dGC) and to carry a positive energy penalty for a mismatch (e.g. rCG/dAC). Mismatches occurring nearer to the PAM have been shown to carry more of an energy penalty than those further away. Thus, the equation for ΔΔGexchange for a 20 nucleotide gRNA is given by

Eqn 3

Where dk is the weight of a mismatch k nucleotides from the PAM, and ΔΔGgRNA:Cas9[k,k+1] is the free energy change from binding of the duplex formed by nucleotides at the kth and k+1th positions.

A final energy consideration proposed by the Salis lab is that which results from the supercoiling which is induced by Cas9’s binding to other, nearby sites. When Cas9 binds to a site, the uncoiling (negative supercoiling) of the genome which is necessary to form an R-loop is likely to further coil (positive supercoiling) all adjacent sites within a certain distance, as DNA’s linking number is conserved. This carries a positive energy penalty, making the nearby sites harder to uncoil. Discrepancies in the degree of supercoiling across the genome will result in a similar effect. However, due to the fact that most Cas9 binding will occur at a single site (the on-target site) in the regime we are considering, we were able to neglect the first effect. We also neglected the second, as there is only evidence of small variations in supercoiling across the genome [4].

The probability of the Cas9:gRNA complex binding to a potential DNA site is governed by a Boltzmann function, taking into account all of the potential Gibbs free energy

Eqn 4
    Where:
  • Ntarget[i] is the ith potential DNA binding site
  • ΔGtarget[i] is the free energy change associated with Cas9 binding to the ith potential site
  • kB is the Boltzmann constant
  • T is the temperature (taken here as 25°C)
  • N is the length of the genome

It follows that the rate of Cas9 binding to a given site is give as the rate of contact with the site (rRW) multiplied by the probability of binding. Once Cas9 has bound, it may cause a double stranded break in the site, usually around 3 base pairs from the PAM. However, there is still a chance that it will dissociate from the site before it has the chance to cleave, leading to the equation

Eqn 5

Where kc is the cleavage rate constant, and kd the dissociation rate constant.


3. Homologous recombination

Once a site has been cleaved by Cas9, the site will either be repaired by homologous recombination, or its lack of stability will degrade this copy of the chloroplast genome until it is unable to be repaired. In the C. reinhardtii chloroplast, there is no evidence of non-homologous end joining, and thus repair seems to happen exclusively through homologous recombination [5]. Homology-directed repair requires another, un-cleaved copy of the broken site, to act as a template for repair of the cleaved one. We thus model it as a second order process, dependent on both the number of cleaved and uncleaved sites. The rate of degradation is considerably less than the rate of recombination [6], and so here we neglect it, leading to the following expression for a given target site:

Eqn 6
    Where:
  • Ntarget[i] is the number of un-cleaved copies of the Cas9 binding site
  • Ntotal[i] is the total number of both cleaved and un-cleaved copies of the binding site, such that Ntotal[i]-Ntarget[i] is the number of cleaved copies of the binding site
  • kh is the homologous recombination rate parameter

As we are neglecting the possibility of degradation of the cut site, Ntotal[i] will be equal to the chloroplast’s copy number, Ncopy, for all sites but the on-target site (i.e., the site which exactly matches the gRNA). However, for the on-target site, Ntotal[i] will steadily decrease, as these sites are converted to containing the gene of interest, and are no longer viable Cas9 binding sites. Here, we assume perfect selection pressure – i.e. that once the on-target site has been cut, there is 100% chance of it being repaired using the gene of interest, and once it is converted to containing the gene of interest (and the antibiotic resistance which is found on the same cassette), there is 0 probability of it being converted back. The first assumption was somewhat validated by our sensitivity analysis (see below), and the second seems reasonable considering that once the site is converted, it is no longer a target for Cas9 cleavage. This leads to the expression

Eqn 7

Here, NGOI is the number of copies of the gene of interest present in the chloroplast, with flanking homology regions. This will initially be the number of “gene of interest” cassettes inserted via the transformation method (e.g. biolistics), but will increase as more copies of the genome contain the gene of interest, following the expression

Eqn 8

PARAMETERISING THE MODEL

1. Gene expression

As with all chloroplasts, regulation of gene expression in the C. reinhardtii chloroplast is a complex process. The majority of gene regulation happens at the post-transcriptional level, with factors changing the shape of mRNA’s 3’ and 5’ untranslated regions (UTRs) in response to a range of factors including light conditions, and nuclear signals indicating conditions such as the point in a C. reinhardtii cell’s circadian rhythm [7]. This results in extremely dynamic mRNA degradation and translation rates, making gene expression very hard to model accurately. Further complicating the matter is a reasonably noticeable lack of data on absolute gene expression levels in the C. reinhardtii chloroplast.