Team:Paris Saclay/Model

{{{titre}}}

Introduction

One of the goal of our project is to visualize the system “bring DNA closer” with the tri-partite GFP. In a first place, we decide to focus the modeling part on the question “what is the optimal distance between the two dCas9 for fluorescence?” Our system is divide in 4 parts:

  • Two dCas9
  • Two linker
  • Tri-partite GFP
Legend

This question is essential because the distance between the dCas9 may cause major problem: if this system doesn’t work we may not see the effect of the other system. In fact, the 3D structure of our linker may interact with the dCas9 or the GFP because of steric hindrance.


3D Model

The first step of this is the 3D model. We use the PDB website to find all proteins of our system and use Pymol to assemble all the system. For this model we have:

  • Two identical dCas9 from Streptococcus pyogenese instead of Streptococcus thermophiles and Neisseria meningitidus
  • The wild type GFP with the 10th and 11th beta sheet remove

The first limit was the linker. We don’t have any information but is sequence. So we decide to use the software PEP-FOLD for having a 3D simulation. The prediction was not usable for our model because the results were very improbable. We decide to build 3 different models: one with a small end to end distance, one with a long distance and one last with the mean between this two values. We had a first answer: the optimal distance lays between 73 and 110 base pairs.


Ideal Chain an Worm-Like Chain Models

We decide to simulate the end to end distance of our linker with mathematical model. The first was the Ideal Chain model (or freely jointed chain) which is the simplest model to describe polymers by assumes a polymer as a random walk. For N segments with a length of l we have the contour length which is the total unfolded length:

T--Paris Saclay--090816 contour length.jpg

R is the total end to end vector and it depends on the number of segments and the length of each segments:

T--Paris Saclay--090816 end to end.jpg


The end to end distance is distributed according to this probability density function:

T--Paris Saclay--090816 results1.jpg


With this model, we lose the information of the spatial arrangement of the repeat units. If we consider our real chain, the rotation of bonds around the backbone is restricted due to hindered internal rotation and due to excluded-volume effects. With this consideration, we know that our results are biased.


We decide to continue our mathematical model and we considered the Worm-Like Chain model. This model is suited for describing semi-flexible polymers. We used the paper Huan-Xiang Zhou (2004): Polymer Models of Protein Stability, Folding, and Interactions to have the probability density function:

T--Paris Saclay--090816 Zhou.jpg


With these two models we were able to construct a python program to have the first approximation for the end to end distance of our linker.

T--Paris Saclay--090816 results2.jpg

We see on this graph that we have two different behaviors and we can’t really develop our models because the information were really short: we just know the end to end distance. So we decide to code our own model for describing the behavior of our linker. The free jointed model and the worm like chain model give us an idea for the results that we waited.


Our mathematical model

T--Paris Saclay--090816 angle.jpg

We decide to make a program that simulate our linker. To do that we decide to consider each segment as a liaison, and all the dihedral angles. With the Pymol software, we were able to define some constant:

  • The length of each segment : 1.5 Å
  • The angle θ : π/3








In our program, we simulate in 3D our linker. We decide to model the linker by representing each segment by a vector and adding the vectors in the same base. In the end we obtain one vector which represent the end to end distance. We also keep the coordinates to have a 3D representation of our linker. To do that, we initialize the first vector on the Oz axis and then all the others segments will be expressed in that vector-base. For each segment of our model we have three possibilities:

  • The liaison N-Cα with the Φ angle
  • The liaison Cα-C with the Ψ angle
  • The liaison C-N which is the peptide bond

We have to define a change of basis matrix and consider the first two liaisons. For the peptide bond, we only consider the Rx matrix because Φ = 0. We define an initial vector:

T--Paris Saclay--090816 init vector.jpg

And we construct the change of basis matrix with a translation matrix Tz, a rotation matrix on the Oz axis Rz, and a rotation matrix on the Ox axis Rx.

  • Tz represent the translation of 1.5 Å corresponding of the segment length
  • Rz represent the Φ or Ψ rotation
  • Rx represent the θ rotation with a fix θ angle. It is define relatively with the Ox axis and that lead to a rotation of the coordinate system on the Ox axis.
T--Paris Saclay--090816 matrix.jpg

Then we have the change of base matrix: P = Tz * Rz * Rx. With this matrix, we can pass the vector n in the base of the vector n-1

T--Paris Saclay--090816 rotation.jpg

To each new segment, we define a value for Φ (that depends on the liaison of the amino acid). And we calculate for each Ui segment:

T--Paris Saclay--090816 length U.jpg

In the last step, we conserve the coordinates of each vector for plotting the 3D visualization. We complete our model by adding parts. We consider that our linker is not entirely consisting by glycine and we add a test for adjusting dihedral angles because other amino acids don’t share the same Ramachandran plot. We also define an exclusion zone for each vectors. With this, we exclude non biological covering.


Results

Our program is design to give the end to end distance on n simulations, to give a study of the RMSD and to show one simulation of the linker in 3D.

For 5000 simulations we obtain this graph:

T--Paris Saclay--090816 Gauss.jpg

As we can see the mean of end to end distance is 19.66 and the standard deviation is 7.82. We can compare our results to the other we obtain with the freely jointed chain. When we study the distance and the repartition of the last segment we obtain:

T--Paris Saclay--090816 repartition.jpg

In the end, we want a 3D representation of our linker and we obtain:

T--Paris Saclay--090816 linker.jpg

With our program we can obtain more information and have an idea of how our linker look on 3D and the space it can fold.


Gromacs software

T--Paris Saclay--090816 end gro.jpg

On this graph, we print the end to end distance to see the dynamic obtained with Gromacs. The results are:

  • Protein Average end to end distance: 2.223 (nm)
  • Average radius of gyration: 0.934 (nm)

This results are really close to our program results. With these results, we can say that our program give good results for the prediction of unfolded protein.








We also obtain the graphs of RMSD and RMSF from Gromacs to have more information about the dynamic of our linker.

T--Paris Saclay--090816 RMSD.jpg

With these information, we can construct the 3D model of our system.

T--Paris Saclay--090816 final.jpg

With Pymol software, we calculate the distance between the dCas9: 244.7 Å


Discussion

We can compare the results we obtained with our program and the results we obtained with other models. The freely jointed chain is the model that gives the better results for the end to end approximation. The worm like chain model is maybe not suitable for our kind of protein.

The Gromcas results show us that our program give a good estimation of the end to end distance and of the steric hindrance.

We can also say that our program is running in less than 5 minutes while Gromacs program run in more than 8 hours.

Our model can be improve because we only see our linker at one time without the all dynamics.

In another way or program give a good estimation of the end to end distance and of the 3D conformation for unfolded protein. We never consider the folding like alpha helix or beta sheet because the results of the disorder of our linker was near to 100% so the program don’t consider these.










Modeling

Mathematical models and computer simulations provide a great way to describe the function and operation of BioBrick Parts and Devices. Synthetic Biology is an engineering discipline, and part of engineering is simulation and modeling to determine the behavior of your design before you build it. Designing and simulating can be iterated many times in a computer before moving to the lab. This award is for teams who build a model of their system and use it to inform system design or simulate expected behavior in conjunction with experiments in the wetlab.

Inspiration

Here are a few examples from previous teams: