Team:TU Darmstadt/Model

If you can see this message, you do not use Javascript. This Website is best to use with Javascript enabled. Without Javascript enabled, many features including the mobile version are not usable.
iGEM TU Darmstadt 2016

MODELING

ABSTRACT
Bonding of proteins is highly depending on structural properties which in turn are determined by the amino acid sequences. Changing the amino acid sequence of one participating partner could consequently diminish it's binding ability. Therefore it is important to estimate the influence of mutations on the protein structure. This is particularly true for mutations from natural to non-natural amino acids.
To estimate the influence of O-methyl tyrosine on Colicin E2's immunity protein we applied several molecular dynamics simulations leading to 1300 ns in total simulation time. To do this we estimated O-methyl tyrosine parameters for the CHARMm 22 and the GROMOS36a7 force field. We evaluated our simulations by applying several well documented evaluation methods like secondary structure analysis, plotting the solvent accesible surface area, and RMSD and RMSF. Our first simualtion analysis led to the conclusion that O-methyl tyrosine had no influence on the immunity protein.
To estimate possible influences on the thermodynamics of the system we calculated the binding energy between Colicin E2 and it's immunity protein by pulling experiments with following umbrella sampling molecular dynamics simulations. The binding energy was afterwards calculated using the WHAM algorithm showing only minor differences.

THEORETICAL OVERVIEW

Molecular Dynamics Simulation
Introduction

Molecular Dynamics (MD) Simulations is a method to describe atomic and molecular movements. Molecular Dynamics simulations depend on several simplifications that enables the simulation to range from nanoseconds up to severeal milliseconds in systems containg of over one hundred thousand atoms. This enables the possibility to study different biomolecular processes like protein protein binding or enzyme dynamics. Because of the deterministic nature of the system it is possible to calculate thermodynamic properties like free energy or free binding enthalpies.

Assumptions

To describe atomistic or molecular behavior the exact system conditions like positions and energies. The energies of an atomic system are described by the Schr&oumldinger equation (eq. \ref{Schrödinger}) with wavefunction \(\Psi\) (eq. \ref{wavefunction}), kinetic energies \(\hat{T}_{e}\) and \(\hat{T}_{N}\) and potential energies \(\hat{V}_{e}\), \(\hat{V}_{N}\) and \(\hat{V}_{eN}\). Terms with subscript \(_{e}\) are terms concerning the electrons and terms with subscript \(_{N}\) are terms concerning the nuclei.

$$ \begin{equation} (\hat{T}_{e} + \hat{T}_{N} + \hat{V}_{e} + \hat{V}_{N} + \hat{V}_{eN})\Psi=i \hbar \frac{\partial}{\partial t}\Psi \label{Schrödinger} \end{equation}$$ $$ \begin{equation} \Psi=\Psi(\vec{r}_{1},..., \vec{r}_{N_{e}},\vec{r}_{1},...,\vec{r}_{N_{N}}) \label{wavefunction} \end{equation}$$

Since there is no possibility to solve these equations numerical, it is necessary to simplify the system description. The first assumption depends on the Born-Oppenheimer approximation that the Schr&oumldinger equation can be splitted into two parts, one for the electrons and one for the nuclei respectivly. Since the electrons are far more mobile the dynamic of the system can be defined by the nuclei positions.
Molecular Dynamics simulations depend on several simplifications. First we assume in accord with the Born-Oppenheimer approximation that electronical movement has no influence on the overall atomic momentum because electrons will simply follow the nuclear movements in the simulated time scales. Second we can describe the potential energy function by a sum of simple terms. These terms are described in the so called force field which will be described later on. Third the system potential is evaluated by deriving the forces and applying Newtonian mechanic calculations as shown in equation \ref{newton} and \ref{newton1}.

$$ \begin{equation} M_{K}\frac{d\vec{v}_{1}}{dt} = M_{K}\frac{d^{2}\vec{r}_{1}}{dt^{2}} = \vec{F}_{\vec{r}_{1}} = \frac{\partial V\left(\vec{r}_{1},...,\vec{r}_{N}\right)}{\partial \vec{r}_{1}} \label{newton} \end{equation} $$

$$ \begin{equation} F_{ij}=-\frac{\partial}{\partial r_{ij}}V_{force~field} \label{newton1} \end{equation} $$

To solve these terms numerically we have to discretize the trajectory and therefore use an integrator for the small time steps. Several different integrators were developed today, of which the velocity-Verlet algorithm is the most used (eq. \ref{VV1} & \ref{VV2}).

$$ \begin{equation} r_{i}(t_{0} + \Delta t) = r_{i}(t_{0}) + v_{i}(t_{0})\Delta t + \frac{1}{2}a_{i}(t_{0})\Delta t^{2} \label{VV1} \end{equation} $$

$$ \begin{equation} v_{i}(t_{o}+\Delta t) = v_{i}(t_{0}) + \frac{1}{2}[a_{i}(t_{o} + \Delta t)]\Delta t \label{VV2} \end{equation} $$


The temperature of the system is directly correlated to the distribution of kinetic energies. Therefore the temperature of the system can be controlled by manipulating the atom velocities. A possible way to do this was proposed by Berendsen by coupling the system to a heat bath resulting in a NVT ensemble (eq. \ref{Berendsen}).

$$ \begin{equation} a_{i}=\frac{F_{i}}{m_{i}} + \frac{1}{2 \tau_{T}} \left( \frac{T_{B}}{T_{t}} -1 \right) v_{i} \label{Berendsen} \end{equation} $$

Empirical Force Fields

Empirical Force Fields are the backbone of every Molecular Dynamics simulation. Typically the force fields are diveded into two parts, bonded and nonbonded interactions. Bonded interactions consist of chemical bond stretching, angle bending, and rotation of dihedrals and impropers. Nonbonded interactions are approximated by Coulomb interactions (ionic) and Lennard-Jones potentials. The overall CHARMm (Chemistry at HARvard Macromolekular mechanics) potential is calculated by summing up these main potentials ( \( V_{CHARMm} = V_{bonded} + V_{nonbonded} \) ).
In equation \ref{CHARMM_bonded} and \ref{CHARMM_nonbonded} the bonded and nonbonded Potentials of the CHARMM force field are displayed. All terms consist of an equilibrium value marked with \(0\) and a force constant \(K\).

$$ \begin{equation} V_{bonded} = \sum_{bonds}{K_{b}(b-b_{0})^{2}} + \sum_{angels}{K_{\theta}(\theta-\theta_{0})^{2}} + \sum_{torsions}{K_{\phi}(1+cos(n\phi-\delta))} + \\ \sum_{impropers}{K_{\psi}(\psi-\psi_{0})^{2}} + \sum_{Urey-Bradley}{K_{UB}(r_{1,3}-r_{1,3,o})^{2}} + \sum_{\phi\psi}{V_{CMAP}} \label{CHARMM_bonded} \end{equation} $$ $$ \begin{equation} V_{nonbonded}=\sum_{nonbonded}{\frac{q_{i}q_{j}}{4\pi D r_{ij}}}+ \sum_{nonbonded}{\epsilon_{ij}\left[\left(\frac{R_{min,ij}}{r_{ij}}\right)^{12}-2\left(\frac{R_{min,ij}}{r_{ij}}\right)^{6}\right] } \label{CHARMM_nonbonded} \end{equation} $$

The additional terms CMAP and Urey-Bradley are correctional terms for backbone atoms and 1, 3 interactions respectively.

Simulation Analysis

Because of the vast amount of data that is produced by Molecular Dynamics simulations it is essential to process the data into more accesible formats. To perform this task we applied several approaches like comparison of the solvent accesible surface area (SASA) over time.

Root Mean Square Deviation

The Root Mean Square Deviation (RMSD) describes the sum of distances of all selected atoms \(n\) between themselves in a selceted timestep \(\tau\) and a reference timestep \(r\) (eq. \ref{RMSD}). Plotted over time it is possible to detect fluctuations in the whole molecular configuration and therefore it is possible to conclude structures of high stability from plateaus in the RMSD.

$$ \begin{equation} RMSD_{\tau}=\sqrt{\sum_{n=1}^{N}{\left((x_{\tau,n}-x_{r,n})^{2}+(y_{\tau}-y_{r,n})^{2}+(z_{\tau,n}-z_{r,n})^{2}\right)}} \label{RMSD} \end{equation} $$

By chosing the right atom selection it is possible to evaluate different behaviour of protein subgroups. For example if the RMSD between Cα atoms is calculated it is possible to plot the backbone movement over time and hence detect configurations that differ from the starting structure. This is important if the one wants to search for different thermodynamic stable ensembles of the protein or molecul of interest.

Root Mean Square Fluctuation

Similar to the RMSD the Root Mean Square Fluctuation describes the sum of distances of all selected atoms. In this case the distance per atom between all selected configuartions is calculated and summed over time. Therefore it is possible to spot out residues with strong mobility and consequntly residues that are part of fluctuating and disordered protein subunits.

$$ \begin{equation} RMSF_{n}=\sqrt{\sum_{\tau=1}^{T}{\left((x_{n}-x_{0})^{2}+(y_{n}-y_{0})^{2}+(z_{n}-z_{0})^{2}\right)}} \label{RMSF} \end{equation} $$

DSSP

Define Secondary Structures of Proteins (DSSP) by Wolfgang Kabsch and Christian Sander is a standard program to analyse secondary structure properties of proteins. The main idea to discriminate between different secondary structures is to based on the presence of H bonds because this can be represented by one energy value. This definition enables the algorithm to distinguish different types of α helices, β sheets, and turns.
The electrostatic interations between two groups are calculated by assigning partial charges to each C (\(+q_{1}\)), O (\(-q_{1}\)), N (\(-q_{2}\)), and H (\(+q_{2}\)), with \(q_{1} = 0.42~e\) and \( q_{2} = 0.20~e\) and r(AB) being the distance between two atoms A and B in Angstr&oumlm. An H bond is defined by \( E < -0.5~\frac{kcal}{mol}\).

$$ \begin{equation} E = q_{1} q_{2} \left( \frac{1}{r(ON)} + \frac{1}{r(CH)} + \frac{1}{r(OH)} + \frac{1}{r(CN)} \right) 332~\frac{kcal}{mol} \label{DSSP} \end{equation} $$

Binding Energy Calculations

Several methods to calculate binding energies from molecular dynamics simulations have been developed. Each of these methods rely on a stepwise cancelling of the binding state by either decoupling the binding interactions often called alchemistry perturbation mehtods or by pulling the molecules apart by an additional biasing potential. In each of these methods several molecular dynamics simulations have to be run with in case of umbrella sampeling different distances between the molecules of interest. This is done to construct a thermodynamic cycle to calculate the free energy differences between the bound and unbound state and hence gain the binding energy. In case of the perturbation methods a coupling parameter λ is introduced that ranges from 0 to 1 over the simulations and regulates the strength of the intermolecule interactions.
In every energy calculation concept the simualtions have to be similar enough so that a part of the ensembles is present in the neighbouring simulations and consequntly a sufficient phase space overlap is reached. Unfortunately it is difficult to guess how the simulations have to be set up to gain this overlap. Therefore a careful evaluation has to be applied that results in a trial-and-error procedure. This is especially true for pulling experiments that have four major adjustment parameters: The pulling direction, the pulling speed, the pulling force, and the number of and spacing between the conformations that are subsequently used. To evaluate these simulations two algorithms are often in applied today: The Bennet Acceptence Ration (BAR) and the Weighted Histogram Analysis Method (WHAM).
To calculate the experimentally determinable parameter of the binding constant \(K_{d}\) one can use the relation in equation \ref{gibbs}.

$$ \begin{equation} \Delta G = R T ln \left( \frac{K_{d}}{mol/L} \right) \label{gibbs} \end{equation} $$

METHODS

Visualisations
Colicin E2

No obtainable 3D model of Colicin E2 was found in the Research Collaboration for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) at the Brookhaven National Laboratory and the Protein Data Bank in Europe (PDBe) at the European Bioinformatics Institue (EMBL-EBI). Kristallographic structures of the DNase subunit of Colicin E2 and it's bacterial import subunit was available. Therefore we chose to use homology modelling to obtain a 3D structure of Colicin E2. For homology modelling the Protein Homology/analogY Recognition Engine V 2.0 (PHYRE2) [] server was used in combination with the amino acid sequence obtained from the Universal Protein Resource (UniProt) Protein Knowledgebase (UniProtKB) entrance P04419 [].
The obtained model was based on the known subunits of Colicin E2 and Colicin E3, a close realtive. A CHARMm topology was then produced via the pdb2gmx module of GROMACs 5.1.3, the model was solvated in TIP3P water and energy minimized using the steepest descend algorithm to !FEHLTNOCH! kJ/mol.


Force Field Parametrization

A 3D model of o-methyl tyrosine was created using Avogadro 1.1.1 and energy minimized using the steepest descend algorithm in vacuum. The so obtained model was subsequently parsed to the SwissParam [] topology server. The created topology was then used to derive parameters for the CHARMm residue force field so that for any protein with o-methyl tyrosine a topology could be created. Since o-methyl tyrosine is very similar to tyrosine most parameters could be adopted.
CHARMm is an atom type based force field, meaning that every parameter is not dependend on the residue but on the assigned atom types taking part in the interactions. As a result we had to define several new atom types to account for the changed properties of the methyl ether since no other benzyl ether atoms could be found in the force field parameter files.


To implement the newly derived parameters we had to update several force field parameter files that we gathered from the GROMACS 5.0.4 software suite. This was due to the case that all parameter files had to be compatible to the GROMACS software suite. First we had to implement the new amino acid in the aminoacids.rtp file which basically serves as a register for all known residues. The amino acid entry contains a three letter code by which it will be identified, all atoms with name, type and charge, all bonds between these atoms as well as impropers and CMAPs. Since the last two entries only concern the backbone atoms, we simply copied those entries. Atom charges were duplicated from tyrosine for all atoms that were not involved in the ether bond. The ether methyl group was represented by a standard methyl group with CT3 carbon and HA hydrogen. This atom classes represent alanin's Cβ and the Cδ atom of methionin.
A new atom class named OE was introduced to represent the ether oxygen since there are no ether oxygen parameters for amino acid residues in the CHARMm force field files. We used parameters from the generated topology file to derive parameters for all interactions from the OE atom type. To implement these parameters we updated the bondtypes, angletypes, and dihedrals in the ffbonded.itp file. Similar we updated the ffnonbonded.itp file sections atomtypes, and pairtypes. Subsequently we add an OE entry to the atomtypes.atp file.
CHARMm simulates aromaticity through dummy atoms which will be calculated automatically. To achiev this for our new amino acid we updated the aminoacids.vsd file and inserted every bond length and angle. Finally we had to assign a three letter code to represent our amino acid in the topology and structure files. We chose OMT. This resulted in the empirical force field CHARMm 27 OMT.


Molecular Dynamics Simulations

3D structures of Colicin E2 DNase subunit (SAse) and it's immunity protein were obtained from RCSB PDB entrance 3U43 [] . Since the main binding occurs between the DNase subunit and the immunity protein only this Colicin E2 subunit was simulated. Furthermore we established a minimized Colicin E2 in another project part and wanted to test it's operational capability.
To insert mutations inside the immunity protein we inserted the energy minimized structure of OMT at the desired position. Positions of backbone atoms were fitted to those of the replaced amino acid so that the backbone integrity was preserved. This was achieved through the implementation of Kabsch's algorithm for structural alignments in the Bio3D package [] for the statistical computing language R []. Additionaly we used Bio3D's pdb processing package to seperate Colicin E2 DNase subunit and its immunity protein.
All molecular dynamics simualtions were performed with the GROningen MAchine for Chemical Simulations (GROMACS) 5.0.3 [] software suite. As empirical force field CHARMm 27 OMT was used. An explicit water model with TIP3P water was chosen and a cubic box with edge length !FEHLTNOCH! nm was constructed. The box was subsequently filled with water and the system was neutralized through insertions of chloride ions. After neutralization the system was energy minimized with the steepest descent algorithm until it converted. The exact values can be found in table 1. To generate velocity and temperate the system a small equilibration run of about 500 ps was performed. The end temperature was set to 298 K and the Berendsen thermostat and the velocity-Verlet integrator with a stepsize of 2 fs were used. After this equilibration run a NVT ensemble was achieved. To achieve a NPT ensemble the equilibration run was repeated with applied pressure coupling to 1 bar. Subsequently the final MD production run was performed. Every 5000st step was saved resulting in a trajectory of 10001 conformations ranging over 100 ns simualtion time.


Molecular Dynamics Simulations Analysis

Simulation Analysis was performed using the R [] package Bio3D []. All plots were created using the R package ggplot2 []. For visualization of protein structures and trajectories the PyMOL visualization system [] was used. To compensate for eventual jumps due to translations across the PBC barriers a GROMACS internal fitting program was applied (gmx trjconv -center -pbc nojump). To exclude translational and rotational movement of the simulated protein we applied the GROMACS internal fitting algorithms gmx trjconv -fit rot+trans. These trajectory manipulations were performed because several analysis methods like RMSD and RMSF rely on distance calculations between atom positions over time. In these analyses the translational and rotational movements are not of interest since we only want to visualize the movement of atoms in relation to the simulated protein to test for different configuarations and structural flexibility. Additionally the crossing of PBC barriers would increase the distance between two atom positions drastically since the atom would be relocated to the opposing site of the simualted system.


Binding Energy Calculations

For binding energy calculations we applied pulling simulations followed by umbrella sampling of ?30? different configuartions along the reaction coordinate, namely the x-axis. To perform the pulling simulation average structures of Colicin E2 and it's immunity protein in TIP3P water was simulated. To do this we constructed a box with the dimensions 16 nm, 5.5 nm, and 6.5 nm. The Complex was centered at 3 nm, 2.75 nm, and 3.25 nm. The system was filled with sodium and chloride atoms to a concentration of 0.1 M and susequently energy minimized to a potential energy of !FEHLTNOCH!. The system was equilibrated according to the former simulations. Pulling was performed over 1 ns with a pull rate of 0.01 nm/ps resulting in 10 nm overall pulling. A harmonic umbrella pulling potential with 250,000 \(\frac{kJ}{mol~nm^2}\) was applied. The pulling simulation resulted in a center of mass difference of about !FEHLTNOCH! nm.

(Eventuell noch unterschiedliche pulling Simulationen zeigen.)

RESULTS

"- yes, their son, Harry -"Mr. Dursley was the director of a firm called Grunnings, which made drills. He was a big, beefy man with hardly any neck, although he did have a very large mustache. Mrs. Dursley was thin and blonde and had nearly twice the usual amount of neck, which came in very useful as she spent so much of her time craning over garden fences, spying on the neighbors. The Dursleys had a small son called Dudley and in their opinion there was no finer boy anywhere.
The Dursleys had everything they wanted, but they also had a secret, and their greatest fear was that somebody would discover it. They didn't think they could bear it if anyone found out about the Potters. Mrs. Potter was Mrs. Dursley's sister, but they hadn't met for several years; in fact, Mrs. Dursley pretended she didn't have a sister, because her sister and her good-for-nothing husband were as unDursleyish as it was possible to be. The Dursleys shuddered to think what the neighbors would say if the Potters arrived in the street. The Dursleys knew that the Potters had a small son, too, but they had never even seen him. This boy was another good reason for keeping the Potters away; they didn't want Dudley mixing with a child like that.
When Mr. and Mrs. Dursley woke up on the dull, gray Tuesday our story starts, there was nothing about the cloudy sky outside to suggest that strange and mysterious things would soon be happening all over the country. Mr. Dursley hummed as he picked out his most boring tie for work, and Mrs. Dursley gossiped away happily as she wrestled a screaming Dudley into his high chair.
None of them noticed a large, tawny owl flutter past the window.
[....]