Difference between revisions of "Team:TU Darmstadt/Model"

 
(83 intermediate revisions by 6 users not shown)
Line 9: Line 9:
 
<body>
 
<body>
 
<div class="vviki" id="vviki">
 
<div class="vviki" id="vviki">
 +
<div id="head">
 
<div id="title">
 
<div id="title">
 
<img id="logoleiste" src="https://static.igem.org/mediawiki/2016/8/83/T--TU_Darmstadt--titel.png" alt="iGEM TU Darmstadt 2016"/>
 
<img id="logoleiste" src="https://static.igem.org/mediawiki/2016/8/83/T--TU_Darmstadt--titel.png" alt="iGEM TU Darmstadt 2016"/>
 
</div>
 
</div>
<div class="navbar">
+
</html>{{Team:TU_Darmstadt/MainMenu}}<html>
<div class="tablet" style="display: none">
+
    <div id="menu">
+
  <button id="tabletbutton">
+
  <img src="https://static.igem.org/mediawiki/2016/c/c9/T--TU_Darmstadt--zahnrad.png" alt="menu" id="tabletbutton"></img></button>
+
    <div id="tabletmenu">
+
<ul>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Achievements">Achievements</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Lab">In the Lab</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Hardware">Robotics</a></li>
+
            <li ><a class="current" href="https://2016.igem.org/Team:TU_Darmstadt/Model">Modelling</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Human_Practices">Human Practices</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Collaborations">Collaborations</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Results_Parts">Results + Parts</a></li>
+
            <li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Notebook">Labbook</a></li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Team">Team</a></li>
+
            </ul>
+
</div>
+
</div>
+
</div>
+
<div class="computer" id="computer" style="display: none">
+
<ul>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Achievements">Achievements</a></li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Lab">In the Lab</a>
+
<!--<div class="drop">
+
<ul>
+
    <li><a href="LabReporter.html">Reporter</a></li>
+
<li><a href="OrthoPair.html">Orthogonal Pair</a></li>
+
<li><a href="ColE2Im.html">KILL(switch)</a></li>
+
<li><a href="GI.html">Genomic Integration</a></li>
+
<li><a href="ChemSyn.html">Chemical Synthesis</a></li>
+
</ul>
+
</div> -->
+
</li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Hardware">Robotics</a></li>
+
<li ><a class="current" href="https://2016.igem.org/Team:TU_Darmstadt/Model">Modeling</a>
+
    <!-- <div class="drop">
+
    <ul>
+
    <li><a href="#ThOv">Theoretical Overview</a></li>
+
    <li><a href="#Mod_m">Methods</a></li>
+
    <li><a href="#Mod_r">Results</a></li>
+
    </ul>
+
    </div> -->
+
</li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Human_Practices">Human Practices</a></li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Collaborations">Collaborations</a></li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Results_Parts">Results + Parts</a>
+
    <!-- <div class="drop">
+
    <ul>
+
    <li><a href="Results.html">Results</a></li>
+
    <li><a href="Parts.html">Parts</a></li>
+
    </ul>
+
    </div> -->
+
</li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Notebook">Labbook</a></li>
+
<li ><a href="https://2016.igem.org/Team:TU_Darmstadt/Team">Team</a></li>
+
</ul>
+
</div>
+
</div>
+
 
<div class="banner"><img id="banner" src="https://static.igem.org/mediawiki/2016/8/86/T--TU_Darmstadt--team.jpg" alt="teamphoto"></img>
 
<div class="banner"><img id="banner" src="https://static.igem.org/mediawiki/2016/8/86/T--TU_Darmstadt--team.jpg" alt="teamphoto"></img>
 
</div>
 
</div>
Line 76: Line 19:
 
<h1>MODELING</h1>
 
<h1>MODELING</h1>
 
    </div>
 
    </div>
 +
</div>
 
<div class="page">
 
<div class="page">
 
<div class="abstract">
 
<div class="abstract">
 
    <p><b>ABSTRACT</b><br/>
 
    <p><b>ABSTRACT</b><br/>
Bonding of proteins is highly depending on structural properties which in turn are determined by the amino acid sequences. Changing the amino acid sequence of one participating partner could consequently diminish it's binding ability. Therefore it is important to estimate the influence of mutations on the protein structure. This is particularly true for mutations from natural to non-natural amino acids.<br/>To estimate the influence of <i>O</i>-methyl tyrosine on Colicin E2's immunity protein we applied several molecular dynamics simulations leading to 1300&nbsp;ns in total simulation time. To do this we estimated <i>O</i>-methyl tyrosine parameters for the CHARMm&nbsp;22 and the GROMOS36a7 force field. We evaluated our simulations by applying several well documented evaluation methods like secondary structure analysis, plotting the solvent accesible surface area, and RMSD and RMSF. Our first simualtion analysis led to the conclusion that <i>O</i>-methyl tyrosine had no influence on the immunity protein.<br/>To estimate possible influences on the thermodynamics of the system we calculated the binding energy between Colicin E2 and it's immunity protein by pulling experiments with following umbrella sampling molecular dynamics simulations. The binding energy was afterwards calculated using the WHAM algorithm showing only minor differences.</p>
+
Bonding of proteins is highly depending on structural properties which in turn are determined by the amino acid sequences. Changing the amino acid sequence of one participating partner could consequently diminish its binding ability. Therefore it is important to estimate the influence of mutations on the protein structure. This is particularly true for mutations from natural to non-natural amino acids.<br/>To estimate the influence of <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine on Colicin E2's immunity protein we applied several molecular dynamics simulations leading to 1300&nbsp;ns in total simulation time. To do this we estimated <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine parameters for the CHARMm&nbsp;22 and the GROMOS36a7 force field. We evaluated our simulations by applying several well documented evaluation methods, like secondary structure analysis, plotting the solvent accesible surface area, and RMSD and RMSF. Our first simulation analysis led to the conclusion that <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine had no influence on the immunity protein.<br/>To estimate possible influences on the thermodynamics of the system we tried to calculate the binding energy between Colicin E2 and its immunity protein by Alchemistry Free Energy Perturbation experiments. This was not successful due to computational errors and comutational limitations.</p>
 
</div>
 
</div>
 
<div class="content">
 
<div class="content">
    <div class="verlinked" id="ThOv"><h2>THEORETICAL OVERVIEW</h2></div>
+
    <div class="verlinked" id="ThOv"><h2 style="border-bottom:3px solid #019ac8; padding-top:80px;">THEORETICAL OVERVIEW</h2></div>
<h5 id="MD">Molecular Dynamics Simulation</h5>
+
<h5 id="MD" style="padding-top:0;">Molecular Dynamics Simulation</h5>
<h6>Introduction</h6><br/>
+
<p><h6>Introduction</h6></p>
<p>Molecular Dynamics (MD) Simulations is a method to describe atomic and molecular movements. Molecular Dynamics simulations depend on several simplifications that enables the simulation to range from nanoseconds up to severeal milliseconds in systems containg of over one hundred thousand atoms. This enables the possibility to study different biomolecular processes like protein protein binding or enzyme dynamics. Because of the deterministic nature of the system it is possible to calculate thermodynamic properties like free energy or free binding enthalpies.</p>
+
<p>Molecular Dynamics (MD) Simulations is a method to describe atomic and molecular movements. Molecular Dynamics simulations depend on several simplifications that enables the simulation to range from nanoseconds up to severeal milliseconds in systems containg of over one hundred thousand atoms. This enables the possibility to study different biomolecular processes like protein protein binding or enzyme dynamics. Because of the deterministic nature of the system it is possible to calculate thermodynamic properties like free energy or free binding enthalpies [1].</p>
 
<h6>Assumptions</h6><br/>
 
<h6>Assumptions</h6><br/>
<p>To describe atomistic or molecular behavior the exact system conditions like positions and energies. The energies of an atomic system are described by the Schr&oumldinger equation (eq. \ref{Schrödinger}) with wavefunction \(\Psi\) (eq. \ref{wavefunction}), kinetic energies \(\hat{T}_{e}\) and \(\hat{T}_{N}\) and potential energies \(\hat{V}_{e}\), \(\hat{V}_{N}\) and \(\hat{V}_{eN}\). Terms with subscript \(_{e}\) are terms concerning the electrons and terms with subscript \(_{N}\) are terms concerning the nuclei.  
+
<p>To describe atomistic or molecular behavior the exact system conditions like positions and energies. The energies of an atomic system are described by the Schr&ouml;dinger equation (eq. \ref{Schrödinger}) with wavefunction \(\Psi\) (eq. \ref{wavefunction}), kinetic energies \(\hat{T}_{e}\) and \(\hat{T}_{N}\) and potential energies \(\hat{V}_{e}\), \(\hat{V}_{N}\) and \(\hat{V}_{eN}\). Terms with subscript \(_{e}\) are terms concerning the electrons and terms with subscript \(_{N}\) are terms concerning the nuclei[1].  
 
</p><p>
 
</p><p>
 
<center>
 
<center>
Line 102: Line 46:
 
</center>
 
</center>
 
</p><p>
 
</p><p>
Since there is no possibility to solve these equations numerical, it is necessary to simplify the system description. The first assumption depends on the Born-Oppenheimer approximation that the Schr&oumldinger equation can be splitted into two parts, one for the electrons and one for the nuclei respectivly. Since the electrons are far more mobile the dynamic of the system can be defined by the nuclei positions. <br/>Molecular Dynamics simulations depend on several simplifications. First we assume in accord with the Born-Oppenheimer approximation that electronical movement has no influence on the overall atomic momentum because electrons will simply follow the nuclear movements in the simulated time scales. Second we can describe the potential energy function by a sum of simple terms. These terms are described in the so called force field which will be described later on. Third the system potential is evaluated by deriving the forces and applying Newtonian mechanic calculations as shown in equation \ref{newton} and \ref{newton1}.</p>
+
Since there is no possibility to solve these equations numerically, it is necessary to simplify the system description. The first assumption depends on the Born-Oppenheimer approximation that the Schr&ouml;dinger equation can be split into two parts, one for the electrons and one for the nuclei respectivly. Since the electrons are far more mobile the dynamic of the system can be defined by the nuclei positions. <br/>Molecular Dynamics simulations depend on several simplifications. First, we assume in accord with the Born-Oppenheimer approximation that electronical movement has no influence on the overall atomic momentum since electrons will simply follow the nuclear movements in the simulated time scales. Second, we can describe the potential energy function by a sum of simple terms. These terms are described in the so-called force field which will be described later on. Third, the system potential is evaluated by deriving the forces and applying Newtonian mechanic calculations as shown in equation \ref{newton} and \ref{newton1} [1].</p>
 
<p>
 
<p>
 
<center>
 
<center>
Line 123: Line 67:
 
</center>
 
</center>
 
</p>
 
</p>
<p>To solve these terms numerically we have to discretize the trajectory and therefore use an integrator for the small time steps. Several different integrators were developed today, of which the velocity-Verlet algorithm is the most used (eq. \ref{VV1} & \ref{VV2}).</p>
+
<p>To solve these terms numerically we have to discretize the trajectory and therefore use an integrator for the small time steps. Several different integrators are developed today, of which the velocity-Verlet algorithm is the most used (eq. \ref{VV1} & \ref{VV2}) [1].</p>
 
<p>
 
<p>
 
<center>
 
<center>
Line 146: Line 90:
 
<p>
 
<p>
  
<br/>The temperature of the system is directly correlated to the distribution of kinetic energies. Therefore the temperature of the system can be controlled by manipulating the atom velocities. A possible way to do this was proposed by Berendsen by coupling the system to a heat bath resulting in a NVT ensemble (eq. \ref{Berendsen}).</p>
+
<br/>The temperature of the system is directly correlated to the distribution of kinetic energies. Therefore the temperature of the system can be controlled by manipulating the atom velocities. A possible way to do this was proposed by Berendsen by coupling the system to a heat bath resulting in a NVT ensemble (eq. \ref{Berendsen}) [1].</p>
 
<p>
 
<p>
 
<center>
 
<center>
Line 158: Line 102:
 
</p>
 
</p>
 
<h5 id="FFs">Empirical Force Fields</h5>
 
<h5 id="FFs">Empirical Force Fields</h5>
<p>Empirical Force Fields are the backbone of every Molecular Dynamics simulation. Typically the force fields are diveded into two parts, bonded and nonbonded interactions. Bonded interactions consist of chemical bond stretching, angle bending, and rotation of dihedrals and impropers. Nonbonded interactions are approximated by Coulomb interactions (ionic) and Lennard-Jones potentials. The overall CHARMm (Chemistry at HARvard Macromolekular mechanics) potential is calculated by summing up these main potentials ( \( V_{CHARMm} = V_{bonded} + V_{nonbonded} \) ).  
+
<p>Empirical Force Fields are the backbone of every Molecular Dynamics simulation. Typically the force fields are divided into two parts, bonded and nonbonded interactions. Bonded interactions consist of chemical bond stretching, angle bending, and rotation of dihedrals and impropers. Nonbonded interactions are approximated by Coulomb interactions (ionic) and Lennard-Jones potentials. The overall CHARMm (Chemistry at HARvard Macromolekular mechanics) potential is calculated by summing up these main potentials (\( V_{CHARMm} = V_{bonded} + V_{nonbonded} \)) [2,3,4].  
 
<br/>
 
<br/>
In equation \ref{CHARMM_bonded} and \ref{CHARMM_nonbonded} the bonded and nonbonded Potentials of the CHARMM force field are displayed. All terms consist of an equilibrium value marked with \(0\) and a force constant \(K\).</p>
+
In equation \ref{CHARMM_bonded} and \ref{CHARMM_nonbonded} the bonded and nonbonded potentials of the CHARMm force field are displayed. All terms consist of an equilibrium value marked with \(0\) and a force constant \(K\) [2,3,4].</p>
 
<p>
 
<p>
 
<center>
 
<center>
Line 177: Line 121:
 
</center>
 
</center>
 
</p>
 
</p>
<p>The additional terms CMAP and Urey-Bradley are correctional terms for backbone atoms and 1, 3 interactions respectively.</p>
+
<p>The additional terms CMAP and Urey-Bradley are correctional terms for backbone atoms and 1, 3 interactions respectively [2,3,4].</p>
 
<h5 id="SimAn">Simulation Analysis</h5>
 
<h5 id="SimAn">Simulation Analysis</h5>
<p>Because of the vast amount of data that is produced by Molecular Dynamics simulations it is essential to process the data into more accesible formats. To perform this task we applied several approaches like comparison of the solvent accesible surface area (SASA) over time.</p>
+
<p>Since the vast amount of data that is produced by Molecular Dynamics simulations it is essential to process the data into more accesible formats. To perform this task we applied several approaches like comparison of the solvent accesible surface area (SASA) over time.</p>
 
<h6 id="RMSD">Root Mean Square Deviation</h6>
 
<h6 id="RMSD">Root Mean Square Deviation</h6>
 
<p>The Root Mean Square Deviation (RMSD) describes the sum of distances of all selected atoms \(n\) between themselves in a selceted timestep \(\tau\) and a reference timestep \(r\) (eq. \ref{RMSD}). Plotted over time it is possible to detect fluctuations in the whole molecular configuration and therefore it is possible to conclude structures of high stability from plateaus in the RMSD.</p>
 
<p>The Root Mean Square Deviation (RMSD) describes the sum of distances of all selected atoms \(n\) between themselves in a selceted timestep \(\tau\) and a reference timestep \(r\) (eq. \ref{RMSD}). Plotted over time it is possible to detect fluctuations in the whole molecular configuration and therefore it is possible to conclude structures of high stability from plateaus in the RMSD.</p>
Line 193: Line 137:
 
</p>
 
</p>
 
<p>
 
<p>
By chosing the right atom selection it is possible to evaluate different behaviour of protein subgroups. For example if the RMSD between C&alpha; atoms is calculated it is possible to plot the backbone movement over time and hence detect configurations that differ from the starting structure. This is important if the one wants to search for different thermodynamic stable ensembles of the protein or molecul of interest.
+
By choosing the right atom selection it is possible to evaluate the different behaviours of protein subgroups. For example, if the RMSD between C&alpha; atoms is calculated it is possible to plot the backbone movement over time and hence detect configurations that differ from the starting structure. This is important if one wants to search for different thermodynamically stable ensembles of the protein or molecule of interest.
 
</p>
 
</p>
 
<h6 id="RMSF">Root Mean Square Fluctuation</h6>
 
<h6 id="RMSF">Root Mean Square Fluctuation</h6>
<p>Similar to the RMSD the Root Mean Square Fluctuation describes the sum of distances of all selected atoms. In this case the distance per atom between all selected configuartions is calculated and summed over time. Therefore it is possible to spot out residues with strong mobility and consequntly residues that are part of fluctuating and disordered protein subunits.</p>
+
<p>Similar to the RMSD the Root Mean Square Fluctuation describes the sum of distances of all selected atoms. In this case the distance per atom between all selected configuartions is calculated and summed over time. Therefore it is possible to spot residues with strong mobility and consequntly residues that are part of fluctuating and disordered protein subunits.</p>
 
<p>
 
<p>
 
<center>
 
<center>
Line 208: Line 152:
 
</p>
 
</p>
 
<h6 id="DSSP">DSSP</h6>
 
<h6 id="DSSP">DSSP</h6>
<p><i>Define Secondary Structures of Proteins</i> (DSSP) by Wolfgang Kabsch and Christian Sander is a standard program to analyse secondary structure properties of proteins. The main idea to discriminate between different secondary structures is to based on the presence of H bonds because this can be represented by one energy value. This definition enables the algorithm to distinguish different types of &alpha; helices, &beta; sheets, and turns.<br/>
+
<p><i>Define Secondary Structures of Proteins</i> (DSSP) by Wolfgang Kabsch and Christian Sander is a standard program to analyse secondary structure properties of proteins. The main idea to discriminate between different secondary structures is based on the presence of H&nbsp;bonds because this can be represented by one energy value. This definition enables the algorithm to distinguish different types of &alpha;&nbsp;helices, &beta;&nbsp;sheets, and turns.<br/>The electrostatic interations between two groups are calculated by assigning partial charges to each C (\(+q_{1}\)), O (\(-q_{1}\)), N (\(-q_{2}\)), and H (\(+q_{2}\)), with \(q_{1} = 0.42~e\) and \( q_{2} = 0.20~e\) and r(AB) being the distance between two atoms A and B in Angstr&ouml;m. An H&nbsp;bond is defined by \( E < -0.5~\frac{kcal}{mol}\) [5].</p><p>
The electrostatic interations between two groups are calculated by assigning partial charges to each C (\(+q_{1}\)), O (\(-q_{1}\)), N (\(-q_{2}\)), and H (\(+q_{2}\)), with \(q_{1} = 0.42~e\) and \( q_{2} = 0.20~e\) and r(AB) being the distance between two atoms A and B in Angstr&oumlm. An H bond is defined by \( E < -0.5~\frac{kcal}{mol}\).
+
</p>
+
<p>
+
 
$$
 
$$
 
\begin{equation}
 
\begin{equation}
Line 220: Line 161:
 
</p>
 
</p>
 
<h5 id="BEC">Binding Energy Calculations</h5>
 
<h5 id="BEC">Binding Energy Calculations</h5>
<p>Several methods to calculate binding energies from molecular dynamics simulations have been developed. Each of these methods rely on a stepwise cancelling of the binding state by either decoupling the binding interactions often called alchemistry perturbation mehtods or by pulling the molecules apart by an additional biasing potential. In each of these methods several molecular dynamics simulations have to be run with in case of umbrella sampeling different distances between the molecules of interest. This is done to construct a thermodynamic cycle to calculate the free energy differences between the bound and unbound state and hence gain the binding energy. In case of the perturbation methods a coupling parameter &lambda; is introduced that ranges from 0 to 1 over the simulations and regulates the strength of the intermolecule interactions.<br/>In every energy calculation concept the simualtions have to be similar enough so that a part of the ensembles is present in the neighbouring simulations and consequntly a sufficient phase space overlap is reached. Unfortunately it is difficult to guess how the simulations have to be set up to gain this overlap. Therefore a careful evaluation has to be applied that results in a trial-and-error procedure. This is especially true for pulling experiments that have four major adjustment parameters: The pulling direction, the pulling speed, the pulling force, and the number of and spacing between the conformations that are subsequently used. To evaluate these simulations two algorithms are often in applied today: The Bennet Acceptence Ration (BAR) and the Weighted Histogram Analysis Method (WHAM).<br/>To calculate the experimentally determinable parameter of the binding constant \(K_{d}\) one can use the relation in equation \ref{gibbs}.
+
<p>Alchemical Free Energy Perturbation (FEP) is a method in computational biology to obtain energy differences from molecular dynamics or Metropolis Monte Carlo simulations between two system states. In here the system is slowly transformed from state \(i\) to state \(j\) through non-natural intermediates which are sampled by slowly decreasing the intermolecular interactions. The core equation (eq. \ref{FEGG}) for the Helmholtz free energy difference between states \(i\) and \(j\) \(\Delta A_{ij}\) is derived from statistical mechanics. \(Q\) represents the canonical partition function, \(k_B\) the Boltzman constant, \(U\) the corresponding potential system energy in relation to the coordinates and momenta \(\vec{q}\), \(T\) the temperatur and \(\Gamma\) the volume of potential states of \(\vec{q}\) [16].</p>
 +
 
 +
<p>
 +
$$
 +
\begin{equation}
 +
\Delta A_{ij} = -k_{B} T \frac{Q_{j}}{Q_{i}} = -k_{B} T~ln \left( \frac{\int_{\Gamma_{j}}e^{-\frac{U_{j}(\vec{q})}{k_{B}T}}d \vec{q}}{\int_{\Gamma_{i}}e^{-\frac{U_{i}(\vec{q})}{k_{B}T}}d \vec{q}}\right)
 +
\label{FEGG}
 +
\end{equation}
 +
$$
 +
<p>
 +
To calculate free energy differences between states with low state space overlap a thermodynamic cycle can be constructed hence the Helmholtz free energy is a thermodynamic state function. The most straightforward way to calculate state transition free energy changes would be to simulate the naturally occuring process. For example the binding free energy between two proteins can be calculated by separating both proteins. This approach however has high computational costs since simulating the whole water filled box results in large systems.<br/>The alchemical perturbation approach relies on simulating several intermediate states over which Coulomb and Lennard-Jones interactions are slowly decreased. This is controlled via a coupling factor \(\lambda\) (eq \ref{lambda}). The resulting potential energy \(U\) at state \(\lambda\) is subsequently calculated as a sum of the two end states \(U_0\), \(U_1\) and all not decoupled interacions \(U_{unaffected}\) [16].
 +
<p>
 +
$$
 +
\begin{equation}
 +
U(\lambda,\vec{q}) = (1-\lambda)~U_{0}(\vec{q}) + \lambda~U_{1}(\vec{q}) + U_{unaffected}(\vec{q})
 +
\label{lambda}
 +
\end{equation}
 +
$$
 +
</p>
 +
The overall free energy change \(\Delta A_{0,1}\) is consequently calculated as the sum over all free energy changes (eq. \ref{TC}).
 +
</p>
 +
<p>
 +
$$
 +
\begin{equation}
 +
\Delta A_{0,1} = \sum^{1}_{\lambda=0} {\Delta A_{\lambda,\Delta \lambda}}
 +
\label{TC}
 +
\end{equation}
 +
$$
 +
</p>
 +
<p>
 +
Gibb's free energy differences between two &lambda; states can be calculated by equation \ref{Calc}.
 +
</p><p>
 +
$$
 +
\begin{equation}
 +
\Delta G~(\lambda^{'} \rightarrow \lambda^{"}) = -k_{B}T~\Bigg \langle exp \left( -\frac{U(\lambda^{"})-U(\lambda^{'})}{k_{B}T} \right) \Bigg \rangle
 +
\label{Calc}
 +
\end{equation}
 +
$$
 +
</p><p>
 +
Interactions of molecules are represented by Lennard-Jones and Coulomb interactions. This can lead to problems when Lennard-Jones interactions are decoupled and charge interactions are still active. This ensemble will lead to a clashing of the molecules since the charges will attract each other without any form of antagonistic force. To avoid this problem, position restraints are added which have to be considered in the final calculation.<br/>In order to evaluate the simulations of the intermediate states and calculate the free energy difference between the end states several algorithms have been developed (e.g., Exponential Averaging (EXP) or Bennett Acceptance Ratio (BAR)). BAR computes the energy difference between two simulations generating the trajectories \( n_i \) and \( n_j \) with the corresponding potential energy functions \(U_i\) and \(U_j\). The free energy difference \(\Delta A_{ij}\) can then be written as in equation \ref{Bennett} where \(f()\) stands for the Fermi function (eq. \ref{Fermi}) [16].
 +
</p><p>
 +
$$
 +
\begin{equation}
 +
\Delta A_{ij} = k_B T ~ \left( ln \frac{\sum_j f(U_i - U_j + k_B T~ln \frac{Q_i n_j}{Q_j n_i})}{\sum_i f(U_j - U_i - k_B T~ln \frac{Q_i n_j}{Q_j n_i})} - ln \frac{n_j}{n_i} \right)+ k_B T~ln \frac{Q_i n_j}{Q_j n_i}
 +
\label{Bennett}
 +
\end{equation}
 +
$$
 +
</p>
 +
<p>
 +
$$
 +
\begin{equation}
 +
f(x) = \frac{1}{1 + e^{~\beta x}}
 +
\label{Fermi}
 +
\end{equation}
 +
$$
 +
</p>
 +
<p>
 +
The Term \(k_B T~ln \frac{Q_i n_j}{Q_j n_i}\) has to be approximated since it cannot be computed analytically. Equation \ref{Bennett2} describes the relation used by Bennett et al. to estimate the term. Once it has been determined the free energy difference can be calculated by equation \ref{Bennett3} [16].
 +
</p>
 +
<p>
 +
$$
 +
\begin{equation}
 +
\sum_j f\left(U_i - U_j + k_B T~ln \frac{Q_i n_j}{Q_j n_i}\right) = \sum_i f\left(U_j - U_i - k_B T~ln \frac{Q_i n_j}{Q_j n_i}\right)
 +
\label{Bennett2}
 +
\end{equation}
 +
$$
 +
</p>
 +
<p>
 +
$$
 +
\begin{equation}
 +
\Delta A_{ij} = - k_B T ~ ln \frac{n_j}{n_i} + k_B T~ln \frac{Q_i n_j}{Q_j n_i}
 +
\label{Bennett3}
 +
\end{equation}
 +
$$
 +
</p><p>Later on Shirts et al. derived the BAR method by applying maximum likelihood techniques [16].<br/>Since common analysis methods in biochemistry often rely on titration, only dissociation (\(K_{d}\)) respectively association constants (\(K_{a}\)) can be concluded from experiments. Therefore the dissociation constants were calculated by using equation \ref{gibbs}.
 
</p>
 
</p>
 
<p>
 
<p>
 
$$
 
$$
 
\begin{equation}
 
\begin{equation}
\Delta G = R T ln \left( \frac{K_{d}}{mol/L} \right)
+
K_{a/d} = e^{-\frac{\Delta G}{RT}}
 
\label{gibbs}
 
\label{gibbs}
 
\end{equation}
 
\end{equation}
 
$$
 
$$
 
</p>
 
</p>
<div class="verlinked" id="Mod_m"><h2>METHODS</h2></div>
+
<div class="verlinked" id="Mod_m"><h2 style="border-bottom:3px solid #019ac8; padding-top:80px;">METHODS</h2></div>
<h5>Visualisations</h5>
+
<h5 style="padding-top:0;">Visualisations</h5>
 
 
 
<h6>Colicin E2</h6>
 
<h6>Colicin E2</h6>
<p>No obtainable 3D model of Colicin E2 was found in the Research Collaboration for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) at the Brookhaven National Laboratory and the Protein Data Bank in Europe (PDBe) at the European Bioinformatics Institue (EMBL-EBI). Kristallographic structures of the DNase subunit of Colicin E2 and it's bacterial import subunit was available. Therefore we chose to use homology modelling to obtain a 3D structure of Colicin E2. For homology modelling the <a href="http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index">Protein Homology/analogY Recognition Engine V 2.0</a> (PHYRE<sup style="font-size: 12pt; vertical-align:baseline; position: relative; top: -0.4em">2</sup>) [] server was used in combination with the amino acid sequence obtained from the Universal Protein Resource  (UniProt) Protein Knowledgebase (UniProtKB) entrance <a href="http://www.uniprot.org/uniprot/P04419">P04419</a> [].<br/>The obtained model was based on the known subunits of Colicin E2 and Colicin E3, a close realtive. A CHARMm topology was then produced via the pdb2gmx module of GROMACs 5.1.3, the model was solvated in TIP3P water and energy minimized using the steepest descend algorithm to !FEHLTNOCH! kJ/mol.</p>
+
<p>No obtainable 3D model of Colicin E2 was found in the Research Collaboration for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) at the Brookhaven National Laboratory and the Protein Data Bank in Europe (PDBe) at the European Bioinformatics Institue (EMBL-EBI). Crystallographic structures of the DNase subunit of Colicin E2 and its bacterial import subunit were available. Therefore we chose to use homology modeling to obtain a 3D structure of Colicin E2. For homology modeling the <a href="http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index">Protein Homology/analogY Recognition Engine V 2.0</a> (PHYRE<sup style="font-size: 12pt; vertical-align:baseline; position: relative; top: -0.4em">2</sup>) [17] server was used in combination with the amino acid sequence obtained from the Universal Protein Resource  (UniProt) Protein Knowledgebase (UniProtKB) entrance <a href="http://www.uniprot.org/uniprot/P04419">P04419</a>.<br/>The obtained model was based on the known subunits of Colicin E2 and Colicin E3, a close relative. Afterwards a CHARMm topology was produced via the pdb2gmx module of GROMACs 5.1.3 [6-15], the model was solvated in TIP3P water and energy minimized using the steepest descend algorithm. Afterwards we used PyMOL [18] to create molecular images.</p>
  
 
<h5>Force Field Parametrization</h5>
 
<h5>Force Field Parametrization</h5>
<p>A 3D model of <i>o</i>-methyl tyrosine was created using Avogadro 1.1.1 and energy minimized using the steepest descend algorithm in vacuum. The so obtained model was subsequently parsed to the <a href="http://www.swissparam.ch/">SwissParam</a> [] topology server. The created topology was then used to derive parameters for the CHARMm residue force field so that for any protein with <i>o</i>-methyl tyrosine a topology could be created. Since <i>o</i>-methyl tyrosine is very similar to tyrosine most parameters could be adopted.<br/>CHARMm is an atom type based force field, meaning that every parameter is not dependend on the residue but on the assigned atom types taking part in the interactions. As a result we had to define several new atom types to account for the changed properties of the methyl ether since no other benzyl ether atoms could be found in the force field parameter files.<br/>
+
<p>A 3D model of <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine was created using Avogadro 1.1.1, and energy minimized using the steepest descend algorithm in vacuum. The so obtained model was subsequently parsed to the <a href="http://www.swissparam.ch/">SwissParam</a> [19] topology server. The created topology was then used to derive parameters for the CHARMm residue force field [2,3,4] so that for any protein with <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine a topology could be created. Since <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine is very similar to tyrosine most parameters could be adopted.<br/>CHARMm is an atom type based force field, meaning that every parameter is not dependend on the residue but on the assigned atom types taking part in the interactions. As a result we had to define several new atom types to account for the changed properties of the methyl ether since no other benzyl ether atoms could be found in the force field parameter files.</p><p>
<center><img src="https://static.igem.org/mediawiki/2016/f/f4/T--TU_Darmstadt--forec_field_style.png" style="width: 80%;"></center><br/>
+
<center><div class="bild" style="width: 80%;"><img src="https://static.igem.org/mediawiki/2016/f/f4/T--TU_Darmstadt--forec_field_style.png" width=100%><b>Figure 1:</b> Used atom names and types for the implementation.</div></center></p><p>
To implement the newly derived parameters we had to update several force field parameter files that we gathered from the GROMACS&nbsp;5.0.4 software suite. This was due to the case that all parameter files had to be compatible to the GROMACS software suite. First we had to implement the new amino acid in the <b>aminoacids.rtp</b> file which basically serves as a register for all known residues. The amino acid entry contains a three letter code by which it will be identified, all atoms with name, type and charge, all bonds between these atoms as well as impropers and CMAPs. Since the last two entries only concern the backbone atoms, we simply copied those entries. Atom charges were duplicated from tyrosine for all atoms that were not involved in the ether bond. The ether methyl group was represented by a standard methyl group with <i>CT3</i> carbon and <i>HA</i> hydrogen. This atom classes represent alanin's C&beta; and the C&delta; atom of methionin.<br/>A new atom class named <i>OE</i> was introduced to represent the ether oxygen since there are no ether oxygen parameters for amino acid residues in the CHARMm force field files. We used parameters from the generated topology file to derive parameters for all interactions from the <i>OE</i> atom type. To implement these parameters we updated the <i>bondtypes</i>, <i>angletypes</i>, and <i>dihedrals</i> in the <b>ffbonded.itp</b> file. Similar we updated the <b>ffnonbonded.itp</b> file sections <i>atomtypes</i>, and <i>pairtypes</i>. Subsequently we add an <i>OE</i> entry to the <b>atomtypes.atp</b> file.<br/>CHARMm simulates aromaticity through dummy atoms which will be calculated automatically. To achiev this for our new amino acid we updated the <b>aminoacids.vsd</b> file and inserted every bond length and angle. Finally we had to assign a three letter code to represent our amino acid in the topology and structure files. We chose <b>OMT</b>. This resulted in the empirical force field CHARMm&nbsp;27&nbsp;OMT.
+
To implement the newly derived parameters we had to update several force field parameter files that we gathered from the GROMACS&nbsp;5.0.4 software suite [6-15]. This was due to the case that all parameter files had to be compatible to the GROMACS software suite. First we had to implement the new amino acid in the <b>aminoacids.rtp</b> file which basically serves as a register for all known residues. The amino acid entry contains a three letter code by which it will be identified, all atoms with name, type and charge, all bonds between these atoms as well as impropers and CMAPs. Since the last two entries only concern the backbone atoms, we simply copied those entries. Atom charges were duplicated from tyrosine for all atoms that were not involved in the ether bond. The ether methyl group was represented by a standard methyl group with <i>CT3</i> carbon and <i>HA</i> hydrogen. This atom classes represent alanin's C&beta; and the C&delta; atom of methionin.<br/>A new atom class named <i>OE</i> was introduced to represent the ether oxygen since there are no ether oxygen parameters for amino acid residues in the CHARMm force field files. We used parameters from the generated topology file to derive parameters for all interactions from the <i>OE</i> atom type. To implement these parameters we updated the <i>bondtypes</i>, <i>angletypes</i>, and <i>dihedrals</i> in the <b>ffbonded.itp</b> file. Similarly, we updated the <b>ffnonbonded.itp</b> file sections <i>atomtypes</i>, and <i>pairtypes</i>. Subsequently we added an <i>OE</i> entry to the <b>atomtypes.atp</b> file.<br/>CHARMm simulates aromaticity through dummy atoms which will be calculated automatically. To achieve this for our new amino acid we updated the <b>aminoacids.vsd</b> file and inserted every bond length and angle. Finally we had to assign a three letter code to represent our amino acid in the topology and structure files. We chose <b>OMT</b>. This resulted in the empirical force field CHARMm&nbsp;27&nbsp;OMT.
 
</p>
 
</p>
  
 
<h5 id="MDS">Molecular Dynamics Simulations</h5>
 
<h5 id="MDS">Molecular Dynamics Simulations</h5>
<p>3D structures of Colicin E2 DNase subunit (SAse) and it's immunity protein were obtained from RCSB PDB entrance <a href="http://www.ebi.ac.uk/pdbe/entry/pdb/3U43">3U43</a> [] . Since the main binding occurs between the DNase subunit and the immunity protein only this Colicin E2 subunit was simulated. Furthermore we established a minimized Colicin E2 in <a href="">another project part</a> and wanted to test it's operational capability. <br/>To insert mutations inside the immunity protein we inserted the energy minimized structure of OMT at the desired position. Positions of backbone atoms were fitted to those of the replaced amino acid so that the backbone integrity was preserved. This was achieved through the implementation of Kabsch's algorithm for structural alignments in the <a href="http://thegrantlab.org/bio3d/index.php">Bio3D package</a> [] for the <a href="https://www.r-project.org/">statistical computing language R</a> []. Additionaly we used Bio3D's pdb processing package to seperate Colicin E2 DNase subunit and its immunity protein.<br/>
+
<p>3D structures of the Colicin E2 DNase subunit (miniColicin) and its immunity protein were obtained from the RCSB PDB entrance <a href="http://www.ebi.ac.uk/pdbe/entry/pdb/3U43">3U43</a> [20]. Since the main binding occurs between the DNase subunit and the immunity protein only this Colicin E2 subunit was simulated. Furthermore we established a minimized Colicin E2 in <a href="">another project part</a> and wanted to test its operational capability. <br/>To insert mutations inside the immunity protein we inserted the energy minimized structure of OMT at the desired position. Positions of backbone atoms were fitted to those of the replaced amino acid so that the backbone integrity was preserved. This was achieved through the implementation of Kabsch's algorithm for structural alignments in the <a href="http://thegrantlab.org/bio3d/index.php">Bio3D package</a> [21-23] for the <a style="padding-right:0;" href="https://www.r-project.org/">statistical computing language R</a> [24]. Additionaly we used Bio3D's pdb processing package to separate Colicin E2 DNase subunit and its immunity protein.<br/>All molecular dynamics simulations were performed with the <a href="http://www.gromacs.org/">GROningen MAchine for Chemical Simulations</a> (GROMACS) 5.0.3 [6-15] software suite. As empirical force field CHARMm&nbsp;27&nbsp;OMT was used. An explicit water model with TIP3P water was chosen. The box was subsequently filled with water and the system was neutralized through insertions of chloride ions. After neutralization the system was energy minimized with the steepest descent algorithm until it converted. To generate velocity and temperate the system a small equilibration run of about 500&nbsp;ps was performed. The end temperature was set to 298&nbsp;K and the Berendsen thermostat and the velocity-Verlet integrator with a stepsize of 2&nbsp;fs were used. After this equilibration run a NVT ensemble was achieved. To achieve a NPT ensemble the equilibration run was repeated with applied pressure coupling to 1&nbsp;bar. Subsequently the final MD production run was performed. Every 5000st step was saved resulting in a trajectory of 10001 conformations ranging over 100&nbsp;ns simulation time.</p>
All molecular dynamics simualtions were performed with the <a href="http://www.gromacs.org/">GROningen MAchine for Chemical Simulations</a> (GROMACS) 5.0.3 [] software suite. As empirical force field CHARMm&nbsp;27&nbsp;OMT was used. An explicit water model with TIP3P water was chosen and a cubic box with edge length !FEHLTNOCH!&nbsp;nm was constructed. The box was subsequently filled with water and the system was neutralized through insertions of chloride ions. After neutralization the system was energy minimized with the steepest descent algorithm until it converted. The exact values can be found in <a href="">table 1</a>. To generate velocity and temperate the system a small equilibration run of about 500&nbsp;ps was performed. The end temperature was set to 298&nbsp;K and the Berendsen thermostat and the velocity-Verlet integrator with a stepsize of 2&nbsp;fs were used. After this equilibration run a NVT ensemble was achieved. To achieve a NPT ensemble the equilibration run was repeated with applied pressure coupling to 1&nbsp;bar. Subsequently the final MD production run was performed. Every 5000st step was saved resulting in a trajectory of 10001 conformations ranging over 100&nbsp;ns simualtion time.</p>
+
  
 
<h5>Molecular Dynamics Simulations Analysis</h5>
 
<h5>Molecular Dynamics Simulations Analysis</h5>
<p>Simulation Analysis was performed using the R [] package Bio3D []. All plots were created using the R package ggplot2 []. For visualization of protein structures and trajectories the PyMOL visualization system [] was used. To compensate for eventual jumps due to translations across the PBC barriers a GROMACS internal fitting program was applied (<i>gmx trjconv -center -pbc nojump</i>). To exclude translational and rotational movement of the simulated protein we applied the GROMACS internal fitting algorithms <i>gmx trjconv -fit rot+trans</i>. These trajectory manipulations were performed because several analysis methods like <a href="Theory.html#RMSD">RMSD</a> and <a href="Theory.html#RMSF">RMSF</a> rely on distance calculations between atom positions over time. In these analyses the translational and rotational movements are not of interest since we only want to visualize the movement of atoms in relation to the simulated protein to test for different configuarations and structural flexibility. Additionally the crossing of PBC barriers would increase the distance between two atom positions drastically since the atom would be relocated to the opposing site of the simualted system.</p>
+
<p>Simulation Analysis was performed using the R [24] package Bio3D [21,22,23]. All plots were created using the R package ggplot2 [25]. For visualization of protein structures and trajectories the PyMOL visualization system [18] was used. To compensate for eventual jumps due to translations across the PBC barriers a GROMACS internal fitting program was applied (<i>gmx trjconv -center -pbc nojump</i>). To exclude translational and rotational movement of the simulated protein we applied the GROMACS internal fitting algorithms <i>gmx trjconv -fit rot+trans</i>. These trajectory manipulations were performed because several analysis methods like <a href="Theory.html#RMSD">RMSD</a> and <a href="Theory.html#RMSF">RMSF</a> rely on distance calculations between atom positions over time. In these analyses the translational and rotational movements are not of interest since we only want to visualize the movement of atoms in relation to the simulated protein to test for different configuarations and structural flexibility. Additionally the crossing of PBC barriers would increase the distance between two atom positions drastically since the atom would be relocated to the opposing site of the simulated system.</p>
  
 
<h5>Binding Energy Calculations</h5>
 
<h5>Binding Energy Calculations</h5>
<p>For binding energy calculations we applied pulling simulations followed by umbrella sampling of ?30? different configuartions along the reaction coordinate, namely the x-axis. To perform the pulling simulation average structures of Colicin E2 and it's immunity protein in TIP3P water was simulated. To do this we constructed a box with the dimensions 16&nbsp;nm, 5.5&nbsp;nm, and 6.5&nbsp;nm. The Complex was centered at 3&nbsp;nm, 2.75&nbsp;nm, and 3.25&nbsp;nm. The system was filled with sodium and chloride atoms to a concentration of 0.1&nbsp;M and susequently energy minimized to a potential energy of !FEHLTNOCH!. The system was equilibrated according to the <a href="#MDS">former simulations</a>. Pulling was performed over 1&nbsp;ns with a pull rate of 0.01&nbsp;nm/ps resulting in 10&nbsp;nm overall pulling. A harmonic umbrella pulling potential with 250,000&nbsp;\(\frac{kJ}{mol~nm^2}\) was applied. The pulling simulation resulted in a center of mass difference of about !FEHLTNOCH!&nbsp;nm.<p style="color:red">(Eventuell noch unterschiedliche pulling Simulationen zeigen.)</p></p>
+
<p>Binding energies were calculated by using <a href="#BEC">alchemical Free Energy Perturbation</a> (FEP) in combination with Bennett Acceptence Ratio (BAR). A thermodynamical cycle was constructed (see fig. 2) and accordingly two simulation sets were performed. First, Colicin E2's immunity protein was simulated in TIP3P water with 0.6&nbsp;nm space between the box and the protein. Furthermore, Lennard-Jones potential and Coulomb interactions were decoupled over ten simulations, each resulting in 20 simulations in total. The simulations were energy minimized over approximately 300&nbsp;steps, NVT equilibrated over 10&nbsp;ps and NPT equilibrated over 100&nbsp;ps. The production run was performed for 2&nbsp;ns. Second, the binding complex consisting of Colicin E2 and its immunity protein was simulated under similar conditions i.e. equilibration and simulation times, and steps were chosen alike. In contrast to the immunity protein simulations additional restraint decoupling steps were performed over ten simulations before the other steps. All simulations were evaluated using the BAR scripts from the <a style="padding-right:0;" href="https://github.com/choderalab/pymbar">pyMBAR python library</a>.</p></p>
 
</p>
 
</p>
<div class="verlinked" id="Mod_r"><h2>RESULTS</h2></div>
+
<div class="verlinked" id="Mod_r"><h2 style="border-bottom:3px solid #019ac8; padding-top:80px;">RESULTS &amp; CONCLUSION</h2></div>
<p>"- yes, their son, Harry -"Mr. Dursley was the director of a firm called Grunnings, which made drills. He was a big, beefy man with hardly any neck, although he did have a very large mustache. Mrs. Dursley was thin and blonde and had nearly twice the usual amount of neck, which came in very useful as she spent so much of her time craning over garden fences, spying on the neighbors. The Dursleys had a small son called Dudley and in their opinion there was no finer boy anywhere.<br/>
+
<h5>Molecular Dynamcis Simulations</h5>
The Dursleys had everything they wanted, but they also had a secret, and their greatest fear was that somebody would discover it. They didn't think they could bear it if anyone found out about the Potters. Mrs. Potter was Mrs. Dursley's sister, but they hadn't met for several years; in fact, Mrs. Dursley pretended she didn't have a sister, because her sister and her good-for-nothing husband were as unDursleyish as it was possible to be. The Dursleys shuddered to think what the neighbors would say if the Potters arrived in the street. The Dursleys knew that the Potters had a small son, too, but they had never even seen him. This boy was another good reason for keeping the Potters away; they didn't want Dudley mixing with a child like that.<br/>
+
<p>Three amino acid positions were chosen for <i>O</i>-methyl-<span style="font-variant: small-caps;">l</span>-tyrosine (OMT) exchange evaluation (tyrosine 8 (Y8), phenylalanine 13 (F13) and phenylalanine 16 (F16)). These positions were selected because of their small deviation in regard to OMT and were therefore expected to cause the smallest structural differences. All molecular dynamics steps were performed on these mutation variants, as well as on the wildtype protein for comparison. All simulations were performed over 100&nbsp;ns, leading to 10001 conformations each. These conformations were evaluated using RMSD, RMSF, SASA and the amount of secondary structures over time.<br/>Figure 2 displays the RMSD in regard to the initial conformation. It can be observed that the mutational variant Y8O exhibits similar curve characteristics as the wildtype variant. The mutational variant F16O on the contrary shows a more severe deviation from the wildtype.</p>
When Mr. and Mrs. Dursley woke up on the dull, gray Tuesday our story starts, there was nothing about the cloudy sky outside to suggest that strange and mysterious things would soon be happening all over the country. Mr. Dursley hummed as he picked out his most boring tie for work, and Mrs. Dursley gossiped away happily as she wrestled a screaming Dudley into his high chair.<br/>
+
<p>
None of them noticed a large, tawny owl flutter past the window.<br/>
+
<center><div class="bild" style="width:70%;"><img src="https://static.igem.org/mediawiki/2016/8/81/T--TU_Darmstadt--RMSDc.png" width=100%><b>Figure 2:</b> RMSD of the mutation variants Y8O (green), F16O (blue) and the wildtype (red).</div></center>
<b>[....]</b></p>
+
</p>
 +
<p>
 +
The solvent accessible surface area (SASA) was calculated for every simulation step using DSSP and is displayed in figure 3. The little to no fluctuation in the SASA of all simulated variants is an argument for the high structural stability of the wildtype and the mutational variants. The disparity between the mutational variants and the wildtype can be traced back to the DSSP algorithm since it is based solely on natural amino acids. Therefore it cannot evaluate a non-natural amino acid and would cause a discrepancy between surface areas if non-natural amino acids like OMT are involved.
 +
</p>
 +
<center><div class="bild" style="width:70%;"><img src="https://static.igem.org/mediawiki/2016/1/16/T--TU_Darmstadt--SASAc.png" width=100%><b>Figure 3:</b> SASA of the mutation variants Y8O (green), F16O (blue) and the wildtype (red).</div></center>
 +
</p>
 +
<p>
 +
Additionally we evaluated the RMSF and the amount of secondary structures over time which exhibited no differences between the mutational variants and the wildtype. Therefore the results of these evaluation methods are not shown on this wiki.<br/>Closing up we can conclude that the mutational variant Y8O does fit our demands best, since its behavior exhibits the least disparity in our applied evaluation methods towards the wildtype. Therefore Y8O was chosen for further analyses.
 +
</p>
 +
<p>
 +
Furthermore we evaluated the stability of our designed miniColicin by performing 100&nbsp;ns of MD simulations with different force fields (CHARMm27, AMBER03, GROMOS56a7). The RMSD of these simulations is displayed in figure 4. It can be observed that some deviations between the different force fields occur. The simulation run with the GROMOS56a7 shows the biggest discrepancy towards the AMBER and CHARMm simulations which display little to no fluctuations over time. This behavior can be traced to the different parametrization approach in the force fields as well as to the fact that the GROMOS56a7 force field is a united atom force field, in which all CH<sub>3</sub> and CH<sub>2</sub> groups are described as one group.
 +
</p>
 +
<p>
 +
<center><div class="bild" style="width:76%;"><img src="https://static.igem.org/mediawiki/2016/3/31/T--TU_Darmstadt--RMSD.png" width=100%><b>Figure 4:</b> RMSD of miniColicin simulated using the the force fields CHARMm27 (red), AMBER03 (blue) and GROMOS56a7 (green).</div></center>
 +
</p>
 +
<h5>Binding Energy Calculations</h5>
 +
<p>To obtain the binding energy between miniColicin and the immunity protein we simulated 20 &lambda; states per thermodynamic step. The coupling factor &lambda; has been variated ina range of 0 to 1 in steps of 0.2 for Coulomb, 0.05 and 0.1 for the Lennard-Jones interactions respectively. Unfortunatly, a computation of a full set of simulations could not be achieved. This was due to unknown errors, inferior debugging abilities, malformed error messages recieved from the GROMACS simulation package [6-15] and temporal as well as computational limitations. In this context it was only possible to evaluate a scaling of Lennard-Jones potentials in the range from 0.9 to 1.0 with completely coupled Coulomb interactions. We tested simulations for different immunity protein variants on different computer systems (workstation and server) with the results being the same. Therefore, it seems appropriate to assume that the unknown error occured due to systematic problems. For example, instabilities in the simulated systems or badly implemented features could account for thus errors. Because of insufficient debug information provided by error messages we have not been able to debug the error and are consequently unable to present calculated data.<br/>It should be evaluated wether this behavior is a singular event or an indication for a major underlying fault within the implemtation. If so an evaluation of the configuration space seems to be in order. Furthermore, a comparison against alternative simulation frameworks is most likely imperative for future work.</p>
 
    <p>
 
    <p>
 
    </div>
 
    </div>
 +
<div class="references"><h6>References</h6>
 +
            <ul><li>[1] Gonz&agrave;lez, Force fields and molecular dynamics simulations, Les journ&eacute;es de la Diffusion Neutronique (JDN), vol. 12,pp. 2011</li>
 +
<li>[2] R. B. Best et al., Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone phi, psi and Side-Chain chi1 and chi2 Dihedral Angles, Journal of Chemical Theory and Computation, 8: 3257-3273., 2012</li>
 +
<li>[3] A.D. Jr. MacKerell, M. Feig, C.L. III Brooks, Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations, Journal of Computational Chemistry, 25: 1400-1415., 2004</li>
 +
<li>[4] A.D. Jr. MacKerell et al. All-atom empirical potential for molecular modeling and dynamics Studies of proteins, Journal of Physical Chemistry B,  102, 3586-3616.,1998</li>
 +
<li>[5] Wolfgang Kabsch and Christian Sander, Dictionary of Protein Secondary Structure: Patter Recognition of Hydrogen-Bonded and Geometrical Features, Biopolymers, Vol.22, 2577-2637, 1983</li>
 +
<li>[6] Berendsen, et al., GROMACS: A message-passing parallel molecular dynamics implementation, Comp. Phys. Comm. 91: 43-56.1995, 1995</li>
 +
<li>[7] Lindahl, et al., GROMACS 3.0: a package for molecular simulation and trajectory analysis,  J. Mol. Model. 7: 306-317., 2001</li>
 +
<li>[8] van der Spoel, et al., GROMACS: Fast, flexible, and free, J. Comput. Chem. 26: 1701-1718., 2005</li>
 +
<li>[9] Hess, et al., GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation, J. Chem. Theory Comput. 4: 435-447., 2008</li>
 +
<li>[10] Pronk, et al., GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit, Bioinformatics 29 845-854, 2013</li>
 +
<li>[11] P&agrave;ll, et al., Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS, Proc. of EASC 2015 LNCS, 2015</li>
 +
<li>[12] Abraham, et al., GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX 1-2 19-25, 2015</li>
 +
<li>[13] Wennberg et al., Lennard-Jones Lattice Summation in Bilayer Simulations Has Critical Effects on Surface Tension and Lipid Properties, J. Chem. Theory Comput., 9 3527–3537, 2013</li>
 +
<li>[14] Wennberg et al., Direct-Space Corrections Enable Fast and Accurate Lorentz–Berthelot Combination Rule Lennard-Jones Lattice Summation, J. Chem. Theory Comput., 12 5737–5746, 2015</li>
 +
<li>[15] P&agrave;ll, Hess, A flexible algorithm for calculating pair interactions on SIMD architectures, Comp. Phys. Comm.,184 2641-2650, 2013</li>
 +
<li>[16] Gerhard K&ouml;nig, Stefan Bruckner, Stefan Boresch, Unorthodox Uses of Bennett’s Acceptance Ratio Method, J. Comput. Chem., vol. 30, 1712-1718, 2009</li>
 +
<li>[17] Lawrence A Kelley, Stefans Mezulis, Christopher M Yates, Mark N Wass & Michael J E Sternberg, The Phyre2 web portal for protein modeling, prediction and analysis, Nature Protocols, vol. 10, 845-858, 2015</li>
 +
<li>[18] The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.</li>
 +
<li>[19] V. Zoete, M. A. Cuendet, A. Grosdidier, O. Michielin, SwissParam, a Fast Force Field Generation Tool For Small Organic Molecules, J. Comput. Chem, vol. 32(11), 2359-68., 2011</li>
 +
<li>[20] J.A. Wojdyla , S.J. Fleishman, D. Baker, C. Kleanthous, Structure of the ultra-high-affinity colicin E2 DNase--Im2 complex, J. Mol. Biol, vol. 417, 79-94, 2012</li>
 +
<li>[21] Grant, Rodrigues, ElSawy, McCammon, Caves, Bio3D: An R package for the comparative analysis of protein structures., Bioinformatics, vol. 22, 2695-2696 2006</li>
 +
<li>[22] Skj&aelig;rven, Yao, Scarabelli, Grant, Integrating protein structural dynamics and evolutionary analysis with Bio3D., BMC Bioinformatics vol., 15, 399, 2014</li>
 +
<li>[23] Skj&aelig;rven, Jariwala, Yao, Grant, Online interactive analysis of protein structure ensembles with Bio3D-web., Bioinformatics In press., 2016</li>
 +
<li>[24] R Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, <a href="http://www.R-project.org/.">URL</a>, 2014</li>
 +
<li>[25] H. Wickham, ggplot2: elegant graphics for data analysis, Springer New York, 2009</li>
 +
</ul></div>
 
    </div>
 
    </div>
 
<div class="rechts">
 
<div class="rechts">
<div class="highlights"><a href="#ThOv">Theoretical Overview</a><br/><br/><a href="#Mod_m">Methods</a><br/><br/><a href="#Mod_r">Results</a></div>
+
<div class="scrollbox"><div class="highlights"><a href="#ThOv">Theoretical Overview</a><br/><a href="#Mod_m">Methods</a><br/><a href="#Mod_r">Results &amp; Conclusion</a></div>
    <div><a href="#mainHeader"><button class="back_top">back to the top</button></a></div>
+
    <div><a href="#mainHeader"><button class="back_top_full">Back to the Top</button></a></div>
</div>
+
</div></div>
<div class="footer" id="footer">
+
</html>{{Team:TU_Darmstadt/Footer}}<html>
</div>
+
 
</div>
 
</div>
 
</body>
 
</body>
 
</html>
 
</html>

Latest revision as of 03:19, 20 October 2016

If you can see this message, you do not use Javascript. This Website is best to use with Javascript enabled. Without Javascript enabled, many features including the mobile version are not usable.

ABSTRACT
Bonding of proteins is highly depending on structural properties which in turn are determined by the amino acid sequences. Changing the amino acid sequence of one participating partner could consequently diminish its binding ability. Therefore it is important to estimate the influence of mutations on the protein structure. This is particularly true for mutations from natural to non-natural amino acids.
To estimate the influence of O-methyl-l-tyrosine on Colicin E2's immunity protein we applied several molecular dynamics simulations leading to 1300 ns in total simulation time. To do this we estimated O-methyl-l-tyrosine parameters for the CHARMm 22 and the GROMOS36a7 force field. We evaluated our simulations by applying several well documented evaluation methods, like secondary structure analysis, plotting the solvent accesible surface area, and RMSD and RMSF. Our first simulation analysis led to the conclusion that O-methyl-l-tyrosine had no influence on the immunity protein.
To estimate possible influences on the thermodynamics of the system we tried to calculate the binding energy between Colicin E2 and its immunity protein by Alchemistry Free Energy Perturbation experiments. This was not successful due to computational errors and comutational limitations.

THEORETICAL OVERVIEW

Molecular Dynamics Simulation

Introduction

Molecular Dynamics (MD) Simulations is a method to describe atomic and molecular movements. Molecular Dynamics simulations depend on several simplifications that enables the simulation to range from nanoseconds up to severeal milliseconds in systems containg of over one hundred thousand atoms. This enables the possibility to study different biomolecular processes like protein protein binding or enzyme dynamics. Because of the deterministic nature of the system it is possible to calculate thermodynamic properties like free energy or free binding enthalpies [1].

Assumptions

To describe atomistic or molecular behavior the exact system conditions like positions and energies. The energies of an atomic system are described by the Schrödinger equation (eq. \ref{Schrödinger}) with wavefunction \(\Psi\) (eq. \ref{wavefunction}), kinetic energies \(\hat{T}_{e}\) and \(\hat{T}_{N}\) and potential energies \(\hat{V}_{e}\), \(\hat{V}_{N}\) and \(\hat{V}_{eN}\). Terms with subscript \(_{e}\) are terms concerning the electrons and terms with subscript \(_{N}\) are terms concerning the nuclei[1].

$$ \begin{equation} (\hat{T}_{e} + \hat{T}_{N} + \hat{V}_{e} + \hat{V}_{N} + \hat{V}_{eN})\Psi=i \hbar \frac{\partial}{\partial t}\Psi \label{Schrödinger} \end{equation}$$ $$ \begin{equation} \Psi=\Psi(\vec{r}_{1},..., \vec{r}_{N_{e}},\vec{r}_{1},...,\vec{r}_{N_{N}}) \label{wavefunction} \end{equation}$$

Since there is no possibility to solve these equations numerically, it is necessary to simplify the system description. The first assumption depends on the Born-Oppenheimer approximation that the Schrödinger equation can be split into two parts, one for the electrons and one for the nuclei respectivly. Since the electrons are far more mobile the dynamic of the system can be defined by the nuclei positions.
Molecular Dynamics simulations depend on several simplifications. First, we assume in accord with the Born-Oppenheimer approximation that electronical movement has no influence on the overall atomic momentum since electrons will simply follow the nuclear movements in the simulated time scales. Second, we can describe the potential energy function by a sum of simple terms. These terms are described in the so-called force field which will be described later on. Third, the system potential is evaluated by deriving the forces and applying Newtonian mechanic calculations as shown in equation \ref{newton} and \ref{newton1} [1].

$$ \begin{equation} M_{K}\frac{d\vec{v}_{1}}{dt} = M_{K}\frac{d^{2}\vec{r}_{1}}{dt^{2}} = \vec{F}_{\vec{r}_{1}} = \frac{\partial V\left(\vec{r}_{1},...,\vec{r}_{N}\right)}{\partial \vec{r}_{1}} \label{newton} \end{equation} $$

$$ \begin{equation} F_{ij}=-\frac{\partial}{\partial r_{ij}}V_{force~field} \label{newton1} \end{equation} $$

To solve these terms numerically we have to discretize the trajectory and therefore use an integrator for the small time steps. Several different integrators are developed today, of which the velocity-Verlet algorithm is the most used (eq. \ref{VV1} & \ref{VV2}) [1].

$$ \begin{equation} r_{i}(t_{0} + \Delta t) = r_{i}(t_{0}) + v_{i}(t_{0})\Delta t + \frac{1}{2}a_{i}(t_{0})\Delta t^{2} \label{VV1} \end{equation} $$

$$ \begin{equation} v_{i}(t_{o}+\Delta t) = v_{i}(t_{0}) + \frac{1}{2}[a_{i}(t_{o} + \Delta t)]\Delta t \label{VV2} \end{equation} $$


The temperature of the system is directly correlated to the distribution of kinetic energies. Therefore the temperature of the system can be controlled by manipulating the atom velocities. A possible way to do this was proposed by Berendsen by coupling the system to a heat bath resulting in a NVT ensemble (eq. \ref{Berendsen}) [1].

$$ \begin{equation} a_{i}=\frac{F_{i}}{m_{i}} + \frac{1}{2 \tau_{T}} \left( \frac{T_{B}}{T_{t}} -1 \right) v_{i} \label{Berendsen} \end{equation} $$

Empirical Force Fields

Empirical Force Fields are the backbone of every Molecular Dynamics simulation. Typically the force fields are divided into two parts, bonded and nonbonded interactions. Bonded interactions consist of chemical bond stretching, angle bending, and rotation of dihedrals and impropers. Nonbonded interactions are approximated by Coulomb interactions (ionic) and Lennard-Jones potentials. The overall CHARMm (Chemistry at HARvard Macromolekular mechanics) potential is calculated by summing up these main potentials (\( V_{CHARMm} = V_{bonded} + V_{nonbonded} \)) [2,3,4].
In equation \ref{CHARMM_bonded} and \ref{CHARMM_nonbonded} the bonded and nonbonded potentials of the CHARMm force field are displayed. All terms consist of an equilibrium value marked with \(0\) and a force constant \(K\) [2,3,4].

$$ \begin{equation} V_{bonded} = \sum_{bonds}{K_{b}(b-b_{0})^{2}} + \sum_{angels}{K_{\theta}(\theta-\theta_{0})^{2}} + \sum_{torsions}{K_{\phi}(1+cos(n\phi-\delta))} + \\ \sum_{impropers}{K_{\psi}(\psi-\psi_{0})^{2}} + \sum_{Urey-Bradley}{K_{UB}(r_{1,3}-r_{1,3,o})^{2}} + \sum_{\phi\psi}{V_{CMAP}} \label{CHARMM_bonded} \end{equation} $$ $$ \begin{equation} V_{nonbonded}=\sum_{nonbonded}{\frac{q_{i}q_{j}}{4\pi D r_{ij}}}+ \sum_{nonbonded}{\epsilon_{ij}\left[\left(\frac{R_{min,ij}}{r_{ij}}\right)^{12}-2\left(\frac{R_{min,ij}}{r_{ij}}\right)^{6}\right] } \label{CHARMM_nonbonded} \end{equation} $$

The additional terms CMAP and Urey-Bradley are correctional terms for backbone atoms and 1, 3 interactions respectively [2,3,4].

Simulation Analysis

Since the vast amount of data that is produced by Molecular Dynamics simulations it is essential to process the data into more accesible formats. To perform this task we applied several approaches like comparison of the solvent accesible surface area (SASA) over time.

Root Mean Square Deviation

The Root Mean Square Deviation (RMSD) describes the sum of distances of all selected atoms \(n\) between themselves in a selceted timestep \(\tau\) and a reference timestep \(r\) (eq. \ref{RMSD}). Plotted over time it is possible to detect fluctuations in the whole molecular configuration and therefore it is possible to conclude structures of high stability from plateaus in the RMSD.

$$ \begin{equation} RMSD_{\tau}=\sqrt{\sum_{n=1}^{N}{\left((x_{\tau,n}-x_{r,n})^{2}+(y_{\tau}-y_{r,n})^{2}+(z_{\tau,n}-z_{r,n})^{2}\right)}} \label{RMSD} \end{equation} $$

By choosing the right atom selection it is possible to evaluate the different behaviours of protein subgroups. For example, if the RMSD between Cα atoms is calculated it is possible to plot the backbone movement over time and hence detect configurations that differ from the starting structure. This is important if one wants to search for different thermodynamically stable ensembles of the protein or molecule of interest.

Root Mean Square Fluctuation

Similar to the RMSD the Root Mean Square Fluctuation describes the sum of distances of all selected atoms. In this case the distance per atom between all selected configuartions is calculated and summed over time. Therefore it is possible to spot residues with strong mobility and consequntly residues that are part of fluctuating and disordered protein subunits.

$$ \begin{equation} RMSF_{n}=\sqrt{\sum_{\tau=1}^{T}{\left((x_{n}-x_{0})^{2}+(y_{n}-y_{0})^{2}+(z_{n}-z_{0})^{2}\right)}} \label{RMSF} \end{equation} $$

DSSP

Define Secondary Structures of Proteins (DSSP) by Wolfgang Kabsch and Christian Sander is a standard program to analyse secondary structure properties of proteins. The main idea to discriminate between different secondary structures is based on the presence of H bonds because this can be represented by one energy value. This definition enables the algorithm to distinguish different types of α helices, β sheets, and turns.
The electrostatic interations between two groups are calculated by assigning partial charges to each C (\(+q_{1}\)), O (\(-q_{1}\)), N (\(-q_{2}\)), and H (\(+q_{2}\)), with \(q_{1} = 0.42~e\) and \( q_{2} = 0.20~e\) and r(AB) being the distance between two atoms A and B in Angström. An H bond is defined by \( E < -0.5~\frac{kcal}{mol}\) [5].

$$ \begin{equation} E = q_{1} q_{2} \left( \frac{1}{r(ON)} + \frac{1}{r(CH)} + \frac{1}{r(OH)} + \frac{1}{r(CN)} \right) 332~\frac{kcal}{mol} \label{DSSP} \end{equation} $$

Binding Energy Calculations

Alchemical Free Energy Perturbation (FEP) is a method in computational biology to obtain energy differences from molecular dynamics or Metropolis Monte Carlo simulations between two system states. In here the system is slowly transformed from state \(i\) to state \(j\) through non-natural intermediates which are sampled by slowly decreasing the intermolecular interactions. The core equation (eq. \ref{FEGG}) for the Helmholtz free energy difference between states \(i\) and \(j\) \(\Delta A_{ij}\) is derived from statistical mechanics. \(Q\) represents the canonical partition function, \(k_B\) the Boltzman constant, \(U\) the corresponding potential system energy in relation to the coordinates and momenta \(\vec{q}\), \(T\) the temperatur and \(\Gamma\) the volume of potential states of \(\vec{q}\) [16].

$$ \begin{equation} \Delta A_{ij} = -k_{B} T \frac{Q_{j}}{Q_{i}} = -k_{B} T~ln \left( \frac{\int_{\Gamma_{j}}e^{-\frac{U_{j}(\vec{q})}{k_{B}T}}d \vec{q}}{\int_{\Gamma_{i}}e^{-\frac{U_{i}(\vec{q})}{k_{B}T}}d \vec{q}}\right) \label{FEGG} \end{equation} $$

To calculate free energy differences between states with low state space overlap a thermodynamic cycle can be constructed hence the Helmholtz free energy is a thermodynamic state function. The most straightforward way to calculate state transition free energy changes would be to simulate the naturally occuring process. For example the binding free energy between two proteins can be calculated by separating both proteins. This approach however has high computational costs since simulating the whole water filled box results in large systems.
The alchemical perturbation approach relies on simulating several intermediate states over which Coulomb and Lennard-Jones interactions are slowly decreased. This is controlled via a coupling factor \(\lambda\) (eq \ref{lambda}). The resulting potential energy \(U\) at state \(\lambda\) is subsequently calculated as a sum of the two end states \(U_0\), \(U_1\) and all not decoupled interacions \(U_{unaffected}\) [16].

$$ \begin{equation} U(\lambda,\vec{q}) = (1-\lambda)~U_{0}(\vec{q}) + \lambda~U_{1}(\vec{q}) + U_{unaffected}(\vec{q}) \label{lambda} \end{equation} $$

The overall free energy change \(\Delta A_{0,1}\) is consequently calculated as the sum over all free energy changes (eq. \ref{TC}).

$$ \begin{equation} \Delta A_{0,1} = \sum^{1}_{\lambda=0} {\Delta A_{\lambda,\Delta \lambda}} \label{TC} \end{equation} $$

Gibb's free energy differences between two λ states can be calculated by equation \ref{Calc}.

$$ \begin{equation} \Delta G~(\lambda^{'} \rightarrow \lambda^{"}) = -k_{B}T~\Bigg \langle exp \left( -\frac{U(\lambda^{"})-U(\lambda^{'})}{k_{B}T} \right) \Bigg \rangle \label{Calc} \end{equation} $$

Interactions of molecules are represented by Lennard-Jones and Coulomb interactions. This can lead to problems when Lennard-Jones interactions are decoupled and charge interactions are still active. This ensemble will lead to a clashing of the molecules since the charges will attract each other without any form of antagonistic force. To avoid this problem, position restraints are added which have to be considered in the final calculation.
In order to evaluate the simulations of the intermediate states and calculate the free energy difference between the end states several algorithms have been developed (e.g., Exponential Averaging (EXP) or Bennett Acceptance Ratio (BAR)). BAR computes the energy difference between two simulations generating the trajectories \( n_i \) and \( n_j \) with the corresponding potential energy functions \(U_i\) and \(U_j\). The free energy difference \(\Delta A_{ij}\) can then be written as in equation \ref{Bennett} where \(f()\) stands for the Fermi function (eq. \ref{Fermi}) [16].

$$ \begin{equation} \Delta A_{ij} = k_B T ~ \left( ln \frac{\sum_j f(U_i - U_j + k_B T~ln \frac{Q_i n_j}{Q_j n_i})}{\sum_i f(U_j - U_i - k_B T~ln \frac{Q_i n_j}{Q_j n_i})} - ln \frac{n_j}{n_i} \right)+ k_B T~ln \frac{Q_i n_j}{Q_j n_i} \label{Bennett} \end{equation} $$

$$ \begin{equation} f(x) = \frac{1}{1 + e^{~\beta x}} \label{Fermi} \end{equation} $$

The Term \(k_B T~ln \frac{Q_i n_j}{Q_j n_i}\) has to be approximated since it cannot be computed analytically. Equation \ref{Bennett2} describes the relation used by Bennett et al. to estimate the term. Once it has been determined the free energy difference can be calculated by equation \ref{Bennett3} [16].

$$ \begin{equation} \sum_j f\left(U_i - U_j + k_B T~ln \frac{Q_i n_j}{Q_j n_i}\right) = \sum_i f\left(U_j - U_i - k_B T~ln \frac{Q_i n_j}{Q_j n_i}\right) \label{Bennett2} \end{equation} $$

$$ \begin{equation} \Delta A_{ij} = - k_B T ~ ln \frac{n_j}{n_i} + k_B T~ln \frac{Q_i n_j}{Q_j n_i} \label{Bennett3} \end{equation} $$

Later on Shirts et al. derived the BAR method by applying maximum likelihood techniques [16].
Since common analysis methods in biochemistry often rely on titration, only dissociation (\(K_{d}\)) respectively association constants (\(K_{a}\)) can be concluded from experiments. Therefore the dissociation constants were calculated by using equation \ref{gibbs}.

$$ \begin{equation} K_{a/d} = e^{-\frac{\Delta G}{RT}} \label{gibbs} \end{equation} $$

METHODS

Visualisations
Colicin E2

No obtainable 3D model of Colicin E2 was found in the Research Collaboration for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) at the Brookhaven National Laboratory and the Protein Data Bank in Europe (PDBe) at the European Bioinformatics Institue (EMBL-EBI). Crystallographic structures of the DNase subunit of Colicin E2 and its bacterial import subunit were available. Therefore we chose to use homology modeling to obtain a 3D structure of Colicin E2. For homology modeling the Protein Homology/analogY Recognition Engine V 2.0 (PHYRE2) [17] server was used in combination with the amino acid sequence obtained from the Universal Protein Resource (UniProt) Protein Knowledgebase (UniProtKB) entrance P04419.
The obtained model was based on the known subunits of Colicin E2 and Colicin E3, a close relative. Afterwards a CHARMm topology was produced via the pdb2gmx module of GROMACs 5.1.3 [6-15], the model was solvated in TIP3P water and energy minimized using the steepest descend algorithm. Afterwards we used PyMOL [18] to create molecular images.

Force Field Parametrization

A 3D model of O-methyl-l-tyrosine was created using Avogadro 1.1.1, and energy minimized using the steepest descend algorithm in vacuum. The so obtained model was subsequently parsed to the SwissParam [19] topology server. The created topology was then used to derive parameters for the CHARMm residue force field [2,3,4] so that for any protein with O-methyl-l-tyrosine a topology could be created. Since O-methyl-l-tyrosine is very similar to tyrosine most parameters could be adopted.
CHARMm is an atom type based force field, meaning that every parameter is not dependend on the residue but on the assigned atom types taking part in the interactions. As a result we had to define several new atom types to account for the changed properties of the methyl ether since no other benzyl ether atoms could be found in the force field parameter files.

Figure 1: Used atom names and types for the implementation.

To implement the newly derived parameters we had to update several force field parameter files that we gathered from the GROMACS 5.0.4 software suite [6-15]. This was due to the case that all parameter files had to be compatible to the GROMACS software suite. First we had to implement the new amino acid in the aminoacids.rtp file which basically serves as a register for all known residues. The amino acid entry contains a three letter code by which it will be identified, all atoms with name, type and charge, all bonds between these atoms as well as impropers and CMAPs. Since the last two entries only concern the backbone atoms, we simply copied those entries. Atom charges were duplicated from tyrosine for all atoms that were not involved in the ether bond. The ether methyl group was represented by a standard methyl group with CT3 carbon and HA hydrogen. This atom classes represent alanin's Cβ and the Cδ atom of methionin.
A new atom class named OE was introduced to represent the ether oxygen since there are no ether oxygen parameters for amino acid residues in the CHARMm force field files. We used parameters from the generated topology file to derive parameters for all interactions from the OE atom type. To implement these parameters we updated the bondtypes, angletypes, and dihedrals in the ffbonded.itp file. Similarly, we updated the ffnonbonded.itp file sections atomtypes, and pairtypes. Subsequently we added an OE entry to the atomtypes.atp file.
CHARMm simulates aromaticity through dummy atoms which will be calculated automatically. To achieve this for our new amino acid we updated the aminoacids.vsd file and inserted every bond length and angle. Finally we had to assign a three letter code to represent our amino acid in the topology and structure files. We chose OMT. This resulted in the empirical force field CHARMm 27 OMT.

Molecular Dynamics Simulations

3D structures of the Colicin E2 DNase subunit (miniColicin) and its immunity protein were obtained from the RCSB PDB entrance 3U43 [20]. Since the main binding occurs between the DNase subunit and the immunity protein only this Colicin E2 subunit was simulated. Furthermore we established a minimized Colicin E2 in another project part and wanted to test its operational capability.
To insert mutations inside the immunity protein we inserted the energy minimized structure of OMT at the desired position. Positions of backbone atoms were fitted to those of the replaced amino acid so that the backbone integrity was preserved. This was achieved through the implementation of Kabsch's algorithm for structural alignments in the Bio3D package [21-23] for the statistical computing language R [24]. Additionaly we used Bio3D's pdb processing package to separate Colicin E2 DNase subunit and its immunity protein.
All molecular dynamics simulations were performed with the GROningen MAchine for Chemical Simulations (GROMACS) 5.0.3 [6-15] software suite. As empirical force field CHARMm 27 OMT was used. An explicit water model with TIP3P water was chosen. The box was subsequently filled with water and the system was neutralized through insertions of chloride ions. After neutralization the system was energy minimized with the steepest descent algorithm until it converted. To generate velocity and temperate the system a small equilibration run of about 500 ps was performed. The end temperature was set to 298 K and the Berendsen thermostat and the velocity-Verlet integrator with a stepsize of 2 fs were used. After this equilibration run a NVT ensemble was achieved. To achieve a NPT ensemble the equilibration run was repeated with applied pressure coupling to 1 bar. Subsequently the final MD production run was performed. Every 5000st step was saved resulting in a trajectory of 10001 conformations ranging over 100 ns simulation time.

Molecular Dynamics Simulations Analysis

Simulation Analysis was performed using the R [24] package Bio3D [21,22,23]. All plots were created using the R package ggplot2 [25]. For visualization of protein structures and trajectories the PyMOL visualization system [18] was used. To compensate for eventual jumps due to translations across the PBC barriers a GROMACS internal fitting program was applied (gmx trjconv -center -pbc nojump). To exclude translational and rotational movement of the simulated protein we applied the GROMACS internal fitting algorithms gmx trjconv -fit rot+trans. These trajectory manipulations were performed because several analysis methods like RMSD and RMSF rely on distance calculations between atom positions over time. In these analyses the translational and rotational movements are not of interest since we only want to visualize the movement of atoms in relation to the simulated protein to test for different configuarations and structural flexibility. Additionally the crossing of PBC barriers would increase the distance between two atom positions drastically since the atom would be relocated to the opposing site of the simulated system.

Binding Energy Calculations

Binding energies were calculated by using alchemical Free Energy Perturbation (FEP) in combination with Bennett Acceptence Ratio (BAR). A thermodynamical cycle was constructed (see fig. 2) and accordingly two simulation sets were performed. First, Colicin E2's immunity protein was simulated in TIP3P water with 0.6 nm space between the box and the protein. Furthermore, Lennard-Jones potential and Coulomb interactions were decoupled over ten simulations, each resulting in 20 simulations in total. The simulations were energy minimized over approximately 300 steps, NVT equilibrated over 10 ps and NPT equilibrated over 100 ps. The production run was performed for 2 ns. Second, the binding complex consisting of Colicin E2 and its immunity protein was simulated under similar conditions i.e. equilibration and simulation times, and steps were chosen alike. In contrast to the immunity protein simulations additional restraint decoupling steps were performed over ten simulations before the other steps. All simulations were evaluated using the BAR scripts from the pyMBAR python library.

RESULTS & CONCLUSION

Molecular Dynamcis Simulations

Three amino acid positions were chosen for O-methyl-l-tyrosine (OMT) exchange evaluation (tyrosine 8 (Y8), phenylalanine 13 (F13) and phenylalanine 16 (F16)). These positions were selected because of their small deviation in regard to OMT and were therefore expected to cause the smallest structural differences. All molecular dynamics steps were performed on these mutation variants, as well as on the wildtype protein for comparison. All simulations were performed over 100 ns, leading to 10001 conformations each. These conformations were evaluated using RMSD, RMSF, SASA and the amount of secondary structures over time.
Figure 2 displays the RMSD in regard to the initial conformation. It can be observed that the mutational variant Y8O exhibits similar curve characteristics as the wildtype variant. The mutational variant F16O on the contrary shows a more severe deviation from the wildtype.

Figure 2: RMSD of the mutation variants Y8O (green), F16O (blue) and the wildtype (red).

The solvent accessible surface area (SASA) was calculated for every simulation step using DSSP and is displayed in figure 3. The little to no fluctuation in the SASA of all simulated variants is an argument for the high structural stability of the wildtype and the mutational variants. The disparity between the mutational variants and the wildtype can be traced back to the DSSP algorithm since it is based solely on natural amino acids. Therefore it cannot evaluate a non-natural amino acid and would cause a discrepancy between surface areas if non-natural amino acids like OMT are involved.

Figure 3: SASA of the mutation variants Y8O (green), F16O (blue) and the wildtype (red).

Additionally we evaluated the RMSF and the amount of secondary structures over time which exhibited no differences between the mutational variants and the wildtype. Therefore the results of these evaluation methods are not shown on this wiki.
Closing up we can conclude that the mutational variant Y8O does fit our demands best, since its behavior exhibits the least disparity in our applied evaluation methods towards the wildtype. Therefore Y8O was chosen for further analyses.

Furthermore we evaluated the stability of our designed miniColicin by performing 100 ns of MD simulations with different force fields (CHARMm27, AMBER03, GROMOS56a7). The RMSD of these simulations is displayed in figure 4. It can be observed that some deviations between the different force fields occur. The simulation run with the GROMOS56a7 shows the biggest discrepancy towards the AMBER and CHARMm simulations which display little to no fluctuations over time. This behavior can be traced to the different parametrization approach in the force fields as well as to the fact that the GROMOS56a7 force field is a united atom force field, in which all CH3 and CH2 groups are described as one group.

Figure 4: RMSD of miniColicin simulated using the the force fields CHARMm27 (red), AMBER03 (blue) and GROMOS56a7 (green).

Binding Energy Calculations

To obtain the binding energy between miniColicin and the immunity protein we simulated 20 λ states per thermodynamic step. The coupling factor λ has been variated ina range of 0 to 1 in steps of 0.2 for Coulomb, 0.05 and 0.1 for the Lennard-Jones interactions respectively. Unfortunatly, a computation of a full set of simulations could not be achieved. This was due to unknown errors, inferior debugging abilities, malformed error messages recieved from the GROMACS simulation package [6-15] and temporal as well as computational limitations. In this context it was only possible to evaluate a scaling of Lennard-Jones potentials in the range from 0.9 to 1.0 with completely coupled Coulomb interactions. We tested simulations for different immunity protein variants on different computer systems (workstation and server) with the results being the same. Therefore, it seems appropriate to assume that the unknown error occured due to systematic problems. For example, instabilities in the simulated systems or badly implemented features could account for thus errors. Because of insufficient debug information provided by error messages we have not been able to debug the error and are consequently unable to present calculated data.
It should be evaluated wether this behavior is a singular event or an indication for a major underlying fault within the implemtation. If so an evaluation of the configuration space seems to be in order. Furthermore, a comparison against alternative simulation frameworks is most likely imperative for future work.

References
  • [1] Gonzàlez, Force fields and molecular dynamics simulations, Les journées de la Diffusion Neutronique (JDN), vol. 12,pp. 2011
  • [2] R. B. Best et al., Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone phi, psi and Side-Chain chi1 and chi2 Dihedral Angles, Journal of Chemical Theory and Computation, 8: 3257-3273., 2012
  • [3] A.D. Jr. MacKerell, M. Feig, C.L. III Brooks, Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations, Journal of Computational Chemistry, 25: 1400-1415., 2004
  • [4] A.D. Jr. MacKerell et al. All-atom empirical potential for molecular modeling and dynamics Studies of proteins, Journal of Physical Chemistry B, 102, 3586-3616.,1998
  • [5] Wolfgang Kabsch and Christian Sander, Dictionary of Protein Secondary Structure: Patter Recognition of Hydrogen-Bonded and Geometrical Features, Biopolymers, Vol.22, 2577-2637, 1983
  • [6] Berendsen, et al., GROMACS: A message-passing parallel molecular dynamics implementation, Comp. Phys. Comm. 91: 43-56.1995, 1995
  • [7] Lindahl, et al., GROMACS 3.0: a package for molecular simulation and trajectory analysis, J. Mol. Model. 7: 306-317., 2001
  • [8] van der Spoel, et al., GROMACS: Fast, flexible, and free, J. Comput. Chem. 26: 1701-1718., 2005
  • [9] Hess, et al., GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation, J. Chem. Theory Comput. 4: 435-447., 2008
  • [10] Pronk, et al., GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit, Bioinformatics 29 845-854, 2013
  • [11] Pàll, et al., Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS, Proc. of EASC 2015 LNCS, 2015
  • [12] Abraham, et al., GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX 1-2 19-25, 2015
  • [13] Wennberg et al., Lennard-Jones Lattice Summation in Bilayer Simulations Has Critical Effects on Surface Tension and Lipid Properties, J. Chem. Theory Comput., 9 3527–3537, 2013
  • [14] Wennberg et al., Direct-Space Corrections Enable Fast and Accurate Lorentz–Berthelot Combination Rule Lennard-Jones Lattice Summation, J. Chem. Theory Comput., 12 5737–5746, 2015
  • [15] Pàll, Hess, A flexible algorithm for calculating pair interactions on SIMD architectures, Comp. Phys. Comm.,184 2641-2650, 2013
  • [16] Gerhard König, Stefan Bruckner, Stefan Boresch, Unorthodox Uses of Bennett’s Acceptance Ratio Method, J. Comput. Chem., vol. 30, 1712-1718, 2009
  • [17] Lawrence A Kelley, Stefans Mezulis, Christopher M Yates, Mark N Wass & Michael J E Sternberg, The Phyre2 web portal for protein modeling, prediction and analysis, Nature Protocols, vol. 10, 845-858, 2015
  • [18] The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.
  • [19] V. Zoete, M. A. Cuendet, A. Grosdidier, O. Michielin, SwissParam, a Fast Force Field Generation Tool For Small Organic Molecules, J. Comput. Chem, vol. 32(11), 2359-68., 2011
  • [20] J.A. Wojdyla , S.J. Fleishman, D. Baker, C. Kleanthous, Structure of the ultra-high-affinity colicin E2 DNase--Im2 complex, J. Mol. Biol, vol. 417, 79-94, 2012
  • [21] Grant, Rodrigues, ElSawy, McCammon, Caves, Bio3D: An R package for the comparative analysis of protein structures., Bioinformatics, vol. 22, 2695-2696 2006
  • [22] Skjærven, Yao, Scarabelli, Grant, Integrating protein structural dynamics and evolutionary analysis with Bio3D., BMC Bioinformatics vol., 15, 399, 2014
  • [23] Skjærven, Jariwala, Yao, Grant, Online interactive analysis of protein structure ensembles with Bio3D-web., Bioinformatics In press., 2016
  • [24] R Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, URL, 2014
  • [25] H. Wickham, ggplot2: elegant graphics for data analysis, Springer New York, 2009