Difference between revisions of "Team:Bielefeld-CeBiTec/Project/Library/Design"

Line 10: Line 10:
 
<br>
 
<br>
 
<div class="container text"> As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. Evobody is the result of fusing a scaffold, be it Monobody or Nanobody that get continuously mutated and selected for the best binding protein.  
 
<div class="container text"> As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. Evobody is the result of fusing a scaffold, be it Monobody or Nanobody that get continuously mutated and selected for the best binding protein.  
 +
<br>
 
Therefore, we designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of the library is the availability of a wide range of different binding proteins as starting material for the evolution process. After transformation into <i>E. coli</i> a heterogeneous culture is created with each colony carrying a different insert encoding a different binding protein (Osogawa).
 
Therefore, we designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of the library is the availability of a wide range of different binding proteins as starting material for the evolution process. After transformation into <i>E. coli</i> a heterogeneous culture is created with each colony carrying a different insert encoding a different binding protein (Osogawa).
 +
<br>
 
By changing several bases after a specific nucleic acid scheme in the variable region our variability increases on a theoretical variety up to 1,073,741,824.  
 
By changing several bases after a specific nucleic acid scheme in the variable region our variability increases on a theoretical variety up to 1,073,741,824.  
 +
<br>
 
An overlarge variability could not be achieved in an in vivo library and our mutation system will further increase the variability. Therefore, we decided to optimize the amount of used amino acids for the CDRs by using a reduced codon scheme. As a realistic library size we opined a variability of about one billion varieties. For constructing the CDS we utilized ambiguity bases of the IUPAC nucleotide code including explicitly random bases.  
 
An overlarge variability could not be achieved in an in vivo library and our mutation system will further increase the variability. Therefore, we decided to optimize the amount of used amino acids for the CDRs by using a reduced codon scheme. As a realistic library size we opined a variability of about one billion varieties. For constructing the CDS we utilized ambiguity bases of the IUPAC nucleotide code including explicitly random bases.  
 +
<br>
 
First of all, we avoided stop codons to guarantee the synthesis of the complete binding protein. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine [Bogan, 1998]. Especially tyrosine seems to loom large in high affinity CDRs in functional antibodies.  Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine [Mian et al., 1991; Zemlin et al. 2003]. Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins [Fellouse et al., 2004]. The tyrosine side chain exposes most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion [Fellouse et al., 2007], while smaller amino acids provide the necessary space for protein conformation [Koide et al., 2010]. The size of tyrosine also leads to many van der Waals´ and electrostatic interactions for initiation of binding [Mian et al., 1991].
 
First of all, we avoided stop codons to guarantee the synthesis of the complete binding protein. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine [Bogan, 1998]. Especially tyrosine seems to loom large in high affinity CDRs in functional antibodies.  Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine [Mian et al., 1991; Zemlin et al. 2003]. Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins [Fellouse et al., 2004]. The tyrosine side chain exposes most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion [Fellouse et al., 2007], while smaller amino acids provide the necessary space for protein conformation [Koide et al., 2010]. The size of tyrosine also leads to many van der Waals´ and electrostatic interactions for initiation of binding [Mian et al., 1991].
 +
<br>
 
Tyrosine has various more advantages for a good binding protein. Its side chain is amphipathic, what is helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes [Mian et al., 1991].
 
Tyrosine has various more advantages for a good binding protein. Its side chain is amphipathic, what is helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes [Mian et al., 1991].
 +
<br>
 
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins. This feature appears the binding sites furthermore a higher specificity [Koide et al., 2010]. Beside high affinity, specificity is another important requirement we need for the use as a good Evobody.
 
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins. This feature appears the binding sites furthermore a higher specificity [Koide et al., 2010]. Beside high affinity, specificity is another important requirement we need for the use as a good Evobody.
 +
<br>
 
On these grounds, we designed the three randomized codons TMY, KMY, RMR, YWY, NWY and WMY.  
 
On these grounds, we designed the three randomized codons TMY, KMY, RMR, YWY, NWY and WMY.  
 +
<br>
 
The following table shows the used randomized IUPAC nucleotide designation and the encoded amino acids:
 
The following table shows the used randomized IUPAC nucleotide designation and the encoded amino acids:
 
<br>
 
<br>
Line 72: Line 80:
 
<tr>
 
<tr>
 
<td>KMY</td>
 
<td>KMY</td>
<td>Tyrosine, Serine, <br>Alanine, Aspartic Acid </td>
+
<td>Tyrosine, Serine, Alanine, Aspartic Acid </td>
 
</tr>
 
</tr>
 
<tr>
 
<tr>
 
<td>WMY</td>
 
<td>WMY</td>
<td>Tyrosine, Serine, <br>Threonine, Asparagine </td>
+
<td>Tyrosine, Serine, Threonine, Asparagine </td>
 
</tr>
 
</tr>
 
<tr>
 
<tr>
 
<td>RMR</td>
 
<td>RMR</td>
<td> Threonine, Alanine, <br> Lysine, Glutamic Acid </td>
+
<td> Threonine, Alanine, Lysine, Glutamic Acid </td>
 
</tr>
 
</tr>
 
<tr>
 
<tr>
 
<td>YWY</td>
 
<td>YWY</td>
<td>Phenylalanine, Serine, <br>Isoleucine, Threonine </td>
+
<td>Phenylalanine, Serine, Isoleucine, Threonine </td>
 
</tr>
 
</tr>
 
<tr>
 
<tr>
 
<td>NWY</td>
 
<td>NWY</td>
<td>Phenylalanine, Leucine, <br> Isoleucine, Valine, <br> Tyrosine, Histidine, <br>Asparagine, Aspartic Acid</td>
+
<td>Phenylalanine, Leucine, Isoleucine, Valine, Tyrosine, Histidine, Asparagine, Aspartic Acid</td>
 
</tr>
 
</tr>
 
</tbody>
 
</tbody>
Line 98: Line 106:
  
 
<figure class="figure">
 
<figure class="figure">
   <img src="https://static.igem.org/mediawiki/2016/9/94/Bielefeld_CeBiTec_2016_10_14_XXX_florian_min.png" class="figure-img" width="60%" alt="">
+
   <img src="" class="figure-img" width="60%" alt="">
 
   <figcaption class="figure-caption"><b>Figure 1: Florian Helfer (cultural anthropologist from Goethe University in Frankfurt am Main).</b> iGEM team Bielefeld-CeBiTec in conversion with Florian about synthtic biology, biohacking and public outreach.</figcaption>
 
   <figcaption class="figure-caption"><b>Figure 1: Florian Helfer (cultural anthropologist from Goethe University in Frankfurt am Main).</b> iGEM team Bielefeld-CeBiTec in conversion with Florian about synthtic biology, biohacking and public outreach.</figcaption>
 
</figure>
 
</figure>

Revision as of 11:10, 19 October 2016



Design and Construction

Library Construction Area


As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. Evobody is the result of fusing a scaffold, be it Monobody or Nanobody that get continuously mutated and selected for the best binding protein.
Therefore, we designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of the library is the availability of a wide range of different binding proteins as starting material for the evolution process. After transformation into E. coli a heterogeneous culture is created with each colony carrying a different insert encoding a different binding protein (Osogawa).
By changing several bases after a specific nucleic acid scheme in the variable region our variability increases on a theoretical variety up to 1,073,741,824.
An overlarge variability could not be achieved in an in vivo library and our mutation system will further increase the variability. Therefore, we decided to optimize the amount of used amino acids for the CDRs by using a reduced codon scheme. As a realistic library size we opined a variability of about one billion varieties. For constructing the CDS we utilized ambiguity bases of the IUPAC nucleotide code including explicitly random bases.
First of all, we avoided stop codons to guarantee the synthesis of the complete binding protein. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine [Bogan, 1998]. Especially tyrosine seems to loom large in high affinity CDRs in functional antibodies. Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine [Mian et al., 1991; Zemlin et al. 2003]. Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins [Fellouse et al., 2004]. The tyrosine side chain exposes most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion [Fellouse et al., 2007], while smaller amino acids provide the necessary space for protein conformation [Koide et al., 2010]. The size of tyrosine also leads to many van der Waals´ and electrostatic interactions for initiation of binding [Mian et al., 1991].
Tyrosine has various more advantages for a good binding protein. Its side chain is amphipathic, what is helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes [Mian et al., 1991].
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins. This feature appears the binding sites furthermore a higher specificity [Koide et al., 2010]. Beside high affinity, specificity is another important requirement we need for the use as a good Evobody.
On these grounds, we designed the three randomized codons TMY, KMY, RMR, YWY, NWY and WMY.
The following table shows the used randomized IUPAC nucleotide designation and the encoded amino acids:
Degenerated base designation Actual bases coded
M A/C
Y C/T
K G/T
W A/T
R A/G
M A/C

Table 2 shows our designed randomized triplets and the chance of the designated amino acids:

Designed randomized codon Actual amino acids encoded
TMY Tyrosine, Serine
KMY Tyrosine, Serine, Alanine, Aspartic Acid
WMY Tyrosine, Serine, Threonine, Asparagine
RMR Threonine, Alanine, Lysine, Glutamic Acid
YWY Phenylalanine, Serine, Isoleucine, Threonine
NWY Phenylalanine, Leucine, Isoleucine, Valine, Tyrosine, Histidine, Asparagine, Aspartic Acid

Finally, we achieved a theoretic variability of 1,073,741,824 different molecules for Monobodies and Nanobodies, respectively.
Figure 1: Florian Helfer (cultural anthropologist from Goethe University in Frankfurt am Main). iGEM team Bielefeld-CeBiTec in conversion with Florian about synthtic biology, biohacking and public outreach.