Difference between revisions of "Team:Bielefeld-CeBiTec/Project/Library/Design"

 
(13 intermediate revisions by one other user not shown)
Line 15: Line 15:
 
<div class="container text_header"><h3>Library Construction Area</h3></div>
 
<div class="container text_header"><h3>Library Construction Area</h3></div>
 
<br>
 
<br>
<div class="container text"> As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. Evobody is the result of fusing a scaffold, be it Monobody or Nanobody that get continuously mutated and selected for the best binding protein.  
+
<div class="container text">As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. An Evobody is a binding protein resulting from the combination of a <a href="2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Scaffolds">scaffold</a>, be it Monobody or Nanobody, with a continuously mutation and selection system, and thus represents the fusion of a semi rational design approach with evolution. <br>
<br>
+
We designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of a library is the availability of a wide range of different binding proteins as starting material for the evolution process, which ensures that candidates with limited but evolvable affinity are present from the outset. After transformation into <i>E. coli</i> a heterogeneous culture is created and after plating the transformation each colony carries a different insert encoding a different binding protein (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Osoegawa">Osoegawa <i>et al.</i>, 2001</a>). <br>
Therefore, we designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of the library is the availability of a wide range of different binding proteins as starting material for the evolution process. After transformation into <i>E. coli</i> a heterogeneous culture is created with each colony carrying a different insert encoding a different binding protein (Osogawa).
+
By using specific mixtures of bases for defined positions in the oligonucleotide synthesis resulting in degenerate codons coding for a defined mix of amino acids in the variable regions of our scaffolds we achieve a theoretical diversity of 1,073,741,824 proteins sequences for each library. <br>
<br>
+
Since the practical diversity of our library is limited by the transformation efficiency of <i>E. coli</i>, the theoretical library size needed to be chosen carefully to maximize the binding the potential while avoiding dilution in unproductive (i.e. non-folding or non-binding) sequence space. Furthermore, due to our evolution system, only a weak affinity is required in the first place which can then mature to high affinity. We decided to optimize the library by using restricted yet proven codon or amino acids set, respectively, for the complementary determining regions (CDRs). As a realistic library size, we obtained for a diversity of about one billion varieties. For constructing the coding sequence, we utilized the IUPAC ambiguity base naming scheme for the ordering of our oligonucleotides at a vendor, which did support the equimolar mixing of nucleotides.  
By changing several bases after a specific nucleic acid scheme in the variable region our variability increases on a theoretical variety up to 1,073,741,824.  
+
<br>
+
An overlarge variability could not be achieved in an in vivo library and our mutation system will further increase the variability. Therefore, we decided to optimize the amount of used amino acids for the CDRs by using a reduced codon scheme. As a realistic library size we opined a variability of about one billion varieties. For constructing the CDS we utilized ambiguity bases of the IUPAC nucleotide code including explicitly random bases.
+
<br>
+
First of all, we avoided stop codons to guarantee the synthesis of the complete binding protein. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine [Bogan, 1998]. Especially tyrosine seems to loom large in high affinity CDRs in functional antibodies.  Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine [Mian et al., 1991; Zemlin et al. 2003]. Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins [Fellouse et al., 2004]. The tyrosine side chain exposes most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion [Fellouse et al., 2007], while smaller amino acids provide the necessary space for protein conformation [Koide et al., 2010]. The size of tyrosine also leads to many van der Waals´ and electrostatic interactions for initiation of binding [Mian et al., 1991].
+
<br>
+
Tyrosine has various more advantages for a good binding protein. Its side chain is amphipathic, what is helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes [Mian et al., 1991].
+
<br>
+
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins. This feature appears the binding sites furthermore a higher specificity [Koide et al., 2010]. Beside high affinity, specificity is another important requirement we need for the use as a good Evobody.
+
 
<br><br>
 
<br><br>
On these grounds, we designed the three randomized codons TMY, KMY, RMR, YWY, NWY and WMY.  
+
First of all, in our library design, we avoided stop codons to guarantee the synthesis of the complete binding proteins. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds and compatibility with the intracellular and extracellular environment. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Bogan">Bogan and Thorn, 1998</a>). Especially tyrosine is overrepresented in high affinity CDRs in functional antibodies. Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Mian">Mian <i>et al.</i>, 1991</a>; <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Zemlin">Zemlin <i>et al.</i>, 2003</a>). Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Fellouse2004">Fellouse <i>et al.</i>, 2004</a>). The tyrosine side chain bears most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Fellouse2007">Fellouse <i>et al.</i>, 2007</a>), while smaller amino acids provide the necessary space and protein conformation flexibility (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Koide">Koide <i>et al.</i>, 2010</a>). The size of tyrosine also leads to many van der Waals´ contacts and the -OH group provides electrostatic interactions for binding (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Mian">Mian <i>et al.</i>, 1991</a>). The amphipathic nature of the tyrosine is also helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Mian">Mian <i>et al.</i>, 1991</a>). <br>
<br><br>
+
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins by providing a high specificity (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Koide">Koide <i>et al.</i>, 2010</a>). Beside high affinity, specificity is another important requirement for a good Evobody.
The following table shows the used randomized IUPAC nucleotide designation and the encoded amino acids:
+
 
<br><br>
 
<br><br>
 +
On these grounds, we designed the randomized codons TMY, KMY, RMR, YWY, NWY and WMY. <br><br>
 +
Table 1 shows the used randomized IUPAC nucleotide naming and the encoded bases:
 +
<br>
 +
 
<table class="table">
 
<table class="table">
 
<thead>
 
<thead>
Line 70: Line 64:
 
</table>
 
</table>
 
<br><br>
 
<br><br>
<div class="container text">Table 2 shows our designed randomized triplets and the chance of the designated amino acids:
+
Table 2: used randomized codons and resulting amino acids:
</div>
+
 
 
<br><br>
 
<br><br>
 
<table class="table">
 
<table class="table">
Line 117: Line 111:
 
<br>
 
<br>
 
<div class="container text">
 
<div class="container text">
The standard plasmid pSB1K3 was extended with the required parts of the selection system. Finally, the respective binding protein got fused with RpoZ by a c-Myc-linker to get selected for good target affinity (Link Two-Hybrid-System). Therefore, the constant regions of our binding protein scaffolds (Monobody and Nanobody) were inserted as synthesized IDT G-Blocks&reg; via Gibson assembly.
+
The standard plasmid pSB1K3 was extended with the required parts of the <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Selection/Bacterial_Two-Hybrid_System">selection system</a>. Finally, the respective binding protein got fused with RpoZ, the omega domain of RNA polymerase, via a c-Myc-linker to enable selection for good target affinity. In the first steps, the constant regions of our binding protein scaffolds (Monobody and Nanobody) were ordered as gene as synthesized IDT G-Blocks&reg; and inserted in the vector via Gibson assembly. <br>
 +
After finding the optimal randomized codon scheme for Monobodies and Nanobodies, respectively, we had to decide how to generate the library. Most of the gene synthesis companies do not offer long gene fragments with partly randomized sequences and if they do the synthesis of a whole library is prohibitively expensive (about 5-10 k€) and time consuming. Therefore, we decided to do the gene synthesis of the library ourselves and ordered a small number of larger, partly randomized oligonucleotides (from Metabion). <br>
 +
The ordered oligonucleotides had complementary overlaps for annealing. Double stranded fragments were generated by a single touch-down fill in reaction to achieve and maintain maximum diversity. After generating the double stranded library fragments, these were cloned in the library vector using a Gibson assembly, due to the incorporated overlap with the vector.
 +
<br><br>
 +
<b>Monobodies:</b>
 
<br>
 
<br>
After finding the optimal randomized codon scheme for Monobodies and Nanobodies, respectively, we ordered synthetic gene fragments for the variable regions of our Evobodies. Most of the gene synthesis companies do not offer long gene fragments with partly randomized sequences. Furthermore, a synthesis of a whole library would be very expensive and time consuming.  Therefore, we decided to order a number of small, in part randomized, oligonucleotides (from Metabion).
 
<br>
 
The ordered oligonucleotides had complementary overlaps, so an annealing is possible. To achieve double-stranded fragments for Gibson assembly we planned to establish a fill up reaction at the 5´-end of the randomized fragments. After filling up the oligonucleotides, the variable fragments were assembled between the constant outer regions. For this application, they got overlaps to their neighbored sequences.
 
<br><br>
 
Monobodies:
 
  
 
  </div>
 
  </div>
Line 129: Line 122:
 
   <img src="https://static.igem.org/mediawiki/2016/3/33/Bielefeld_CeBiTec_2016_10_18_LIB_Monobody_detailed.png
 
   <img src="https://static.igem.org/mediawiki/2016/3/33/Bielefeld_CeBiTec_2016_10_18_LIB_Monobody_detailed.png
 
" class="figure-img" width="30%" alt="">
 
" class="figure-img" width="30%" alt="">
   <figcaption class="figure-caption"><b>Figure 1: Overview of the Monobody construction.</b>Variable regions are colored in blue.</figcaption>
+
   <figcaption class="figure-caption"><b><br>Figure 1: Overview of the Monobody construction.</b>Constant regions are colored green, variable regions are colored blue. Small letters denote the oligonucleotides which were used for library assembly.</figcaption>
 
</figure>
 
</figure>
 
<br><br>
 
<br><br>
Line 172: Line 165:
 
</table>
 
</table>
 
<br>
 
<br>
To create our Monobody library, we constructed the fundamental framework (BBa_K2082004) composed of the Monobody constant regions inserted RFP instead of variable regions, which can easily be exchanged. Therefore, we ordered a synthesized IDT G-Block&reg; and assembled it using Gibson cloning. To achieve a complete Monobody, the latter RFP can be replaced with randomized variable regions. Due to this, an easy control trough visual control is possible. To insert the variable regions, we amplified the backbone, including Monobody constant regions, the cMyc-linker and <i>rpoZ</i> by using the primers MB-bb-fw (Link) and MB-bb-rev (Link). After annealing the single stranded oligonucleotides V1-1 (a) + V1-2 (b) = V1 and V2-1 (c) + V2-2 (d) = V2 and filling up the parts by a qualified polymerase, a Gibson assembly with V1, V2 and the backbone was used to combine the parts.
+
To create our Monobody library, we constructed the fundamental framework (<a href="parts.igem.org/wiki/index.php?title=Part:BBa_K2082004">BBa_K2082004</a>) composed of the Monobody constant regions with an inserted RFP as a place holder for the variable regions, which can easily be exchanged. Therefore, we ordered a synthesized IDT G- Blocks&reg; and assembled it using Gibson cloning. To achieve a complete Monobody from our submitted part, the RFP can be replaced with randomized variable regions allowing an easy visual control. To insert the variable regions, we amplified the backbone, including Monobody constant regions, the cMyc-linker and <i>rpoZ</i> by using the primers <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Experiments/Primers">MB-bb-fw</a> and <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Experiments/Primers">MB-bb-rev</a>. After annealing the single stranded oligonucleotides V1-1 (a) + V1-2 (b) filling in the parts resulting in the double stranded fragment V1 and annealing oligonucleotides V2-1 (c) + V2-2 (d) and fill-in yielding V2 and a Gibson assembly with V1, V2 and the backbone combined the parts.  
 
<br><br>
 
<br><br>
Nanobodies:
+
<b>Nanobodies:</b>
 
  </div>
 
  </div>
 
<br><br>
 
<br><br>
 
<figure class="figure">
 
<figure class="figure">
 
   <img src="https://static.igem.org/mediawiki/2016/a/a0/Bielefeld_CeBiTec_2016_10_18_LIB_Nanobody_detailed.png" class="figure-img" width="30%" alt="">
 
   <img src="https://static.igem.org/mediawiki/2016/a/a0/Bielefeld_CeBiTec_2016_10_18_LIB_Nanobody_detailed.png" class="figure-img" width="30%" alt="">
   <figcaption class="figure-caption"><b>Figure 1: Overview of the Nanobody construction.</b>Variable regions are colored in blue.</figcaption>
+
   <figcaption class="figure-caption"><b><br>Figure 1: Overview of the Nanobody construction.</b>Constant regions are colored green, variable regions are colored blue. Small letters denote the oligonucleotides which were used for library assembly.</figcaption>
 
</figure>
 
</figure>
 
<div class="container text">
 
<div class="container text">
Line 209: Line 202:
 
</table>
 
</table>
 
<br>
 
<br>
Similar to Monobodies, we built up a universal framework for Nanobodies, containing rpoZ, c-Myc-linker and a constant Nanobody segment (BBa_K2082001). As opposed to the Monobody CDS in this case we only designed one variable insert, because in the used scaffold, we have only randomized the third Complementarity-determining region (CDR), which is the main binding CDR (!!!!!!!!Quelle!!!!!!!!). Amplifying BBa_K2082001 with primers NB-bb-fw (Link) and NB-bb-rev (Link) results to a backbone, used to insert the annealed and replenished variable fragment F2 = F2-1 (a) + F2-2 (b) by Gibson assembly.</div>
+
Similar to Monobodies, we built up a universal framework for Nanobodies, containing rpoZ, c-Myc-linker and a constant Nanobody segment (<a href="parts.igem.org/wiki/index.php?title=Part:BBa_K2082001">BBa_K2082001</a>). As opposed to the Monobody CDS in this case we only designed one variable insert, because in the used scaffold, we have only randomized the third Complementarity-determining region (CDR3), which is known to be the main binding CDR (<a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/Design#Krebber">Krebber <i>et al.</i>, 1997</a>). Amplifying <a href="parts.igem.org/wiki/index.php?title=Part:BBa_K2082001">BBa_K2082001</a> with primers <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Experiments/Primers">NB-bb-fw</a> and <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Experiments/Primers">NB-bb-rev</a> results in a backbone, which was used to insert the annealed and replenished variable fragment F2 = F2-1 (a) + F2-2 (b) by Gibson assembly.</div>
 
<br>
 
<br>
 
<div class="container text_header"><h3>Proof of Concept</h3></div>
 
<div class="container text_header"><h3>Proof of Concept</h3></div>
 
<br>
 
<br>
<div class="container text">The core of our system is the combination of the library with a mutation system and a selection system, to optimize binding proteins to Evobodies. To get in this directed evolution process, some binding proteins has to show at least a marginal affinity to the chosen target protein. Low affinity binders than get mutated and selected <i>in vivo</i> during the replication under a rising selective pressure. To make sure, that the designed binding proteins work and the initial libraries containing some compatible Monobodies and Nanobodies, respectively, we performed a phagemid display, using phagemid pAK100 and M13 derived helper phages. (Link)
+
<div class="container text">The core of our system is the combination of the library with a mutation system and a selection system, to optimize binding proteins to Evobodies. To get in this directed evolution process, some binding proteins must show at least a marginal affinity to the chosen target protein. Low affinity binders then get mutated and selected in vivo during plasmid replication and cell growth under a rising selective pressure. To make sure, that the designed binding proteins work and that the initial libraries contained some useful Monobodies and Nanobodies, respectively, we performed a <a href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Results/Library/Phage">phagemid display</a>, using phagemid pAK100 and M13 derived helper phages.
 
</div>
 
</div>
 +
<center>
 +
<a class= "button_link" href="https://2016.igem.org/Team:Bielefeld-CeBiTec/Project/Library/CreateYours" role="button"><button>Create your own library</button></a>
 +
</center>
 
<div class="container text_header"><h3>References</h3></div>
 
<div class="container text_header"><h3>References</h3></div>
 
<div class="container text">
 
<div class="container text">
 
<ul>
 
<ul>
<li>
+
<li id="Bogan">
 
Bogan, A. A.; Thorn, K. S. (1998): Anatomy of hot spots in protein interfaces. In: <i>Journal of Molecular Biology</i> 280 (1), S. 1–9. DOI: 10.1006/jmbi.1998.1843. </li>
 
Bogan, A. A.; Thorn, K. S. (1998): Anatomy of hot spots in protein interfaces. In: <i>Journal of Molecular Biology</i> 280 (1), S. 1–9. DOI: 10.1006/jmbi.1998.1843. </li>
  
<li>Fellouse, Frederic A.; Wiesmann, Christian; Sidhu, Sachdev S. (2004): Synthetic antibodies from a four-amino-acid code: a dominant role for tyrosine in antigen recognition. In: <i> Proceedings of the National Academy of Sciences of the United States of America</i> 101 (34), S. 12467–12472. DOI: 10.1073/pnas.0401786101.</li>
+
<li id="Fellouse2004">Fellouse, Frederic A.; Wiesmann, Christian; Sidhu, Sachdev S. (2004): Synthetic antibodies from a four-amino-acid code: a dominant role for tyrosine in antigen recognition. In: <i> Proceedings of the National Academy of Sciences of the United States of America</i> 101 (34), S. 12467–12472. DOI: 10.1073/pnas.0401786101.</li>
  
<li>Fellouse, Frederic A.; Esaki, Kaori; Birtalan, Sara; Raptis, Demetrios; Cancasci, Vincenzo J.; Koide, Akiko et al. (2007): High-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries. In: Journal of Molecular Biology 373 (4), S. 924–940. DOI: 10.1016/j.jmb.2007.08.005. Koide, Shohei; Sidhu, Sachdev S. (2009): The importance of being tyrosine: lessons in molecular recognition from minimalist synthetic binding proteins. In: <i>ACS chemical biology</i> 4 (5), S. 325–334. DOI: 10.1021/cb800314v.</li>
+
<li id="Fellouse2007">Fellouse, Frederic A.; Esaki, Kaori; Birtalan, Sara; Raptis, Demetrios; Cancasci, Vincenzo J.; Koide, Akiko et al. (2007): High-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries. In: Journal of Molecular Biology 373 (4), S. 924–940. DOI: 10.1016/j.jmb.2007.08.005.</li>
 +
<li id="Koide">Koide, Shohei; Sidhu, Sachdev S. (2009): The importance of being tyrosine: lessons in molecular recognition from minimalist synthetic binding proteins. In: <i>ACS chemical biology</i> 4 (5), S. 325–334. DOI: 10.1021/cb800314v.</li>
 +
<li id="Krebber">Krebber, Anke; Bornhauser, Susanne; Burmester, Jörg; Honegger, Annemarie; Willuda, Jörg; Bosshard, Hans Rudolf; Plückthun, Andreas (1997): Reliable cloning of functional antibody variable domains from hybridomas and spleen cell repertoires employing a reengineered phage display system. In: <i>Journal of Immunological Methods</i> 201 (1), S. 35–55. DOI: 10.1016/S0022-1759(96)00208-6.</li>
  
<li>Mian, I.Saira; Bradwell, Arthur R.; Olson, Arthur J. (1991): Structure, function and properties of antibody binding sites. In: <i>Journal of Molecular Biology</i> 217 (1), S. 133–151. DOI: 10.1016/0022-2836(91)90617-F.</li>
+
<li id="Mian">Mian, I.Saira; Bradwell, Arthur R.; Olson, Arthur J. (1991): Structure, function and properties of antibody binding sites. In: <i>Journal of Molecular Biology</i> 217 (1), S. 133–151. DOI: 10.1016/0022-2836(91)90617-F.</li>
  
<li>Osoegawa, K.; Jong, P. J. de; Frengen, E.; Ioannou, P. A. (2001): Construction of bacterial artificial chromosome (BAC/PAC) libraries. In: <i>Current protocols in molecular biology</i> Chapter 5, Unit 5.9. DOI: 10.1002/0471142727.mb0509s55.</li>
+
<li id="Osoegawa">Osoegawa, K.; Jong, P. J. de; Frengen, E.; Ioannou, P. A. (2001): Construction of bacterial artificial chromosome (BAC/PAC) libraries. In: <i>Current protocols in molecular biology</i> Chapter 5, Unit 5.9. DOI: 10.1002/0471142727.mb0509s55.</li>
<li>Zemlin, Michael; Klinger, Martin; Link, Jason; Zemlin, Cosima; Bauer, Karl; Engler, Jeffrey A. et al. (2003): Expressed Murine and Human CDR-H3 Intervals of Equal Length Exhibit Distinct Repertoires that Differ in their Amino Acid Composition and Predicted Range of Structures. In: <i>Journal of Molecular Biology</i> 334 (4), S. 733–749. DOI: 10.1016/j.jmb.2003.10.007.</li>
+
<li id="Zemlin">Zemlin, Michael; Klinger, Martin; Link, Jason; Zemlin, Cosima; Bauer, Karl; Engler, Jeffrey A. et al. (2003): Expressed Murine and Human CDR-H3 Intervals of Equal Length Exhibit Distinct Repertoires that Differ in their Amino Acid Composition and Predicted Range of Structures. In: <i>Journal of Molecular Biology</i> 334 (4), S. 733–749. DOI: 10.1016/j.jmb.2003.10.007.</li>
 
</ul>
 
</ul>
 
</div>
 
</div>

Latest revision as of 03:27, 20 October 2016



Library Project

Design and Construction

Library Construction Area


As starting point of our directed evolution system for binding proteins we needed a numerousness of partly randomized Monobodies and Nanobodies, respectively, to emerge our Evobodies. An Evobody is a binding protein resulting from the combination of a scaffold, be it Monobody or Nanobody, with a continuously mutation and selection system, and thus represents the fusion of a semi rational design approach with evolution.
We designed two libraries for Monobodies and Nanobodies, respectively. A library is a collection of identical plasmids that only vary in the protein coding sequences (CDS). These CDS are designed to possess optimized and planned randomized subregions displaying high variability. The result is a wide variety of plasmids with different inserts. One major advantage of a library is the availability of a wide range of different binding proteins as starting material for the evolution process, which ensures that candidates with limited but evolvable affinity are present from the outset. After transformation into E. coli a heterogeneous culture is created and after plating the transformation each colony carries a different insert encoding a different binding protein (Osoegawa et al., 2001).
By using specific mixtures of bases for defined positions in the oligonucleotide synthesis resulting in degenerate codons coding for a defined mix of amino acids in the variable regions of our scaffolds we achieve a theoretical diversity of 1,073,741,824 proteins sequences for each library.
Since the practical diversity of our library is limited by the transformation efficiency of E. coli, the theoretical library size needed to be chosen carefully to maximize the binding the potential while avoiding dilution in unproductive (i.e. non-folding or non-binding) sequence space. Furthermore, due to our evolution system, only a weak affinity is required in the first place which can then mature to high affinity. We decided to optimize the library by using restricted yet proven codon or amino acids set, respectively, for the complementary determining regions (CDRs). As a realistic library size, we obtained for a diversity of about one billion varieties. For constructing the coding sequence, we utilized the IUPAC ambiguity base naming scheme for the ordering of our oligonucleotides at a vendor, which did support the equimolar mixing of nucleotides.

First of all, in our library design, we avoided stop codons to guarantee the synthesis of the complete binding proteins. By exclusion of cysteine encoding triplets, we ensured the absence of disulfide bonds and compatibility with the intracellular and extracellular environment. Furthermore, we preferred amino acids that are beneficial to a high binding affinity. Evolution has optimized natural proteins for specific biological functions. Proteins with similar tasks and features also show similar structures and contain related amino acids. Likewise, the binding hot spots for protein-protein interactions are often enriched in tyrosine, tryptophan and arginine (Bogan and Thorn, 1998). Especially tyrosine is overrepresented in high affinity CDRs in functional antibodies. Round about 40 % of their sequence consists of tyrosine. Another ~30 % is assembled of small amino acids like serine, glycine, alanine and threonine (Mian et al., 1991; Zemlin et al., 2003). Scientific studies also have shown the importance of tyrosine in the CDS of synthetic binding proteins (Fellouse et al., 2004). The tyrosine side chain bears most of the contacts necessary for high affinity antigen recognition in synthetic libraries of binding proteins. Firstly, because of its big size to fill large volumes with just a few angels of torsion (Fellouse et al., 2007), while smaller amino acids provide the necessary space and protein conformation flexibility (Koide et al., 2010). The size of tyrosine also leads to many van der Waals´ contacts and the -OH group provides electrostatic interactions for binding (Mian et al., 1991). The amphipathic nature of the tyrosine is also helpful in the different hydrophobic and hydrophilic environments in antibody-antigen complexes (Mian et al., 1991).
In contrast to other high affinity providing amino acids, tyrosine does not show an outstanding flexibility. This fact leads to another benefit of tyrosine in binding proteins by providing a high specificity (Koide et al., 2010). Beside high affinity, specificity is another important requirement for a good Evobody.

On these grounds, we designed the randomized codons TMY, KMY, RMR, YWY, NWY and WMY.

Table 1 shows the used randomized IUPAC nucleotide naming and the encoded bases:
Degenerated base designation Actual bases coded
M A/C
Y C/T
K G/T
W A/T
R A/G
M A/C


Table 2: used randomized codons and resulting amino acids:

Designed randomized codon Actual amino acids encoded
TMY Tyrosine, Serine
KMY Tyrosine, Serine, Alanine, Aspartic Acid
WMY Tyrosine, Serine, Threonine, Asparagine
RMR Threonine, Alanine, Lysine, Glutamic Acid
YWY Phenylalanine, Serine, Isoleucine, Threonine
NWY Phenylalanine, Leucine, Isoleucine, Valine, Tyrosine, Histidine, Asparagine, Aspartic Acid

Finally, we achieved a theoretic variability of 1,073,741,824 different molecules for Monobodies and Nanobodies, respectively.

Implementation of the Evobody Libraries


The standard plasmid pSB1K3 was extended with the required parts of the selection system. Finally, the respective binding protein got fused with RpoZ, the omega domain of RNA polymerase, via a c-Myc-linker to enable selection for good target affinity. In the first steps, the constant regions of our binding protein scaffolds (Monobody and Nanobody) were ordered as gene as synthesized IDT G-Blocks® and inserted in the vector via Gibson assembly.
After finding the optimal randomized codon scheme for Monobodies and Nanobodies, respectively, we had to decide how to generate the library. Most of the gene synthesis companies do not offer long gene fragments with partly randomized sequences and if they do the synthesis of a whole library is prohibitively expensive (about 5-10 k€) and time consuming. Therefore, we decided to do the gene synthesis of the library ourselves and ordered a small number of larger, partly randomized oligonucleotides (from Metabion).
The ordered oligonucleotides had complementary overlaps for annealing. Double stranded fragments were generated by a single touch-down fill in reaction to achieve and maintain maximum diversity. After generating the double stranded library fragments, these were cloned in the library vector using a Gibson assembly, due to the incorporated overlap with the vector.

Monobodies:

Figure 1: Overview of the Monobody construction.
Constant regions are colored green, variable regions are colored blue. Small letters denote the oligonucleotides which were used for library assembly.


Variable Monobody Oligonucleotides:
Oligonucleotide Length [bp] Sequence
a MB-V1-1 60 TCTTGGGACGCTCCGGCTGTTACCGTTNWYYWYTACNWYATTACTTATGGCGAGACTGGC
b MB-V1-2 75 GGTAGCGGTAGATTTAGAACCCGGAACYKYGAAYKYCTGRKARKARKMRKARKAGCCAGTCTCGCCATAAGTAAT
c MB-V2-1 75 GTTCCGGGTTCTAAATCTACCGCTACTATCTCTGGTCTGTCTCCGGGTGTTGACTATACCATCACCGTTTACGCT
d MB-V2-2 80 GGTACGGTAGTTGATAGAGATCGGAGARKARKARKARKMRKARKARKMRKARKARKAAGCGTAAACGGTGAT
GGTATAGT

To create our Monobody library, we constructed the fundamental framework (BBa_K2082004) composed of the Monobody constant regions with an inserted RFP as a place holder for the variable regions, which can easily be exchanged. Therefore, we ordered a synthesized IDT G- Blocks® and assembled it using Gibson cloning. To achieve a complete Monobody from our submitted part, the RFP can be replaced with randomized variable regions allowing an easy visual control. To insert the variable regions, we amplified the backbone, including Monobody constant regions, the cMyc-linker and rpoZ by using the primers MB-bb-fw and MB-bb-rev. After annealing the single stranded oligonucleotides V1-1 (a) + V1-2 (b) filling in the parts resulting in the double stranded fragment V1 and annealing oligonucleotides V2-1 (c) + V2-2 (d) and fill-in yielding V2 and a Gibson assembly with V1, V2 and the backbone combined the parts.

Nanobodies:



Figure 1: Overview of the Nanobody construction.
Constant regions are colored green, variable regions are colored blue. Small letters denote the oligonucleotides which were used for library assembly.
Variable Nanobody Oligonucleotides:
Oligonucleotide Length [bp] Sequence
a NB-F2-1 102 GACACCGCTATCTACTACTGCGCTGCTWMYKMYWMYKMYWMYTMYKMYWMYKMYTMYWMY
KMYWMYKMYWMYKMYTGGGGTCAGGGTACGCAGGTTACCGTT
b NB-F2-2 65 CTGTCAGGGGCGGGGTTTTTTTTTCTCTAGTAAGAAGAAACGGTAACCTGCGTACCCTGACCCCA

Similar to Monobodies, we built up a universal framework for Nanobodies, containing rpoZ, c-Myc-linker and a constant Nanobody segment (BBa_K2082001). As opposed to the Monobody CDS in this case we only designed one variable insert, because in the used scaffold, we have only randomized the third Complementarity-determining region (CDR3), which is known to be the main binding CDR (Krebber et al., 1997). Amplifying BBa_K2082001 with primers NB-bb-fw and NB-bb-rev results in a backbone, which was used to insert the annealed and replenished variable fragment F2 = F2-1 (a) + F2-2 (b) by Gibson assembly.

Proof of Concept


The core of our system is the combination of the library with a mutation system and a selection system, to optimize binding proteins to Evobodies. To get in this directed evolution process, some binding proteins must show at least a marginal affinity to the chosen target protein. Low affinity binders then get mutated and selected in vivo during plasmid replication and cell growth under a rising selective pressure. To make sure, that the designed binding proteins work and that the initial libraries contained some useful Monobodies and Nanobodies, respectively, we performed a phagemid display, using phagemid pAK100 and M13 derived helper phages.

References

  • Bogan, A. A.; Thorn, K. S. (1998): Anatomy of hot spots in protein interfaces. In: Journal of Molecular Biology 280 (1), S. 1–9. DOI: 10.1006/jmbi.1998.1843.
  • Fellouse, Frederic A.; Wiesmann, Christian; Sidhu, Sachdev S. (2004): Synthetic antibodies from a four-amino-acid code: a dominant role for tyrosine in antigen recognition. In: Proceedings of the National Academy of Sciences of the United States of America 101 (34), S. 12467–12472. DOI: 10.1073/pnas.0401786101.
  • Fellouse, Frederic A.; Esaki, Kaori; Birtalan, Sara; Raptis, Demetrios; Cancasci, Vincenzo J.; Koide, Akiko et al. (2007): High-throughput generation of synthetic antibodies from highly functional minimalist phage-displayed libraries. In: Journal of Molecular Biology 373 (4), S. 924–940. DOI: 10.1016/j.jmb.2007.08.005.
  • Koide, Shohei; Sidhu, Sachdev S. (2009): The importance of being tyrosine: lessons in molecular recognition from minimalist synthetic binding proteins. In: ACS chemical biology 4 (5), S. 325–334. DOI: 10.1021/cb800314v.
  • Krebber, Anke; Bornhauser, Susanne; Burmester, Jörg; Honegger, Annemarie; Willuda, Jörg; Bosshard, Hans Rudolf; Plückthun, Andreas (1997): Reliable cloning of functional antibody variable domains from hybridomas and spleen cell repertoires employing a reengineered phage display system. In: Journal of Immunological Methods 201 (1), S. 35–55. DOI: 10.1016/S0022-1759(96)00208-6.
  • Mian, I.Saira; Bradwell, Arthur R.; Olson, Arthur J. (1991): Structure, function and properties of antibody binding sites. In: Journal of Molecular Biology 217 (1), S. 133–151. DOI: 10.1016/0022-2836(91)90617-F.
  • Osoegawa, K.; Jong, P. J. de; Frengen, E.; Ioannou, P. A. (2001): Construction of bacterial artificial chromosome (BAC/PAC) libraries. In: Current protocols in molecular biology Chapter 5, Unit 5.9. DOI: 10.1002/0471142727.mb0509s55.
  • Zemlin, Michael; Klinger, Martin; Link, Jason; Zemlin, Cosima; Bauer, Karl; Engler, Jeffrey A. et al. (2003): Expressed Murine and Human CDR-H3 Intervals of Equal Length Exhibit Distinct Repertoires that Differ in their Amino Acid Composition and Predicted Range of Structures. In: Journal of Molecular Biology 334 (4), S. 733–749. DOI: 10.1016/j.jmb.2003.10.007.