Team:Stanford-Brown/SB16 BioMembrane Collagen

Stanford-Brown 2016

BioMembranes team member Charlie introduces the collagen and elastin subprojects

Making a Collagen Mimetic

Human Collagen

Human collagen is a complex protein to produce recombinantly. The tight left-handed helix structure is facilitated by numerous hydroxyproline residues1 which must be produced by post-translational modification by an enzyme with complex quaternary structure2 with the function only demonstrated natively in a single relatively common yeast strain, Ogataea polymorpha.3 The traditional collagen motif is Gly-X-Y, with X often representing proline and Y hydroxyproline; these relative locations in the helix are essential for their contributions to its formation.4 Even working in H. polymorpha, which in addition to not being cultured in most labs has finicky codon preferences,5 there are a number of problems.
Collagen genes themselves are ill-suited to genetic manipulation, with glycine and proline codons (GGN and CCN respectively) making up such a large proportion of the sequence. The high GC content and potential self complementarity that result from high glycine and proline content cause additional concerns on top of general repetitive sequence issues. Even after transformation and successful translation, human collagen undergoes post-translational glycosylation and cleavage before forming a number of different heterotrimers,1 with cross links depending on specific positioning of multiple different post-translationally modified amino acids.6 All this is not to say that recombinant synthesis of human collagen is impossible; it has in fact been done. However, replicating that work would have been quite difficult in the time available without representing substantial innovation. Instead, we looked to possible mimetics.

Bacterial Collagen

Fortunately for what was rapidly becoming an Escherichia coli-centric team, there exists a whole class of proteins known as "bacterial collagens". These proteins have the same supersecondary structure and Gly-X-Y repeat as human collagen without relying on hydroxyproline in the Y position,7 making these sequences easier to manipulate and bacterial synthesis possible. Unfortunately, despite the suggestive moniker, bacterial collagen has not been well adapted to provide structural integrity to large regions of extracellular matrix in areas of regular high stress such as arterial walls and skin. As such, it has no natural mechanism to form a network of long, cross-linked fibers, and there is a dearth of data on how such a matrix might perform. Still, it might prove a viable collagen mimetic given successful assembly of a fibrous matrix. Returning to human collagen mimetics for a moment, the length of native collagen has been one factor that makes production difficult, and there has been some interesting work done with relatively short peptides working around this difficulty.

Peptide Tessellation

Some recent work has been done to create a potentially infinitely long self-assembling collagen-like triple helix, made possible by the design and synthesis of short peptides which assemble in a staggered fashion into a triple helix with sticky ends, permitting the continued addition of the peptide monomer indefinitely into a long fiber. Sticky ends are most familiar in DNA applications such as restriction cloning, but sticky-ended DNA fragments can also be created by combining two DNA single strands (as in the work of this team on melanin binding) due to the specificity of base pair interactions. Tanrikulu et al. used lysine/aspartate charge pairs to similar effect, forcing a staggered assembly.8 Cejas et al. chose to use short 30-residue length triple helices which would interact and form longer chains through stacking interactions of aromatic terminal residues, and lateral assembly into wider fibrils potentially by the same mechanism.9 These are just two examples of what is a fairly diverse field of collagen-mimetic peptide exploration,10 but many share a dependence on solid-phase peptide synthesis.

A transition to recombinant synthesis in an organism is difficult. These short, non-native structures would not necessarily be post-translationally modified correctly and would be difficult to purify, with even short purification tags resulting in major disruption of the structure and even an initial formyl-methionine potentially destructive. For the same reason, they are difficult to modify further. While the work of a number of authors has shown the potential for peptide-based models to interact with human cells,9 the lack of any cross-linking means that they may not be stable under physical stress, an essential feature of any membrane material. Tanrikulu and Raines may have developed a cysteine-homocysteine system for intra-strand disulfide cross-links,11 but this too is best-suited for solid-phase peptide synthesis. Our approach works within a bacterial production system.
Figure 1: One of the collagen-mimetic peptides used in the Cejas et al. paper. The aromatic phenylalanine residues at both ends control aggregation of blunt-ended assemblies.

Experimental design

Collagen, Keratin, and Elastin

The bacterially produced collagen mimetic we designed draws from a few additional sources, wading into the uncertain waters of rational design in the process. Yoshizumi et al. were able to demonstrate the utility of a designed coiled-coil homotrimerization domain in assisting the folding of a bacterial collagen domain.12 Coiled-coil domains, the most famous example of which forms the fundamental dimer structure of keratin, are defined by a heptad repeat structure and can govern the association of anywhere between two and four proteins fairly easily;13 the example here was designed to form homotrimers which retained their structure above 90 °C.12 The authors evaluate the effect of different locations of this domain on the refolding of a single bacterial collagen domain, concluding that a single coiled-coil domain at the N-terminus is optimal for the most rapid and complete refolding.12 We created a close analog to this construct (modified for convenient PCR amplification) with a purification tag and submitted it as BBa_K2027007.

This general approach might be extended to different coiled-coil and target proteins; in particular, the potential applications of an effective heterotrimerization domain are even more expansive. A candidate coiled-coil obligate heterotrimer with a defined orientation was designed in 1995 by Nautiyal et al., with all homotrimers and two-component heterotrimers made unfavorable by charge pair interactions.13 Full heterotrimers have a higher melting point (apparent Tm = 87.5 °C) than the next-most stable arrangement by at least 15 °C,13 potentially allowing thermal cycling to increase specificity. A first step toward using this domain would be attempting to replace the coiled-coil in Yoshizumi et al.'s device with the coiled-coil domains making up the heterotrimer; bricks BBa_K2027044, BBa_K2027037, and BBa_K2027038 represent these constructs. Observing under what conditions, if any, specificity is maintained in the presence of a weaker homotrimerization domain (the bacterial collagen) could give some indication of the utility of these domains in organizing proteins. Collagen-like triple helices generally have melting temperatures below 40 °C, so the coiled-coil domains have a clear thermodynamic advantage and at least one has been shown to nucleate trimerization of an unfolded bacterial collagen trimer, but the kinetics of initial formation of trimers was not studied and in fact could not have been studied in the setting of the first construct. Instead, the bacterial collagen domain was unfolded while leaving the coiled-coil trimers intact;12 it remains possible that at room temperature, interaction of bacterial collagen domains dominates the trimerization process. The relative stability of the heterotrimer will allow for formation of the desired heterotrimeric construct at an elevated temperature regardless of room temperature kinetics, but it would be interesting to study the kinetics nonetheless. Still, all constructs so far are short, blunt-ended fibers incapable of assembling into a matrix. However, the modular nature of this assembly suggests the possibility of large-scale cross-linking in a way peptide tesselation models might have more difficulty supporting.

Cross-linking is essential for proper functioning of collagen in a structural context; covalent linkages ensure not just thermal but also mechanical stability of quaternary protein associations. Bacterial collagen offers no help here; human collagen is replete with intra- and inter-strand cross-links but they depend on precise positioning and the action of multiple mammalian enzymes.6 Fortunately, a simple protein fiber cross-linking module has already had its efficacy demonstrated: exons 21 and 23 from human tropoelastin have been shown to cross-link coacervated elastin-like monomers in the presence of a lysine-oxidizing agent.14 The single relavant enzyme is not even necessary; as is shown in the PQQ aptamer purification project, it might soon be enough to purchase a cofactor supplement at a local drug store and add copper sulfate. To explore this, another set of constructs was created with this cross-linking domain at the N-terminus. These are the relevant bricks: BBa_K2027003, BBa_K2027004, BBa_K2027005, and BBa_K2027006. These constructs need to be studied to evaluate the effect of the addition on trimerization and the potential for cross-linking. Even if cross-linking is successful, these would still be the short, blunt fibers shown in Figure 2, but governance of their assembly by formation of a heterotrimer opens the door to sticky end use.

Parts BBa_K2027002 and BBa_K2027045 are the conjugations of the three heterotrimer parts (with and without cross-linking domains, respectively). Part BBa_K2027002 is represented in both detailed and simplified schematics in Figure 3. Due to the directionality and specificity of trimer formation, the preferred assembly has sticky ends leaving the possibility of substantially longer fibers such as that shown in simplified form in Figure 4.
Figure 2: A symbolic representation of the short blunt-ended assemblies referenced, with the bacterial collagen portion much shorter than scale.
Figure 3: Simple head-to-tail assembly of the three components allows for specific assembly of sticky-ended triple helices, which is easier to visualize with the much-reduced representation shown at the bottom.
Figure 4: Staggered assembly of the full construct allows for the formation of potentially infinitely long fibers.


We took the collagen constructs (mentioned in the previous section) and transformed them into E. coli. We picked successfully transformed colonies and ran colony polymerase chain reaction (CPCR) using the primers VF2 and VR. We ran these PCR products on agarose gels to examine whether the plasmids contained appropriately sized DNA. After successfully sequencing them to confirm that they indeed had the correct domain constructs, we wanted to induce the cells to produce the collagen proteins in order to characterize them. In addition, we wanted to also produce constructs without the elastin cross-linking domain. We removed this domain through Sap1 digestion of the DNA constructs and further transformed them into cells. We once again confirmed the success of the transformation with the digested constructs through gel electrophoresis and sequence verification. Although our sequence verifications were impressively similar to our desired constructs, it actually took 3 digestion attempts to fully digest the constructs to produce cells with constructs BBa_K2027044, BBa_K2027037, and BBa_K2027038. While it is successful, the number of tries suggests that our Sap1 digestion with Golden Gate Assembly protocol could definitely be improved to increase efficiency.
Figure 5: Gel electrophoresis results of CPCR of colonies transformed with digested constructs. There are 3 replicates picked from S1, S2, and S3 plates.
It is helpful to note that in our project and in the following figures, we denoted constructs that contain elastin cross-linking domains with a C and a number. For instance, heterotrimeric coiled coil domain 1 is C1. For constructs without the elastin cross-linking domains, they are denoted in the similar style using the letter S instead.

The figure to the left displays the results after CPCR of colonies after a transformation. We expect that digested constructs would be around 900 base pairs (bp). We used the gel to filter out picked colonies that did not contain the correct size of the constructs' DNA (such as S1.3). This was necessary, considering the inefficiency of the digestion protocol, so that we would only sequence colonies that were promising.
After induction of our liquid cultures, we extracted the proteins from our constructs. We kept all protein washes to assure that our desired protein product did not get extracted earlier than expected (we expected the protein to be in the final elution buffer washes). After confirming its presence in the final elution washes (shown in Figure 6), we ran the extracts against each other. To verify the size of the protein, we ran our protein extractions in SDS-PAGE gels. We mostly used Ni columns to purify our protein extract. Since our construct has a Lumio-Flag-His tag, we also used Anti-FLAG magnetic beads, with the help of our mentor Kosuke, to examine which protocol was most efficient in extracting protein.
Figure 6: SDS-PAGE gel of all washes of extracted proteins using Ni-NTA resin columns and Lumio staining. AF stands for Anti-FLAG, and represents samples that were extracted using Anti-FLAG magnetic beads.
Figure 7: SDS-PAGE gel of extracted C2 and CN proteins (lanes 2-5 and lanes 6-8 respectively) using Lumio stain
In Figure 7, lanes 2-5 represent the Lumio stained protein extract of the BioBrick BBa_K2027005 or construct 2 (C2). Lanes 2,3, and 4 are respectively first, second, and third elution buffer washes using the His-tag nickel resin columns. Lane 5 is also C2's extracted protein, however it was extracted using a different protocol, Anti-FLAG. Lanes 6-8 contains corresponding elution buffers of the natural construct (CN), BBa_K2027007. CN was extracted using His-tag nickel resin columns.
Figure 8 also represents an SDS-PAGE gel containing the protein extracts of BBa_K2027003 (natural construct without the elastin cross-linking domain or SN) in lanes 2-4 and BBa_K2027002 in lanes 5-7. BBa_K2027002 is the fully assembled heterotrimeric coiled-coil collagen fiber monomer, consisting of the domains BBa_K2027004, BBa_K2027005, and BBa_K2027006. Because the fully assembled construct contains three domains, we expected the protein produced would be at least three times larger than the proteins with singular constructs. However, as indicated by the gel in Figure 2, BBa_K2027002 did not produce this size, which suggested that the assembly of all parts may have introduced errors in the protein synthesis.
Figure 8: SDS-PAGE gel of extracted SN and full construct proteins (lanes 2-4 and lanes 5-7 respectively) using Lumio stain
We also explored the possibility of cross-linking the proteins. Our plan was to cross-link the proteins and compare the results with un-processed protein in an SDS-PAGE. Initially, we wanted to test small samples of cross-linked C1, C2, and C3 and S1, S2, and S3 against non-cross-linked samples. We expected cross-linked proteins to have a higher molecular weight than proteins that weren't. Since this was our first time attempting to cross-link proteins, we wanted to test our protocol on only one sample before proceeding on a large-scale cross-linking experiment.

In this experiment, we used pure pyrroloquinoline quinone (PQQ) as a catalyst for the cross-linking reaction for sample construct CN. PQQ was diluted in a copper (II) solution and mixed with the purified protein. Since the purified protein exists in a phosphate buffer (extract was purified from the Ni-NTA resin columns), we maintained the pH of the reaction solution and kept it close to pH 7 while adding excess copper because the precipitation of copper(II) phosphate causes an increase in acidity of the buffer. When the solution became basic, we found that the solution was mostly clear and the precipitate was a light green-blue color, likely copper hydroxide. When the solution was nearly pH 7 and borderline acidic, the solution turned a clear blue and produced a fine blue copper(II) phosphate precipitate, but copper remained in solution. We incubated the two different sample solutions at room temperature for 24 hours. After 24 hours, we found that the solutions separated into two layers. We weren't sure if the cross-linked protein would exist in the precipitate-like layer or the supernatant so we ran both against the original untreated CN in a gel.
Figure 9: Tubes containing the cross-linking reaction solution. Left tube is pH 8 and right tube is pH is 6.8.
Figure 10: SDS-PAGE with Lumio stain of non-cross-linked and cross-linked CN protein. Lane 2 is the untreated CN protein. Lanes 3 and 4 are the pH 6.8 tube's precipitate and supernatant, respectively. Lanes 5 and 6 are the pH 8 tube's precipitate and supernatant, respectively.
The SDS-PAGE gel results showed that cross-linking did not occur. It is difficult to read the gel because the protein may have run off early on. In this experiment, we used a different gel percentage, 8%, because we ran out of our usual gel type. This is the third iteration of running our SDS-PAGE gel, because we had trouble figuring out the optimal run conditions. Even though the gel was run at 200 V for 10 minutes, the samples were mostly at the bottom. This may have contributed to the production of a messy gel. Due to time constraints and material limitations, we could not rerun more gels to further optimize our protocol. Despite the difficulty in gauging the exact results of the gel, we estimated our results of failure based on the location of the untreated CN band. The bands in the other lanes were approximately around the same location, suggesting that the treated proteins did not change in size. We found it peculiar that the lanes 3 and 4 had additional bands of smaller molecular weight and are unsure what it is.


Ultimately, this work still needs a much greater degree of characterization. A successful fiber is always going to be a copolymer with non-collagen regions, and it may have properties quite different from collagen in its current formulation. However, this method of assembly might also be used to form human collagen-based heterotrimers in a eukaryotic production organism, not simply bacterial collagen homotrimers. This kind of modular design seems both more tenable and more generally applicable than any de novo work. After all, conjugating GFP or a cellulose-binding domain to a protein is an example of an attempt at rational modular design of a bifunctional protein construct; while such attempts are far from universally successful, a great many of them have been, including our own modified chromoproteins. It is only natural for this approach to expand in scope and refine associated techniques, and we certainly hope to continue to build on this work specifically.


DNA for the four single bacterial collagen domain constructs with attached cross-linking domains was ordered from IDT with overlaps for Gibson assembly into pSB1C3. The resulting plasmid was transformed into T7 Express E. coli cells and sequence verified; 250 mL cultures of the three heterotrimer component-producing strains and a 500 mL culture of the original construct with cross-linking domain were grown up and induced at room temperature with 40 μm IPTG. Protein production was verified with LumioTM Green and protein extraction based on hexahistidine tags was performed using 3 mL HisPur Ni-NTA nickel columns. Protein gels used were 4-12 Bis-Tris Protein gels. Protein ladder used was BenchMarkTM Fluorescent Protein Standard Constructs without cross-linking domains were produced by SapI digest in the presence of T4 DNA Ligase, taking advantage of sites designed into the original order. Full constructs were produced by Golden Gate assembly of linear DNA fragments produced by PCR of the relevant minipreps using BsaI-HF. All enzymes were purchased from New England Biolabs.
  1. Gelse K, Poschl E, Aigner T. Collagens-structure, function, and biosynthesis. Adv Drug Deliv Rev 2003;55:1531-46.
  2. Vuorela A, Myllyharju J, Nissi R, Pihlajaniemi T, Kivirikko KI. Assembly of human prolyl 4-hydroxylase and type III collagen in the yeast pichia pastoris: formation of a stable enzyme tetramer requires coexpression with collagen and assembly of a stable collagen requires coexpression with prolyl 4-hydroxylase. EMBO J. 1997;16(22):6702–6712.
  3. de Bruin, E. C., M. W. T. Werten, C. Laane, and F. A. de Wolf. Endogenous prolyl 4-hydroxylation in Hansenula polymorpha and its use for the production of hydroxylated recombinant gelatin. FEMS Yeast Res. 2002;1:291-298.
  4. Holmgren SK, Bretscher LE, Taylor KM, Raines RT. A hyperstable collagen mimic. Chem. Biol. 1999;6:63–70.
  5. Liu C, Yang X, Yao Y, Huang W, Sun W, Ma Y. Diverse expression levels of two codon-optimized genes that encode human papilloma virus type 16 major protein L1 in Hansenula polymorpha. Biotechnol. Lett. 2014;36:937-945. doi:10.1007/s10529-014-1455-z
  6. Yamauchi M, Sricholpech M (2012) Lysine post-translational modifications of collagen. Essays Biochem. 52: 113–133. doi: 10.1042/bse0520113
  7. Yu Z, An B, Ramshaw JA, and Brodsky B. Bacterial collagen-like proteins that form triple-helical structures. J. Struct. Biol. 2014;186:451–461.
  8. Tanrikulu IC, Forticaux A, Jin S, Raines RT. Peptide tessellation yields micrometre-scale collagen triple helices. Nat. Chem. 2016;doi:10.1038/nchem.2556
  9. Cejas, M. A.; Kinney, W. A.; Chen, C.; Leo, G. C.; Tounge, B. A.; Vinter, J. G.; Joshi, P. P.; Maryanoff, B. E.Collagen-Related Peptides: Self-Assembly of Short, Single Strands into a Functional Biomaterial of Micrometer Scale. J. Am. Chem. Soc. 2007, 129;8:2202– 2203, DOI: 10.1021/ja066986f
  10. Yu SM, Li Y, Kim D. Collagen mimetic peptides: progress towards functional applications. Soft Matter. 2011;7(18):7927–7938.
  11. Tanrikulu IC, Raines RT. Optimal Interstrand Bridges for Collagen-like Biomaterials J Am Chem Soc. 2014;136:13490–13493.
  12. Yoshizumi A, Fletcher JM, Yu Z, Persikov AV, Bartlett GJ, Boyle AL, Vincent TL, Woolfson DN, Brodsky B. Designed coiled coils promote folding of a recombinant bacterial collagen. J. Biol.Chem. 2011;286:17512–17520.
  13. Nautiyal S, Woolfson DN, King DS, Alber T. Biochemistry 1995;34:11645−11651
  14. Keeley FW, Bellingham CM, Woodhouse KA. Elastin as a self-organizing biomaterial: use of recombinantly expressed human elastin polypeptides as a model for investigations of structure and self-assembly of elastin. Philos Trans R Soc Lond B Biol Sci. 2002;357:185–189.