To develop a multi STI-sensor based on aptamers, we first needed to study the necessary biomolecules for the scaffold of the device, and then the interactions between the biomarkers and the nucleic acid sequences. To overcome the lack of time and resources, required to make RMN or X-ray studies, we used Structural Bioinformatics.
Prediction of 3D structure of the proteins allowed us to enhance the Cellulose Binding power of our constructs (RFP_CBD, Streptavidin_CipA…) and to properly display the His-Tag to purify the Hepatitis B biomarker HBsAg. Moreover, the aptamer structure modeling, coupled to the modelisation of the aptamers/target interactions allowed us to select the best aptamer to bind to HBsAg.
Homology modeling of chimeric proteins with Cellulose Binding Domain
As we wanted to know the 3D structure of our chimeric proteins (RFP_CBDs, Streptavidin_CBDs and Streptavidin_CipA), we used homology modeling. To do so, we needed appropriate templates for each domain (RFP, Streptavidin, CBDs or CipA). These templates were obtained by a PSI-BLAST against the Protein Data Bank (PDB) using the NCBI server. The selection was based on the identity, alignment, e-values, sequence coverage and resolution of the structure. For the RFP domain we chose 2H5Q, Streptavidin 4BX6 and for the various CBDs 4JO5 and 1NBC.
Then we used the software Rosetta, and the webservers I-TASSER and RaptorX. We obtained a total of 106 structures. To validate the quality of the predicted models, we calculated the QMEAN Z-score, which estimates the absolute quality of a model by comparing to the structures solved by X-ray crystallography and with a resolution below than 2 Å. From there, we chose the best structure.
Finally, we aligned models with experimental structures to check if the structure was conserved, so functions would remain unaltered.
First, we ran preliminary tests with a construct similar to the chimeric protein of the 2014 iGEM Stanford-Brown-Spelman team 2014 (BBa_K1499004). Instead of streptavidin at the center of the CBDs we used RFP, like the 2015 iGEM Edinburgh team did. If the protein became red, it meant that the RFP domain was functional. Then this protein was used for the development of purification and fixation protocols on microcrystalline cellulose. Here, we just studied the fold of the protein in order to see if it could work.
The best model has a Z-score of -1.75, which means a quite good absolute quality of the prediction. The graphic below shows that the predicted structure is in the standard deviation of the major part of the experimental structures.
The RFP domain is similar to its template with an RMSD of 3.17Å, but the second CBD has only a RMSD of 11.39Å and the first CBD has lost its secondary structure.
Nonetheless, we found out through our wetlab experiments that the protein was red and it fixed to the cellulose.
As the model seemed suitable for the RFP_CBDs, we decided to work on the part designed by the 2014 iGEM Stanford-Brown-Spelman team (BBa_K1499004) and model its 3D structure.
The best model had a Z-score of -3.36, which means a bad absolute quality of the prediction. The graphic below shows that the predicted structure is outside of the standard deviation of the major part of the experimental structures.
Here we can see that the domains are not well defined. Only the wetlab results could show if the protein is working or not, and if it seems functional.
As we have seen previously, the streptavidin with the 2 CBDs wasn’t the best solution to fix the aptamer onto the paper. Fortunately, in the registry, the 2015 iGEM Edinburgh team made a RFP protein with another Cellulose Binding Domain, CipA (BBa_K1615100). First, we made the structural model of this chimeric protein to see the differences with the previous one and if it was more promising.
The best model returned a Z-score of -0.73, which means a high absolute quality of the prediction. The graphic below shows that the predicted structure is in the lowest standard deviation of the experimental structures.
The linker allows the spatial separation of the 2 domains and we obtain a RMSD of 1.84Å for the streptavidin and of 1.37Å for the RFP.
Because the RFP_CipA work well in the wetlab and the model is good, we decide to use a streptavidin with this CipA.
The best model has a Z-score of -1.63 that means a quite good absolute quality of the prediction, the graphic shows that our model is in the standard deviation of the major part of the experimental structures.
We obtained a RMSD of 1.84Å for the streptavidin and of 1.37Å for the CipA. And the results showed that this protein was more efficient about the fixation on paper.
HBsAg with His-Tag de novo modeling
HBsAg, Hepatitis B Surface Antigen, is one of the biomarkers we want to detect in our device. For the purification of this protein, we needed to put a His-Tag, either at the N- or C-termini of the protein sequence. We made a PSI-BLAST against the PDB and no structure was available for a homology modeling. So we realized a de novo modeling using the software Rosetta to modelize the 3D structures of the protein with the His-Tag at both ends.
We had 20 models for each His-tag position and we wanted to choose the best models:
Best Z-score for the C-terminus = -3.44
Best Z-score for the N-terminus = -4.23
This low scores are due to the fact that this protein is on the cell membrane.
Then we compared the 2 positions of the His-tag:
We can see that the C-terminus His-Tag is near an α-helix so it is not very accessible. The N-terminus His-Tag is more accessible so we decided to put it here for the purification of HBsAg.
Sadly, we didn’t succeed in cloning this part in a plasmid so we could not experiment the purification with the His-Tag at the N-terminus.
Discover new aptamers (MAWS)
Unfortunately, there are very few aptamers described for STI biomarkers. As an example, we could only find one aptamer well described in the literature to the HIV reverse transcriptase. As we didn’t have the time nor the resources to run a SELEX procedure, we wanted to use the software MAWS developed by the 2015 iGEM Heidelberg team Heidelberg. But this software didn’t gave us the expected results because of the inconsistency in the structure of the aptamer in the PDB file result. When we contacted them, they said that it was still under development.
Aptamers structures modeling
For the biomarkers HBsAg, 3 aptamers are described in the literature but we only knew their sequences and secondary structures .
But with our homemade software tool, we can predict their tertiary structure.
These structures will be used for a docking procedure with the target protein to know the interactions.
Thanks to our webserver, we obtained the 3D structures of the 3 aptamers against the HBsAg. We used them in the docking webserver NPDock to predict the interaction between the protein and the nucleic acid molecule.
We can see that aptamers A and B fix to the same domain of the protein but aptamer C is on the opposites side. So we chose aptamers A and C for our device.
 Shu, X., Shaner, N., Yarbrough, C., Tsien, R. and Remington, S. (2006). Novel Chromophores and Buried Charges Control Color in mFruits. Biochemistry, 45(32), pp.9639-9647.  Fairhead, M., Krndija, D., Lowe, E. and Howarth, M. (2014). Plug-and-Play Pairing via Defined Divalent Streptavidins. Journal of Molecular Biology, 426(1), pp.199-214.  Yaniv, O., Morag, E., Borovok, I., Bayer, E., Lamed, R., Frolow, F. and Shimon, L. (2013). Structure of a family 3a carbohydrate-binding module from the cellulosomal scaffoldin CipA ofClostridium thermocellumwith flanking linkers: implications for cellulosome structure. Acta Cryst Sect F, 69(7), pp.733-737.  Tormo J, Lamed R, Chirino AJ, Morag E, Bayer EA, Shoham Y, et al. (1996) Crystal structure of a bacterial family-III cellulose-binding domain: a general mechanism for attachment to cellulose. The EMBO journal 15: 5739–5751.  Song, Y., DiMaio, F., Wang, R., Kim, D., Miles, C., Brunette, T., Thompson, J. and Baker, D. (2013). High-Resolution Comparative Modeling with RosettaCM. Structure, 21(10), pp.1735-1742.  Zhang, Y. (2008). I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9(1), p.40.  Morten Källberg, Haipeng Wang, Sheng Wang, Jian Peng, Zhiyong Wang, Hui Lu & Jinbo Xu. Template-based protein structure modeling using the RaptorX web server. Nature Protocols 7, 1511–1522, 2012.  Benkert, P., Tosatto, S. and Schomburg, D. (2008). QMEAN: A comprehensive scoring function for model quality assessment. Proteins: Structure, Function, and Bioinformatics, 71(1), pp.261-277.  Bradley, P. (2005). Toward High-Resolution de Novo Structure Prediction for Small Proteins. Science, 309(5742), pp.1868-1871.  Xi, Z., Huang, R., Li, Z., He, N., Wang, T., Su, E. and Deng, Y. (2015). Selection of HBsAg-Specific DNA Aptamers Based on Carboxylated Magnetic Nanoparticles and Their Application in the Rapid and Simple Detection of Hepatitis B Virus Infection. ACS Appl. Mater. Interfaces, 7(21), pp.11215-11223.  Tuszynska, I., Magnus, M., Jonak, K., Dawson, W. and Bujnicki, J. (2015). NPDock: a web server for protein–nucleic acid docking. Nucleic Acids Res, 43(W1), pp.W425-W430.