A fundamental part in the gate design based on dCas9 are the guide RNAs (gRNA) Although different biochemistries exist for these RNAs (scRNA, sgRNA, crRNA, etc.) thier unifying characteristic is a a 20 nucleotide sequence that will bind to a given target. This binding occurs with varying levels of specificity, and is sequence dependent. In the case of our project, the target is always in a promoter sequence. It was thus vital to have guides that would bind both strongly and specifically to the target sequence with very little off-target activity. Designing the perfect gRNAs is not an easy task, as a simple BLAST will not be enough to ensure specificity. The change in binding energy resulting from mismatches varies depending on the identity of the mismatch, and the pition in the target. When we saw that Twist Bioscience and Desktop Genetics - a company specialized in sgRNA design - were offering training in guide design to iGEM teams, and were sponsoring the synthesis of these guides, we sent our application letter. They contacted us a few days later to inform us that they were interested in our project and that we were part of the 5 lucky teams selected for the sponsoring. After this great news, our guide design adventure was ready to start.
DESKGEN©, is Desktop Genetics's experiment planning platform. One of it's major functions is to find sgRNAs that will target your sequence of interest and perform knock-in and knock-out experiments, which could help scientists in the understanding of the function of various genes. Our needs were a little bit different than those of most since we worked with the CYC minimal promoter. We did not want just want to target a promoter, but we wanted to target it multiple times. Interestingly, this promoter has regions which are known to be changeable - without modifying the function of the promoter. We therefore decided that it would be interesting to use a CYC multiple times throughout a certain circuit, each one with unique "barcode" sequences. This would allow each CYC promoter to be regulated independently of the others. We therefore required various, different sgRNAs with no cross-talk between them, so that we could build a lot of different versions of the CYC promoter with their corresponding target and be able to design any complex biological circuit. With the help of Eugenia Petrova, an intern at Desktop Genetics, we found a way to use their platform in order to exploit all their implemented algorithms to find guides suited to our needs.
Since we did not wanted and off-target binding in the S. cerevisiae or human genomes, any other genome excluding those two might constitute good “mine” where we could search for our guides. We decided to use the C. albicans genome for this purpose. Our mining work started by uploading portions of the FASTA file of the C. albicans genome in the “Sequence Only” section of the knock-in mode experiment design of DESKGEN.
Next we had to set the correct parameters: specifying the organism in which we were working, S. Cervisiae and the dCas9 we were working with, SpCas9. It was necessary as the SpCas9 can only bind to a sequence and target a sgRNA at it if a specific PAM sequence, NGG, is lying next to it.
After complex calculations, the platform shows all the possible sgRNAs and their targets on the uploaded sequence. Accodring to DESKGEN, an on-target activity score of 50/100 is sufficient to have well performing guides, but we decided to selected only those that had a score of at least 65/100. This score is calculated using the algorithm described by Doench al. (2016). For the sgRNAs with high enough on-target activity scores, we used DESKGEN to calculate the off-target activity, according to an algorithm first described by Hsu et al. (2013). A higher score means higher specificity and minimal binding to the genome of the host cell. We selected only the sgRNAs with a 100 off-target score in S.cerevisiae. In order to verify the safety of these guides, and to allow the same guides to be used in possible forthcoming experiments in human cells, we also checked that the few sgRNAs that met the previous conditions had also almost no off-target in humans. Taking into account the enormous size of the human genome, it is nearly impossible to have no off-target activity, for probabilistic reasons. It is possible, however, to ensure the safety of these guides by making sure that any binding behavior they exhibit occurs in a non-coding region of the genome. In addition, we only selected guides that had two more mismatches or more with targets in these non-coding regions, thereby reducing their binding capabilities. Theoretically, all these guides could be used safely even in human cells.
After all of these screenings, the number of possible sgRNAs decreased drastically. To find a decent number of sgRNAs, we had to analyze almost three entire chromosomes of C. albicans. Below is the final list of targets and their complements that we were able to find that met our stringent criteria.
|Target Sequence||Guide Sequence|