CRISPR/CPF1
The CRISPR-Cas system is a powerful tool for genetic modification that has its root in bacterial immunity systems. We decided to study a newly discovered Cas protein that is promising because of its unique properties.
The CRISPR-Cas System
CRISPR-Cas systems are bacterial adaptive immunity systems. Functional CRISPR-loci are present in about 45% of known bacteria and the systems protects them from foreign DNA, for example bacteriophages, that might harm them. The basis for the systems is that short sequences of foreign DNA are incorporated into the hosts cell’s own genome, which later can be used for recognizing and eliminating the same threat if it reoccurs. Since the changes are made on gene level, immunity can be passed on to the offspring and thereby improve the fitness of the whole strain.
The loci which code for the systems contains an array of short palindromic repeat sequences that are interspaced with unique sequences, so called spacers, which are obtained from foreign sources. The repeats are commonly called CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats. The upstream part of the CRISPR loci is provided with a number of CRISPR associated (Cas) proteins that serve different functions in the systems.
There are multiple types of CRISPR-Cas systems, often co-existing in the same organism and capable of acting in parallel. They all work based on the same principles but with different variations. The commonly used categorizations are the three types I, II and III but additional types IV and V have been proposed to fit new discoveries. As an example of the differences between systems: The protein responsible for DNA degradation in type II systems is Cas9, whereas in type I systems the same task is done by a complex of multiple Cas-proteins called Cascade. In addition to the main types there are also subtypes to further categorize the different kinds of systems.
The work cycle of the CRISPR-Cas systems can be divided into three parts; adaptation, expression and interference. In the adaptation stage foreign DNA is integrated by Cas-proteins into the CRISPR array as a new spacer. Sequences that are to be acquired from invaders are called protospacers. In type I and II systems these sequences are identified by sequence motifs that flank the protospacers, called PAM-sites. These PAM-sites are not integrated with the protospacer and thus serve as a way for the system to differentiate between itself and the attacker, even though the sequence is the same as in the attacker the system will not target the spacer in the CRISPR array because it does not have a PAM site attached to it.
The expression stage involves the transcription and translation of the CRISPR-loci. The CRISPR-array is transcribed as one long piece of CRISPR RNA, pre-crRNA. The strand is then processed into smaller pieces by Cas proteins. Each piece consists of a repeat segment and a spacer.
Finally, in the interference stage, the crRNA forms ribonucleic complexes with Cas-proteins, like Cas9 in type II systems. The Cas proteins then use the crRNA to identify the sequence that is to be neutralized. Type I and II systems look for PAM-sites and then match complementary base-pairs between the invader genome and the crRNA. If a match is made the Cas-protein cuts or degrades the DNA and the threat is eliminated.
The CRISPR-Cas system provides prokaryotes with an efficient way of dealing with invaders. In recent years researchers have found ways to adapt elements of the system in order to create efficient tools for genetic engineering. There is still much we do not know about these systems which might mean that many more important discoveries are yet to come.
CPF1
Clustered Regularly Interspaced Short Palindromic Repeats from Prevotella and Francisella 1, or CPF1, was discovered in 2015 by the biologist Feng Zhang and his team at the Broad Institute in Cambridge, Massachusetts. CPF1 is a class II type V endonuclease able to cleave the covalent bond in DNA or RNA molecules, when guided by a specific RNA molecule called crRNA (crisprRNA).CPF1 recognize and cuts downstream from TTN PAM-sites, leaving four to five nucleotide long cohesive double-stranded breaks.
CPF1 differs from Cas9 in three main aspects. In first place, it cleaves the non complementary DNA strand at the 18th nucleotide and the complementary strand at the 23rd nucleotide downstream of the PAM site, leaving 5 nucleotide overhangs in the 5’ direction on the complementary DNA strand. Secondly, it recognizes TTN PAM sites whereas Cas 9 recognizes sequences adjacent to NGG PAM sites (Figure 1). Lastly it has two RuvC-like domains that are able to process both DNA and RNA, so that no additional tracrRNA is needed to process the precr-RNA. Nevertheless, by leaving sticky ends, CPF1 might allow gene insertion by NHEJ (non homologous end joining), which is a new exciting application given that all CRISPR system nowadays available permits gene insertion only by means of HR (homologous recombination).
These aspects makes CPF1 the most minimalistic CRISPR/Cas system for biotechnological applications. Based on that we decided to explore this new promising tool.
References
Ledford, H., 2015. Alternative CRISPR system could improve genome editing. Nature, 526(7571), pp.17-17.
Kim, D., Kim, J., Hur, J.K., Been, K.W., Yoon, S.H. and Kim, J.S., 2016. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nature biotechnology.
Zetsche, B., Gootenberg, J.S., Abudayyeh, O.O., Slaymaker, I.M., Makarova, K.S., Essletzbichler, P., Volz, S.E., Joung, J., van der Oost, J., Regev, A. and Koonin, E.V., 2015. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell, 163(3), pp.759-771.
Planning
The amino acid sequence of CPF1 from the organism Francisella novicida was obtained from the Uniprot database and codon optimized for expression in E.coli. The nucleotide sequence was ordered from IDT with the biobrick prefix and suffix. A 6x his-tag was added to the N-terminal for purification via IMAC (Immobilized Metal ion Affinity Chromatography).
During the summer, we decided to perform a preliminary cleavage test in order to assess the directed DNA cleavage by CPF1. We planned to do this by designing two different spacer regions that would target a GFP and an Ampicillin resistance coding sequence. The expected results of targeting these genes would have been no or a reduced number of fluorescent cells in the case of GFP, and an increase in ampicillin sensitivity in the case of the ampicillin resistance sequence (Figure 1).
To design an appropriate system for CPF1 we had to decide whether to express a mature crRNA or a pre-crRNA. When the CRISPR array is transcribed, the spacer with the complementary sequence for the target DNA is flanked by repetitive regions. These regions forms hairpin structures which have been shown to be pivotal for recognition and processing of the pre-crRNA (Figure 2).
Processing is mainly due to two events. Firstly, cleavage of the repetitive region occurs four nucleotides upstream of the beginning of the hairpin. Secondly, processing on the 3’ ends is entrusted by host RNAases hence it cannot be controlled. These events, will eventually lead to formation of a mature crRNA of 42-44 nucleotides (19 nucleotides of the repetitive region and 24-25 nucleotide of the designed spacer region).
Experimental evidence has shown that CPF1 cleavage efficiency is increased when the protein is furnished with the aforementioned pre-crRNA. This might be due to conformational changes occurring to CPF1 after recognition of the pre-crRNA (Figure 2). Therefore we decided to design our pre-crRNA in a Full Length Repeat-Spacer-Full Length Repeat organization.
References
Bikard, D., Jiang, W., Samai, P., Hochschild, A., Zhang, F., & Marraffini, L. A. (2013). Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic acids research, 41(15), 7429-7437.
Fonfara, I., Richter, H., Bratovič, M., Le Rhun, A., & Charpentier, E. (2016). The CRISPR-associated DNA-cleaving enzyme Cpf1 also processes precursor CRISPR RNA. Nature, 532(7600), 517-521.
Results
Assembly of composite biobricks was confirmed by UGC (Uppsala Genome Center) by means of Sanger Sequencing. Table 1 summarize results of sequencing reactions. The biobrick made of pBAD/araC, medium RBS, CPF1 and Double terminator was partially correct. Readings on the RBS were fuzzy. This might due to the fact that sequencing primers were designed ~ 600 nucleotides far from the RBS, which is on the resolution limits of Sanger sequencing technique. Thus, we strongly recommend re-sequencing before use of this biobrick. However the general pre-crRNA endowed of the double terminator as long as CPF1 in psB1C3 were successfully sequenced.
BIOBRICK | Single parts making up the biobrick | Sequencing results |
---|---|---|
BBa_K2003002 | I0500+BBa_0032+CPF1+ BBa_0015 | PARTIAL |
BBa_K2003001 | General spacer+BBa_0015 | OK |
BBa_K2003000 | CPF1 | OK |
During the summer we had a lot of problems with the 3A-assemblies. The main issue was that we failed in attaching the RBS to the promoter but did not realise this until we had assembled to whole construct and thus had to redo the assembly. This delayed our progress to the extent that we never had time to perform our experiments and confirming that the constructs work in the way they were intended.