Let it work...let it work...

With the hypothesis outlined and aims clearly defined, we started on simple experiments to see if our ideas worked! :)

In order to assay the binding affinity of dCas9 to it’s target, we used the VP64-p65-Rta(VPR) transcription activation domain fused dCas9, to activate a GFP reporter construct co-transfected into HEK293FT cells. Another plasmids is used for gRNA expression and OFP marker for transfected cells. The cells are then passed through the flow cytometry to measure the Mean Fluorescent Intensity(MFI) of transfected cells. Gates were set on OFP+ cells and histogram of GFP intensity is shown for OFP+ cells.

Left: Plasmids used for reporter assay. Right: flow cytometry plot and MFI measurement

As the focus of dCas9 in on it's DNA binding capability, we first delete the domains of the nuclease(NUC) lobe which are the three RuvC domains and HNH domain. On Recognition(REC) lobe, the REC2 domain has been shown to be deletable, despite with a reduced expression. Regions that we started on are as shown:

Schematic depicting the truncations of individual whole domains of SpCas9 that were evaluated.

Left: Mean Fluorescence Intensity of ZsGreen activation. Right: Microscope image of HEK293FT cells 24 hours after transfection.

As shown above, the ∆HNH mutant is still able to bind to it’s target while other NUC truncations failed to activate our reporter. We were very excited when we found out that the HNH domain can be deleted. The HNH and RuvC domains are known for their function as a helicase. Studies for specificity enhancement have shown that excess energy is provided for the separation of the non-targeted strand. We could only hypothesize that with HNH domain deleted, strand separation can still be aided by the RuvC domains.

Next, we asked whether we might be able to delete smaller pieces of RuvCIII and still be able to activate gene expression using dSpCas9-VPR. It has been showed that the Cas9-RNA-DNA ternary complex exists in two different states, before and after the non-targeted strand(NTS) cleavage.

Conformations of RuvCIII in state A, before NTS cleavage and state B, after NTS cleavage

As shown above, when Cas9 transits from state A to state B, the loop connected to the HNH domain switches from an unstructures loop to an alpha helix(red) and is pulled towards the blue-white helix(in RuvCIII-2). This forces the white region within the blue helix to bend in order to accommodate the red helix. Also, the loop regions hightlighted in purple were not resolved in state B meaning that they may be flexible and not essential for binding interactions. As we know that the HNH can be deleted, this region involved in the movement of HNH might not be essential as well.

On the other hand, the cyan-highlighted(RuvCIII-1) helixes also has minimal interaction with non-targeted strand of DNA. Hence, we chose these two regions within RuvCIII as candidates for truncation.

Schematic of RuvCIII sub-domain deletions

We tried the RuvCIII deletions as well as with HNH deletions in combination to see if our deletions can be combined.

Activation of a zsGreen reporter gene using the various RuvCIII truncations

Microscope image of HEK293FT cells 24 hours after transfection.

it has been shown that sub-regions of the REC1 domain in the vicinity of REC2 abolishes the binding of the Cas9 to the DNA-RNA heteroduplex. From the crystal structures, we saw that these regions were closed to the PAM-proximal “seed” region of the heteroduplex.

As specificity studies have shown that mismatches at the PAM-distal sites, 3' end of target sequence, can be tolerated(refer below), we hypothesized that protein-DNA interaction is weaker at PAM-distal regions and hence, may be redundant for binding.

Tolerance of PAM-distal mismatches. Source: DNA targeting specificity of RNA-guided Cas9 nucleases

We chose three regions of the REC1 domain which are close to the PAM-distal site(10bp range) and found three regions where the distance of both ends are close to each other and can be replaced by the GGGS linker. We named them REC1-1, REC1-2 and REC1-3.

Sub-domians within REC1 at PAM-distal end

Schematic of different truncations of the REC1 domain tested

Left: Mean Fluorescence Intensity of ZsGreen activation. Right: Microscope image 24 hours after transfection.

PI domain plays an essential role in binding to the target sequence at the PAM sequence and also gRNA binding. As a result, we realized it may not be easily achievable by deleting the whole PI region. Yet we believed PI domain has potential to be truncated because that is where the main difference is found between SpCas9 (370aa) and the much more compact SaCas9 (144aa). Thus, we have made a structural comparison of the PI domain between SpCas9 and SaCas9. SpCas9 was found to have extra 5 alpha helices from 1225E – 1318G. This identified sequence was named as PI1-all for the labeling of the subsequent experiments.

Alignment of SpCas9 and SaCas9. (Domain for PI1-all deletion is highlighted by red stroke.)

Meanwhile, we have also looked at the evolutionary homology pattern of the variants species of Cas9 and found that 1138T to 1199P is less conserved as compared to the rest of PI domain. We then labeled it as PI2-all for potential target to be truncated. Hence, we have identified two target regions PI1 and PI2 for the truncation test.

We also made smaller truncations within PI1-all(PI1-1 to PI1-3) and PI2-all(PI-4) as we worry that large truncation may affect the structural integrity of the protein, as in the case of the RuvCIII truncation.

Truncations made within PI domain

Mean Fluorescence Intensity of each truncated PI structure

From the plot above, we know that unfortunately all the truncated dCas9 structures at PI domain do not exhibit similarly high activity as what we have obtained from Rec 1 and HNH truncations.

We then made a comparison among all the truncation combinations to see how the truncations compare among each other.

Size deleted Mutant Size
Single Truncation
145 ∆ REC1-1 1223
48 ∆ REC1-3 1320
128 ∆ REC2 1240
134 ∆ HNH 1234
72 ∆ RuvCIII-2 1296
Double Truncations
182 ∆ REC1-3 ∆ HNH 1186
262 ∆ REC2 ∆ HNH 1106
206 ∆ RuvCIII-2 ∆ HNH 1162
Triple Truncations
334 ∆ REC2 ∆ RUVCIII-2 ∆ HNH 1034

MFI for truncations mutants that worked.

The overall flow of doing a T7E experiment is as such:

We cloned plasmids containing various guide RNA sequences into the Cas9/Cpf1 vector with a -P2A-OFP reporter. We transfected Human Embryonic Kidney 293 cells (HEK293 cells) with the plasmids and sort for transfected cells(OFP+) via FACS. We then extracted DNAs from these cells and PCR amplify the targeted genomic locus.

As DNA cleavage efficiency can be reflected from indel% due NHEJ’s dominance in DSB repair, we performed T7E assay on amplified products for the detection and quantification of indels.For the measurement of HDR efficiency in terms of HDR, the amplified product is then digested with the Restriction Enzyme corresponding to the RE site on the ssODN.

Workflow for T7E assay

In order to have a fair comparison among all endonucleases, we cloned them onto the same plasmid under the expression of two different promoters, CAG and EF1a. We did one round of T7E assay for every constructs on genomic locus with known editing efficiency to check if our cloning works.

Constructs used for Project Evaluation

T7E1 results for each constructs. a. SpCas9 b. SaCas9 c. NmCas9 d. AsCpf1 e. LbCpf1

We then check the expression level of the endonucleases under the two promoters via RT-qPCR to ensure our results are promoter independent. We found that the proteins expressed under the EF1a promoter has 1.5-fold higher expression. However, even with differing expression levels, the measurement of NHEJ efficiency using the two promoter correlates well with an EF1a:CAG ratio of 0.9621(R2 = 0.7025).

a. Expression of endonuclease under CAG and EF1a promoter b. Plot of indel% between two promoters

With all conditions for a fair measurement checked, we proceed on to measure the NHEJ efficiency on the 5 genes with increasing expression levels. The endonuclease were targeted to regions with 17bp to 24 bp spacers. For a fair comparison, the target sequence for Cpf1 and Cas9 are the same. As the PAM sequence for SaCas9 and NmCas9 is impossible to be unified under the same PAM binding site, we searched for two sets of targets, set A for SaCas9 and others, set B for NmCas9 and others.

a. Target sites for each locus with PAM sites for Cas9 and Cpf1. Indel% on b. ALK, c. EGFR d. NF1 e. KDM6A f. STAG2 locus.

In general, we are able to derive that:
1.SpCas9 is able to work with efficiently even with a minimum 17bp spacer but not as well with 23nt spacers.
2.NmCas9 is unable to perform as well for most of the time.
3.SaCas9, LbCpf1 and AsCpf1 works well with 19-13nt spacer but not reliable with 17-18nt spacers.

Due to the scarcity of target sites with PAM binding site for both Cas9 and Cpf1, most target sites are within intronic regions of the gene. We then searched for other Cas9 and Cpf1 compatible target sites that are situated within protein coding regions. The results on protein coding regions parallels results we obtained from intronic regions.

Indel% for protein coding regions on a. APC b. ATM c. KDM5C

Although the spacer sequence is the same, the PAM-proximal sites of Cas9 and Cpf1 still differs. Hence, we shortened the distance of the sequence between PAM sites of Cas9s and Cpf1s to 7bp, so that every Cas9 and Cpf1 have the same seed sequence. Results confirm our conclusion that SpCas9 works well with 17nt spacer while others require longer spacers for efficient cleavage editing.

Indel% with the same seed on a. CACNA1D protein-coding region b. PPP1R12C protein-coding region

We then went on to compare how efficient is each enzyme in promoting HDR with a 100nt ssODN donor, with an Xba1 cutsite, using the same target sites as with the same seed region. Interestingly, LbCpf1 and AsCpf1 works efficiently to induce HDR for targeted genome editing.

HDR efficiency with the same seed on a. CACNA1D protein-coding region b. PPP1R12C protein-coding region

Competition assay was conducted to isolate more efficient SpCas9 variants. Both the EZ plasmids and wild type::M13F (SpCas9WT::M13F) plasmid were introduced into BL21(DE3) that carries the selection plasmid, BBa_K2130004. The M13F sequence was introduced into the plasmid backbone of SpCas9WT, which allows us to differentiate the EZ plasmids from the WT plasmid via colony PCR.

Two gRNAs (1A and 4B) were used and both gRNAs were designed to target the Ampicillin gene on BBa_K2130004. This selection plasmid encodes the toxin CcdB gene under the inducible BAD promoter. In the presence of the gRNA and appropriate inducers, the CcdB-Ampicillin plasmid will be cleaved and hence, the toxin gene CcdB will not be expressed, allowing the cell to survive. While on the other hand, cells that carry a non-functional EZ SpCas9 variant, will not cleave the CcdB-Ampicillin plasmid. As a result, the toxin CcdB gene will be expressed and lead to the cell death.

BBa_K2130004: ccdB plasmid

To perform the competition assay, we introduced equal amount (50ng) of plasmids (EZ + WT) into BL21(DE3) + BBa_K2130004. Colony PCR was performed on 32 randomly selected colonies to determine the EZ::WT ratio. The colony PCR results showed approximately 50% of the colonies contained the EZ plasmid and the other 50% contained the SpCas9WT::M13F plasmid. As we perform iterative rounds of competition assays, the ratio of EZ plasmid to SpCas9WT::M13F plasmid gradually increased, and eventually, all the tested cells contained only the EZ plasmid and none of them contained the SpCas9WT::M13F plasmid. The theory is that if the EZ SpCas9 variants are more efficient than SpCas9WT::M13F, cells that carry the EZ SpCas9 variants will replicate faster and their plasmid concentrations will increase as the rounds of competition assays increase.

Screen for WT colonies after each round of competition assay.

Upon completing the competition assay, a few colonies were sequenced to determine their Cas9 mutations. The recurring variants were selected as the potential candidates for more efficient SpCas9. These selected variants of SpCas9 were recreated using Gibson Assembly in order to examine their cleaving efficiency in human cells. An alternative method employed to recreate the mutants was QuikChange site-directed mutagenesis (Agilent Scientific).

Library Winners Part Name
Mutants using gRNA 1A
1 Q894H F897Y E952D S1025G
2 Y812H Y1001H 459(BBa_K2130002)
3 L727I R884Q A1032T V748I N767D M822I R832L E1026K
4 I814V E952K P1002T A1081E
Mutants using gRNA 4B
1 N869K E904K 462(BBa_K2130001)
2 Q794H E952K P1002T A1081E
3 A752V G729D K959M
4 K968N W1074R E802D D829H S937F A991G D1017E G1070C V1092I

We chose one mutant from each gRNA library, mutant 459(BBa_K2130002) and mutant 462(BBa_K2130001) with fewer mutation to test their cleavage efficiency using the T7E assay, as well as an in vitro assay done with the help from Team Macquarie.

In vitro assay of WT-Cas9, mutant 462 and mutant 459. The Cas9s are programmed to cut a linear DNA fragment of 1200bp to 1000bp and 200bp fragments.