BabblED have created a modular and accessible system to store data in DNA. We have demonstrated that data can be encoded into DNA, the DNA can be assembled using a modular system without the high expense of de novo synthesis, then read back with high fidelity and built-in error correction.
Our multi-step proof of concept:
Data-to-DNA software
We have written software in python which allows input data to be converted to DNA in the format of BabbleBlocks. The data library (for example an input list of words) is matched to a corresponding list of coding DNA sequences. The desired data to be encoded is then uploaded, or typed. The software gives and output of which BabbleBricks are required, and which order they are to be assembled. For automated labs a protocol script can be provided.
Creation of BabbleBricks
BabbleBricks are designed to be housed in phytobrick compatible plasmids for storage. The BabbleBrick can be excised from the phytobrick when needed for assembly use.
Our BabbleBricks were created using complimentary primers with correct overhangs to be inserted into a plasmid.
To maintain a stock levels of the BabbleBricks replication of the plasmid was done by transforming into competent TOP10 e.coli cells. From these, the plasmid was then extracted, purified and mini prepped. This miniprep is now ready to use for assembly.
Part No XX is an example of a BabbleBrick. In our Lexicon No. 1 it codes for the word “good”.
Assembly of BabbleBricks
Creating BabbleBricks from BabbleBlocks (BioBrick device consisting of multiple BabbleBricks) is a demonstration of our new DNA assembly standard for encoding data.
The assembly method is the sequential addition of BabbleBricks onto a growing chain anchored by a magnetic bead. Alternating AB and BA overhangs ensures only one BabbleBrick is added on per ligation step.
Once construct is compete, the DNA is meted off the magnetic bead. This can be stored as is, or digested and ligated into a phytobrick compatible plasmid:
A polyagarose gel is used to ensure the correct product has been made. Here, we can see BabbleBlocks of varying lengths corresponding to the number of BabbleBricks added together.
This can be done because BabbleBricks are defined regular lengths, so we observe regular step-wise differences. BabbleBlocks are made of multiple BabbleBricks, and so are devices.
DNA-to-data Software
To read the stored data the DNA is sequenced. In Edinburgh we have been using Sanger sequencing because our DNA constructs have repetitive sequences. These would be difficult to assemble using short read lengths of many next generation methods. Ideally however, we would want to use nanopore technology. The MiniION has long read lengths, and can be plugged into a USB port to be used with our software with the highest convenience for the end user.
The sequence is then read by the decoding software (link again). This uses XYZ to identify the coding regions of each BabbleBrick in order, and match it to the corresponding data fragment in the correct library.
This software can check and correct errors that may have occurred by sequencing or mutations. To demonstrate its ability modelling was done to allow the software to correct various incorrect input sequences.
We can see that the software was able to identify an error X% of the time, and correct it as shown.