Project summary
Protecting data through encryption and storage in bacterial spore DNA.
In 2002, the amount of information stored digitally had eclipsed information stored in analog format for the first time [1]. Just five years later, only 6% of the world’s data was still analog [1]. In 2015, an estimated 2,500,000,000,000 megabytes of new data were created every day, and this number is growing at an increasing rate [1]. It is not surprising that data breaches orchestrated by hackers are on the rise as well. Financial and legal records, military and government documents, these are examples of important information that must be preserved for a long time, but could cause great damage in the wrong hands. We have become a civilization dependent on information, and this information must be stored somewhere. As a result, we are faced with two problems: where do we store all of our data, and how do we keep it safe?
Storage of data in DNA has been proposed as early as the 1960’s, but has only recently become a hot topic [2]. This is in part due to the ever-growing demand for data storage, as well as advancements in DNA synthesis and sequencing technologies. Our goal is to create a system for long-term data storage and data transfer which cannot be hacked by digital means. Digital methods of encrypting information and converting it into binary code are well established, and data storage in DNA has already been demonstrated. Our project combines these two approaches by first converting information into binary code, encrypting it, and then storing it safely in DNA. Additional measures based on molecular biology will prevent unauthorized access, ensuring the safety of the stored information.
Our system will be useful for the kind of information that should be stored and transferred in a very secure manner, but does not have to be accessed quickly (within seconds). It will be possible to obtain the message in about 24-48 hours, however, this timeframe is likely to be reduced as new sequencing technologies are developed. For example, this system could be used to store patent and prototype information, genealogical records, legal and financial records, banking account details, login data or even top secret government documents. Given the stability and compactness of DNA, our system could also be adapted to serve as a time capsule for human knowledge.
Advantages of data storage in spore DNA
- DNA is a far more stable data storage medium compared to magnetic and optical media, remaining intact for at least 700,000 years at -4°C [4]. Even in harsh environments, DNA has a half-life of over 500 years [5]. In contrast, current storage technology lasts only up to 30 years [6].
- Spores are extremely resistant to aging, radiation, heat, and chemical damage. A viable spore-forming Bacillus strain was isolated from 250 million year old salt crystals [7].
- The densest data storage medium commercially available today can hold up to 10 GB/mm3. DNA has a data storage density of up to 109 GB/mm3, 8 orders of magnitude higher. [6]
- Conservative estimates predict that based on global memory demand, the amount of silicon (required for flash memory) is expected to exceed silicon supply by 2040. [8] However, we will never run out of DNA!
- DNA storage will soon become a cheaper alternative for data storage as DNA synthesis and sequencing costs drop. It is estimated to become a cost-effective method for long-term data storage within approximately ten years [9].
- Data storage in DNA is more energy efficient (and environmentally friendly) than currently used digital data storage. In 2015, 416.2 terawatt hours of electricity were used by data centers worldwide. This is higher than the annual power consumption of the entire UK[10], and is responsible for approximately 2% of global greenhouse emissions, rivalling the airline industry [11].
- Data stored in DNA cannot be hacked by digital means.
- Data stored in the DNA of bacterial spores is easy to copy, simply by allowing the spores to germinate and grow.
- DNA data storage is an apocalypse-proof technology because DNA will be relevant to future civilizations. As long as intelligent DNA-based life exists, there will be compelling reasons to study and manipulate DNA.
Our approach
Security through layers.
We use a layered approach with a combination of digital and biological security measures to ensure the information can only be accessed by the intended recipient. The first layer is digital encryption. The information is encrypted with the Advanced Encryption Standard (AES) algorithm, converted into a DNA sequence and integrated in the genomic DNA of Bacillus subtilis, a safe, thoroughly categorized organism capable of sporulation. The binary data obtained after encryption will be encoded into DNA according to the following logic: since DNA consists of four nucleotides namely A, C, T, and G, every nucleotide will represent a binary pair (combination of a 0 and a 1). The A will be represented as 00, C as 01, T as 10 and G as 11. The decryption key and the encrypted message are integrated into two different Bacillus strains and are protected from unauthorized access with additional security layers.
Once the message and key are encoded in Bacillus DNA, the cells are cultured in a sporulation-promoting medium. Bacterial spores are among the most resistant biological entities currently known, and thus represent an ideal substrate for long-term data storage. The spores containing the encrypted message and key are freeze-dried and embedded in separate filter papers (or any other porous material) for storage and transfer, along with a spiropyran-ciprofloxacin conjugate [3]. The biological activity of this photoswitchable antibiotic is very low when the spiropyran photoswitch is in its stable closed form, but increases dramatically after irradiation with a specific wavelength of light (in our case, 365 nm) which brings the photoswitch into a less stable, open form. When the light source is removed, the compound slowly reverts back to its biologically inactive state. Irradiation with other wavelengths also results in deactivation. The strains carrying the message and key (which possess resistance to the antibiotic) are mixed with numerous decoy spores when brought onto the carrier material. The decoy spores are not resistant, and do not contain any encrypted information.
When the intended recipients want to access the stored data, they place the filter paper with key carrying spores and antibiotic in a culture medium, and irradiate it with the activating wavelength of light. This wavelength must be known by the recipient beforehand. The activated antibiotic kills the decoys but not our key carrying strain. After culturing, their DNA is sequenced and the key is found. The key contains information necessary to culture the message carrying strain, and to decrypt the message. Without activation, all the spores germinate and grow, including the decoys. This makes it impossible to find the key by sequencing. Once the key is obtained, the message carrying strain can be cultured. Their DNA is then sequenced and the message can be decrypted.
References
- [1] World’s shift from analog to digital is nearly complete — NBC news
- [2] Some fundamental issues of microminiaturization — Radiotekhnika, 1964, No. 1, pp. 3-12
- [3] Ciprofloxacin−Photoswitch Conjugates: A Facile Strategy for Photopharmacology — Bioconjugate Chem. 2015, 26, 2592−2597 DOI: 10.1021/acs.bioconjchem.5b00591
- [4] Ancient DNA: Towards a million-year-old genome — Nature 499, 34–35 (04 July 2013) DOI:10.1038/nature12263
- [5] The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils — Proc Biol Sci. 2012 Dec 7;279(1748):4724-33 DOI: 10.1098/rspb.2012.1745
- [6] A DNA-Based Archival Storage System — ASPLOS 2016 DOI
- [7] Isolation of a 250 million-year-old halotolerant bacterium from a primary salt crystal— Nature 407, 897-900 (19 October 2000) DOI:10.1038/35038060
- [8] Nucleic Acid Memory — Nature Materials 15, 366–370 (2016) DOI:10.1038/nmat4594
- [9] Synthetic double-helix faithfully stores Shakespeare's sonnets — Nature DOI:10.1038/nature.2013.12279
- [10] Global warming: Data centres to consume three times as much energy in next decade, experts warn. — The Independent
- [11] GeSI SMARTer 2020: The Role of ICT in Driving a Sustainable Future — GeSI