Team:Edinburgh UG/Description

<!doctype html> Untitled Page

ProjectD

Project Description

The last decade has seen an exponential increase in data and information generation, creating a storage demand that will soon outweigh supply. By 2040, global data storage demand will reach 3×10^24 (3 million billion billion) bits^1.. Considering the amount of energy required to run a data centre (about 2% of global energy consumption^2.) and the limited supply of raw materials like silicon for manufacturing memory devices, it is clear that novel storage methods are of the utmost importance in meeting demand and providing a sustainable, long term solution to the data storage problem. The University of Edinburgh 2016 undergraduate iGEM team held these considerations in mind when we set out to create a new DNA-based storage system.

Over the course of weeks 1 and 2 our team explored and researched ideas for a project utilising DNA as an information storage device. Our brainstorming process evolved through discussions about the advantages and disadvantages of DNA synthesis and encoding digital information into nucleotide sequences. Major points of debate were cost, fidelity and efficiency of data storage. Following some constructive feedback from our supervisors, our team focused on developing a method that is accessible, sustainable and fits the iGEM format.

Our project, given the name BabblED, is based on a simple idea: develop a modular system for encoding text, or any other unit of information, into DNA. We will prove the validity of our concept by encoding Ogden’s Basic English (a collection of 850-1,000 words that can be used to express most concepts in the English language). Each encoded word sequence, termed a BabbleBrick, will be stored in a different PhytoBrick. Sentence assembly and directionality is ensured with stepwise addition of ‘DNA words’ that have alternating types of sticky ends; this also prevents repeats and minimizes the occurrence of missing words. The whole sentence construct can be melted off for easy retrieval and assembled back into a PhytoBrick for storage. Since the value that is assigned to each BabbleBrick is arbitrary, each one can be reused with any library or language. In this way, our encoding and assembly method can be optimized for many types of data. Furthermore, using tools such as checksums, optimal rectangular codes and, when appropriate, natural language processing techniques, we are able to ensure that each BabbleBrick sentence can be decoded with 100% accuracy.

As of week 4, we have developed the computer program that converts our vocabulary to BabbleBrick sequences. We have designed the DNA sequences for error-correcting codes and researched the benefits and potential ways to utilize encryption in our method. We are in the process of ordering our first BabbleBricks in the form of gBlocks from IDT and testing our assembly method for efficiency. We have commenced the 2016 Interlab study and are pursuing another exciting project on bacterial growth-based logic. We have already had some fascinating discussions with data specialists and librarians; their feedback and expertise are vital to how we are shaping our project. We have also been in touch with other iGEM teams, such as Newcastle and Dundee, and are hosting a Scottish team meet-up in the beginning of July.

References:
1.http://www.nature.com/nmat/journal/v15/n4/full/nmat4594.html#supplementary-information
2.http://www.greenpeace.org/international/Global/international/publications/climate/2011/Cool%20IT/dirty-data-report-greenpeace.pdf/