The main purpose behind our project is to design a modular, more affordable method for storing data in DNA. Our first challenge of the summer was to answer the question:
Who are our consumers?
At the beginning of the summer we met with Dr Szymanski and Dr Schyfter from our University’s Science, Technology and Innovation Studies department to talk about our project. Their main advice to us was this – don’t assume you know what ‘the public’ needs; if you want your project to have applications then talk to your consumers directly. We took their advice and put it into action.
The next week we met with Dr Turner and Dr Carter from EPCC and ARCHER; the University’s supercomputer and big data research centre. Their expertise on data generation, storage and transfer inspired us to refocus our project not on the bigger issue of data storage, but to focus in on long term archival data.
This led to our meetings with data librarians at our University Main Library and with the Digital Preservations Officer at the National Library of Scotland. Both libraries were supportive and interested in the idea of storing their archival data in DNA. We discussed the difficulties that they face with storing data over long periods of time and investigated how we could improve our method to meet these needs.
It became clear through these meetings that the answer to the question “Who are our consumers?” is this: organisations or people that need to reliably store data for long periods of time. Who might these people actually be? Well, obviously libraries archive manuscripts that date back hundreds of years, but we also discovered that UK Parliamentary law requires certain organisations to store their data anywhere from 6 months to 50 years.
After we had gauged the interest of these librarians, we were faced with the question; how can we optimize our method to suit their needs? With this, came the further investigation of the sustainability, safety, ethics, and security of our project.
From our initial meetings, we were able to pinpoint our target audience as being librarians, or organizations that are required to archive data for extended periods of time. This is because DNA is competitive with magnetic tape as a more sustainable, dense and long lasting storage medium. The text below outlines the considerations that came to mind when investigating the practicality of our project, and how we explored these in turn.
After realizing that librarians or archivists would be our target audience we started to investigate ways in which our project is sustainable. Research into data storage will tell you that by 2040 we will exhaust the raw materials necessary for manufacturing flash memory(1). It will also tell you that the average life span of hard drives is 2 years while magnetic tape only lasts 6(2). This, combined with the startling facts about data centre energy consumption motivated us to ensure that our BabblED system will be the most sustainable data storage medium. We did this through our meetings with data storage and supercomputing experts, data librarians and digital preservation experts. Their feedback helped us cater the BabblED system to archival storage needs.
After deciding on our project idea, we were all keen to tell our friends. A common reaction tended to be “DNA?? What if you make a super cancer!?” For the most part, this was a joking reaction and no one seemed genuinely concerned. However, this, along with our supervisor’s suggestion to include a stop codon region in our design, inspired us to investigate the safety of our method further. To investigate safety in our project we met with bioethicists, anthropologists and federal and cyber security professionals.
In our meeting with librarians and data librarians from the University of Edinburgh Main Library we were asked if we had looked into the ethics of what our project proposed. At that point in our summer, we had not, but following this suggestion we realized how important it was to investigate the ethical considerations of our project. In the following weeks we met with Dr Tait, a bioethicist at the University to discuss our project. Her expertise is in the field of GMO policy and she gave us some useful advice about how to communicate our project to non-scientists. We also had a conversation with Dr Riley-Smith, pointed us in the direction of FRRIICT, or the Framework for Responsible Research and Innovation in ICT; we used this to assess our human practices and analyze whether we had been taking the right steps to ensure our research was responsible.
In terms of security, we investigated two types: security of data retrieval and secure data transfer. One of the main concerns of the librarians at our University Library and at the National Library was the ability to read back the data with accuracy. If this was not possible, then obviously it was not worth storing the data in DNA. We were also advised by Edward You, a Special Agent at the FBI, that data encryption and integrity is paramount in modern data storage and transfer. We further investigated this issue by speaking with encryption experts at our university and the Director of Information Security at Goldman Sachs.