Design
To realize a fluent and efficient DNA-based information storage system, great efforts have been devoted to designing of its web system.
The optimization of the front-end
Website basedDue to the rapid development of network, web is now used extensively throughout the whole world. Meanwhile, website is more likely to be used without installing any app. Most importantly, website is compatible with different equipment and systems such as windows, mac os, linux and so on. So we developed our software based on website. In order to ensure the stability and high efficiency, we accomplished all computational calculating on our server and provided users an API to download the final result.
User-friendly InterfaceThe interface of our webpage is concise. We have two main buttons on our webpage—encode and decode, which can complete users’ demand of uploading, transforming and downloading files. Users can easily familiarize the operation of our software, and use our software to do more things they want. To develop a cross-platform software, HTML, CSS, bootstrap, and jQuery are integrated into the framework of the present software. The webpage is a humanized and beautiful design, and quick in response.
Fig.1.The interface of Bio101.
Clear operation flowOur website has a clear operation flow as the following figure. When a user starts to encode a file, he or she will submit a file and a code as token to encrypt the file. The file will be compressed and encrypted after stored in our server. After that, process goes to encode it and user will go to the ‘Download’ page after the encoding process. User can choose txt, fasta or SBOL-xml format to download the final DNA sequences. As for decoding, user will submit a file with DNA sequences which are encoded by our software and a code which is set when encoding the file. Decoding will start after the file being stored in our server. Then, the file will be decrypted by token and decompressed. After that, website will skip to ‘Download’ page, so user can download the decoded file.
Fig.2.Design process of Bio101.
The design of back-end
Rigorous process designedBefore the file transformed to DNA sequences, a compression step is needed, which can help decrease the length of synthesized DNA sequences to reduce the consuming of money and time. Thanks for Martin Scharm’s blog “Comparison of compression” systematically analyzed different compression algorithms. We choose bzip2 to compress file. In consideration of a good information storage system, encrypting the message is essential. So after compression process, we use a fast cryptographic random number generator(ISAAC) to encrypt the compressed file to minimize the safety cases so as to keep the information secret. Then, we need to transform the binary numbers to DNA sequences. In order to store various large pieces of information, we fragment the long DNA sequence into pieces and add each new sequence address code and check code, which help to rebuild the sequence with errors.
Fig.3.The process of encoding and decoding.
Different file formats supportedOur software supports the transforming of all formats of files, including jpg, pdf, mp3, etc. So users can store all kinds of computer files in DNA. On the other hand, we provide different formats of recording DNA sequences for users to download, including txt, xml, SBOL, etc. Users can easily use these different formats of files to do more things.
C language and Python CombinedC language has high execution efficiency, crossing platform application, etc. Whereas Python holds a great promise for conciseness, extensibility, abundant library, etc. So, the two are combined in Bio101 to form an ideal environment. The encryption and bit2nt parts are handled in C language while the rest is in Python, to guarantee the efficiency and the extensibility of the program.
Fig.4.Bio101 combine python and C programming language.
Framework of our website
Web Programming with DjangoThe front-end and back-end are separated, which are connected with Django web framework. Django is a high-level Python Web framework that facilitates rapid development and clean, pragmatic design. It’s also a free and open source. When users upload a file to the server-side interface, the back-end works, and then a DNA sequences file will be returned for the users to download. Developers can easily improve the codes in the back-end without worrying about any conflict with present front-end codes.