Difference between revisions of "Team:UESTC-software/Design"

Line 133: Line 133:
 
     <li>
 
     <li>
 
         <a href="#The design of back-end">
 
         <a href="#The design of back-end">
 +
        <span></span>
 
             The design of back-end
 
             The design of back-end
 
         </a>
 
         </a>

Revision as of 09:20, 19 October 2016

三级页面

Design

To realize a fluent and efficient DNA-based information storage system, great efforts have been devoted to designing its web system.

The optimization of the front-end

Website based

Due to the rapid development of network, web is now used extensively throughout the whole world. Meanwhile, website is more likely to be used without installing any app. Most importantly, website is compatible with different equipment and systems such as windows, mac os, linux and so on. So we developed our software based on website. In order to ensure the stability and high efficiency, we accomplished all computational calculating on our server and provided users an hyperlink to download the final result.

User-friendly Interface

The interface of our webpage is concise. We have two main buttons on our webpage—encode and decode, which can complete users’ demand of uploading, transforming and downloading files. Users can easily familiarize the operation of our software, and use our software to do more things they want. To develop a cross-platform software, HTML, CSS, bootstrap, and jQuery are integrated into the framework of the present software. The webpage is a humanized and beautiful design, as well as quick in response.


Fig.1.The interface of Bio101.

Clear operation flow

Our website has a clear operation flow as the following figure. When a user starts to encode a file, he or she will submit a file and a code as token to encrypt the file. The file will be compressed and encrypted after stored in our server. After that, process goes to encode it and user will go to the ‘Download’ page after the encoding process. User can choose txt, fasta or SBOL-xml format to download the final DNA sequences. As for decoding, user will submit a file with DNA sequences which are encoded by our software and a code which is set when encoding the file. Decoding will start after the file being stored in our server. Then, the file will be decrypted by token and decompressed. After that, website will skip to ‘Download’ page, so user can download the decoded file.


Fig.2.Design process of Bio101.

The design of back-end

Rigorous process designed

Before the file transformed to DNA sequences, a compression step is needed, which can help decrease the length of synthesized DNA sequences to reduce the consuming of money and time. Thanks for Martin Scharm’s blog “Comparison of compression”systematically analyzed different compression algorithms. We choose bzip2 to compress file. In consideration of a good information storage system, encrypting the message is essential. So after compression process, we use a fast cryptographic random number generator(ISAAC64) to encrypt the compressed file to minimize the safety cases so as to keep the information secret. Then, we need to transform the binary numbers to DNA sequences. In order to store various large pieces of information, we fragment the long DNA sequence into pieces and add each new sequence address code and check code, which help to rebuild the sequence without errors.


Fig.3.The process of encoding and decoding.

Different file formats supported

Our software supports the transforming of all formats of files, including jpg, pdf, mp3, etc. So users can store all kinds of computer files in DNA. On the other hand, we provide different formats of recording DNA sequences for users to download, including txt, xml, SBOL, etc. Users can easily use these different formats of files to do more things.

C language and Python Combined

C language has high execution efficiency, crossing platform application, etc. Whereas Python holds a great promise for conciseness, extensibility, abundant library, etc. So, the two are combined in Bio101 to form an ideal environment. The encryption and bit2nt parts are handled in C language while the rest is in Python, to guarantee the efficiency and the extensibility of the program.


Fig.4.Bio101 combines Python and C programming language.

Special design for DNA editing

CRISPR-Cas9 system—information edited

The ability to randomly edit the information stored in DNA sequences is significant. Although it is hard to realize perfectly at present, we put forward an idea which can edit information to a certain extent.

Our simple idea is to replace the old sequence by the new sequence. But it is hard to find one specific sequence in the whole system. With the development of biotechnology, the appearance of CRISPR-Cas9 system makes it possible.

Abstract of CRISPR-Cas9 system

Cas9 (CRISPR associated protein 9) is an RNA-guided DNA endonuclease enzyme associated with the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) type II adaptive immunity system in Streptococcus pyogenes, among other bacteria. S. pyogenes utilizes Cas9 to interrogate and cleave foreign DNA,[1] such as invading bacteriophage DNA or plasmid DNA.[2] Cas9 performs this interrogation by unwinding foreign DNA and checking whether it is complementary to the 20 basepair spacer region of the guide RNA. If the DNA substrate is complementary to the guide RNA, Cas9 cleaves the invading DNA.

DNA information edited

Users input the edited file, and our software will find the modified part and the corresponding DNA sequence and define the PAM site. Then generate a new sequence to replace the old sequence and design a sequence of sgRNA based on the upstream sequence of PAM site. User can use this sgRNA and through CRISPR-Cas9 system, the old sequence will be targeted gene-knockout. Then users can add the new sequence to the storage system. Which means old sequence is replaced by the new sequence and old information is replaced by new information.

Framework of our website

Web Programming with Django

The front-end and back-end are separated, which are connected with Django web framework. Django is a high-level Python Web framework that facilitates rapid development and clean, pragmatic design. It’s also a free and open source. When users upload a file to the server-side interface, the back-end works, and then a DNA sequences file will be returned for the users to download. Developers can easily improve the codes in the back-end without worrying about any conflict with present front-end codes.


Fig.5.

References


  • [1] Heler R, Samai P, Modell JW, Weiner C, Goldberg GW, Bikard D, Marraffini LA (Mar 2015). "Cas9 specifies functional viral targets during CRISPR-Cas adaptation". Nature. 519 (7542): 199–202. Bibcode:2015Natur.519..199H. doi:10.1038/nature14245
  • [2] Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E (Aug 2012). "A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity". Science. 337 (6096): 816–21. Bibcode:2012Sci...337..816J. doi:10.1126/science.1225829
CATALOGUE