Difference between revisions of "Team:UESTC-software/Features"

Line 66: Line 66:
 
<p class="title">Features</p>
 
<p class="title">Features</p>
 
</div>
 
</div>
<div class="detail-content">
+
<div class="detail-content">  
<h2 id="Why do we choose DNA?">Why do we choose DNA? </h2>  
+
                <p>DNA information storage is a promising direction of synthetic biology, there were some scientists and iGEM teams that had opened the way for us. One was <a href="https://2010.igem.org/Team:Hong_Kong-CUHK" target="_blank">CUHK’s project</a><sup>[1]</sup> in 2010. The other was a paper<sup>[2]</sup> published on Nature in 2013. Our DNA information storage system has great features. We do the compatibility, fault tolerance, encryption, usage mode of the system best. Besides, we make DNA editing come true, through CRISPR/Cas9 system. </p>
                <p>DNA, as the epochal information storage medium, has many amazing features, [i.e.,] high-density, massive, high-stability, easy-access and free-maintenance.</p>
+
                 <h2 id="Various file formats supported">Various file formats supported</h2>
                 <strong>High-density and massive</strong>  
+
                 <p>Our software supports the transforming of all formats of files, including jpg, pdf, mp3, etc. So users can store all kinds of computer files in DNA. On the other hand, we provide different formats of DNA sequences for users to download, including txt, fasta, SBOL. Users can easily use these different formats of files to do more things.</p>  
                 <p>DNA information storage technology will be a landmark in the future-oriented storage technology. We believe that DNA is an incredibly high-density and massive storage medium. At theoretical maximum, DNA can code two bits per nucleotide(nt) or 455 exabytes pergram of ssDNA<sup>[1]</sup> . Bio101 can transform 200MB files once because of the length of indexes now.</p>  
+
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/5/5b/Uestc_software-figure1.png" /><br/><B>Fig.1.</B> The history of the data storage.</p>  
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/5/5b/Uestc_software-figure1.png" /><br/><B>Fig.1.</B> The history of the data storage.</p>  
                 <strong>High-stability</strong>  
+
                 <h2 id="Higher fault tolerance">Higher fault tolerance</h2>  
                 <p>DNA is a high-stability molecule, with a remarkable long life-span even in suboptimal environments, making it an ideal storage material. Indeed, more than 80% of the woolly mammoth (Mammoths primigenius) genome, comprising 3.3 billion nt, remains readable despite the fact that this species has disappeared from the planet at the end of the Pleistocene (10,000 years ago).</p>  
+
                 <p>Our system involves readings of length 200 bps shifted by 50 bps so as to ensure four-fold<sup>[2]</sup> coverage of the sequences for we can always get the accurate information from the redundant sequence. Meanwhile, we add indexes to the sequence, which contains address code and check code. It will help us know the location of sequence in a file and examine whether the sequence goes wrong or not during the synthesizing, storing or sequencing progress. In short, due to the four-fold redundancy and even-odd check we design, we can easily get to know where errors occur and inaccurate bases can be corrected by majority vote. </p>  
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/d/d9/Uestc_software-figure2.png" /><br/><B>Fig.2.</B>Extracting and reading DNA from Mammoth fossil.</p>  
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/d/d9/Uestc_software-figure2.png" /><br/><B>Fig.2.</B>Extracting and reading DNA from Mammoth fossil.</p>  
                 <strong>Easy-access and free-maintenance</strong>  
+
                 <h2 id="Securer encryption">Securer encryption</h2>  
                 <p>Molecular biology now provides us with the tools to cut (restriction endonucleases), paste (DNA ligase) and copy (PCR) DNA as we might do with the text of a word document. DNA also does not require frequent maintenance. When reading, DNA storage technology will not encounter compatibility problem.</p>  
+
                 <p>We use ISAAC64<sup>[3]</sup>, an encryption algorithm, also a fast cryptographic random number generator, to ensure that the bases appearing in consequential DNA sequence are almost random. It can protect the privacy of users, besides, due to its operating principle; reduce the homopolymer ad bio-function sequences.</p>  
                 <h2 id="What do Bio101 develop or improve as a DNA information storage system?">What do Bio101 develop or improve as a DNA information storage system? </h2>  
+
                 <h2 id="Usage mode">Usage mode </h2>  
                 <p>When we were working on our Bio101, we found that CUHK<sup>[2]</sup> also developed a similar project in 2010. So we compared our project with CUHK’s project and the results are shown in Table1:</p>  
+
                 <p>For convenience, we have designed a web page that allows users to experience our software, through which users can upload any format file they want to encode or decode easily and quickly and download the DNA sequences files generated or the original files conveniently. After our test, now users can use it on any device and any platform including smartphones, iPads and computers.</p>  
 +
 
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/3/3e/Uestc_software-table1.png" /><br/><B>Tab.1.</B> The comparison of two projects.</p>  
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/3/3e/Uestc_software-table1.png" /><br/><B>Tab.1.</B> The comparison of two projects.</p>  
 
                 <p>And more details about features of Bio101 are shown as follows:</p>
 
                 <p>And more details about features of Bio101 are shown as follows:</p>
                 <strong>1. Higher compression</strong>  
+
                 <h2 id="DNA editing">DNA editing</h2>
                 <p>We use bzip2 algorithm to compress the file, which accelerated the code speed in order to fulfill demand of web-app. Through the Table2<sup>[3]</sup>, we can find that bzip2 has a higher compression ratio than other compression algorithms which means less storage space and less bases, so we can save the cost of DNA synthesis.</p>
+
                 <p>The huge improvement we did compared to the former projects, was that we made DNA editing come true. You can provide the demand of modifying, and Bio101 will tell you the content of sgRNA and new DNA fragment for CRISPR-Cas9 system, which can edit DNA substance.</p>
 +
                <p>Our project has its own features, we compared it with the two in Table1<sup>[1]</sup></p>
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/b/bf/Uestc_software-table2.png" /><br/><B>Tab.2.</B> Comparison of several kinds of compression software.</p>  
 
                 <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/b/bf/Uestc_software-table2.png" /><br/><B>Tab.2.</B> Comparison of several kinds of compression software.</p>  
                 <strong>2. Securer encryption</strong>
+
                  
                 <p>We use ISAAC<sup>[4]</sup>—an encryption algorithm as well as a fast cryptographic random number generator to ensure that the bases appearing in consequential DNA sequence are almost random and reduce the homopolymers.</p>
+
                 <p>And more details about features of Bio101 are shown as follows:</p>
                <strong>3. New conversion for bit-to-nt</strong>
+
                  
                <p>We transform one byte of bits into four bytes of A (00), T (11), C (01), G (10) so that the coding efficiency of our system improves greatly. The transform rules are showed on Table 3. </p>
+
                <p class="img-p" style="width:60% !important;margin-left:160px;font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/b/b5/Uestc_software-feature-encoding_rules.png" /><br/><B>Tab.3.</B> Encoding rules.</p>  
+
                 <strong>4. Higher fault tolerancet</strong>
+
                <p>Our system involves readings of 200 bp long shifted by 50 bp to ensure four-fold<sup>[5]</sup> coverage of the sequences so we can always get the accurate information from the redundant sequence. Meanwhile, we add indexes to the sequence, which contains address code and check code. It will help us know the location of sequence in a file and examine whether the sequence goes wrong or not during the synthesizing, storing or sequencing progress.</p>
+
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/a/aa/Uestc_software-figure3.jpg" /><br/><B>Fig.3.</B> Fourfold redundancy and index to improve fault tolerance.</p>
+
                <strong>5. User-friendly design</strong>
+
                <p><em>Interface: </em>We design a webpage that allows users to experience our software, through which users can upload any format file they want to encode or the file including DNA sequences to decode easily and quickly download the DNA sequence files generated or the original files conveniently.</p>
+
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/c/c1/Uestc_software-feature-fig.5.png" /><br/><B>Fig.4.</B>User-friendly interface of Bio101.</p>
+
                <p><em>Compatibility: </em>Bio101 can work stably in a number of multi-task operating systems without frequent crashes. Also users can choose any file they want and then focus on synthesizing DNA by Bio101. The software is accessible for any device and platform.</p>
+
                <p><em>Extendable: </em>The evaluation criteria of a program should depend on its portability. Our code is open source, and we provide four APIs for developers to reuse the function of our software—ISAAC64 random encryption algorithm, bit-to-nt conversion, nt-to-bit conversion and Blast.</p>
+
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/2/25/Uestc_software-figure5.png" /><br/><B>Fig.5.</B>Conversion any file by Bio101 on different devices and platforms.</p>
+
 
                 <h2 id="References">References</h2>
 
                 <h2 id="References">References</h2>
 
                 <br>     
 
                 <br>     
 
                 <ul>
 
                 <ul>
                 <li style="font-size:13px;">[1] George M. Church. Yuan Gao, Spiram Kosupi. Next-Generation Digital Information Storage in DNA. Science, online August 16, 2012</li>
+
                 <li style="font-size:13px;">[1] <a href="https://2010.igem.org/Team:Hong_Kong-CUHK" target="_blank">https://2010.igem.org/Team:Hong_Kong-CUHK</a></li>  
                <li style="font-size:13px;">[2] https://2010.igem.org/Team:Hong_Kong-CUHK.</li>
+
                 <li style="font-size:13px;">[2] Goldman N, Bertone P, Chen S, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. [J]. Nature, 2013, 494(7435):77-80.</li>  
                <li style="font-size:13px;">[3] http://www.cnblogs.com/langzou/p/5823285.html.</li>  
+
                <li style="font-size:13px;">[3] <a href="http://burtleburtle.net/bob/rand/isaacafa.html" target="_blank">http://burtleburtle.net/bob/rand/isaacafa.html </a></li> 
                <li style="font-size:13px;">[4] http://burtleburtle.net/bob/rand/isaacafa.html. </li>  
+
                 <li style="font-size:13px;">[5] Goldman N, Bertone P, Chen S, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. [J]. Nature, 2013, 494(7435):77-80. </li>  
+
 
                 </ul>
 
                 </ul>
  

Revision as of 21:09, 19 October 2016

三级页面

Features

DNA information storage is a promising direction of synthetic biology, there were some scientists and iGEM teams that had opened the way for us. One was CUHK’s project[1] in 2010. The other was a paper[2] published on Nature in 2013. Our DNA information storage system has great features. We do the compatibility, fault tolerance, encryption, usage mode of the system best. Besides, we make DNA editing come true, through CRISPR/Cas9 system.

Various file formats supported

Our software supports the transforming of all formats of files, including jpg, pdf, mp3, etc. So users can store all kinds of computer files in DNA. On the other hand, we provide different formats of DNA sequences for users to download, including txt, fasta, SBOL. Users can easily use these different formats of files to do more things.


Fig.1. The history of the data storage.

Higher fault tolerance

Our system involves readings of length 200 bps shifted by 50 bps so as to ensure four-fold[2] coverage of the sequences for we can always get the accurate information from the redundant sequence. Meanwhile, we add indexes to the sequence, which contains address code and check code. It will help us know the location of sequence in a file and examine whether the sequence goes wrong or not during the synthesizing, storing or sequencing progress. In short, due to the four-fold redundancy and even-odd check we design, we can easily get to know where errors occur and inaccurate bases can be corrected by majority vote.


Fig.2.Extracting and reading DNA from Mammoth fossil.

Securer encryption

We use ISAAC64[3], an encryption algorithm, also a fast cryptographic random number generator, to ensure that the bases appearing in consequential DNA sequence are almost random. It can protect the privacy of users, besides, due to its operating principle; reduce the homopolymer ad bio-function sequences.

Usage mode

For convenience, we have designed a web page that allows users to experience our software, through which users can upload any format file they want to encode or decode easily and quickly and download the DNA sequences files generated or the original files conveniently. After our test, now users can use it on any device and any platform including smartphones, iPads and computers.


Tab.1. The comparison of two projects.

And more details about features of Bio101 are shown as follows:

DNA editing

The huge improvement we did compared to the former projects, was that we made DNA editing come true. You can provide the demand of modifying, and Bio101 will tell you the content of sgRNA and new DNA fragment for CRISPR-Cas9 system, which can edit DNA substance.

Our project has its own features, we compared it with the two in Table1[1]


Tab.2. Comparison of several kinds of compression software.

And more details about features of Bio101 are shown as follows:

References


CATALOGUE