Difference between revisions of "Team:UESTC-software/Proof"

 
(9 intermediate revisions by 2 users not shown)
Line 25: Line 25:
 
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Design">Design</a></li>
 
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Design">Design</a></li>
 
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Features">Features</a></li>
 
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Features">Features</a></li>
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Model">Model</a></li>
+
                         <li><a href="https://2016.igem.org/Team:UESTC-software/Model">Modeling</a></li>
 
                          
 
                          
 
                         <li class="three-nav"><a href="https://2016.igem.org/Team:UESTC-software/Proof">Proof</a></li>
 
                         <li class="three-nav"><a href="https://2016.igem.org/Team:UESTC-software/Proof">Proof</a></li>
Line 68: Line 68:
 
</div>
 
</div>
 
<div class="detail-content">
 
<div class="detail-content">
    <p>Having developed our DNA information storage system, it is important to validate our software performs its intended function. We did thorough dry-lab testing and wet-lab validation, in which we tested the efficiency and safety of our system and successfully restored our file in synthesized DNA sequences.</p>  
+
              <p>Bio101 has been tested internally and externally by us with some help from our collaborators. </p>  
 
                 <h2 id="Dry-lab testing">Dry-lab testing</h2>  
 
                 <h2 id="Dry-lab testing">Dry-lab testing</h2>  
 
            
 
            
               <p>To examine the usability and stability of the software Bio101, we design the software testing model. We test the software in the aspects of errors (deletion, insert and substitute) and the distribution of the errors (discrete or successive). We control the conditions artificially so that we can compare the different situations obviously.</p>
+
               <p>To examine the usability and stability of the software Bio101, we designed the software testing model. We tested different kinds of errors (deletion, insert and substitute) and error distribution has also been concerned (discrete or successive). </p>
 
               <strong>Fault tolerance</strong>
 
               <strong>Fault tolerance</strong>
               <br>
+
               <p>We choose n files and have large files (>1M) and small files (<1M) distributing on 1:1. And we recognize a successive error, which is more than 5% of the sequences length, is successive type. The other is discrete type. Especially when we test one type of the distributions we consider all kinds of errors, [i.e.] deletion, insertion and replacement. For example, when we test the discrete type, we have discrete deletion, discrete insert and discrete substitution equally, and then calculate the success encoding rate. The test of successive distribution is the same.</p>
              <strong style="font-style: italic;">The distribution of errors</strong>
+
              <p>Here we list the result:</p>
              <p>We choose n files including large files (>1M) and small files (<1M), distributing them on 1:1. </p>
+
             
              <p>We recognize under the equal errors, successive errors which is more than 5% of the sequences length is successive type. The other is discrete type. Especially when we test one type of the distributions, we consider all kinds of errors. For example, when we test the discrete type, we have discrete deletion, discrete insert and discrete substitution equally, then calculate the successful encoding rate. The test of successive distribution is the same.</p>
+
              <p>Here we list the result.</p>
+
 
               <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/0/08/Uestc-software-test1.png"/><br/><B>Tab.1.</B>Discrete distribution.</p>
 
               <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/0/08/Uestc-software-test1.png"/><br/><B>Tab.1.</B>Discrete distribution.</p>
 
               <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/6/66/Uestc-software-test2.png"/><br/><B>Tab.2.</B>Successive distribution.</p>
 
               <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/6/66/Uestc-software-test2.png"/><br/><B>Tab.2.</B>Successive distribution.</p>
 
             <p style="text-align:right;font-size:13px;">* √: successful    ×: failed</p>
 
             <p style="text-align:right;font-size:13px;">* √: successful    ×: failed</p>
            <p>The conclusion is that the fault tolerance of discrete distribution is better than successive distribution’s.</p>
 
 
      
 
      
             <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/0/08/Uestc-software-test3.png"/><br/><B>Fig.1.</B>The result comparison.</p>
+
 
             <strong style="font-style: italic;">Error types</strong>
+
             <p class="img-p" style="font-size:13px; width:70%; margin-left:150px;"><img src="https://static.igem.org/mediawiki/2016/6/69/Uestc-software-proof-1.png"/><br/><B>Fig.1.</B>The percentage of different distributions of error.</p>
            <p>In a similar approach, we deal with the files in the same way to test errors one by one. We choose n files including large files (>1M) and small files (<1M), distributing them on 1:1. The following is the result.</p>
+
             <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/f/f2/Uestc-software-test7.png"/><br/><B>Fig.2.</B>The percentage of different types of error</p>
   
+
            <p>The conclusion is that the fault tolerance of discrete distribution is better than that of successive distribution. And the success encoding rate of insert errors is higher than the other two kinds.</p>
          <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/e/e7/Uestc-software-test4.png"/><br/><B>Tab.3.</B>Deletion</p>
+
            <strong>Randomness</strong>
          <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/9/98/Uestc-software-test5.png"/><br/><B>Tab.4.</B>Insert</p>
+
            <p>We should produce sequences with sufficient random distributed A, T, C and G. And we recognize the successive number of the same bases as the standard to test the randomness. The distributions of A, G, C, T and higher-order combinations of the nucleotides are similar and homopolymers are rare. In consideration of biological safety, we also use massive data to test limit of homopolymers and the most of the longest competitive bases. Here are parts of our testing results.</p>
          <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/b/be/Uestc-software-test6.png"/><br/><B>Tab.5.</B>Substitution</p>
+
            <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/2/29/Uestc-software-proof-3.png
                <p style="text-align:right;font-size:13px;">* √: successful    ×: failed</p>
+
"/></br><B>Fig.3.</B>The distribution of three repeated bases</p>
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/f/f2/Uestc-software-test7.png"/><br/><B>Fig.2.</B>The result of three tests.</p>
+
              <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/a/ab/Uestc-software-proof-4.png"/></br><B>Fig.4.</B>The distribution of four repeated bases</p>
              <strong>Randomness</strong>
+
            <p class="img-p" style="font-size:13px; width:120%; margin-left:-80px;"><img src="https://static.igem.org/mediawiki/2016/0/07/Uestc-software-proof-5.png"/></br><B>Fig.5.</B>The distribution of the longest repeated bases</p>
              <p> In consideration of biological safety, we should produce sequences with sufficient random distributed A, T, C, G. We recognize the successive number of the same bases as the standard to test randomness.</p>
+
 
              <strong style="font-style: italic;">The percentage of normal length of successive bases & the length of the sequence</strong>
+
 
   
+
            <h2 id="Wet-lab validation">Wet-lab validation</h2>
              <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/7/7e/Uestc-software-test8.png"/></p>
+
            <p>To confirm our workflow’s feasibility, we went through the whole process of DNA information storage.</p>
              <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/a/af/Uestc-software-test9.png"/></p>
+
            <p>We used our software to encode a file and the DNA sequences were synthesized by a specialized company (General Biosystems Company, Anhui, China). After a week’s storage, we took out the sample for sequencing. In order to improve the accuracy of sequencing, we used PCR amplification and high-throughput sequencing to accomplish our work. With regard to the sample, we used PCR amplification to generate more sequences at first, then used E. coli to copy the sequences for high-throughput sequencing. Finally, Bio101 was used to decode the DNA sequences. As a result, we recovered our original file.</p>
              <strong style="font-style: italic;">the longest successive bases & the sequence length</strong>
+
            <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/f/f9/Uestc_software-test-7.jpg"/></br><B>Fig.6.</B>Wet lab validation flow chart.</p>
              <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/c/c7/Uestc-software-test10.png"/></p>
+
            <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/2/24/Uestc_software-test-8.jpg"/></br><B>Fig.7.</B>Plasmids transformation.</p>
              <p>As all these we test Bio101, we have the conclusion that our software has great usability and stability. There may be some unexpected situations happening when users encode the files, you can contact us to solve the problem through our iGEM wiki. We desire to improve Bio101 with users and we are looking forward your feedback!</p>
+
            <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/b/b9/Uestc_software-nndn-4.jpg"/></p>
                <p style="text-align:right;font-size:13px;"><a href="https://2016.igem.org/Team:UESTC-software">*Our wiki: https://2016.igem.org/Team:UESTC-software</a></p>
+
            <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/c/c2/Uestc_software-njik-4.jpg"/></br><B>Fig.8.</B>The sample DNA we synthesized.</p>
               
+
            <p>As all these we test Bio101, we have the conclusion that our software has great usability and stability. There may be some unexpected situations happening when users encode the file, you can contact us to solve the problem through our iGEM wiki. We desire to develop Bio101 with all our users and we are looking for your feedback!</p>
                <h2 id="Wet lab validation">Wet lab validation</h2>
+
 
                <p>Besides software testing, wet lab validation is also needed.</p>               
+
 
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/f/f9/Uestc_software-test-7.jpg"/></br><B>Fig.3.</B>Wet lab validation flow chart.</p>
+
           
                <p>We transformed the chosen file, “sSBOLv.svg”, to DNA sequence file. Going through thorough data analysis and safety confirmation, we connected a biotech company to help us synthesize the DNA sequences.</p>
+
                <p>DNA sequences carrying our file information should be stored in host cells. We chose E.coli TOP10 to store plasmids. The synthesized DNA sequences were transformed into pUC47.</p>
+
                <p class="img-p" style="font-size:13px;"><img src="https://static.igem.org/mediawiki/2016/2/24/Uestc_software-test-8.jpg"/></br><B>Fig.4.</B>Plasmids transformation.</p>
+
                <p> After a week of storage, we took out the sample for sequencing. In order to improve the accuracy of sequencing, we used PCR amplification and high-throughput sequencing to accomplish our work. With regard to the sample, we used PCR amplification to generate more sequences at first. Then used E. coli to copy the sequences for high-throughput sequencing. </p>
+
                <p>In the end, we uploaded the DNA sequences file to our software and decoded them. At last, we achieved the original file perfectly.  </p>
+
 
                  
 
                  
 
</div>
 
</div>
Line 116: Line 108:
 
     <div class="footer-top">
 
     <div class="footer-top">
 
         <p>FOLLOW US:
 
         <p>FOLLOW US:
             <a href="https://github.com/IGEM-UESTC-software" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/0/06/Uestc_software-github.png" /></a>
+
             <a href="https://github.com/igemsoftware2016/UESTC-Software-2016" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/0/06/Uestc_software-github.png" /></a>
 
             <a href="http://www.uestc.edu.cn/" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/a/a4/Uestc_software-school.png" /></a>
 
             <a href="http://www.uestc.edu.cn/" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/a/a4/Uestc_software-school.png" /></a>
 
             <a href="http://weibo.com/u/5621240588?refer_flag=1001030101_&is_hot=1" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/b/b1/Uestc_software-weibo.png" /></a>
 
             <a href="http://weibo.com/u/5621240588?refer_flag=1001030101_&is_hot=1" target="_blank"><img src="https://static.igem.org/mediawiki/igem.org/b/b1/Uestc_software-weibo.png" /></a>
Line 139: Line 131:
 
         </li>
 
         </li>
 
         <li>
 
         <li>
             <a href="#Wet lab validation">
+
             <a href="#Wet-lab validation">
 
                 <span></span>
 
                 <span></span>
                 Wet lab validation
+
                 Wet-lab validation
 
             </a>
 
             </a>
 
         </li>
 
         </li>

Latest revision as of 01:41, 20 October 2016

三级页面

Proof

Bio101 has been tested internally and externally by us with some help from our collaborators.

Dry-lab testing

To examine the usability and stability of the software Bio101, we designed the software testing model. We tested different kinds of errors (deletion, insert and substitute) and error distribution has also been concerned (discrete or successive).

Fault tolerance

We choose n files and have large files (>1M) and small files (<1M) distributing on 1:1. And we recognize a successive error, which is more than 5% of the sequences length, is successive type. The other is discrete type. Especially when we test one type of the distributions we consider all kinds of errors, [i.e.] deletion, insertion and replacement. For example, when we test the discrete type, we have discrete deletion, discrete insert and discrete substitution equally, and then calculate the success encoding rate. The test of successive distribution is the same.

Here we list the result:


Tab.1.Discrete distribution.


Tab.2.Successive distribution.

* √: successful ×: failed


Fig.1.The percentage of different distributions of error.


Fig.2.The percentage of different types of error

The conclusion is that the fault tolerance of discrete distribution is better than that of successive distribution. And the success encoding rate of insert errors is higher than the other two kinds.

Randomness

We should produce sequences with sufficient random distributed A, T, C and G. And we recognize the successive number of the same bases as the standard to test the randomness. The distributions of A, G, C, T and higher-order combinations of the nucleotides are similar and homopolymers are rare. In consideration of biological safety, we also use massive data to test limit of homopolymers and the most of the longest competitive bases. Here are parts of our testing results.


Fig.3.The distribution of three repeated bases


Fig.4.The distribution of four repeated bases


Fig.5.The distribution of the longest repeated bases

Wet-lab validation

To confirm our workflow’s feasibility, we went through the whole process of DNA information storage.

We used our software to encode a file and the DNA sequences were synthesized by a specialized company (General Biosystems Company, Anhui, China). After a week’s storage, we took out the sample for sequencing. In order to improve the accuracy of sequencing, we used PCR amplification and high-throughput sequencing to accomplish our work. With regard to the sample, we used PCR amplification to generate more sequences at first, then used E. coli to copy the sequences for high-throughput sequencing. Finally, Bio101 was used to decode the DNA sequences. As a result, we recovered our original file.


Fig.6.Wet lab validation flow chart.


Fig.7.Plasmids transformation.


Fig.8.The sample DNA we synthesized.

As all these we test Bio101, we have the conclusion that our software has great usability and stability. There may be some unexpected situations happening when users encode the file, you can contact us to solve the problem through our iGEM wiki. We desire to develop Bio101 with all our users and we are looking for your feedback!

CATALOGUE