Team:UESTC-software/Proof

三级页面

Proof

Having developed our DNA information storage system, it is important to validate our software performs its intended function. We did thorough dry-lab testing and wet-lab validation, in which we tested the efficiency and safety of our system and successfully restored our file in synthesized DNA sequences.

Dry-lab testing

We chose different formats of files as test cases to test the efficiency and fault tolerance of our system. We must confirm that if certain bases in the sequence change, it can still be restored to the original file. We take a text file” Bio101 description” to clarify our validation.


Fig.1.Software test flow chart.

Step 1

We encode the file “Bio101 discription.txt” and get DNA sequences.

Step 2

We modify the DNA sequence, including add, delete the base, move the base position, and test whether the software can also restore the original file.

For example, we deleted line 10, and added three AGCT repeats at line 15.

Step 3

We decoded the modified DNA sequence and got the original file. Thus, it proved the fault tolerance of our software.

Afterwards, we use Blast to verify the safety of the DNA sequences generated.

The result shows that there is no encoded DNA sequence matching the any DNA sequence in BLAST. So we can believe that the sequences generated by Bio101 are safe.

After safety validation, we make data analysis of the DNA sequences files generated, including the contents of each base, the length of repeated nucleotides and the length of repeated nucleotides fragment (such as TAAAAAC or ACGT ACGT ACGT).


Fig.2.Analysis report of encoding sequences.

In this report, we can see that the number of four bases in the sequence is substantially equal. And sequences of continuous repeats are almost non-existent.

Wet lab validation

Besides software testing, wet lab validation is also needed.


Fig.3.Wet lab validation flow chart.

We transformed the chosen file, “sSBOLv.svg”, to DNA sequence file. Going through thorough data analysis and safety confirmation, we connected a biotech company to help us synthesize the DNA sequences.

DNA sequences carrying our file information should be stored in host cells. We chose E.coli TOP10 to store plasmids. The synthesized DNA sequences were transformed into pUC47.


Fig.4.Plasmids transformation.

After a week of storage, we took out the sample for sequencing. In order to improve the accuracy of sequencing, we used PCR amplification and high-throughput sequencing to accomplish our work. With regard to the sample, we used PCR amplification to generate more sequences at first. Then used E. coli to copy the sequences for high-throughput sequencing.

In the end, we uploaded the DNA sequences file to our software and decoded them. At last, we achieved the original file perfectly.

CATALOGUE
CATALOGUE