Difference between revisions of "Team:Edinburgh UG/Error Correction"

 
(13 intermediate revisions by 2 users not shown)
Line 11: Line 11:
 
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>
 
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>
 
     </head>
 
     </head>
 +
 +
<style type="text/css">
 +
 +
    #contentSub, #footer-box, #catlinks, #search-controls, #p-logo, .printfooter, .firstHeading,.visualClear
 +
 +
    {
 +
 +
    display:  none;
 +
 +
    }
 +
 +
    #top_menu_under
 +
    {
 +
    height: 0;
 +
    }
 +
 +
    #top-section {
 +
    height: 0px;
 +
    border-top: 0;
 +
    border-left: none;
 +
    border-right: none;
 +
    }
 +
      #globalWrapper
 +
        {
 +
        width: 100%;
 +
        height: 100%;
 +
        border: 0px;
 +
        background-color: #ffffff;
 +
        margin: 0px;
 +
        padding: 0px;
 +
        font-size: 100%
 +
        }
 +
 +
        #content
 +
        {
 +
          width: 100%;
 +
          height: 100%;
 +
          margin-left: auto;
 +
          margin-right: auto;
 +
          background-color: #ffffff;
 +
          padding: 0px;
 +
          font-size: 100%;
 +
        }
 +
 +
 +
    }
 +
 +
    section {
 +
        padding: 75px 0;
 +
    }
 +
 +
    .section-heading {
 +
        margin: 30px 0;
 +
        font-size: 4em;
 +
    }
 +
 +
    .section-lead {
 +
        margin: 30px 0;
 +
    }
 +
 +
    .section-paragraph {
 +
        margin: 30px 0;
 +
    }
 +
 +
    .headline {
 +
        padding: 120px 0;
 +
    }
 +
 +
    @media(max-width:768px) {
 +
        .container {
 +
            margin: 0 15px;
 +
        }
 +
      }
 +
 +
</style>
  
 
<div id="custom-bootstrap-menu" class="navbar navbar-default navbar-fixed-top" role="navigation">
 
<div id="custom-bootstrap-menu" class="navbar navbar-default navbar-fixed-top" role="navigation">
     <div class="container-fluid">
+
     <div class= "container-fluid">
        <div class="navbar-header">
+
      <!-- Navigation -->
            <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-menubuilder"><span class="sr-only">Toggle navigation</span><span class="icon-bar"></span><span class="icon-bar"></span><span class="icon-bar"></span>
+
      <nav class="navbar navbar-inverse navbar-fixed-top" role="navigation">
            </button>
+
          <div class="container">
        </div>
+
              <!-- Brand and toggle get grouped for better mobile display -->
        <div class="collapse navbar-collapse navbar-menubuilder">
+
              <div class="navbar-header">
            <ul class="nav navbar-nav navbar-right">
+
                  <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
              <li class="active">
+
                      <span class="sr-only">Toggle navigation</span>
                  <a href="https://2016.igem.org/Team:Edinburgh_UG">Home</a></li>
+
                      <span class="icon-bar"></span>
                  <li class="dropdown">
+
                      <span class="icon-bar"></span>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Team<span class="caret"></span></a>
+
                      <span class="icon-bar"></span>
                    <ul class="dropdown-menu" role="menu">
+
                  </button>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Team">Team</a></li>
+
                  <a class="navbar-brand" href="https://2016.igem.org/Team:Edinburgh_UG">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Collaborations">Collaboration</a></li>
+
                      <img src="https://static.igem.org/mediawiki/2016/9/92/Edinburgh_logo2_MINI.png" alt="">
                    </ul>
+
                  </a>
                  </li>
+
              </div>
                  <li class="dropdown">
+
 
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Project<span class="caret"></span></a>
+
              <!-- Collect the nav links, forms, and other content for toggling -->
                    <ul class="dropdown-menu" role="menu">
+
              <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Description">Project</a></li>
+
                  <ul class="nav navbar-nav">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Description">Description</a></li>
+
                      <li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Design">Design</a></li>
+
                          <a href="https://2016.igem.org/Team:Edinburgh_UG">Home</a>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Experiments">Experiments</a></li>
+
                      </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Proof">Proof of Concept</a></li>
+
                      <li class="dropdown">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Demonstrate">Demonstrate</a></li>
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Team<span class="caret"></span></a>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Results">Results</a></li>
+
                        <ul class="dropdown-menu" role="menu">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Notebook">Notebook</a></li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Team">Team</a></li>
                    </ul>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Attribution">Attribution</a></li>
                  </li>
+
                        </ul>
                  <li class="dropdown">
+
                      </li>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Parts<span class="caret"></span></a>
+
                      <li class="dropdown">
                    <ul class="dropdown-menu" role="menu">
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Human Practices<span class="caret"></span></a>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Parts">Team Parts</a></li>
+
                        <ul class="dropdown-menu" role="menu">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Basic_Part">Basic Parts</a></li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Overview">Overview</a> </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Composite_Part">Composite Parts</a></li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/HP/Silver">Silver</a> </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Part_Collection">Part Collection</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/HP/Gold">Gold</a> </li>
                    </ul>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Medal_Criteria">Medal Criteria</a> </li>
                  </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Integrated_Practices">Integrated Practices</a> </li>
                  <li class="dropdown">
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Engagement">Engagement</a> </li>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Dry Lab<span class="caret"></span></a>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Ethics">Ethics</a> </li>
                    <ul class="dropdown-menu" role="menu">
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Mary_Queen_of _Scots">Mary Queen of Scots</a> </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Lexicon_Encoding">Lexicon Encoding</a></li>
+
                        </ul>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Error_Correction">Error Correction</a></li>
+
                      </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Encryption">Encryption</a></li>
+
                      <li class="dropdown">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Files">Files</a></li>
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Project<span class="caret"></span></a>
                    </ul>
+
                        <ul class="dropdown-menu" role="menu">
                  </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Description">Description</a></li>
                  <li class="dropdown">
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Design">Design</a></li>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Safety<span class="caret"></span></a>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Proof">Proof of Concept</a></li>
                    <ul class="dropdown-menu" role="menu">
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Demonstrate">Demonstrate</a></li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Safety">Safety</a></li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Notebook">Notebook</a></li>
                    </ul>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Protocols">Protocols</a></li>
                  </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Limitations">Advantages and Limitations</a></li>
                  <li class="dropdown">
+
                        </ul>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Human Practices<span class="caret"></span></a>
+
                      </li>
                    <ul class="dropdown-menu" role="menu">
+
                      <li class="dropdown">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Human_Practices">Human Practices</a></li>
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Informatics<span class="caret"></span></a>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Silver">Silver</a> </li>
+
                        <ul class="dropdown-menu" role="menu">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Gold">Gold</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Lexicon_Encoding">Lexicon Encoding</a></li>
                       <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Integrated_Practices">Integrated Practices</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Error_Correction">Error Correction</a></li>
                       <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Engagement">Engagement</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Encryption">Encryption</a></li>
                    </ul>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Files">Files</a></li>
                  </li>
+
                        </ul>
                  <li class="dropdown">
+
                      </li>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Awards<span class="caret"></span></a>
+
                      <li class="dropdown">
                    <ul class="dropdown-menu" role="menu">
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Parts<span class="caret"></span></a>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Entrepreneurship">Entrepreneurship</a></li>
+
                        <ul class="dropdown-menu" role="menu">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Hardware">Hardware</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Basic_Part">Basic Parts</a></li>
                       <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Software">Software</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Composite_Part">Composite Parts</a></li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Measurement">Measurement</a> </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Part_Collection">Part Collection</a> </li>
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Model">Model</a> </li>
+
                        </ul>
                    </ul>
+
                       </li>
                  </li>
+
                       <li>
                  <li class="dropdown">
+
                          <a href="https://2016.igem.org/Team:Edinburgh_UG/Collaboration">Collaboration</a>
                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Interlab<span class="caret"></span></a>
+
                      </li>
                    <ul class="dropdown-menu" role="menu">
+
                      <li class="dropdown">
                      <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Plate_Reader">Plate Reader</a></li>
+
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Awards<span class="caret"></span></a>
                    </ul>
+
                        <ul class="dropdown-menu" role="menu">
                  </li>
+
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Software">Software</a> </li>
            </ul>
+
                        </ul>
        </div>
+
                      </li>
 +
                       <li class="dropdown">
 +
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Safety<span class="caret"></span></a>
 +
                        <ul class="dropdown-menu" role="menu">
 +
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Safety/Biological Safety">Biological Safety</a></li>
 +
                        </ul>
 +
                      </li>
 +
                      <li class="dropdown">
 +
                        <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">Interlab<span class="caret"></span></a>
 +
                        <ul class="dropdown-menu" role="menu">
 +
                          <li><a href="https://2016.igem.org/Team:Edinburgh_UG/Plate_Reader">Plate Reader</a></li>
 +
                        </ul>
 +
                      </li>
 +
                  </ul>
 +
              </div>
 +
              <!-- /.navbar-collapse -->
 +
          </div>
 +
          <!-- /.container -->
 +
      </nav>
 
     </div>
 
     </div>
 
</div>
 
</div>
 +
    <!-- End of menu  -->
  
 
   <header>
 
   <header>
       <img class="img-responsive img-center" src="https://static.igem.org/mediawiki/2016/e/ee/EdiGEM16ug_errorCorrectionResize.jpeg" alt="">
+
       <img class="img-responsive img-center" src="https://static.igem.org/mediawiki/2016/1/17/EdiGEM16ug_errorCorrectionWhiteBack.jpeg" alt="">
 
   </header>
 
   </header>
 +
 +
  <br>
 +
  <br>
  
 
   <div class="container-fluid">
 
   <div class="container-fluid">
Line 114: Line 211:
 
     </div>
 
     </div>
 
   </div>
 
   </div>
   
+
 
 
   <div class="container-fluid">
 
   <div class="container-fluid">
 
     <div class="col-cm-2"></div>
 
     <div class="col-cm-2"></div>
Line 133: Line 230:
 
           </div>
 
           </div>
 
         </div>
 
         </div>
 +
 +
 +
 +
 
         <div class="panel panel-default">
 
         <div class="panel panel-default">
 
           <div class="panel-heading">
 
           <div class="panel-heading">
 
               <h4 class="panel-title">
 
               <h4 class="panel-title">
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseTwo">Optimal Rectangular Code</a>
+
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseTwo">Improved Checksum - Collaboration with Exeter University</a>
 
               </h4>
 
               </h4>
 
           </div>
 
           </div>
 
           <div id="collapseTwo" class="panel-collapse collapse">
 
           <div id="collapseTwo" class="panel-collapse collapse">
 +
              <div class="panel-body">
 +
                <div class="row">
 +
                <div class="col-sm-6">Fig1</div>
 +
                <div class="col-sm-6">Fig2</div>
 +
                </div>
 +
                <div class="row">
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/4/48/T--Exeter--Collaboration_Edinb_1.png"></div>
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/0/0b/T--Exeter--Collaboration_Edinb_2.png"></div>
 +
                </div>
 +
                <p>Currently a checksum utilizes only a small percentage of the values that can be stored.
 +
A BabbleBrick contains 5 base 4 digits meaning that 4^5B unique bits of
 +
information share one of 15B checksums where B is the amount of BabbleBricks in one
 +
BabbleBlock. This data has been plotted for BabbleBlocks containing 2 and 3 BabbleBricks
 +
in Fig.1 and Fig.2 respectively. Assuming that between the time of writing and reading
 +
any number of mutations can occur, the maximum probability of a mutation event resulting
 +
in the same checksum can be calculated by comparing the frequency of one checksum to the
 +
total frequency of unique bits of information.</p>
 +
                <img src="https://static.igem.org/mediawiki/2016/4/4c/EdiGEM16UGeq1.jpeg">
 +
                <p>
 +
Therefore, it can be predicted that for an average sentence containing 9 words the maximum
 +
probability of the same checksum occurring will be of the magnitude of 1%. The probability
 +
should decrease marginally when adding BabbleBricks due to the slightly increased range of
 +
checksums that become available. This value can be optimized by altering the method of the
 +
checksum to utilize a greater range of values and to spread out the frequency more evenly as
 +
to reduce the maximum probability of the same checksum occurring.
 +
</p>
 +
<p>
 +
Currently one BabbleBlock has 4 BabbleBricks dedicated to storing the checksum, giving a maximum
 +
10^4 possible values. The first step in determining a ‘CheckMethod’ is to ensure that all checksums
 +
for a suitable amount of BabbleBricks can be stored without going over 10^4. It is also important
 +
to not use operators that will result in negative numbers or decimals, therefore limiting the
 +
possible checksum values to integers up to but not including 10^4, this rules out operators such
 +
as subtract and divide. For this example, a suitable number of words in a sentence and therefore
 +
BabbleBricks in a BabbleBlock shall be 20. All simulations will be carried out on 3 BabbleBrick
 +
systems due to computing limitations.
 +
</p>
 +
<p>
 +
Checksums are non-directional, for example a BabbleBrick of bases [2,2,2,2,2] would have the
 +
same checksum as [2,1,3,2,2].  To alter this a checkmethod will incorporate the position
 +
of the base in to the calculation. At each point the digit is multiplied by its position
 +
in the BabbleBlock, where the first BabbleBrick has digit positions 1 to 5 and the last
 +
BabbleBrick (20th}) has positions 96 to 100. A scaler alpha has been included to
 +
increase the range of results. To ensure multiplications don’t result in a null result
 +
the value of each base had a value of 1 added to it. The first checkmethod of one
 +
BabbleBlock can be defined as:
 +
</p>
 +
                <img src="https://static.igem.org/mediawiki/2016/5/5c/EdiGEM16UGeq2.jpeg">
 +
                <div class="row">
 +
                <div class="col-sm-6">Fig3</div>
 +
                <div class="col-sm-6">Fig4</div>
 +
                </div>
 +
                <div class="row">
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/4/4c/T--Exeter--Collaboration_Edinb_3.png"></div>
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/d/d7/T--Exeter--Collaboration_Edinb_4.png"></div>
 +
                </div>
 +
                </div>
 +
                <p>
 +
This method results in Fig.3 and Fig.4 for a 2 and 3 BabbleBlock system respectively,
 +
which shows a large improvement over the original checksum method. The maximum frequency
 +
of a single checksum has been significantly decreased which will lower the probability of
 +
a false positive occurring; this is largely due to the large range of results available to
 +
the method. However, there is still room for improvement as the shaded area of the graph
 +
indicates that on a smaller scale the frequency of checkmethod 1 varies between high and low
 +
values. Eliminating this fluctuation would allow for the data to be spread out more evenly.
 +
To improve this
 +
method a second layer of multiplication will be implemented, each digit will
 +
now be multiplied by a constant depending on its relative position in the BabbleBrick.
 +
</p>
 +
                <img src="https://static.igem.org/mediawiki/2016/d/d1/EdiGEM16UGeq3.jpeg">
 +
                <div class="row">
 +
                <div class="col-sm-6">Fig5</div>
 +
                <div class="col-sm-6">Fig6</div>
 +
                </div>
 +
                <div class="row">
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/6/6f/T--Exeter--Collaboration_Edinb_5.png"></div>
 +
                <div class="col-sm-6"><img src="https://static.igem.org/mediawiki/2016/0/06/T--Exeter--Collaboration_Edinb_6.png"></div>
 +
                </div>
 +
                <img src="https://static.igem.org/mediawiki/2016/d/d1/EdiGEM16UGeq4.jpeg">
 +
                <p>
 +
This has been plotted for a 2 and 3 BabbleBlock system in Fig.5 and Fig.6 respectively.
 +
When comparing checksum to checkmethod 2 the frequency peak is approximately 20 to 30
 +
times smaller in both cases whilst utilizing more values. In Fig.5 and Fig.6 the largest
 +
improvement using the second iteration of the checkmethod is the utilization of every
 +
integer value, checkmethod 1 appears shaded as the frequency varies frequently. The last
 +
step is to test checkmethod 2 when used in a babbleBlock containing 20 BabbleBricks; the
 +
largest value possible assuming a BabbleBlock containing the value ‘3’ in each digit will
 +
grant a value of 60600 which falls out of the current limit of 10^4 values. Therefore,
 +
it is recommended that one more BabbleBrick is added to the end of the BabbleBlock in order
 +
to store 10^5 values.
 +
</p>
 +
<p>
 +
To improve this method  further more complex multiplications could be added, it would be
 +
a decision based on optimising efficiency of calculations and minimising false positives.
 +
In a 2 and 3 BabbleBrick system the probability of a false positives occurring was reduced by
 +
approximately 20 and 30 times respectively, although the numbers are too large to compute,
 +
this new method has the possibility of lowering the maximum false positive error of the previously
 +
used checksum by one or more orders of magnitude.
 +
</p>
 +
        </div>
 +
          </div>
 +
        </div>
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
        <div class="panel panel-default">
 +
          <div class="panel-heading">
 +
              <h4 class="panel-title">
 +
                <a data-toggle="collapse" data-parent="#accordion" href="#collapseThree">Optimal Rectangular Code</a>
 +
              </h4>
 +
          </div>
 +
          <div id="collapseThree" class="panel-collapse collapse">
 
               <div class="panel-body">
 
               <div class="panel-body">
 
                 <p>The Optimal Rectangular Code (ORC) is a type of error correcting code that is present in all of our BabbleBricks. When the checksum detects an error the next action is to checks the ORCs in each word and use them to correct any errors that have taken place as follows:</p>
 
                 <p>The Optimal Rectangular Code (ORC) is a type of error correcting code that is present in all of our BabbleBricks. When the checksum detects an error the next action is to checks the ORCs in each word and use them to correct any errors that have taken place as follows:</p>
Line 150: Line 368:
 
           <div class="panel-heading">
 
           <div class="panel-heading">
 
               <h4 class="panel-title">
 
               <h4 class="panel-title">
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseThree">Beyond Optimal Rectangular Codes - The SuperFixer Funtion and Levenshtein Distance</a>
+
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseFour">Beyond Optimal Rectangular Codes - The SuperFixer Function and Levenshtein Distance</a>
 
               </h4>
 
               </h4>
 
           </div>
 
           </div>
           <div id="collapseThree" class="panel-collapse collapse">
+
           <div id="collapseFour" class="panel-collapse collapse">
 
               <div class="panel-body">
 
               <div class="panel-body">
 
                 <p>If the Optimal Rectangular Code (ORC) detects more than the single base pair error it can fix using its normal actions the SuperFixer function is called. The SuperFixer uses the ORC and works out all the possible words in the lexicon that fit this pattern. For any given ORC this is a maximum of 4 words and indeed there are many words that have unique ORC’s. Once the SuperFixer has narrowed down the possible words the list of possibilities is compared to what appears in the actual DNA and the sequence with the lowest Levenshtein distance is used in the translation. Levenshtein distance is a metric which shows the number of deletions, insertions or substitutions it will take to turn one string (or sequence) into another.</p>
 
                 <p>If the Optimal Rectangular Code (ORC) detects more than the single base pair error it can fix using its normal actions the SuperFixer function is called. The SuperFixer uses the ORC and works out all the possible words in the lexicon that fit this pattern. For any given ORC this is a maximum of 4 words and indeed there are many words that have unique ORC’s. Once the SuperFixer has narrowed down the possible words the list of possibilities is compared to what appears in the actual DNA and the sequence with the lowest Levenshtein distance is used in the translation. Levenshtein distance is a metric which shows the number of deletions, insertions or substitutions it will take to turn one string (or sequence) into another.</p>
Line 164: Line 382:
 
           <div class="panel-heading">
 
           <div class="panel-heading">
 
               <h4 class="panel-title">
 
               <h4 class="panel-title">
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseFour">Overall Performance</a>
+
                 <a data-toggle="collapse" data-parent="#accordion" href="#collapseFive">Overall Performance</a>
 
               </h4>
 
               </h4>
 
           </div>
 
           </div>
           <div id="collapseFour" class="panel-collapse collapse">
+
           <div id="collapseFive" class="panel-collapse collapse">
 
               <div class="panel-body">
 
               <div class="panel-body">
 
                 <p>Finally, this graph represents the performance of our entire error correction system on errors of varying magnitudes:</p>
 
                 <p>Finally, this graph represents the performance of our entire error correction system on errors of varying magnitudes:</p>
Line 179: Line 397:
 
   </div>
 
   </div>
  
 +
  <br>
 +
  <br>
 
   <div class="row">
 
   <div class="row">
 
           <div class="box">
 
           <div class="box">
 
               <div class="col-lg-12">
 
               <div class="col-lg-12">
 
                   <hr>
 
                   <hr>
                   <h2 class="intro-text text-center">Contact
+
                   <h2 class="intro-text text-center">Follow
 
                       <strong>Us</strong>
 
                       <strong>Us</strong>
 
                   </h2>
 
                   </h2>
 
                   <hr>
 
                   <hr>
 
                   <div class="intro-text text-center">
 
                   <div class="intro-text text-center">
                      <p><a href="mailto:edinburgh.igem2016@gmail.com">edinburgh.igem2016@gmail.com</a>
 
                      </p>
 
 
                       <ul class="list-inline banner-social-buttons">
 
                       <ul class="list-inline banner-social-buttons">
 
                           <li>
 
                           <li>
                               <a href="https://twitter.com/EdiGEM2016"><img src="https://static.igem.org/mediawiki/2016/0/04/Twitterlogo_ed2016.png"></img></a>
+
                               <a href="https://twitter.com/EdiGEM2016"><img src="https://static.igem.org/mediawiki/2016/9/94/Edinburgh2_t2.jpg"></img></a>
 
                           </li>
 
                           </li>
 
                           <li>
 
                           <li>
                               <a href="https://www.facebook.com/EdiGEM2016"><img src="https://static.igem.org/mediawiki/2016/c/c2/Facebook_ed2016.png"></img></a>
+
                               <a href="https://www.facebook.com/EdiGEM2016"><img src="https://static.igem.org/mediawiki/2016/c/ce/Edinburgh2_f2.png"></img></a>
 
                           </li>
 
                           </li>
 
                           <li>
 
                           <li>
                               <a href="https://www.instagram.com/edigem2016/"><img src="https://static.igem.org/mediawiki/2016/5/5d/Instagram_ed2016.png"></img></a>
+
                               <a href="https://www.instagram.com/edigem2016/"><img src="https://static.igem.org/mediawiki/2016/6/64/Edinburgh2_insta2.png"></img></a>
 
                           </li>
 
                           </li>
 
                       </ul>
 
                       </ul>
Line 205: Line 423:
 
           </div>
 
           </div>
 
       </div>
 
       </div>
 +
<br>
 +
<br>
  
 
</body>
 
</body>
 
</html>
 
</html>

Latest revision as of 00:42, 20 October 2016



In any information storage system, the primary issue is being able to read back data with as high a fidelity as possible. Storage of archival data would be futile if the medium used could not guarantee this. BabblED utilises a powerful system of error detection and correction using checksums, optimal rectangular codes and Levenshtein distance to ensure fidelity of data even when a DNA sequence is damaged or there are mistakes in sequencing. Using these and the incredible properties of DNA we can be confident in accurately storing archival data for 10s or even 100s of years. You can read more about all these mechanisms below or jump straight into the code at our Github

A checksum is a commonly used technique in network engineering that we’ve adapted for use in DNA storage. In this format the checksum allows the detection of errors in a BabbleBlock (DNA sentence). The checksum holds the sum of the word coding regions of BabbleBricks (DNA words) in a BabbleBlock in a 4 BabbleBrick region that can be found towards the end of a BabbleBlock.

When a BabbleBlock is being translated the checksum is looked at first to see if it matches the sums of the word coding regions in the DNA we get back from sequencing. If this is the case, no errors have appeared and the translation program can proceed as normal however if there is no match an error has occurred somewhere in the BabbleBlock and other error correcting mechanisms will have to be used to make sure the stored information can still be read back.

Fig1
Fig2

Currently a checksum utilizes only a small percentage of the values that can be stored. A BabbleBrick contains 5 base 4 digits meaning that 4^5B unique bits of information share one of 15B checksums where B is the amount of BabbleBricks in one BabbleBlock. This data has been plotted for BabbleBlocks containing 2 and 3 BabbleBricks in Fig.1 and Fig.2 respectively. Assuming that between the time of writing and reading any number of mutations can occur, the maximum probability of a mutation event resulting in the same checksum can be calculated by comparing the frequency of one checksum to the total frequency of unique bits of information.

Therefore, it can be predicted that for an average sentence containing 9 words the maximum probability of the same checksum occurring will be of the magnitude of 1%. The probability should decrease marginally when adding BabbleBricks due to the slightly increased range of checksums that become available. This value can be optimized by altering the method of the checksum to utilize a greater range of values and to spread out the frequency more evenly as to reduce the maximum probability of the same checksum occurring.

Currently one BabbleBlock has 4 BabbleBricks dedicated to storing the checksum, giving a maximum 10^4 possible values. The first step in determining a ‘CheckMethod’ is to ensure that all checksums for a suitable amount of BabbleBricks can be stored without going over 10^4. It is also important to not use operators that will result in negative numbers or decimals, therefore limiting the possible checksum values to integers up to but not including 10^4, this rules out operators such as subtract and divide. For this example, a suitable number of words in a sentence and therefore BabbleBricks in a BabbleBlock shall be 20. All simulations will be carried out on 3 BabbleBrick systems due to computing limitations.

Checksums are non-directional, for example a BabbleBrick of bases [2,2,2,2,2] would have the same checksum as [2,1,3,2,2]. To alter this a checkmethod will incorporate the position of the base in to the calculation. At each point the digit is multiplied by its position in the BabbleBlock, where the first BabbleBrick has digit positions 1 to 5 and the last BabbleBrick (20th}) has positions 96 to 100. A scaler alpha has been included to increase the range of results. To ensure multiplications don’t result in a null result the value of each base had a value of 1 added to it. The first checkmethod of one BabbleBlock can be defined as:

Fig3
Fig4

This method results in Fig.3 and Fig.4 for a 2 and 3 BabbleBlock system respectively, which shows a large improvement over the original checksum method. The maximum frequency of a single checksum has been significantly decreased which will lower the probability of a false positive occurring; this is largely due to the large range of results available to the method. However, there is still room for improvement as the shaded area of the graph indicates that on a smaller scale the frequency of checkmethod 1 varies between high and low values. Eliminating this fluctuation would allow for the data to be spread out more evenly. To improve this method a second layer of multiplication will be implemented, each digit will now be multiplied by a constant depending on its relative position in the BabbleBrick.

Fig5
Fig6

This has been plotted for a 2 and 3 BabbleBlock system in Fig.5 and Fig.6 respectively. When comparing checksum to checkmethod 2 the frequency peak is approximately 20 to 30 times smaller in both cases whilst utilizing more values. In Fig.5 and Fig.6 the largest improvement using the second iteration of the checkmethod is the utilization of every integer value, checkmethod 1 appears shaded as the frequency varies frequently. The last step is to test checkmethod 2 when used in a babbleBlock containing 20 BabbleBricks; the largest value possible assuming a BabbleBlock containing the value ‘3’ in each digit will grant a value of 60600 which falls out of the current limit of 10^4 values. Therefore, it is recommended that one more BabbleBrick is added to the end of the BabbleBlock in order to store 10^5 values.

To improve this method further more complex multiplications could be added, it would be a decision based on optimising efficiency of calculations and minimising false positives. In a 2 and 3 BabbleBrick system the probability of a false positives occurring was reduced by approximately 20 and 30 times respectively, although the numbers are too large to compute, this new method has the possibility of lowering the maximum false positive error of the previously used checksum by one or more orders of magnitude.

The Optimal Rectangular Code (ORC) is a type of error correcting code that is present in all of our BabbleBricks. When the checksum detects an error the next action is to checks the ORCs in each word and use them to correct any errors that have taken place as follows:

Unfortunately, the ORC can only locate and fix single base pair errors however the properties of the ORC facilitate another added level of error correction.

If the Optimal Rectangular Code (ORC) detects more than the single base pair error it can fix using its normal actions the SuperFixer function is called. The SuperFixer uses the ORC and works out all the possible words in the lexicon that fit this pattern. For any given ORC this is a maximum of 4 words and indeed there are many words that have unique ORC’s. Once the SuperFixer has narrowed down the possible words the list of possibilities is compared to what appears in the actual DNA and the sequence with the lowest Levenshtein distance is used in the translation. Levenshtein distance is a metric which shows the number of deletions, insertions or substitutions it will take to turn one string (or sequence) into another.

By taking the minimum Levenshtein distance we always work under the assumption that the smallest number of errors is most probable. This assumption gives the best performance across errors of different magnitudes due to the robustness of DNA as a molecule making small errors considerably more likely than larger ones.

Finally, this graph represents the performance of our entire error correction system on errors of varying magnitudes:




Follow Us