Difference between revisions of "Team:Exeter/Collaborations"

Line 438: Line 438:
 
background:#e8e8e8;
 
background:#e8e8e8;
 
border-style:none;
 
border-style:none;
 +
}
 +
.equation{
 +
font-size:120%;
 +
margin:auto;
 +
display:block;
 +
text-align:left;
 +
margin-left:30%;
 +
}
 +
.equation_key{
 +
font-size:80%;
 +
margin-left:40px;
 +
display:block;
 +
}
 +
.equation_ref{
 +
float:right;
 +
font-size:60%;
 +
text-align:right;
 +
}
 +
.graph_box{
 +
max-width:30%;
 +
margin:20px 10%;
 +
display:block;
 +
}
 +
.graph_box_single{
 +
max-width:30%;
 +
margin:20px 35%;
 +
display:block;
 +
}
 +
.graph_box img, .graph_box_single img{
 +
padding:0;
 +
margin:0 auto;
 +
max-width:100%;
 +
display:block;
 +
border-style:solid;
 +
border-color:#4e4e4e;
 +
border-width:0.5px;
 
}
 
}
 
@media (min-width: 767px){
 
@media (min-width: 767px){
Line 472: Line 508:
 
padding:10px 0;
 
padding:10px 0;
 
margin-left:0.85vw;
 
margin-left:0.85vw;
 +
}
 +
.graph_box{
 +
max-width:90%;
 
}
 
}
 
}
 
}
Line 663: Line 702:
 
<div id="section_2" class="link_fix"></div>
 
<div id="section_2" class="link_fix"></div>
 
<div id="contentTitle">
 
<div id="contentTitle">
Software:Purdue </div>
+
Software </div>
 
<div>
 
<div>
 
+
<h3>Purdue Collaboration</h3>
  
 
<p id="pp">Our team helped Purdue with this by logging data for the 260  
 
<p id="pp">Our team helped Purdue with this by logging data for the 260  
Line 675: Line 714:
 
<p id="pp">We gave Purdue feedback on the design, layout and how  
 
<p id="pp">We gave Purdue feedback on the design, layout and how  
 
easy this database was to use to help them improve on what they had done so far.</p>  
 
easy this database was to use to help them improve on what they had done so far.</p>  
 +
<br />
 +
<h3>Edinburgh Collaboration</h3>
 +
<h6>Optimising methods of data mutation detection in BabbleBlocks</h6>
 +
<p id="pp">
 +
Storing information on DNA offers many advantages over current methods, however mutations
 +
need to be carefully monitored to ensure incorrect data is not read as a false positive.
 +
Currently for information stored on a BabbleBrick a ‘CheckSum’ is calculated by taking the
 +
sum of the values on each base of DNA. If the checksum of a BabbleBlock has changed between
 +
the time of writing and reading, the data is considered to be corrupt.
 +
</p>
 +
<p id="pp">
 +
<span class="equation">$C = \sum^{bp}_{n=1} bp_n$</span><br />
 +
<span class="equation_key">
 +
$C$: Frequency of checksum<br />
 +
$n$: The integer address of base pair<br />
 +
$bp$: Amount of base pairs (5 times the number of BabbleBricks)<br />
 +
$bp_n$: The value of the $n^{th}$ base pair
 +
</span>
 +
</p>
 +
<div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;">
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/4/48/T--Exeter--Collaboration_Edinb_1.png">
 +
<span>Fig. 1. The frequency of all checksums in a babbleBlock system containing two BabbleBricks.</span>
 +
</div>
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/0/0b/T--Exeter--Collaboration_Edinb_2.png">
 +
<span>Fig. 2. The frequency of all checksums in a babbleBlock system containing three BabbleBricks.</span>
 +
</div>
 +
</div>
 +
<p id="pp">
 +
Currently a checksum utilizes only a small percentage of the values that can be stored.
 +
A BabbleBrick contains 5 base 4 digits meaning that 4$^{\text{5}B}$ unique bits of
 +
information share one of 15$B$ checksums where $B$ is the amount of BabbleBricks in one
 +
BabbleBlock. This data has been plotted for BabbleBlocks containing 2 and 3 BabbleBricks
 +
in Fig.1 and Fig.2 respectively. Assuming that between the time of writing and reading
 +
any number of mutations can occur, the maximum probability of a mutation event resulting
 +
in the same checksum can be calculated by comparing the frequency of one checksum to the
 +
total frequency of unique bits of information.
 +
</p>
 +
<p id="pp">
 +
<span class="equation">$P_C = \big(\frac{C_{max}}{F}) \approx \big(\frac{1.2 \times 10^5}{4^{10}}) = 11$% in a 2 BabbleBrick system</span><br />
 +
<span class="equation">$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\approx \big(\frac{10^8}{4^{15}}) = 9$% in a 3 BabbleBrick system</span>
 +
<span class="equation_key">
 +
$P_C$: Maximum probability of the same checksum occuring after any number of mutations<br />
 +
$C_{max}$: Frequency of most common checksum<br />
 +
$F$: Frequency of possible unique bits of information
 +
</span>
 +
</p>
 +
<p id="pp">
 +
Therefore, it can be predicted that for an average sentence containing 9 words the maximum
 +
probability of the same checksum occurring will be of the magnitude of 1%. The probability
 +
should decrease marginally when adding BabbleBricks due to the slightly increased range of
 +
checksums that become available. This value can be optimized by altering the method of the
 +
checksum to utilize a greater range of values and to spread out the frequency more evenly as
 +
to reduce the maximum probability of the same checksum occurring.
 +
</p>
 +
<p id="pp">
 +
Currently one BabbleBlock has 4 BabbleBricks dedicated to storing the checksum, giving a maximum
 +
10$^4$ possible values. The first step in determining a ‘CheckMethod’ is to ensure that all checksums
 +
for a suitable amount of BabbleBricks can be stored without going over 10$^4$. It is also important
 +
to not use operators that will result in negative numbers or decimals, therefore limiting the
 +
possible checksum values to integers up to but not including 10$^4$, this rules out operators such
 +
as subtract and divide. For this example, a suitable number of words in a sentence and therefore
 +
BabbleBricks in a BabbleBlock shall be 20. All simulations will be carried out on 3 BabbleBrick
 +
systems due to computing limitations.
 +
</p>
 +
<p id="pp">
 +
Checksums are non-directional, for example a BabbleBrick of bases [2,2,2,2,2] would have the
 +
same checksum as [2,1,3,2,2].  To alter this a checkmethod will incorporate the position
 +
of the base in to the calculation. At each point the digit is multiplied by its position
 +
in the BabbleBlock, where the first BabbleBrick has digit positions 1 to 5 and the last
 +
BabbleBrick (20$^{th}$) has positions 96 to 100. A scaler $\alpha$ has been included to
 +
increase the range of results. To ensure multiplications don’t result in a null result
 +
the value of each base had a value of 1 added to it. The first checkemethod of one
 +
BabbleBlock can be defined as:
 +
</p>
 +
<p id="pp">
 +
<span class="equation">$M_1 = \sum_{n=1}^{bp}(bp_n + 1) . \alpha . bp$</span>
 +
<span class="equation_key">
 +
$M_1$: Frequency of CheckMethod 1<br />
 +
$\alpha$: Scaler ($\alpha = 5$ in this example)
 +
</span>
 +
</p>
 +
<div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;">
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/4/4c/T--Exeter--Collaboration_Edinb_3.png">
 +
<span>Fig. 3. The frequency of checkmethod 1 for all possible bits of information in a babbleBlock system containing two BabbleBricks.</span>
 +
</div>
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/d/d7/T--Exeter--Collaboration_Edinb_4.png">
 +
<span>Fig. 4. The frequency of checkmethod 1 for all possible bits of information in a babbleBlock system containing three BabbleBricks.</span>
 +
</div>
 +
</div>
 +
<p id="pp">
 +
This method results in Fig.3 and Fig.4 for a 2 and 3 BabbleBlock system respectively,
 +
which shows a large improvement over the original checksum method. The maximum frequency
 +
of a single checksum has been significantly decreased whichwill lower the probability of
 +
a flase positive occuring; this is largely due to the large range of results available to
 +
the method. However, there is still room for improvement as the shaded area of the graph
 +
indicates that on a smaller scale the frequency of checkmethod 1 varies between high and low
 +
values. Eliminating this fluctuation would allow for the data to be spread out more evenly.
 +
To improve this
 +
method a second layer of multiplication will be implamented, each digit will
 +
now be multiplied by a constant depending on its relative position in the BabbleBrick.
 +
</p>
 +
<p id="pp">
 +
<span class="equation">$M_2 = \sum_{p=1}^B \sum_{q=1}^{5}(bp_{(5B_p + q)} + 1) . q . bp$</span><br />
 +
<span class="equation" style="font-size:60%;">Or using the remainder modulo '%'</span><br />
 +
<span class="equation">$M_2 = \sum_{n=1}^{bp} (bp_n + 1) . ((bp \text{ % } 5) + 1) . bp$</span>
 +
<span class="equation_key">
 +
$M_2$: Frequency of CheckMethod 2<br />
 +
$B$: Number of BabbleBricks in the BabbleBlock<br />
 +
$p$: Local integer address of BabbleBrick<br />
 +
$q$: Local integer address of base pair in BabbleBrick<br />
 +
$B_p$: The $p^{th}$ Babblebrick in the BabbleBlock
 +
</span>
 +
</p>
 +
<div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;">
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/6/6f/T--Exeter--Collaboration_Edinb_5.png">
 +
<span>Fig. 5. The frequency of checkmethod 2 for all possible bits of information in a babbleBlock system containing two BabbleBricks.</span>
 +
</div>
 +
<div class="graph_box col-xs-12">
 +
<img src="https://static.igem.org/mediawiki/2016/0/06/T--Exeter--Collaboration_Edinb_6.png">
 +
<span>Fig. 6. The frequency of checkmethod 2 for all possible bits of information in a babbleBlock system containing three BabbleBricks.</span>
 +
</div>
 +
</div>
 +
<p id="pp">
 +
<span class="equation">$P_{M_2} = \big(\frac{M_{2\:max}}{F}) \approx \big(\frac{6 \times 10^3}{4^{10}}) = 0.6$% in a 2 BabbleBrick system ($11$% for checksum)</span><br />
 +
<span class="equation">$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\approx \big(\frac{3 \times 10^6}{4^{15}}) = 0.3$% in a 3 BabbleBrick system ($9$% for checksum)</span>
 +
<span class="equation_key">
 +
$P_{M_2}$: Maximum probability of the same checkmethod 2 value occuring after any number of mutations<br />
 +
$M_{2\:max}$: Frequency of most common checkmethod 2 value<br />
 +
$F$: Frequency of possible unique bits of information
 +
</span>
 +
</p>
 +
<p id="pp">
 +
This has been plotted for a 2 and 3 BabbleBlock system in Fig.5 and Fig.6 respectively.
 +
When comparing checksum to checkcethod 2 the frequency peak is approximately 20 to 30
 +
times smaller in both cases whilst utilizing more values. In Fig.5 and Fig.6 the largest
 +
improvement using the second iteration of the checkmethod is the utilization of every
 +
integer value, checkmethod 1 appears shaded as the frequency varies frequently. The last
 +
step is to test checkmethod 2 when used in a babbleBlock containing 20 BabbleBricks; the
 +
largest value possible assuming a BabbleBlock containing the value ‘3’ in each digit will
 +
grant a value of 60600 which falls out of the current limit of 10$^4$ values. Therefore,
 +
it is recommended that one more BabbleBrick is added to the end of the BabbleBlock in order
 +
to store 10$^5$ values. 
 +
</p>
 +
<p id="pp">
 +
To improve this method  further more complex multiplications could be added, it would be
 +
a decision based on optimising efficiency of calculations and minimising false positives.
 +
In a 2 and 3 BabbleBrick system the probability of a false positives occurring was reduced by
 +
approximately 20 and 30 times respectively, although the numbers are too large to compute,
 +
this new method has the possibility of lowering the maximum false positive error of the previously
 +
used checksum by one or more orders of magnitude.
 +
If continued further, research should also be done in to the reconstruction of data after it has been lost.
 +
</p>
 +
<div>
 +
<a id="Section_link" href="#section_3" style="display:block;margin:20px auto 0 auto;width:14px;"><span style="color:#47BCC2;font-size: 25px;" class="glyphicon glyphicon-menu-down" aria-hidden="true"></span></a>
 +
</div>
  
  
  
  
 +
</div>
  
<p id="pp"></p>
 
 
 
 
 
 
 
<a id="Section_link" href="#section_3" style="display:block;margin:20px auto 0 auto;width:14px;"><span style="color:#47BCC2;font-size: 25px;" class="glyphicon glyphicon-menu-down" aria-hidden="true"></span></a>
 
</div>
 
</div>
 
 
<div class="col-xs-12 div_content">
 
<div class="col-xs-12 div_content">
 
<div id="section_3" class="link_fix"></div>
 
<div id="section_3" class="link_fix"></div>

Revision as of 16:20, 13 October 2016