AndrewWild (Talk | contribs) |
|||
Line 685: | Line 685: | ||
<!--Give span class "oneline" or "twoline" depending on how llong the section text is--> | <!--Give span class "oneline" or "twoline" depending on how llong the section text is--> | ||
− | <a href="#section_1" class="banner_link col-xs-6 col-sm-3"><span class="oneline"> | + | <a href="#section_1" class="banner_link col-xs-6 col-sm-3"><span class="oneline">Newcastle</span></a> |
− | <a href="#section_2" class="banner_link col-xs-6 col-sm-3"><span class="oneline"> | + | <a href="#section_2" class="banner_link col-xs-6 col-sm-3"><span class="oneline">Perdue</span></a> |
− | <a href="#section_3" class="banner_link col-xs-6 col-sm-3"><span class="oneline"> | + | <a href="#section_3" class="banner_link col-xs-6 col-sm-3"><span class="oneline">Glasgow</span></a> |
− | <a href="#section_4" class="banner_link col-xs-6 col-sm-3"><span class=" | + | <a href="#section_4" class="banner_link col-xs-6 col-sm-3"><span class="oneline">Edinburgh</span></a> |
</div> | </div> | ||
<!--Left picture (the teal line on left)--> | <!--Left picture (the teal line on left)--> | ||
Line 735: | Line 735: | ||
<div id="section_2" class="link_fix"></div> | <div id="section_2" class="link_fix"></div> | ||
<div id="contentTitle"> | <div id="contentTitle"> | ||
− | Software </div> | + | Software: Newcastle </div> |
<div> | <div> | ||
− | |||
− | |||
<p id="pp">Our team helped Purdue with this by logging data for the 260 | <p id="pp">Our team helped Purdue with this by logging data for the 260 | ||
iGEM teams of 2015 and critiquing ease of use and effectiveness of the database. For each team | iGEM teams of 2015 and critiquing ease of use and effectiveness of the database. For each team | ||
Line 748: | Line 746: | ||
easy this database was to use to help them improve on what they had done so far.</p> | easy this database was to use to help them improve on what they had done so far.</p> | ||
<br /> | <br /> | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
<div> | <div> | ||
<a id="Section_link" href="#section_3" style="display:block;margin:20px auto 0 auto;width:14px;"><span style="color:#47BCC2;font-size: 25px;" class="glyphicon glyphicon-menu-down" aria-hidden="true"></span></a> | <a id="Section_link" href="#section_3" style="display:block;margin:20px auto 0 auto;width:14px;"><span style="color:#47BCC2;font-size: 25px;" class="glyphicon glyphicon-menu-down" aria-hidden="true"></span></a> | ||
Line 1,091: | Line 934: | ||
<div id="section_4" class="link_fix"></div> | <div id="section_4" class="link_fix"></div> | ||
<div id="contentTitle"> | <div id="contentTitle"> | ||
− | + | Theory: Edinburgh | |
</div> | </div> | ||
+ | <h6>Optimising methods of data mutation detection in BabbleBlocks</h6> | ||
+ | <p id="pp"> | ||
+ | Storing information on DNA offers many advantages over current methods, however mutations | ||
+ | need to be carefully monitored to ensure incorrect data is not read as a false positive. | ||
+ | Currently for information stored on a BabbleBrick a ‘CheckSum’ is calculated by taking the | ||
+ | sum of the values on each base of DNA. If the checksum of a BabbleBlock has changed between | ||
+ | the time of writing and reading, the data is considered to be corrupt. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | <span class="equation">$C = \sum^{bp}_{n=1} bp_n$</span><br /> | ||
+ | <span class="equation_key"> | ||
+ | $C$: Frequency of checksum<br /> | ||
+ | $n$: The integer address of base pair<br /> | ||
+ | $bp$: Amount of base pairs (5 times the number of BabbleBricks)<br /> | ||
+ | $bp_n$: The value of the $n^{th}$ base pair | ||
+ | </span> | ||
+ | </p> | ||
+ | <div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;"> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/4/48/T--Exeter--Collaboration_Edinb_1.png"> | ||
+ | <span>Fig. 1. The frequency of all checksums in a babbleBlock system containing two BabbleBricks.</span> | ||
+ | </div> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/0/0b/T--Exeter--Collaboration_Edinb_2.png"> | ||
+ | <span>Fig. 2. The frequency of all checksums in a babbleBlock system containing three BabbleBricks.</span> | ||
+ | </div> | ||
+ | </div> | ||
+ | <p id="pp"> | ||
+ | Currently a checksum utilizes only a small percentage of the values that can be stored. | ||
+ | A BabbleBrick contains 5 base 4 digits meaning that 4$^{\text{5}B}$ unique bits of | ||
+ | information share one of 15$B$ checksums where $B$ is the amount of BabbleBricks in one | ||
+ | BabbleBlock. This data has been plotted for BabbleBlocks containing 2 and 3 BabbleBricks | ||
+ | in Fig.1 and Fig.2 respectively. Assuming that between the time of writing and reading | ||
+ | any number of mutations can occur, the maximum probability of a mutation event resulting | ||
+ | in the same checksum can be calculated by comparing the frequency of one checksum to the | ||
+ | total frequency of unique bits of information. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | <span class="equation">$P_C = \big(\frac{C_{max}}{F}) \approx \big(\frac{1.2 \times 10^5}{4^{10}}) = 11$% in a 2 BabbleBrick system</span><br /> | ||
+ | <span class="equation">$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\approx \big(\frac{10^8}{4^{15}}) = 9$% in a 3 BabbleBrick system</span> | ||
+ | <span class="equation_key"> | ||
+ | $P_C$: Maximum probability of the same checksum occuring after any number of mutations<br /> | ||
+ | $C_{max}$: Frequency of most common checksum<br /> | ||
+ | $F$: Frequency of possible unique bits of information | ||
+ | </span> | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | Therefore, it can be predicted that for an average sentence containing 9 words the maximum | ||
+ | probability of the same checksum occurring will be of the magnitude of 1%. The probability | ||
+ | should decrease marginally when adding BabbleBricks due to the slightly increased range of | ||
+ | checksums that become available. This value can be optimized by altering the method of the | ||
+ | checksum to utilize a greater range of values and to spread out the frequency more evenly as | ||
+ | to reduce the maximum probability of the same checksum occurring. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | Currently one BabbleBlock has 4 BabbleBricks dedicated to storing the checksum, giving a maximum | ||
+ | 10$^4$ possible values. The first step in determining a ‘CheckMethod’ is to ensure that all checksums | ||
+ | for a suitable amount of BabbleBricks can be stored without going over 10$^4$. It is also important | ||
+ | to not use operators that will result in negative numbers or decimals, therefore limiting the | ||
+ | possible checksum values to integers up to but not including 10$^4$, this rules out operators such | ||
+ | as subtract and divide. For this example, a suitable number of words in a sentence and therefore | ||
+ | BabbleBricks in a BabbleBlock shall be 20. All simulations will be carried out on 3 BabbleBrick | ||
+ | systems due to computing limitations. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | Checksums are non-directional, for example a BabbleBrick of bases [2,2,2,2,2] would have the | ||
+ | same checksum as [2,1,3,2,2]. To alter this a checkmethod will incorporate the position | ||
+ | of the base in to the calculation. At each point the digit is multiplied by its position | ||
+ | in the BabbleBlock, where the first BabbleBrick has digit positions 1 to 5 and the last | ||
+ | BabbleBrick (20$^{th}$) has positions 96 to 100. A scaler $\alpha$ has been included to | ||
+ | increase the range of results. To ensure multiplications don’t result in a null result | ||
+ | the value of each base had a value of 1 added to it. The first checkemethod of one | ||
+ | BabbleBlock can be defined as: | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | <span class="equation">$M_1 = \sum_{n=1}^{bp}(bp_n + 1) . \alpha . bp$</span> | ||
+ | <span class="equation_key"> | ||
+ | $M_1$: Frequency of CheckMethod 1<br /> | ||
+ | $\alpha$: Scaler ($\alpha = 5$ in this example) | ||
+ | </span> | ||
+ | </p> | ||
+ | <div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;"> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/4/4c/T--Exeter--Collaboration_Edinb_3.png"> | ||
+ | <span>Fig. 3. The frequency of checkmethod 1 for all possible bits of information in a babbleBlock system containing two BabbleBricks.</span> | ||
+ | </div> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/d/d7/T--Exeter--Collaboration_Edinb_4.png"> | ||
+ | <span>Fig. 4. The frequency of checkmethod 1 for all possible bits of information in a babbleBlock system containing three BabbleBricks.</span> | ||
+ | </div> | ||
+ | </div> | ||
+ | <p id="pp"> | ||
+ | This method results in Fig.3 and Fig.4 for a 2 and 3 BabbleBlock system respectively, | ||
+ | which shows a large improvement over the original checksum method. The maximum frequency | ||
+ | of a single checksum has been significantly decreased whichwill lower the probability of | ||
+ | a flase positive occuring; this is largely due to the large range of results available to | ||
+ | the method. However, there is still room for improvement as the shaded area of the graph | ||
+ | indicates that on a smaller scale the frequency of checkmethod 1 varies between high and low | ||
+ | values. Eliminating this fluctuation would allow for the data to be spread out more evenly. | ||
+ | To improve this | ||
+ | method a second layer of multiplication will be implamented, each digit will | ||
+ | now be multiplied by a constant depending on its relative position in the BabbleBrick. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | <span class="equation">$M_2 = \sum_{p=1}^B \sum_{q=1}^{5}(bp_{(5B_p + q)} + 1) . q . bp$</span><br /> | ||
+ | <span class="equation" style="font-size:60%;">Or using the remainder modulo '%'</span><br /> | ||
+ | <span class="equation">$M_2 = \sum_{n=1}^{bp} (bp_n + 1) . ((bp \text{ % } 5) + 1) . bp$</span> | ||
+ | <span class="equation_key"> | ||
+ | $M_2$: Frequency of CheckMethod 2<br /> | ||
+ | $B$: Number of BabbleBricks in the BabbleBlock<br /> | ||
+ | $p$: Local integer address of BabbleBrick<br /> | ||
+ | $q$: Local integer address of base pair in BabbleBrick<br /> | ||
+ | $B_p$: The $p^{th}$ Babblebrick in the BabbleBlock | ||
+ | </span> | ||
+ | </p> | ||
+ | <div class="col-xs-12" style="width:100%;position:relative;margin:auto;padding:0;"> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/6/6f/T--Exeter--Collaboration_Edinb_5.png"> | ||
+ | <span>Fig. 5. The frequency of checkmethod 2 for all possible bits of information in a babbleBlock system containing two BabbleBricks.</span> | ||
+ | </div> | ||
+ | <div class="graph_box col-xs-12"> | ||
+ | <img src="https://static.igem.org/mediawiki/2016/0/06/T--Exeter--Collaboration_Edinb_6.png"> | ||
+ | <span>Fig. 6. The frequency of checkmethod 2 for all possible bits of information in a babbleBlock system containing three BabbleBricks.</span> | ||
+ | </div> | ||
+ | </div> | ||
+ | <p id="pp"> | ||
+ | <span class="equation">$P_{M_2} = \big(\frac{M_{2\:max}}{F}) \approx \big(\frac{6 \times 10^3}{4^{10}}) = 0.6$% in a 2 BabbleBrick system ($11$% for checksum)</span><br /> | ||
+ | <span class="equation">$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\approx \big(\frac{3 \times 10^6}{4^{15}}) = 0.3$% in a 3 BabbleBrick system ($9$% for checksum)</span> | ||
+ | <span class="equation_key"> | ||
+ | $P_{M_2}$: Maximum probability of the same checkmethod 2 value occuring after any number of mutations<br /> | ||
+ | $M_{2\:max}$: Frequency of most common checkmethod 2 value<br /> | ||
+ | $F$: Frequency of possible unique bits of information | ||
+ | </span> | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | This has been plotted for a 2 and 3 BabbleBlock system in Fig.5 and Fig.6 respectively. | ||
+ | When comparing checksum to checkcethod 2 the frequency peak is approximately 20 to 30 | ||
+ | times smaller in both cases whilst utilizing more values. In Fig.5 and Fig.6 the largest | ||
+ | improvement using the second iteration of the checkmethod is the utilization of every | ||
+ | integer value, checkmethod 1 appears shaded as the frequency varies frequently. The last | ||
+ | step is to test checkmethod 2 when used in a babbleBlock containing 20 BabbleBricks; the | ||
+ | largest value possible assuming a BabbleBlock containing the value ‘3’ in each digit will | ||
+ | grant a value of 60600 which falls out of the current limit of 10$^4$ values. Therefore, | ||
+ | it is recommended that one more BabbleBrick is added to the end of the BabbleBlock in order | ||
+ | to store 10$^5$ values. | ||
+ | </p> | ||
+ | <p id="pp"> | ||
+ | To improve this method further more complex multiplications could be added, it would be | ||
+ | a decision based on optimising efficiency of calculations and minimising false positives. | ||
+ | In a 2 and 3 BabbleBrick system the probability of a false positives occurring was reduced by | ||
+ | approximately 20 and 30 times respectively, although the numbers are too large to compute, | ||
+ | this new method has the possibility of lowering the maximum false positive error of the previously | ||
+ | used checksum by one or more orders of magnitude. | ||
+ | If continued further, research should also be done in to the reconstruction of data after it has been lost. | ||
+ | </p> | ||
</div> | </div> | ||
</div> | </div> |
Revision as of 20:35, 17 October 2016