Difference between revisions of "Team:NCTU Formosa/Model"

Line 192: Line 192:
 
}
 
}
  
 
+
.content-2{
 +
    font-size:15pt !important;
 +
    text-align:justify;
 +
    color:#F3F7F7;
 +
    padding-left:20px;
 +
    padding-top:15px;
 +
    padding-bottom:15px;
 +
}
 
.content-image{
 
.content-image{
 
     color:#F3F7F7 !important;
 
     color:#F3F7F7 !important;
Line 289: Line 296:
 
             <li class="list">The number of disulfides bonds</li>
 
             <li class="list">The number of disulfides bonds</li>
 
             <li class="list">Propeptide & signal peptide—If the proteins have an N-terminal signal peptide and propeptide, a part of protein will be cleaved during maturation or activation.</li>
 
             <li class="list">Propeptide & signal peptide—If the proteins have an N-terminal signal peptide and propeptide, a part of protein will be cleaved during maturation or activation.</li>
             <li class="list">Uniprot entry & Arachnoserver id—the accession number of protein in UniProtKB and ArachnoServer.</li>
+
             <li class="list">Uniprot entry & Arachnoserver id—the accession number of protein in UniProtKB and ArachnoServer*.</li>
 
</ul>
 
</ul>
 +
            <p class="content-2">*ArachnoServer is a manually curated database for protein toxins derived from spider venom.(<a href="http://www.arachnoserver.org/" style="color:#44E287;">http://www.arachnoserver.org/</a>).</p>
 +
            <p class="content">We also crawled other seven entries of protein toxicity recorded by Arachnoserver—molecular target, taxon, ED50, LD50, PD50, qualitative information, protein sequence from Arachnoserver. The term, Molecular target, is the effect site of toxin peptides, such as voltage-gated ion channels, GABA receptors and so on. Taxon, ED50, LD50, PD50, and the qualitative information are the toxicity against taxon that had been tested by experiments. The protein sequence from two databases is entirely the same.</p>
 +
            <p class="content">We utilized BeautifulSoup 4.4.0, sqlite3 and gevent modules in Python 3.5 to develop our crawler. Moreover, we have submitted the code to GitHub.<br>(Link:<a href="https://github.com/chengchingwen/iGEM/blob/master/crawler.py." style="color:#44E287;">https://github.com/chengchingwen/iGEM/blob/master/crawler.py.</a>)</p>
 +
 
         </div>
 
         </div>
 
     </div>
 
     </div>

Revision as of 09:43, 15 October 2016

Software—Toxin selection

I. Purpose

To prove the concept of Pantide, we wanted to select three existing distinct spider toxin peptides with probable oral toxicity against the testee-Spodoptera litura(Tobacco cutworms). For the actual application of Pantide we, we needed some more knowledge base of peptides which have different molecular targets to promote Pantide applying to other orders of insects, and a different toxic mechanism to regularly alternate so as to avoid drug resistance.

To date, about 1500 toxin peptides from 97 spider species have been studied, though the number of spider toxin peptides is conservatively estimated up to 10 million. [1] So, our purpose is to establish a database collecting the information of those peptides, such as molecular target, taxon, toxicity, sequence. According to the database, if we first choose a target insect, then we can easily find out groups of suitable peptides used as Pantide. Therefore, we also need to create a method to select peptides from the database.

II. Method

The method of toxin selection can be separated into three part: crawler, filter, and selection.

  • Toxin Collection—we planned to collect information of toxin peptides to establish our own database for Pantide from protein databases and some research results like taxon and toxicity from published papers.
  • Toxin Filtering—based on background knowledge of toxin peptides, we set up some conditions to filter out those unsuitable to use as Pantide.
  • Toxin Processing—we used online protein analyze tools to classify the remained peptides into groups by their similarity. Finally, we select out three distinct peptides from different groups to proof concept of Pantide.

III. Step 1: Crawler

In the beginning, we searched on UniProtKB/Swiss-Prot. It is a freely accessible database of protein sequence and functional information that is the manually annotated and reviewed section. (http://www.uniprot.org/) By searching the keyword “insecticidal NOT crystal” we wanted to find all the protein that has insecticidal activity excluding those crystal proteins of Bacillus thuringiensis, and we got 216 proteins as results.

Using the result, we established our Pantide database by crawling 11 entries of the protein information from UniProt. The entries are as follows.

  • The name of the protein
  • The description of protein function
  • The organisms/source of the protein sequence
  • The length of amino acids
  • The number of disulfides bonds
  • Propeptide & signal peptide—If the proteins have an N-terminal signal peptide and propeptide, a part of protein will be cleaved during maturation or activation.
  • Uniprot entry & Arachnoserver id—the accession number of protein in UniProtKB and ArachnoServer*.

*ArachnoServer is a manually curated database for protein toxins derived from spider venom.(http://www.arachnoserver.org/).

We also crawled other seven entries of protein toxicity recorded by Arachnoserver—molecular target, taxon, ED50, LD50, PD50, qualitative information, protein sequence from Arachnoserver. The term, Molecular target, is the effect site of toxin peptides, such as voltage-gated ion channels, GABA receptors and so on. Taxon, ED50, LD50, PD50, and the qualitative information are the toxicity against taxon that had been tested by experiments. The protein sequence from two databases is entirely the same.

We utilized BeautifulSoup 4.4.0, sqlite3 and gevent modules in Python 3.5 to develop our crawler. Moreover, we have submitted the code to GitHub.
(Link:https://github.com/chengchingwen/iGEM/blob/master/crawler.py.)

IV. Step 2: Filter

V. Step 3: Selection

VI. Future