Difference between revisions of "Team:EPFL/Software CELLO"

(first draft software - cello)
 
Line 1: Line 1:
 
{{RISE_head}}
 
{{RISE_head}}
 
<html>
 
<html>
 +
<div class="spacer h50"></div>
  
        <div class="spacer h50"></div>
+
         <div class="simple-page">
 
+
         <div class="simple-page centered-page">
+
 
             <section class="">
 
             <section class="">
 
                 <div class="container">
 
                 <div class="container">
Line 12: Line 11:
 
                         <hr class="animate-box"/>
 
                         <hr class="animate-box"/>
 
                         <div class="spacer h20"></div>
 
                         <div class="spacer h20"></div>
                         <p class="sub-lead justified animate-box">
+
                         <p class="sub-lead text-justify animate-box">
                             Over the course of this summer, we developed a suite of tools intended to help users design their own biological circuits for their target systems. Our journey into the realm of bioinformatics began when we discovered Cello. <a href="#TOLINK">Published on March 31, 2016 in Nature</a>, Cello takes user inputs in the form of a circuit described using <a class="#TOLINK">Verilog</a> code, and a user constraint file (UCF) that contains biological data pertaining to the system chosen by the user and a list of gates that the user can put in that system. By combining these two inputs with a series of algorithms, the program returns one or more plasmids that, once put in a host cell, will recreate the circuit the user has defined. By choosing the input promoters, the user can decide what molecules or transcription factors to use as inputs to the logical circuit. This circuit can allow the cell to make novel decisions and calculations based on its environment, metabolism, and health.
+
                             Over the course of this summer, we developed a suite of tools intended to help users  
 +
                            design their own biological circuits for their target systems. Our journey into the  
 +
                            realm of bioinformatics began when we discovered Cello. Published on April 1<sup>st</sup>, 2016
 +
                            in <a href="http://science.sciencemag.org/content/352/6281/aac7341">
 +
                                Science</a>, Cello takes user inputs in the form of a circuit described using  
 +
                                <a href="https://en.wikipedia.org/wiki/Verilog">Verilog</a>  
 +
                            code, and a user constraint file (UCF) that contains biological data pertaining to the  
 +
                            system chosen by the user and a list of gates that the user can put in that system. By  
 +
                            combining these two inputs with a series of algorithms, the program returns one or more
 +
                            plasmids that, once put in a host cell, will recreate the circuit the user has defined.
 +
                            By choosing the input promoters, the user can decide what molecules or transcription  
 +
                            factors to use as inputs to the logical circuit. This circuit can allow the cell to make
 +
                            novel decisions and calculations based on its environment, metabolism, and health.  
 +
 
 
                         </p>
 
                         </p>
 
                         <div class="spacer h50"></div>
 
                         <div class="spacer h50"></div>
Line 20: Line 32:
 
                         </p>
 
                         </p>
 
                         <div class="spacer h50"></div>
 
                         <div class="spacer h50"></div>
                         <p class="sub-lead justified animate-box">
+
                         <p class="sub-lead text-justify">
                             This tool is immensely powerful, as well as accurate. In its original round of testing, the authors found that 71% of the tested circuits acted as expected on the first try. The program has certain limitations, however. Cello is modular, which makes it adaptable to the user’s requirements. Currently, the only publically available UCFs are defined in E. Coli, meaning that it is not immediately usable in other systems. This brings up an even more important point: currently, UCFs are treated as completely local files, meaning that for one researcher to use information obtained by another, that information has to be passed directly from one to the other. This is inefficient, of course, since it is likely that at any given point a user could obtain or build more gates than they are actually aware of. Finally, using Cello requires a basic understanding of how Verilog code works, meaning that there is a slight learning code before you can start to use the program.  
+
                             Cello is immensely powerful, as well as accurate. In its original round of testing,  
 +
                            the authors found that 75% of the 60 circuits texted acted exactly as expected for all
 +
                            input combinations. Cello is modular, which makes it adaptable to user requirements.  
 +
                            These requirements are passed to it using a User Constraint File, which also contains
 +
                            information specific to the biological system that is being used, such as gates that are  
 +
                            available to use.  
 
                         </p>
 
                         </p>
                         <p class="sub-lead justified animate-box">
+
                        <br>
                            We also felt that we could make Cello even more powerful by introducing new biochemistries to it. It is currently only built to work with transcription factors and CRISPRi, but the future in gate-building technology may very well lie with dCas9-based gates. This is because dCas9 allows for the creation of “programmable” transcription factors. Finding actual transcription factors that work well in the host, but also don’t interfere with its genome is a real challenge. With dCas9, guide RNAs can be used to guide the dCas9 to specific 23-nucleotide long targets. In addition, transcriptional effectors can be attached to the dCas9 directly, or attached to the RNA directly using an architecture first described by <a href="#TOLINK">Zalatan et al. (2015)</a>. See our <a href="#TOLINK">Biological Project Page</a> for more details! Using this structure, we can have multiple effectors in the same synthetic system, with varying degrees of repression and activation, effectively fine-tuning the transcriptional response to each promoter. This system is made even more powerful by the discovery of “programmable” promoters, which have certain known internal regions in which the code can be changed without affecting the strength of the promoter. By modifying the code in these regions, it is possible to give each promoter its own barcode, which can be targeted by a unique guide RNA.  
+
                       
 +
                         <p class="sub-lead text-justify">
 +
                        Cello does not work with all biochemistries, unfortunately. Although it works well with CRISPRi,
 +
                        it does not accept newer dCas9-based gate structures and technologies. We felt that we could make
 +
                        Cello even more powerful by introducing these new biochemistries, and scRNA specifically (link
 +
                        Zalatan). This is because dCas9 allows for the creation of “programmable” transcription factors.  
 +
                        Finding actual transcription factors that work well in the host, but also don’t interfere with its
 +
                        genome is a real challenge. With dCas9, guide RNAs can be used to guide the dCas9 to specific  
 +
                        23-nucleotide long targets. In addition, transcriptional effectors can be attached to the dCas9  
 +
                        directly, or attached to the RNA directly using an architecture first described by  
 +
                        <a href="https://www.ncbi.nlm.nih.gov/pubmed/25533786">Zalatan et al.  
 +
                            (2015)</a>. See our <a href="https://2016.igem.org/Team:EPFL/Description">Project Description</a> for more details! Using this structure, we can have  
 +
                        multiple effectors in the same synthetic system, with varying degrees of repression and activation,  
 +
                        effectively fine-tuning the transcriptional response to each promoter. This system is made even more  
 +
                        powerful by the discovery of “programmable” promoters, which have certain known internal regions in  
 +
                        which the code can be changed without affecting the strength of the promoter. By modifying the code  
 +
                        in these regions, it is possible to give each promoter its own ‘barcode’, which can be targeted by  
 +
                        a unique guide RNA. <a href="https://2016.igem.org/Team:EPFL/Deskgen">Click here to see the barcodes we designed.</a></p>
 +
                        <br>
 +
                        <p class="sub-lead text-justify">
 +
                            The current organization of information in the program could also be improved,
 +
                            to the benefit of the program’s usability. The current implementation of the UCF
 +
                            has certain limitation. In particular, the only publically available UCFs are
 +
                            defined in <em>E. coli</em>, meaning that it is not immediately usable in other systems.
 +
                            This brings up an even more important point: UCFs are treated as completely local
 +
                            files, meaning that for one researcher to use information obtained by another,
 +
                            that information has to be passed directly from one to the other. This is inefficient
 +
                            since it is likely that at any given point more gate structures would have been
 +
                            invented than someone would be aware of, since those structures might be locked away
 +
                            in another person’s local file. Finally, using Cello requires a basic understanding
 +
                            of how Verilog code works, meaning that understanding how to use a coding language
 +
                            is necessary before you can start to use the program.  
 
                         </p>
 
                         </p>
 
                     </div>
 
                     </div>
Line 30: Line 78:
 
             </section>
 
             </section>
 
         </div>
 
         </div>
         <div class="simple-page centered-page">
+
         <div class="simple-page">
 
             <section class="">
 
             <section class="">
 
                 <div class="container">
 
                 <div class="container">
 
                     <div class="col-md-10 col-md-offset-1 text-center">
 
                     <div class="col-md-10 col-md-offset-1 text-center">
                         <h2 class="lead animate-box">Our modifications</h2>
+
                         <h2 class="lead">Our modifications</h2>
 
                         <div class="spacer h20"></div>
 
                         <div class="spacer h20"></div>
                         <hr class="animate-box"/>
+
                         <hr />
 
                         <div class="spacer h20"></div>
 
                         <div class="spacer h20"></div>
                         <p class="sub-lead justified animate-box">
+
                        <h3>Graphical User Interface</h3>
                             Cello is made to function with the UCF as its only source of external data regarding the gate library available. We therefore also modified the program to include new external inputs. In our version, the collection of gates taken from the UCF and from the database are combined into a single collection within the program. Since multiple dCas9-based gates could depend on common features – the dCas9 protein, activators, and repressors, for example – these “associated parts” are also taken into account by the program and built into a separate plasmid.  
+
                         <p class="sub-lead text-justify">
 +
                             The first modification we made was to eliminate the need to understand how to
 +
                            code in Verilog to be able to use Cello. Verilog is not a difficult programming
 +
                            language, but learning any programming language can be challenging for beginners,
 +
                            and we wanted to open up Cello to people who had no other programming experience,
 +
                            and eliminate the initial learning curve due to this technological barrier. The
 +
                            user interface we developed is based on an intuitive drag-and-drop interface. This
 +
                            interface is based on a template by <a href="https://github.com/edwardball/academo.org">
 +
                            edwardball</a>, and uses the jsPlump library. Some users may feel more at home
 +
                            using Verilog code, so the original input forms are still available on their original page.  
 
                         </p>
 
                         </p>
 
                     </div>
 
                     </div>
Line 51: Line 108:
 
                         <div class="spacer h40"></div>
 
                         <div class="spacer h40"></div>
 
                         <!--NTH: qualità -->
 
                         <!--NTH: qualità -->
                         <img class="animate-box video" src="img/videoGIF/cello_designer_demo_and.gif" alt="">
+
                         <img class="video" src="img/videoGIF/cello_designer_demo_and.gif" alt="">
 
                         <div class="spacer h40"></div>
 
                         <div class="spacer h40"></div>
 
                     </div>
 
                     </div>
Line 57: Line 114:
 
             </section>
 
             </section>
 
         </div>
 
         </div>
         <div class="simple-page centered-page">
+
        <div class="spacer h20"></div>
 +
 
 +
         <div class="simple-page">
 
             <section class="">
 
             <section class="">
 
                 <div class="container">
 
                 <div class="container">
 
                     <div class="col-md-10 col-md-offset-1 text-center">
 
                     <div class="col-md-10 col-md-offset-1 text-center">
                         <p class="sub-lead justified animate-box">
+
                        <h3>Accepting External Inputs</h3>
                            Just as our databases our user-friendly, we also created a refined user interface for circuit creation in Cello. This user interface relies on an intuitive drag-and-drop system, eliminating the need to understand structural code in Verilog. For those circuits best described using case statements, the original input forms still exist.  
+
                         <p class="sub-lead text-justify">
 +
                        In order to add the new biochemistries to Cello, we first had to delve into its
 +
                        inner workings. A large part of our time was spent trying to understand what each
 +
                        part of the program did, and how it interacted with its libraries. It was during
 +
                        this point that we identified two bugs in the program which we reported to the
 +
                        development team of Cello, through Prashant Vaidyanathan at Boston University.  
 +
                        This led to an updated version of the NetSynth library being made to deal with
 +
                        gates that were not NOT, NOR, or AND.
 +
                        </p>
 +
                        <br>
 +
                        <p class="sub-lead text-justify">
 +
                            We felt that the best way to integrate the aforementioned biochemistries and  
 +
                            to also make the program more open-sourced would be to move the storage of
 +
                            information about dCas9-based gates to the web. Since we did not find an extant database
 +
                            for parts with this biochemistry that returns data that is usable by Cello,  
 +
                            we decided to make an open and free database for these parts.
 +
                            <a href="https://2016.igem.org/Team:EPFL/software_database">You can find out more about
 +
                                our databases here!</a> Cello had to be modified, however, to accept inputs from
 +
                            this third source. As the UCF is written in <a href="http://www.json.org/">JSON</a> (JavaScript Object Notation),
 +
                            we decided that we would also pass information to Cello through JSON files.
 +
                            These files also allow dCas9-based gates to work correctly in the target system, by appreciating
 +
                            the particularities specific to dCas9-based gates. For example, all these gates would require
 +
                            dCas9 to be expressed in the host cells to function, and many gates would have overlapping effector
 +
                            molecules. Recopying these "associated sequences" multiple times into the plasmids would constitute a massive waste of
 +
                            space, so these sequences are copied <em>just once</em> into a separate plasmid. 
 +
                        </p>
 +
                        <figure>
 +
                            <img id="associated-plasmid-gif" src="img/cello_associated_plasmid_output.gif">
 +
                            <figcaption id='associated-plasmid-caption'>The "associated sequences" are visible in Cello's output as part of a separate plasmid.</figcaption>
 +
                           
 +
                        </figure>
 +
                        <p class="sub-lead text-justify">
 +
                            To know how to use these new gates, Cello requires their response functions.
 +
                            Since many of these gates would be created at the moment they are needed,
 +
                            they might not have experimentally confirmed response functions, but it would
 +
                            still be important to know how they react. We worked on a model to help
 +
                            explain their behavior, which you can find <a href="https://2016.igem.org/Team:EPFL/Model">here</a>.
 +
                        </p>
 +
                        <br>
 +
                        <p class="sub-lead text-justify">
 +
                            Cello serves to make the design of synthetic biological genetic circuits easier than ever; our
 +
                            modifications to Cello continue in the same vein, by making the software more usable, and the
 +
                            information it uses more transferable. Although the impementation of these new features is not
 +
                            perfect yet – the integration of new genetic material from json files is subject to bugs, and
 +
                            Cello is not yet connected with intelligene.plus – we are working diligently past wiki freeze
 +
                            to give the synthetic biology community the best tools possible!
 
                         </p>
 
                         </p>
 
                     </div>
 
                     </div>

Revision as of 22:21, 19 October 2016

iGEM EPFL 2016

Cello


Over the course of this summer, we developed a suite of tools intended to help users design their own biological circuits for their target systems. Our journey into the realm of bioinformatics began when we discovered Cello. Published on April 1st, 2016 in Science, Cello takes user inputs in the form of a circuit described using Verilog code, and a user constraint file (UCF) that contains biological data pertaining to the system chosen by the user and a list of gates that the user can put in that system. By combining these two inputs with a series of algorithms, the program returns one or more plasmids that, once put in a host cell, will recreate the circuit the user has defined. By choosing the input promoters, the user can decide what molecules or transcription factors to use as inputs to the logical circuit. This circuit can allow the cell to make novel decisions and calculations based on its environment, metabolism, and health.

Cello is immensely powerful, as well as accurate. In its original round of testing, the authors found that 75% of the 60 circuits texted acted exactly as expected for all input combinations. Cello is modular, which makes it adaptable to user requirements. These requirements are passed to it using a User Constraint File, which also contains information specific to the biological system that is being used, such as gates that are available to use.


Cello does not work with all biochemistries, unfortunately. Although it works well with CRISPRi, it does not accept newer dCas9-based gate structures and technologies. We felt that we could make Cello even more powerful by introducing these new biochemistries, and scRNA specifically (link Zalatan). This is because dCas9 allows for the creation of “programmable” transcription factors. Finding actual transcription factors that work well in the host, but also don’t interfere with its genome is a real challenge. With dCas9, guide RNAs can be used to guide the dCas9 to specific 23-nucleotide long targets. In addition, transcriptional effectors can be attached to the dCas9 directly, or attached to the RNA directly using an architecture first described by Zalatan et al. (2015). See our Project Description for more details! Using this structure, we can have multiple effectors in the same synthetic system, with varying degrees of repression and activation, effectively fine-tuning the transcriptional response to each promoter. This system is made even more powerful by the discovery of “programmable” promoters, which have certain known internal regions in which the code can be changed without affecting the strength of the promoter. By modifying the code in these regions, it is possible to give each promoter its own ‘barcode’, which can be targeted by a unique guide RNA. Click here to see the barcodes we designed.


The current organization of information in the program could also be improved, to the benefit of the program’s usability. The current implementation of the UCF has certain limitation. In particular, the only publically available UCFs are defined in E. coli, meaning that it is not immediately usable in other systems. This brings up an even more important point: UCFs are treated as completely local files, meaning that for one researcher to use information obtained by another, that information has to be passed directly from one to the other. This is inefficient since it is likely that at any given point more gate structures would have been invented than someone would be aware of, since those structures might be locked away in another person’s local file. Finally, using Cello requires a basic understanding of how Verilog code works, meaning that understanding how to use a coding language is necessary before you can start to use the program.

Our modifications


Graphical User Interface

The first modification we made was to eliminate the need to understand how to code in Verilog to be able to use Cello. Verilog is not a difficult programming language, but learning any programming language can be challenging for beginners, and we wanted to open up Cello to people who had no other programming experience, and eliminate the initial learning curve due to this technological barrier. The user interface we developed is based on an intuitive drag-and-drop interface. This interface is based on a template by edwardball, and uses the jsPlump library. Some users may feel more at home using Verilog code, so the original input forms are still available on their original page.

Accepting External Inputs

In order to add the new biochemistries to Cello, we first had to delve into its inner workings. A large part of our time was spent trying to understand what each part of the program did, and how it interacted with its libraries. It was during this point that we identified two bugs in the program which we reported to the development team of Cello, through Prashant Vaidyanathan at Boston University. This led to an updated version of the NetSynth library being made to deal with gates that were not NOT, NOR, or AND.


We felt that the best way to integrate the aforementioned biochemistries and to also make the program more open-sourced would be to move the storage of information about dCas9-based gates to the web. Since we did not find an extant database for parts with this biochemistry that returns data that is usable by Cello, we decided to make an open and free database for these parts. You can find out more about our databases here! Cello had to be modified, however, to accept inputs from this third source. As the UCF is written in JSON (JavaScript Object Notation), we decided that we would also pass information to Cello through JSON files. These files also allow dCas9-based gates to work correctly in the target system, by appreciating the particularities specific to dCas9-based gates. For example, all these gates would require dCas9 to be expressed in the host cells to function, and many gates would have overlapping effector molecules. Recopying these "associated sequences" multiple times into the plasmids would constitute a massive waste of space, so these sequences are copied just once into a separate plasmid.

The "associated sequences" are visible in Cello's output as part of a separate plasmid.

To know how to use these new gates, Cello requires their response functions. Since many of these gates would be created at the moment they are needed, they might not have experimentally confirmed response functions, but it would still be important to know how they react. We worked on a model to help explain their behavior, which you can find here.


Cello serves to make the design of synthetic biological genetic circuits easier than ever; our modifications to Cello continue in the same vein, by making the software more usable, and the information it uses more transferable. Although the impementation of these new features is not perfect yet – the integration of new genetic material from json files is subject to bugs, and Cello is not yet connected with intelligene.plus – we are working diligently past wiki freeze to give the synthetic biology community the best tools possible!