Team:EPFL/Software CELLO

iGEM EPFL 2016


Over the course of this summer, we developed a suite of tools intended to help users design their own biological circuits for their target systems. Our journey into the realm of bioinformatics began when we discovered Cello. Published on April 1st, 2016 in Science, Cello takes user inputs in the form of a circuit described using Verilog code, and a user constraint file (UCF) that contains biological data pertaining to the system chosen by the user and a list of gates that the user can put in that system. By combining these two inputs with a series of algorithms, the program returns one or more plasmids that, once put in a host cell, will recreate the circuit the user has defined. By choosing the input promoters, the user can decide what molecules or transcription factors to use as inputs to the logical circuit. This circuit can allow the cell to make novel decisions and calculations based on its environment, metabolism, and health.

Cello is immensely powerful, as well as accurate. In its original round of testing, the authors found that 75% of the 60 circuits texted acted exactly as expected for all input combinations. Cello is modular, which makes it adaptable to user requirements. These requirements are passed to it using a User Constraint File, which also contains information specific to the biological system that is being used, such as gates that are available to use.

Cello does not work with all biochemistries, unfortunately. Although it works well with CRISPRi, it does not accept newer dCas9-based gate structures and technologies. We felt that we could make Cello even more powerful by introducing these new biochemistries, and scRNA specifically (link Zalatan). This is because dCas9 allows for the creation of “programmable” transcription factors. Finding actual transcription factors that work well in the host, but also don’t interfere with its genome is a real challenge. With dCas9, guide RNAs can be used to guide the dCas9 to specific 23-nucleotide long targets. In addition, transcriptional effectors can be attached to the dCas9 directly, or attached to the RNA directly using an architecture first described by Zalatan et al. (2015). See our Project Description for more details! Using this structure, we can have multiple effectors in the same synthetic system, with varying degrees of repression and activation, effectively fine-tuning the transcriptional response to each promoter. This system is made even more powerful by the discovery of “programmable” promoters, which have certain known internal regions in which the code can be changed without affecting the strength of the promoter. By modifying the code in these regions, it is possible to give each promoter its own ‘barcode’, which can be targeted by a unique guide RNA. Click here to see the barcodes we designed.

The current organization of information in the program could also be improved, to the benefit of the program’s usability. The current implementation of the UCF has certain limitation. In particular, the only publically available UCFs are defined in E. coli, meaning that it is not immediately usable in other systems. This brings up an even more important point: UCFs are treated as completely local files, meaning that for one researcher to use information obtained by another, that information has to be passed directly from one to the other. This is inefficient since it is likely that at any given point more gate structures would have been invented than someone would be aware of, since those structures might be locked away in another person’s local file. Finally, using Cello requires a basic understanding of how Verilog code works, meaning that understanding how to use a coding language is necessary before you can start to use the program.

Our modifications

Graphical User Interface

The first modification we made was to eliminate the need to understand how to code in Verilog to be able to use Cello. Verilog is not a difficult programming language, but learning any programming language can be challenging for beginners, and we wanted to open up Cello to people who had no other programming experience, and eliminate the initial learning curve due to this technological barrier. The user interface we developed is based on an intuitive drag-and-drop interface. This interface is based on a template by edwardball, and uses the jsPlump library. Some users may feel more at home using Verilog code, so the original input forms are still available on their original page.

Accepting External Inputs

In order to add the new biochemistries to Cello, we first had to delve into its inner workings. A large part of our time was spent trying to understand what each part of the program did, and how it interacted with its libraries. It was during this point that we identified two bugs in the program which we reported to the development team of Cello, through Prashant Vaidyanathan at Boston University. This led to an updated version of the NetSynth library being made to deal with gates that were not NOT, NOR, or AND.

We felt that the best way to integrate the aforementioned biochemistries and to also make the program more open-sourced would be to move the storage of information about dCas9-based gates to the web. Since we did not find an extant database for parts with this biochemistry that returns data that is usable by Cello, we decided to make an open and free database for these parts. You can find out more about our databases here! Cello had to be modified, however, to accept inputs from this third source. As the UCF is written in JSON (JavaScript Object Notation), we decided that we would also pass information to Cello through JSON files. These files also allow dCas9-based gates to work correctly in the target system, by appreciating the particularities specific to dCas9-based gates. For example, all these gates would require dCas9 to be expressed in the host cells to function, and many gates would have overlapping effector molecules. Recopying these "associated sequences" multiple times into the plasmids would constitute a massive waste of space, so these sequences are copied just once into a separate plasmid.

The "associated sequences" are visible in Cello's output as part of a separate plasmid.

To know how to use these new gates, Cello requires their response functions. Since many of these gates would be created at the moment they are needed, they might not have experimentally confirmed response functions, but it would still be important to know how they react. We worked on a model to help explain their behavior, which you can find here.

Cello serves to make the design of synthetic biological genetic circuits easier than ever; our modifications to Cello continue in the same vein, by making the software more usable, and the information it uses more transferable. Although the impementation of these new features is not perfect yet – the integration of new genetic material from json files is subject to bugs, and Cello is not yet connected with – we are working diligently past wiki freeze to give the synthetic biology community the best tools possible!