# Abstract

When information flows through gene-regulatory networks, noise is introduced, and fidelity suffers. A cell unable to correctly infer the environment signals from the noisy inputs may be hard to make right responses. What factors can possibly alter the circuit's ability to sense extrinsic information?

There are many “parallel circuits” in gene-regulatory networks where independently transcribed monomers assemble into functional complexes for downstream regulation. We wonder if those frequent dimerization designs are fundamental in improving the circuit properties. In addition, would higher affinity between monomers boost the quality as well?

Inspired by these thoughts, we construct synthetic biology circuit by using split florescent proteins. And we add inteins to split florescent proteins to the make the binding tighter. Then, we quantitatively measure the capacity of these information channels aided by information theory. Computation and wet lab work are combined to optimize our understanding of such systems, and to interpret potential biological significance of reoccurring parallel designs in nature.

# introductory Story

Let us tell a simple but interesting story to help you understand what our project is about and also relax yourselves before you look into our project.I believe all of you know about a term called binaural effect which means you can judge where a thing is with less deviations when you use both of your ears. Similarly, here is the story.

Li lei is an undergraduate from Tsinghua University and Han Meimei is an undergraduate from Peking University. They met and knew each other when they attended the Jamboree of iGEM. After that they always communicated through WeChat and fell in love with each other. So Li lei often called Han Meimei to make sure that how she is going and wanted to make her feel happy. But he had a problem. The telephone line was very noisy and Han Meimei could not hear what he said clearly every time he called her. He knew that and he was very depressed because he worried about that Han Meimei would be angry with him. One day he came up with a good idea and he bought two. Since then he bought two amplifiers and used them both when he called her. Now Han Meimei can hear him clearly and this tells us that when you use two paths other than one to transmit your messages, the noises will be reduced observably. And now with the inspiration, come and see our project.

# Description

# From information theory to our project

As we can learn from Wikipedia, information theory studies the quantification, storage, and communication of information which was originally proposed by Claude E. Shannon in 1948. The theory has developed amazingly and has found applications in many areas. It’s not exaggerated to say that we can see the power of information theory all the time.

Based on this, we would like to explore something originally in biological pathway using information theory.

Our project has a close connection with two terms in information theory, mutual Information and channel capacity. And what are they?

In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" (in units such as bits) obtained about one random variable, through the other random variable. The concept of mutual information is intricately linked to that of entropy of a random variable, a fundamental notion in information theory, that defines the "amount of information" held in a random variable. You can understand it trough from the figure below.

Formally, the mutual information of two discrete random variables X and Y can be defined as:\[I\left( {X;Y} \right) = \sum\limits_{y \in Y} {\sum\limits_{x \in X} {p\left( {x,y} \right)\log \left( {\frac{{p\left( {x,y} \right)}}{{p\left( x \right)p\left( y \right)}}} \right)} } \]

And the formula can also be proofed:\[I\left( {X;Y} \right) = H\left( X \right) - H\left( {X\left| Y \right.} \right) = H\left( Y \right) - H\left( {Y\left| X \right.} \right)\]

From the formula you can easily think that mutual information is the reduction of uncertainty in $X$ when you know $Y$.

And when you understand the mutual information, you can easily understand what the channel capacity is.

In electrical engineering, computer science and information theory, channel capacity is the tight upper bound on the rate at which information can be reliably transmitted over a communications channel. And by the noisy-channel coding theorem, the channel capacity of a given channel is the limiting information rate (in units of information per unit time) that can be achieved with arbitrarily small error probability. And you can also see the figure below.

The channel capacity is defined as:\[C = \mathop {\sup }\limits_{{p_X}\left( x \right)} I\left( {X;Y} \right)\]

where the supremum is taken over all possible choices of ${p_X}\left( x \right)$.

After you know about all the concept above, now we are glad to tell you that you can easily follow us and find out many interesting and inspiring things in our project. Congratulations!

# Protein Dimerization and Splitting Up

Dimerization is only too common in cells. Monomers assembly into dimers for further functions all the time, some interactions strong, some interactions weak. Function-less newborn peptides piece together and get to work, forming so-called tertiary structure; activated kinases reach each other and mutually phosphorylate; transcription factors, when forming homo- or hetero-dimers according to different stoichiometry, leads to varied downstream responses and distinct cellular fates…

Previous researches have underlined the important advantages of dimerization, including differential regulation, specificity, facilitated proximity and so on. Wait, this cannot be the end of the list. What is the influence of dimerization in noise propagation? The question is hardly touched due to the difficulty in controlling experiment variables. Synthetic biology provides powerful tools to carry out experiments otherwise impossible in designed systems.

For synthetic biologists, it is crucial and challenging to construct AND gate. Split up a regulatory protein such as transcription factor, express two halves independently, and an AND gate is born.

Nonetheless, the act of splitting up can bring about unexpected side effects. Gene regulatory circuits are highly dependent on quantitative properties, its complexity and nonlinearity contributing to hard-to-predict behaviors of biological systems. Once an important part in the system is chopped up, who knows what will happen next?

These are the thrilling challenges and opportunities we face in our program. We split up florescent protein and tune the affinity between them by adding intein sequence. When the two parts bind together, graceful blue light is emitted from EBFP2. These constructs mimic the effect of dimerization in living cells, and are under fine control because they can be induced by the drug Dox. Dox concentration serves as the input to our system and navigates the quantitative properties involved in dimerization.

# Cellular Information Inference

Sense the environment, or die. The intricate world is full of chaotic information, knowledge of whom are vital for survival. But all that the cell “knows”is the concentrations of downstream products. It has to infer the right input and try to eliminate the uncertainty caused by noise.

For each level of input, the cell exhibits a specific probability output distribution. If the distributions are overlapped too seriously, the cell will have difficulty guessing the right answer. Thus the circuit is prone to error, its ability to accurately transmit information low.

Traditionally, we evaluate the impacts of noise using variance-related statistics, such as coefficient of variance. These quantities can only describe how concentrated the output is around the mean value, but cannot tell us how well we can infer one of the correlated random variables from the other. Channel capacity makes a better criteria of noise because it more scientifically depicts the information dissemination process.