Team:LMU-TUM Munich/Model


Modeling

Since our approach to tissue printing is supposed to work without a supporting scaffold, the cohesiveness of the printed medium is crucial. Biotin binds strongly to Streptavidin as well as Avidin. Thus we are not worried about the individual bonding itself but rather about the interconnectedness of the individual cells and binding .
Before diving into the details we introduce a few abbreviations to make reading and writing more easy and efficient. Since it is yet to decide weather Streptavidin/Avidin or Biotin will be attached to the cells or to the networking (NP) we refer to the binding sites of cells as CBS and to the ones of NP as NPBS.


In the worst case all CBS we dispense from the printhead would get immediately occupied with otherwise loose NP, which would leave us with lots of individual NP-coated cells. The other extreme case would be that that the cell hardly finds any to bond to. Again resulting in individual cells that are unconnected.

So we assumed that the problem is most likely a typical question of polymerization. While the bonds in our case are not covalent we decided it should be safe to assume they were, since they are very strong with a dissociation constant of 10-15 M. So our most promising discoveries after browsing pertinent literature were the Carothers equation as well as the Flory-Stockmayer theory, which is a generalization of the former.

[describe Flory-Stockmayer theory]

[describe our assumtions to make it fit]

[describe results and limitations]

After we found that the results provided are not sufficient to optimize our process we were talking to some Professors in the field of polymer chemistry, biochemistry and biological modelling. [Most Answers were discouraging because ...] Ultimately we decided to pursue the advice of Dr. Hasenauer, who suggested to simulate the problem to understand it better. First with strong assumptions, later on reducing them stepwise to get more and more accurate results.

So our first approach assumed:

  • Every cell has x CBS.
  • Every NP has y PBS.
  • There are 100 cells and z NP.
  • Every pair of CBS and PBS have the same probability to bind. Regardless of spacial distance, obstruction, [some more fancy chemical words for this?]
  • At some point either all or all cell binding sites will have bond.

In a first experiment we fixed x to 30, compromising between a way higher number of binding sites, which could “consume” , and a much lower amount of cells that could actually surround it due to spacial limitations. Furthermore we fixed y to 8 to get a first impression and run experiments with logarithmically varying z. To analyze the result we analyzed what we call a cell-graph. The cell-graph is obtained by drawing a node for each cell and connecting each node with an edge to the nodes of the respective other cells that this cell is connected to via a . On this graph we evaluated a couple of metrics from graph theory, resulting in the plots in Figure 1

We were happy that most metrics were pointing to roughly the same sweet spot of relative polymer/cell concentration. Soon we discovered however that the solution is a rather trivial one. Since either all CBS or all PBS are bond before the simulation stops it’s quite obvious that the sweet spot is where the total combined PBS equal the combined CBS. So next we tried to incorporate spatial effects into binding probability.

Again we assume:

  • Every cell has x CBS.
  • Every NP has y PBS.
  • There are 100 cells and z NP.

But instead of assuming equal probability for each CBS/PBS pair to bind we assume:

  • A one-dimensional space, where all cells as well as NP are placed in equidistant manner.
  • For now the position of individual CBS is assumed equal to the respective cell. Likewise for PBS and NP.
  • The size of the space is defined by a constant c, where 1 means that the space is of size 1 and 0.01 means the space is of size 100. This constant corresponds to the concentration of active components in a medium. The lower the concentration the higher the distance between two components.
  • At every timestep a constant fraction of free PBS tries to bind. (reactivity)
  • When a PBS is trying to react, each free CBS gets assigned a certain probability depending on two principles:
    • The more near a CBS is to the PBS, the more likely it is to bind to this one. [zero mean gaussian of the distance with a certain varaiance \sigma]
    • The more cells are bound to a certain NP already, the less likely it is, that it will bind to one that it didn’t bind to yet. [limited space, big cells, small NP] [which distribution?]
  • With increasing time, the variance \sigma is increasing to modell that cells and NP can move freely in the medium and reach more distant partners. [neglecting that they only move in one direction; let’s say we don’t know which one]
  • After a defined numebrr of timesteps the experiment is stipped and the result is assessed the same way described under [section above]


References


Code

simulation

Setup

So first we need to import some libraries

In [1]:
import numpy as np
import matplotlib.pylab as plt
%matplotlib inline

The following library is one that we wrote ourselves. It just contains the basic code that makes up our model of cells and proteins as monomers with binding sites

In [126]:
from monomers import *

Now we define a couple of functions to set up an experiment.

functions to initialize an experiment

In [3]:
def init_monomers(n_cells, n_proteins, n_cell_binding = 30, n_protein_binding = 4):
    '''instantiates a certain number of monomers'''
    Cell.n_cells=0
    cells=[]
    for i in range(n_cells):
        cells.append(Cell(n_cell_binding))

    Protein.n_proteins=0
    proteins=[]
    for i in range(n_proteins):
        proteins.append(Protein(n_protein_binding))
    return cells, proteins

After we set our monomers in place we need to modell in some way how they interact with each other. The following "polymerize" function does so in a very basic fashion:

  • make a list of free binding sites of TP as well as cells
  • while there are still free binding sites of either kind choose two at random and let them bind together
In [4]:
def polymerize(cells, proteins):
    '''polymerizes cells and proteins in a simple way'''
    free_cell_bindings = []
    for c in cells:
        free_cell_bindings += c.free_bindings

    free_protein_bindings = []
    for p in proteins:
        free_protein_bindings += p.free_bindings

    while len(free_protein_bindings)>0 and len(free_cell_bindings)>0:
        p_binding_idx = np.random.randint(len(free_protein_bindings))
        p_binding = free_protein_bindings.pop(p_binding_idx)
        c_binding_idx = np.random.randint(len(free_cell_bindings))
        c_binding = free_cell_bindings.pop(c_binding_idx)
        p_binding.bind(c_binding)
    return len(free_protein_bindings) , len(free_cell_bindings)

As described in the text above this simulation of polymerization was too simplified. The results we got from this modell were not very surprising and didn't help us much with optimizing our process.

So next we implemented a more sophisticated way of simulating the polymerization process. The following function takes some basic spacial and temporal effects into account.

In [49]:
def polymerize_spacial(cells, proteins, n_protein_binding, concentration=0.01, reactivity=0.2, mobility=0.05, max_time=10.):
    '''simulates polymerization in a more sophisticated way'''
    time = 0.
    position_variance = 1.
    distance_binding_probability = lambda x: np.exp(-x ** 2 / position_variance ** 2) / (
    2 * np.pi * position_variance ** 2)

    free_protein_bindings = []
    for p in proteins:
        free_protein_bindings += p.free_bindings

    while (time < max_time):
        # at each timestep a certain fraction of the free binding sites will have a chance to react
        # this fraction is depending on the defined "reactivity" parameter
        for i in range(int(np.ceil(reactivity * len(free_protein_bindings)))):
            # choose a random NPBS
            p_binding_idx = np.random.randint(len(free_protein_bindings))
            p_binding = free_protein_bindings[p_binding_idx]
            
            # do tests on parent protein
            p = p_binding.parent
            # all monomers get assigned an individual increasing id at creation
            # we use this id to calculate a position by assuming
            # that at creation the cells are put on a line with equal distance to echa other
            
            # the lower to concentration, the higher the distance
            p_pos = float(p.id) / (len(proteins) * concentration)
            # (float) idx of nearest cell
            c_id_mean = p_pos * len(cells) * concentration
            # max distance of reachable cells from nearest cell
            max_range = position_variance * 3.2
            c_id_range = max_range * len(cells) * concentration
            c_id_min = max(int(c_id_mean - c_id_range), 0)
            c_id_max = min(int(c_id_mean + c_id_range), len(cells))
            # now we can calculate the cells in range
            candidate_cells = cells[c_id_min:c_id_max]

            if len(candidate_cells) == 0:
                #if there are no cells in reach we are done here and pic the next random NPBS to try again
                continue
            # otherwise we go on
            # now depending on the individual cells distance to the NP
            # we calculate the probability of each candidate cell to bind as a gaussian pdf of the distance
            min_distance = max(p_pos - position_variance * 3.2, 0) - p_pos
            max_distance = min(-p_pos + position_variance * 3.2,
                               len(proteins) - 1 / len(proteins) / concentration) + p_pos
            probabilities = np.arange(min_distance, max_distance,
                                      (max_distance - min_distance) / len(candidate_cells))[:len(candidate_cells)]
            probabilities = distance_binding_probability(probabilities)
            probabilities /= sum(probabilities) #normalization
            
            # this call picks a random cell based on the calculated probabilities
            cell = np.random.choice(candidate_cells, p=probabilities)
            
            # check if the chosen cell has actually still free binding sites
            try:
                cell.free_bindings.next()
            except StopIteration:
                # no free bindings left in this cell. back to the top.
                continue
                
            # now we find out which cells the TP in question is already bound to
            protein_partners = list(set([b.partner.parent for b in p.bindings if b.partner is not None]))
            #this way we know how many different cells it's bound to
            if len(protein_partners) == 1:
                # if a protein is already attatched to a cell it will most likely not bind to other cells
                # but there is a chance of 10%
                if cell not in protein_partners and np.random.rand()<0.9:
                    continue
            if len(protein_partners)>1 and cell not in protein_partners:
                # if a protein was binding to two cells already, it will not bind to any other
                continue
            # if we made it this far without running into any of the "continue"s we are ready to bind!
            # so the next free binding of the cell will react to our NPBS
            cell.free_bindings.next().bind(p_binding)
            # and the TPBS is removed from our binding sites dating database
            free_protein_bindings.pop(p_binding_idx)
        
        # after all reactive NPBS tired to react we move everything a bit
        # depending on the mobility parameter and increment the time
        position_variance += mobility
        time += 0.01

    # in the end we calculate the number of free CBS. Just to know about it.
    free_cell_bindings = []
    for c in cells:
        free_cell_bindings += c.free_bindings
    return len(free_protein_bindings), len(free_cell_bindings)

Analysis

Now for analyzing the generated polymer networks we import another library called graph_tool.

In [11]:
from graph_tool.all import *

from graph_tool import clustering,stats

The following two functions translate the network (or graph) that we build before into graph_tool-graphs. The first one draws a node for each cell. Then it draws a connecting edge betwee two nodes if the respective cells are connected via any NP.

The second one draws a big node for each cell and a small one for each NP. Edges represent a binding between a NP and a connected cell.

We will clarify the difference between the two graphing methods graphically after the two function definitions.

In [127]:
def cell_graph(cells,proteins,draw=False):
    '''vertices are cells, cells connected via a protein are connected in the graph with an edge'''
    g = Graph(directed=False)

    vertices = list(g.add_vertex(len(cells)))

    for c_id in range(len(cells)):
        for c_b in cells[c_id].bindings:
            try:
                for p_b in c_b.partner.parent.bindings:
                    try:
                        c2_id = p_b.partner.parent.id
                        if not c_id==c2_id:
                            g.add_edge(vertices[c_id],vertices[c2_id])
                    except AttributeError:
                        pass
            except AttributeError:
                pass
    if draw:
        graph_draw(g, vertex_text=g.vertex_index, edge_pen_width=2 , vertex_font_size=30, output="/tmp/graph.png",
            output_size=(2000, 2000))
    return g
In [128]:
def pc_graph(cells, proteins, draw=False):
    g = Graph(directed=False)

    v_size = g.new_vertex_property("int")

    for cell in cells:
        cell.vertex = g.add_vertex()
        v_size[cell.vertex] = 120

    for p in proteins:
        p.vertex=g.add_vertex()
        v_size[p.vertex] = 40
        for b in p.bindings:
            #cell connections
            if b.partner is not None:
                cell_vertex = b.partner.parent.vertex
                if g.edge(p.vertex,cell_vertex) is None: #only one connection per pair
                    g.add_edge(p.vertex,cell_vertex)
    if draw:
        pos=fruchterman_reingold_layout(g)
        graph_draw(g, edge_pen_width=2, vertex_text=g.vertex_index, vertex_size=v_size, vertex_font_size=15, output_size=(2000, 2000))
    return g

To make the difference more clear lets just run an experiment and visualize the result in both ways

In [133]:
cells, proteins = init_monomers(n_cells=30, n_proteins=200, n_cell_binding = 300, n_protein_binding = 4)
polymerize_spacial(cells, proteins, n_protein_binding=4, 
                   concentration=0.02, reactivity=0.1, mobility=0.05, max_time=3.0)
Out[133]:
(2, 8202)

The experiment is done. Now let's visualize it

In [141]:
cell_graph(cells,proteins,draw=True)
Out[141]:
<Graph object, undirected, with 30 vertices and 492 edges at 0x7f6ef3edaed0>

And now the same experimental result with the other visualization

In [142]:
pc_graph(cells,proteins,draw=True)
Out[142]:
<Graph object, undirected, with 230 vertices and 276 edges at 0x7f6ef30f4110>

The second graph shows not only how the interconnection between the idnividual cells is but also visualizes how many NP are "lost" by just sticking to a single cell without forming an connection to others. Furthermore one can see how many differnt proteins connect two cells.

More experiments

For a first feeling of how the parameters of the polymerization function influence the result it's helpful just to try out a few combinations and look at the visualization of the graph

To make this task a bit easier we defien a small helper function that initializes the monomers, polymerizes them and generates the graph.

In [144]:
def random_graph(graph_method, n_cells, n_proteins, n_cell_binding = 30, n_protein_binding = 4,concentration=0.02,reactivity=0.01,mobility=0.1,max_time=2.):
#    cells,proteins=init_monomers(n_cells, n_proteins, n_cell_binding, n_protein_binding)
#    polymerize(cells,proteins)
    cells,proteins=init_monomers(n_cells, n_proteins, n_cell_binding = n_cell_binding, n_protein_binding = n_protein_binding)
    polymerize_spacial(cells, proteins, n_protein_binding=n_protein_binding, concentration=concentration, reactivity=reactivity, mobility=mobility, max_time=max_time)
    return graph_method(cells,proteins,draw),cells,proteins

As long as we want to see a rendered result we just switch the draw variable to True

In [145]:
draw = True

Now we can build a random graph by plugging in some numbers and runnign a single line of code. (The "_=" just suppresses text output)

up button Back to top

LMU & TUM Munich

Technische Universität MünchenLudwig-Maximilians-Universität München

United team from Munich's universities

Contact us:

Address

iGEM Team TU-Munich
Emil-Erlenmeyer-Forum 5
85354 Freising, Germany