Modeling
crosslinking
Since our approach to tissue printing is supposed to work without a supporting scaffold, the cohesiveness of the printed medium is crucial. Biotin binds strongly to Streptavidin as well as Avidin. Thus we are not worried about the individual bonding itself but rather about the interconnectedness of the individual cells and binding .
Before diving into the details we introduce a few abbreviations to make reading and writing more easy and efficient. Since it is yet to decide weather Streptavidin/Avidin or Biotin will be attached to the cells or to the networking (NP) we refer to the binding sites of cells as CBS and to the ones of NP as NPBS.
In the worst case all CBS we dispense from the printhead would get immediately occupied with otherwise loose NP, which would leave us with lots of individual NP-coated cells. The other extreme case would be that that the cell hardly finds any to bond to. Again resulting in individual cells that are unconnected.
So we assumed that the problem is most likely a typical question of polymerization. While the bonds in our case are not covalent we decided it should be safe to assume they were, since they are very strong with a dissociation constant of 10-15 M. So our most promising discoveries after browsing pertinent literature were the Carothers equation as well as the Flory-Stockmayer theory, which is a generalization of the former, which gets closer to what we need, since the Carothers equation assumes that all monomers are bivalent.[1] The Flory-Stockmayer Theory assumes there are 3 different Monomers.
- bivalent units of type A
- multivalent units of type A and
- bivalent units of type B
Further it assumes that B units can only react with A units and vice versa. So by setting the concentration of bivalent A units to 0 we get quite close to the setup we have. However the Flory Stockmayer theory only asses whether gelation occurs or not. Without providing information on how fast or how well it will turn out. [2]
After we found that the results provided are not sufficient to optimize our process we were talking to some Professors in the field of polymer chemistry, biochemistry and biological modelling. Most Answers were discouraging, because to our surprise none of them ever heard of a existing model that would solve (part of) our problem. Ultimately we decided to pursue the advice of Dr. Hasenauer, who suggested to program a simulation of the problem in order to understand it better. First a simple simulation with strong assumptions, later on reducing them step wise to get more and more accurate results.
So our first approach assumed:
- Every cell has x CBS.
- Every NP has y PBS.
- There are 100 cells and z NP.
- Every pair of CBS and PBS have the same probability to bind. Regardless of spacial distance, obstruction or timing.
- At some point either all or all cell binding sites will have bond.
In a first experiment we fixed x to 30, compromising between a way higher number of binding sites, which could “consume” , and a much lower amount of cells that could actually surround it due to spacial limitations. Furthermore we fixed y to 8 to get a first impression and run experiments with logarithmically varying z (number of NP).
To analyze the result we analyzed what we call a cell-graph. The cell-graph is obtained by drawing a node for each cell and connecting each node with an edge to the nodes of the respective other cells that this cell is connected to via a NP.
On this graph we evaluated a couple of metrics from graph theory, resulting in the plots in Figure 1
We were happy to see that most metrics were pointing to roughly the same sweet spot of relative polymer/cell concentration. Soon we discovered however that the solution is a rather trivial one. Since either all CBS or all NPBS are bond before the simulation stops it’s quite obvious that the sweet spot is where the total combined PBS equal the combined CBS. So next we tried to incorporate spatial effects into binding probability.
Again we assume:
- Every cell has x CBS.
- Every NP has y PBS.
- There are 100 cells and z NP.
But instead of assuming equal probability for each CBS/PBS pair to bind we assume:
- A one-dimensional space, where all cells as well as NP are placed in equidistant manner.
- For now the position of individual CBS is assumed equal to the respective cell. Likewise for PBS and NP.
- The size of the space is defined by a constant c, where 1 means that the space is of size 1 and 0.01 means the space is of size 100. This constant corresponds to the concentration of active components in a medium. The lower the concentration the higher the distance between two components.
- At every timestep a constant fraction of free PBS tries to bind. (reactivity)
- When a PBS is trying to react, each free CBS gets assigned a certain probability depending on two principles:
- The more near a CBS is to the NPBS, the more likely it is to bind to this one. So the binding probability is defined by a pdf of the zero mean gaussian of the distance with a certain variance that resembles the spacial range of a NP.
- The more cells are bound to a certain NP already, the less likely it is, that it will bind to one that it didn’t bind to yet. Actually since the cells are so big compared to the NP, that a NP can connect to two cells at most. When it's bound to one cell already we consider the event of a second binding to this cell 10 times higher than to any other cell.
- With increasing time, the variance modeling the range of a NP is increasing in order to model, that cells and NP can move freely in the medium and reach more distant partners. We are neglecting, that the cell will move in either direction and just consider the actual position more and more uncertain, so we increase the variance.
- After a defined number of timesteps the experiment is stopped and the result is assessed the same way as described for the first experimantal setup.
Code
Setup¶
So first we need to import some libraries
import numpy as np
import matplotlib.pylab as plt
%matplotlib inline
The following library is one that we wrote ourselves. It just contains the basic code that makes up our model of cells and proteins as monomers with binding sites
from monomers import *
Now we define a couple of functions to set up an experiment.
functions to initialize an experiment¶
def init_monomers(n_cells, n_proteins, n_cell_binding = 30, n_protein_binding = 4):
'''instantiates a certain number of monomers'''
Cell.n_cells=0
cells=[]
for i in range(n_cells):
cells.append(Cell(n_cell_binding))
Protein.n_proteins=0
proteins=[]
for i in range(n_proteins):
proteins.append(Protein(n_protein_binding))
return cells, proteins
After we set our monomers in place we need to modell in some way how they interact with each other. The following "polymerize" function does so in a very basic fashion:
- make a list of free binding sites of TP as well as cells
- while there are still free binding sites of either kind choose two at random and let them bind together
def polymerize(cells, proteins):
'''polymerizes cells and proteins in a simple way'''
free_cell_bindings = []
for c in cells:
free_cell_bindings += c.free_bindings
free_protein_bindings = []
for p in proteins:
free_protein_bindings += p.free_bindings
while len(free_protein_bindings)>0 and len(free_cell_bindings)>0:
p_binding_idx = np.random.randint(len(free_protein_bindings))
p_binding = free_protein_bindings.pop(p_binding_idx)
c_binding_idx = np.random.randint(len(free_cell_bindings))
c_binding = free_cell_bindings.pop(c_binding_idx)
p_binding.bind(c_binding)
return len(free_protein_bindings) , len(free_cell_bindings)
As described in the text above this simulation of polymerization was too simplified. The results we got from this modell were not very surprising and didn't help us much with optimizing our process.
So next we implemented a more sophisticated way of simulating the polymerization process. The following function takes some basic spacial and temporal effects into account.
def polymerize_spacial(cells, proteins, n_protein_binding, concentration=0.01, reactivity=0.2, mobility=0.05, max_time=10.):
'''simulates polymerization in a more sophisticated way'''
time = 0.
position_variance = 1.
distance_binding_probability = lambda x: np.exp(-x ** 2 / position_variance ** 2) / (
2 * np.pi * position_variance ** 2)
free_protein_bindings = []
for p in proteins:
free_protein_bindings += p.free_bindings
while (time < max_time):
# at each timestep a certain fraction of the free binding sites will have a chance to react
# this fraction is depending on the defined "reactivity" parameter
for i in range(int(np.ceil(reactivity * len(free_protein_bindings)))):
# choose a random NPBS
p_binding_idx = np.random.randint(len(free_protein_bindings))
p_binding = free_protein_bindings[p_binding_idx]
# do tests on parent protein
p = p_binding.parent
# all monomers get assigned an individual increasing id at creation
# we use this id to calculate a position by assuming
# that at creation the cells are put on a line with equal distance to echa other
# the lower to concentration, the higher the distance
p_pos = float(p.id) / (len(proteins) * concentration)
# (float) idx of nearest cell
c_id_mean = p_pos * len(cells) * concentration
# max distance of reachable cells from nearest cell
max_range = position_variance * 3.2
c_id_range = max_range * len(cells) * concentration
c_id_min = max(int(c_id_mean - c_id_range), 0)
c_id_max = min(int(c_id_mean + c_id_range), len(cells))
# now we can calculate the cells in range
candidate_cells = cells[c_id_min:c_id_max]
if len(candidate_cells) == 0:
#if there are no cells in reach we are done here and pic the next random NPBS to try again
continue
# otherwise we go on
# now depending on the individual cells distance to the NP
# we calculate the probability of each candidate cell to bind as a gaussian pdf of the distance
min_distance = max(p_pos - position_variance * 3.2, 0) - p_pos
max_distance = min(-p_pos + position_variance * 3.2,
len(proteins) - 1 / len(proteins) / concentration) + p_pos
probabilities = np.arange(min_distance, max_distance,
(max_distance - min_distance) / len(candidate_cells))[:len(candidate_cells)]
probabilities = distance_binding_probability(probabilities)
probabilities /= sum(probabilities) #normalization
# this call picks a random cell based on the calculated probabilities
cell = np.random.choice(candidate_cells, p=probabilities)
# check if the chosen cell has actually still free binding sites
try:
cell.free_bindings.next()
except StopIteration:
# no free bindings left in this cell. back to the top.
continue
# now we find out which cells the TP in question is already bound to
protein_partners = list(set([b.partner.parent for b in p.bindings if b.partner is not None]))
#this way we know how many different cells it's bound to
if len(protein_partners) == 1:
# if a protein is already attatched to a cell it will most likely not bind to other cells
# but there is a chance of 10%
if cell not in protein_partners and np.random.rand()<0.9:
continue
if len(protein_partners)>1 and cell not in protein_partners:
# if a protein was binding to two cells already, it will not bind to any other
continue
# if we made it this far without running into any of the "continue"s we are ready to bind!
# so the next free binding of the cell will react to our NPBS
cell.free_bindings.next().bind(p_binding)
# and the TPBS is removed from our binding sites dating database
free_protein_bindings.pop(p_binding_idx)
# after all reactive NPBS tired to react we move everything a bit
# depending on the mobility parameter and increment the time
position_variance += mobility
time += 0.01
# in the end we calculate the number of free CBS. Just to know about it.
free_cell_bindings = []
for c in cells:
free_cell_bindings += c.free_bindings
return len(free_protein_bindings), len(free_cell_bindings)
Analysis¶
Now for analyzing the generated polymer networks we import another library called graph_tool.
from graph_tool.all import *
from graph_tool import clustering,stats
The following two functions translate the network (or graph) that we build before into graph_tool-graphs. The first one draws a node for each cell. Then it draws a connecting edge betwee two nodes if the respective cells are connected via any NP.
The second one draws a big node for each cell and a small one for each NP. Edges represent a binding between a NP and a connected cell.
We will clarify the difference between the two graphing methods graphically after the two function definitions.
def cell_graph(cells,proteins,draw=False):
'''vertices are cells, cells connected via a protein are connected in the graph with an edge'''
g = Graph(directed=False)
vertices = list(g.add_vertex(len(cells)))
for c_id in range(len(cells)):
for c_b in cells[c_id].bindings:
try:
for p_b in c_b.partner.parent.bindings:
try:
c2_id = p_b.partner.parent.id
if not c_id==c2_id:
g.add_edge(vertices[c_id],vertices[c2_id])
except AttributeError:
pass
except AttributeError:
pass
if draw:
graph_draw(g, vertex_text=g.vertex_index, edge_pen_width=2 , vertex_font_size=30, output="/tmp/graph.png",
output_size=(2000, 2000))
return g
def pc_graph(cells, proteins, draw=False):
g = Graph(directed=False)
v_size = g.new_vertex_property("int")
for cell in cells:
cell.vertex = g.add_vertex()
v_size[cell.vertex] = 120
for p in proteins:
p.vertex=g.add_vertex()
v_size[p.vertex] = 40
for b in p.bindings:
#cell connections
if b.partner is not None:
cell_vertex = b.partner.parent.vertex
if g.edge(p.vertex,cell_vertex) is None: #only one connection per pair
g.add_edge(p.vertex,cell_vertex)
if draw:
pos=fruchterman_reingold_layout(g)
graph_draw(g, edge_pen_width=2, vertex_text=g.vertex_index, vertex_size=v_size, vertex_font_size=15, output_size=(2000, 2000))
return g
To make the difference more clear lets just run an experiment and visualize the result in both ways
cells, proteins = init_monomers(n_cells=30, n_proteins=200, n_cell_binding = 300, n_protein_binding = 4)
polymerize_spacial(cells, proteins, n_protein_binding=4,
concentration=0.02, reactivity=0.1, mobility=0.05, max_time=3.0)
The experiment is done. Now let's visualize it
cell_graph(cells,proteins,draw=True)
And now the same experimental result with the other visualization
pc_graph(cells,proteins,draw=True)
The second graph shows not only how the interconnection between the idnividual cells is but also visualizes how many NP are "lost" by just sticking to a single cell without forming an connection to others. Furthermore one can see how many differnt proteins connect two cells.
More experiments¶
For a first feeling of how the parameters of the polymerization function influence the result it's helpful just to try out a few combinations and look at the visualization of the graph
To make this task a bit easier we defien a small helper function that initializes the monomers, polymerizes them and generates the graph.
def random_graph(graph_method, n_cells, n_proteins, n_cell_binding = 30, n_protein_binding = 4,concentration=0.02,reactivity=0.01,mobility=0.1,max_time=2.):
# cells,proteins=init_monomers(n_cells, n_proteins, n_cell_binding, n_protein_binding)
# polymerize(cells,proteins)
cells,proteins=init_monomers(n_cells, n_proteins, n_cell_binding = n_cell_binding, n_protein_binding = n_protein_binding)
polymerize_spacial(cells, proteins, n_protein_binding=n_protein_binding, concentration=concentration, reactivity=reactivity, mobility=mobility, max_time=max_time)
return graph_method(cells,proteins,draw),cells,proteins
As long as we want to see a rendered result we just switch the draw variable to True
draw = True
Now we can build a random graph by plugging in some numbers and runnign a single line of code. (The "_=" just suppresses text output)