Inference of spatiotemporal effects on cellular state transitions from time-lapse microscopy

Background Time-lapse microscopy allows to monitor cell state transitions in a spatiotemporal context. Combined with single cell tracking and appropriate cell state markers, transition events can be observed within the genealogical relationship of a proliferating population. However, to infer the correlations between the spatiotemporal context and cell state transitions, statistical analysis with an appropriately large number of samples is required. Results Here, we present a method to infer spatiotemporal features predictive of the state transition events observed in time-lapse microscopy data. We first formulate a generative model, simulate different scenarios, such as time-dependent or local cell density-dependent transitions, and illustrate how to estimate univariate transition rates. Second, we formulate the problem in a machine-learning language using regularized linear models. This allows for a multivariate analysis and to disentangle indirect dependencies via feature selection. We find that our method can accurately recover the relevant features and reconstruct the underlying interaction kernels if a critical number of samples is available. Finally, we explicitly use the tree structure of the data to validate if the estimated model is sufficient to explain correlated transition events of sister cells. Conclusions Using synthetic cellular genealogies, we prove that our method is able to correctly identify features predictive of state transitions and we moreover validate the chosen model. Our approach allows to estimate the number of cellular genealogies required for the proposed spatiotemporal statistical analysis, and we thus provide an important tool for the experimental design of challenging single cell time-lapse microscopy assays. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0208-5) contains supplementary material, which is available to authorized users.


The master equations of an interacting population of dividing cells
We consider a population of cells, each defined via its position x ∈ R 2 , its state Y ∈ [0, 1] and its age τ ∈ R + . Cells change state Y with rate λ(F ), where F are the cell's features (cell density, age, etc.), and divide with age dependent rate γ(τ ). First, let us describe how the distribution of a single cell evolves:Ṗ (x, Y, τ, t) = ∇ 2 P(x, Y, τ, t) + ∂ ∂τ P(x, Y, τ, t) − δ Y,0 · λ(x, t, τ ) · P(x, Y, τ, t) where δ n,m is the Kronecker delta and ∇ 2 = ∂ 2 . The first term on the right hand side accounts for spatial diffusion of the cell in and out of location x, the second term accounts for aging, the third term accounts for cells transitioning out of state A (Y = 0) and the fourth term describes cells transitioning into state B (Y = 1). The last term accounts for the loss of the single cell due to cell division giving rise to a pair of cells.
When considering pairs of cells, we must describe the evolution of their joint distribution P(x 1 , Y 1 , τ 1 , x 2 , Y 2 , τ 2 , t): The first line represent diffusion in space and drift in time. The second line corresponds to loss due to a state change out of state A, where the rate λ depends on the features F i of cell i, which is function depending on potentially all system variables (e.g. cell locations when representing cell density). The third line accounts for a cell in state A (Y 1 = 0) at position x 1 that transitions into state B (the forth line is analogous for the second cell). The fifth line includes the gain of probability due to a division event in the single cell equation creating two cells of age 0. This term gives rise to the coupling of the equations. For simplicity we assumed that a division event 1 at location x gives rise to two cells both located at x as well. Finally, the last term models loss due to either cell 1 or cell 2 dividing. The equation for higher numbers of cells (triplets, etc.) become more and more complex as one has to deal with arising symmetries (see e.g. Dodd and Ferguson [1]).

Credibility intervals for proportions from counting
When estimating the probability P (t, t + dt|F i (t)) in Equation (9) of the main text, we in fact we estimate the parameter of a binomial distribution: we observed that k successes (events) occurred out of n trials and we are interested in the probability p of the success. Clearly, k follows a binomial distribution with parameters n, p: Having observed a particular k and n we want to infer the parameter p of the underlying binomial distribution. In a maximum likelihood setting, one can show that this is justp = k/n. Confidence intervals for this maximum likelihood estimator can be constructed according to various methods (Wald-, Wilson-, or Clopper-Pearson confidence intervals, [3]).
We invoke a Bayesian approach instead: using an uninformative conjugate prior for p (which is a Beta(1/2,1/2) distribution called Jeffreys prior, see [2]), we get the following posterior distribution for p: If we consider the posterior meanp, we find which is approximately the same as the maximum likelihood estimate (k/n) if k, n are large. We obtain credibility intervals by calculating the 5% and 95% quantiles of the posterior in Equation (1). A particularly appealing property of Bayesian credibility intervals is that they will strictly be within [0, 1], unlike their frequentist counterparts which are not constrained to the [0, 1] domain of probabilities. This is especially prevalent if the estimated probability itself is close to 0 or 1.

Independence of samples
For the machine-learning, we rely on the fact that the individual samples are independent. In the following we show why this is indeed true. Intuitively, one would think that for example two samples generated from the same cell, but at different timepoints within that cell are not independent: not only is there a strong correlation with respect to their features (e.g. because spatial position will change only slightly with time, and therefore the latter sample will be at a similar position to the earlier sample) but it seems that also the class labels are apparently not independent. For example the knowledge that the later sample is undifferentiated immediately implies that also the previous example is undifferentiated. However, this is not the case. Consider for example a single cell, that is created at time t 0 and differentiates at time t N . The likelihood of this whole cell in terms of the hazard is If we however decide to split up this cell into N individual observations O i (as we do in the main text) we get the following likelihood for each observation: Assuming independence, the overall likelihood is Comparing this expression to Equation (2), we find that they are actually the same. Therefore the assumption of independence is true.

Log-binomial regression and its approximation via Poisson regression
In our simulations a single sample (Y (i) , F (i) ) is created according to If we switch class labels such that we see that the generative model corresponds to a GLM with a Bernoulli distribution and an exponential mean function This GLM is known as the log-binomial model and suffers from convergence issues as it allows probabilities greater than 1 (if the exponent in Equation 3 becomes positive). Hence, we choose to approximate the Bernoulli distribution by a Poisson distribution and keep the desired exponential mean function. This GLM is also known as Poisson regression [4]. Since the probability mass function of Bernoulli and Poisson distribution are very similar if p 1, this provides a good approximation to the log-binomial model in the case of rare events: The loglikelihood of the model (Equation (11) in the main text) for data (Y, X) and weights w is obtained as: where we plugged in the mass function of a Poisson distribution in the second line and used the abbreviation α = exp(−w T F (i) ∆t). This is simplified to Note that in the main text (Equation 11), for simplicity we absorbed −∆t into the weights, i.e.

Event probabilities and rates
Given a transition rate λ(t, F (t)) which depends on other variables F (t), what is the probability (referred to as P (t) in the main text) of a state transition happening in the interval [t, t + τ ]?
The number N t+τ t of transition events happening in a single in [t, t+τ ] obeys an inhomogeneous Poisson process with time-dependent rate λ(t, F (t)), i.e. it has probability mass function From that it follows that the probability (termed P (t) in the main text) of one or more events happening in [t, t + τ ] is P N t+τ Note that for our model, only one state transition can occur (i.e. if the cell changed its state from I to II, it cannot undergo the transition ever again) but the above formula remains valid, since the probability of more than one event happening in [t, t + τ ] is of order m 2 (t, τ ) and hence neglegible for small τ .
Time Time next division  Figure 4 of the main text). C) The inclusion probabilities for each feature show that despite the strongly nonlinear transition rate, the method can identify all relevant density features and only includes one irrelevant feature (φ 7 ) into the model.