 Research Article
 Open Access
 Published:
Solution of the chemical master equation by radial basis functions approximation with interface tracking
BMC Systems Biology volume 9, Article number: 67 (2015)
Abstract
Background
The chemical master equation is the fundamental equation of stochastic chemical kinetics. This differentialdifference equation describes temporal evolution of the probability density function for states of a chemical system. A state of the system, usually encoded as a vector, represents the number of entities or copy numbers of interacting species, which are changing according to a list of possible reactions. It is often the case, especially when the state vector is highdimensional, that the number of possible states the system may occupy is too large to be handled computationally. One way to get around this problem is to consider only those states that are associated with probabilities that are greater than a certain threshold level.
Results
We introduce an algorithm that significantly reduces computational resources and is especially powerful when dealing with multimodal distributions. The algorithm is built according to two key principles. Firstly, when performing time integration, the algorithm keeps track of the subset of states with significant probabilities (essential support). Secondly, the probability distribution that solves the equation is parametrised with a small number of coefficients using collocation on Gaussian radial basis functions. The system of basis functions is chosen in such a way that the solution is approximated only on the essential support instead of the whole state space.
Discussion
In order to demonstrate the effectiveness of the method, we consider four application examples: a) the selfregulating gene model, b) the 2dimensional bistable toggle switch, c) a generalisation of the bistable switch to a 3dimensional tristable problem, and d) a 3dimensional cell differentiation model that, depending on parameter values, may operate in bistable or tristable modes. In all multidimensional examples the manifold containing the system states with significant probabilities undergoes drastic transformations over time. This fact makes the examples especially challenging for numerical methods.
Conclusions
The proposed method is a new numerical approach permitting to approximately solve a wide range of problems that have been hard to tackle until now. A full representation of multidimensional distributions is recovered. The method is especially attractive when dealing with models that yield solutions of a complex structure, for instance, featuring multistability.
Background
Temporal evolution of biological systems is often driven by the interaction between different types of particles which, depending on the applications, can represent molecules, bacterias, animals, or other discrete units. In nature, in nearly every process, particle numbers are subject to random fluctuations caused by inherent stochastic noise. Simulations of such systems are usually based on Monte Carlo (MC) simulations of the underlying Markov jump processes, such as Gillespie’s famous stochastic simulation algorithm (SSA) [1]. These methods share some common disadvantages: there is always a sampling error that, in general, is difficult to estimate; the convergence can be quite slow too. Even computing single realisations can be quite costly if many fast reactions are present; therefore approximate MC methods like τleaping [2], averaging approaches [3, 4], and deterministicstochastic hybrid formulations [5–7] have been introduced. The applicability of these approaches depends on the existence of a permanent timescale gap that allows to clearly distinguish between fast and slow reactions.
An alternative approach is to directly compute the probability density function (PDF) as a solution of the chemical master equation (CME). Solving the CME numerically on a large state space with a huge number of unknowns is known to be difficult [8]. Various hybrid methods were proposed to cope with the curse of dimensionality, a phenomenon that refers to the rapidly increasing number of unknowns when parametrising a multidimensional system [9–11].
However, in many cases the probability distribution has ‘significant’ values only on a very small portion of the whole state space. Here, ‘significant’ means being distinguishable from zero and refers to a value that is larger than a predefined small tolerance. This fact has motivated an exploration towards special numerical methods that exploit this feature. For example, Deuflhard et al. (twodimensional case) [12] and Cotter et al. (threedimensional case) [13] applied sophisticated adaptive finite element methods to solve the CME, but their approach is limited to lowdimensional problems. To cope with multidimensional problems, methods based on truncation of the CME to finite state space have been developed, such as the Finite State Projection (FSP) method [14–16] or the finite buffer discrete chemical master equation (dCME) [17, 18]. Based on the FSP, Kazeev et al. used Quantized Tensor Trains for a direct solution of the CME [19].
A very different approach has been taken by Wolf et al. [20], who suggested an algorithm defining a rectangular window in the state space, enclosing the essential part of the distribution that allows to perform parametrisation of the distribution with a small number of parameters. In the current paper we will try to take the latter idea even further. Firstly, we consider cases that are not restricted to one or two dimensions but try to develop a general approach. Secondly, we allow an arbitrary shape of the ‘window’ by considering a manifold that contains system states with probabilities greater than a predefined threshold. Thirdly, we employ the projection on Gaussian basis functions (GBF) to further reduce the computational costs.
The concept of GBF approximation has formerly been applied to various problems in polymer chemistry [21–23] and colloidal physics [24]. In this paper it is extended to the CME. To account for the fact that at a specific time point only a small part of the system states has to be considered, the system of basis functions is adapted on every time step. The idea behind the adaptation is that the unknown distribution is parametrised using only those GBFs that contribute to probability values that are significantly greater than zero. This procedure allows for a smart approximation with a very low number of approximation parameters even in the case of multidimensional distributions. The total number of parameters is not constant in time but changes according the distribution’s complexity. This approach also allows to capture multimodal cases where a timedependent process leads to splitting/merging of a few disjoint parts of the distribution. One example for such a process is the genetic toggle switch model that typically leads to multistable solutions [25]. Even a more comprehensive behaviour can be observed in CMEs that model cell differentiation.
Mesenchymal stem cells (MSCs) are multipotent stromal cells that can differentiate into a variety of cell types, including osteoblasts (bone cells) and chondrocytes (cartilage cells). When derived form adults, one of the applications is related to transplantation, namely either to promote regeneration of diseased or damaged tissue or to rescue defective genes [26]. Foster et al. developed a mathematical model for cell differentiation that predicts presence of multiple stable states for differentiated cells, bifurcations and switchlike transitions [27, 28]. Later, Schittler et al. expanded the model to include the progenitor state and studied the system of binary differentiation with respect to various stimuli [29]. Despite an advanced level of the mathematical description, models presented in Refs. [26, 29] recover single trajectories for the evolution of biological systems, while realistic systems of that kind are known to be composed of a whole population of cells. In theory, the transition from an ordinary differential equation model, that results in single trajectories, to a CME, that describes the evolution for the whole population of cells, is a matter of pure formalism. However, it is the complexity of algorithms one has to cope with when solving the equations numerically, that kept researchers out from the full, threedimensional solution to the CME problem until now.
Methods
Suppose that the evolution of d species interacting via K reaction channels is described by a Markov jump process on the state space \(\Omega =\mathbb {N}^{d}\), whereby \(\mathbb {N}\) denotes the set of all nonnegative integers including zero. The entry \(X_{i}(t) \in \mathbb {N}\) of a realisation \(X(t)\in \mathbb {N}^{d}\) is the number of particles of species i at time t. Our goal is to compute the distribution u(x), x∈Ω, the probability that there are exactly x _{ i } particles of the i ^{th} species, i=1,…,d. In particular we are interested in the time dynamics of the distribution u(x,t),
The distribution u evolves according to the CME [8]
where \(\mathcal {L}: L \rightarrow L\) denotes the operator,
In Eq. (3), \(\nu _{i} \in \mathbb {Z}^{d}\) denotes the stoichiometric vector that defines jumps to new states x+ν _{ i } via the i ^{th} reaction channel. The xdependent coefficients a _{ i }(x) indicate the i ^{th} propensity function.
Let \(T_{k},T_{k}^{\text { }}:L \rightarrow L\) be shift operators that act along the k ^{th} dimension of L,
Let us define the n ^{th}power for operators (4) as an nfolded composition
where I is the identity operator. Now, the CME operator (3) can be rewritten as
Here, ν _{ i,k } denotes the k ^{th} component of the stoichiometric vector ν _{ i }, and W _{ a }:L→L is a multiplicative operator that takes the probability distribution u(x) to its weighted form a(x)u(x). The representation (6) is especially convenient when implementing the approximation technique.
Results
Let S⊂Ω be a fixed and enumerated system of points \(\{x^{i}\}_{i=1,\ldots,n},\; x^{i}=\left ({x_{1}^{i}},\ldots,{x_{d}^{i}}\right)\). Here the lower index denotes dimension, the upper index is a counter. Each point x ^{i}∈S has one radial basis function, ϕ _{ i }(x), associated,
where the choice of the connectivity parameters \({\sigma ^{i}_{k}}\) is a tradeoff between coverage of the whole Ω and sufficiently small condition number of the following matrix,
The approximate solution \(\tilde u(x,t)\) to the CME (3) is searched in the form
The matrix A is composed in such a way that the value of approximation sum (9) at points x ^{i} can be simply written as a column vector: \(\tilde u(x^{i},t) = A\boldsymbol \alpha \). Multiplying the last expression with A ^{−1} on the left yields, \(\boldsymbol \alpha = A^{1}\tilde u(x^{i},t).\) Thus A is an interpolation matrix. It is easy to see that the CME is a linear differential equation and the discretization \(\tilde {\mathcal {L}}:\mathbf {R}^{n}\rightarrow \mathbf {R}^{n}\) to the CME operator (6) can be directly used to implement a collocation scheme on nodes x ^{i},
Here, \(\tilde {T_{k}}\) is an approximated shift operator that, together with its powers, is defined analogously to (5) using \(\tilde {T},\,\tilde {T}^{}\),
and \(\tilde {W}_{a}\) is the approximation to multiplicative operator
Even though. the matrix exponentiation is used in (10), there are sufficiently fast algorithms that compute a matrix exponent with up to machine precision in comparably short time, e.g. [30]. Consequently, the error is predominantly introduced by the choice of the discretization nodes x ^{i}.
The essential support is defined as those states x∈Ω for which u(x)≥0 is greater than a certain threshold,
It is often a case that the probability density u(x,t) at time t has a small essential support not only when comparing with the whole state space Ω, but also when comparing with the union of all essential supports over the whole period of time,
where \(\mu :\mathbb {R}^{d}\rightarrow [\!0,\infty)\) is a Lebesgue measure. Although for a fixed system of basis functions the matrix exponentiation (10) provides the exact solution avoiding time discretisation at all, the condition (14) motivates to consider the CME on sequential time steps, not as means of time approximation, but as a way of economy of computational resources. Indeed, if the system of basis functions (7) is chosen to correspond to the current location of the essential support, the total number of discretisation coefficients α _{ i }(t) will be small.
In order to associate a set of essential supports with a boundary line, the concept of the αhull presented in Ref. [31] is employed. The alpha hull S _{ α } is a generalisation of the convex hull of a finite set of points S. It provides a possibility to associate the signed distance function with a set of points as illustrated in Fig. 1. A distance operator D:S→L takes a set of points from Ω to the signed distance function
where x−y is an Euclidean distance. The operation (15) is reversible: by knowing a signed distance function d(x)=D(S) one may recover the set of points S,
Equations (15) and (16) allow applying the usual numerical toolbox, that is well defined on functions, to essential support sets. Various transformations of the signed distance function lead to changes in the set of points. For instance, let d _{1}(x) and d _{2}(x) be signed distance functions, then d(x)= min(d _{1}(x),d _{2}(x)) is the signed distance function representing the union of the interior regions [32].
Let us assume that the essential support evolves continuously in time. This means that there exists a finite vector valued function v(x,t)<∞ that defines the normal direction speed for the essential support boundary. Moreover, v(x,t) also defines the evolution of the corresponding timecontinuous distance function d(x,t) in terms of the levelset equation [32],
where · is an Euclidian norm. Suppose that the time interval [ 0,t _{ end }] is divided into subintervals with k ordered time points
As the pairwise distances between three sequential time points t _{ i−1},t _{ i },t _{ i+1} approaches zero, the time variation of the speed v(x,t) vanishes as well,
Writing the levelset Eq. (17) for time point t _{ i } using left and right finite differences to replace the time differentiation yields,
The last equation can be directly used to extrapolate the essential support essup{u(x,t _{ i+1})}, provided the probability density u(x,t) is known in time points t _{ i−1} and t _{ i },
The estimation provides a possibility to approximate the actual density u(x,t _{ i+1}) with the numerical scheme (710) involving nodes exclusively from
Here the parameter γ>0 extends the subdomain as there should be a sufficient layer of basis functions around the essential support. This is necessary to interpolate the density u(x,t) on the boundary. Having the approximation \(\tilde {u}(x,t_{i+1})\) in turn permits to compute \(\text {essup}\{\tilde {u}(x,t_{i+1})\}\) and to evaluate the reliability of the prior estimate S ^{′}. On the basis of this information we can accept results at t _{ i+1} or decide to use a smaller time step.
The complete numerical strategy shapes as a sequence of the following steps:

compute the essential support S _{0}=essup{u _{0}(x)} for the initial condition u _{0}(x); choose a system of basis functions with centres x ^{i}∈D ^{−1}(D S _{0}−γ) that provides a sufficient approximation to \(\\tilde u_{0}(x)u_{0}(x)\<p_{\text {threshold}}\);

using (10) perform integration of the approximation to u(x,t) on a small interval [0,t _{1}] and compute the new essential support S _{1}; set i=1;

if t _{ i }<t _{ end }, choose t _{ i+1}=t _{ i }+h; using (22), extrapolate the value for S _{ i+1} utilising (22) and compute the corresponding basis; integrate the system up to t _{ i+1}; validate S _{ i+1} by computing \(\D(\text {essup}\,{\tilde u(x, t_{i+1})}) D(S_{i+1})\\); in the case of satisfactory choice for S _{ i+1}, increase i by one and repeat the step, otherwise repeat the step with a smaller value for h.
Here, essential support threshold p _{threshold}, initial time step h, and density of the basis coverage γ are parameters of the method. The parameter of αhull has grid step h as the lower bound and is chosen to be 2h in the numerical examples that follow.
Discussion
Selfregulating gene
One of the simplest models of a gene regulatory network consists of a single gene regulated by a selfgenerated proteomic atmosphere. In this model the gene is represented by a master equation governing evolution of two probability distributions: u _{on}(x,t) corresponding to situations when the DNA is free (on state), u _{off}(x,t) corresponds to cases when DNA has a repressing protein bound to it (off state). Here x denotes copy numbers of proteins, t represents time. The master equations for the selfregulating gene model read as [33],
Here, g _{on},g _{off} are rates of protein production in the free and bound states, k is a rate of protein degradation, and f is the rate of the repressor protein releasing from the repressor site. In case of monomers, the net binding rate is simply proportional to the number of proteins, h(x)=h x. In a biologically more relevant case  the dimerisation upon binding, the rate is defined as h(x)=h/2x(x−1) [34]. The master equations presented in (23) can be reformulated in the operator form (6),
Here W _{ x=0} acts as unity operator for x=0 and zero operator for x>0. The values of the probability distributions that contradict the physical nature of the process are defined to vanish, that is u _{on}(x,t)=0,x<0, and u _{off} (x,t)=0,x<1. Since extra terms are present, i.e W _{ h(x)}, this important system of two onedimensional master equations does not fall into the class of CMEs; nevertheless it can be discretized by the proposed numerical toolbox. In order to compare the numerical results with the previous findings it is convenient introduce the following unitless parameters:
As can be concluded form the stationary solutions presented in the top panels in Fig. 2, small ω provokes emergence of two distinct peaks in the overall probability distribution, g _{on}+g _{off}, for both cases: monomer and dimer binding. One peak corresponds to the repressed protein production, the other to a much higher protein production (due to free DNA). As protein binding/unbinding becomes faster (i.e. augmented ω) the peaks tend to fuse. In the case of monomer binding the exact analytical solution is known [33]. This gives the possibility to test the method’s convergence on the selfregulatory protein problem. The convergence diagram of relative error of the approximation measured in the l _{2}norm is presented in the lower panel in Fig. 2. In this example the approximation grid has been kept constant due to a small number of system states. As will be shown in the next case studies a much bigger computational challenge can be encountered when treating problems of dimensionality greater then one.
Bistable toggle switch
A classical example of a bistable genetic toggle switch has been both, a comprehensive model explaining experimental data, for instance on E. coli as in Ref. [25], and a challenging test for various numerical methods to solve the CME [8, 12, 35, 36]. In the context of the current paper, we consider the model for its peculiar tendency to form probability landscapes with two maxima. The model describes a pair of mutually repressing genes A and B. Each of the species inhibits the production of the competing repressor by binding to the corresponding genetic control sequences of the promoter. If A becomes abundant the production of B is inhibited and the system is in a stable state of high A and low B. If due to stochastic fluctuations the amount of A decreases or the amount of B is sufficiently high, the switch might flip and B becomes abundant and A repressed. More formally, at every point in time the state of the system is characterised by a 2dimensional vector (x,y), where x represents the copy numbers of A and y represents the copy numbers of B. The model consist of the following reaction channels,
where the propensities are defined as follows,
Let’s denote u(x,y,t) the probability of the system to be in a state (x,y) at time t. Rephrasing mechanisms (25) in terms of the CME presented in the operator form (6) yields
The probability distribution that solves the CME (25) equipped with the parameter set
is known to develop two peaks that correspond to the two semistable states [12, 36]. In principle, Eq. (27) can be considered on the state space (x,y)∈ [ 0,300]× [ 0,300] and integrated in time up to a very high precision by numerical exponentiation. This approach, however, employs matrices of a size 90601×90601, and is far from being optimal if an objective is to recover only probabilities that are significantly greater than zero. The stationary solution u(x,y), depicted in Fig. 3 a, demonstrates that the essential support corresponding to p _{threshold}=0.01· maxu(x,y) occupies only a small fraction of states.
In contrast to the fullstatespace approach, the numerical algorithm proposed in the previous section employs a much smaller number of the degrees of freedom. Let the approximation basis consist of radial basis functions (7) centred in grid points x ^{i}∈ [0,h,2h,…300]^{2}, where h is the grid step. Running the algorithm starting with initial condition
provides us with a succession of domains that support only the part of the solution that has values greater than p _{threshold}. As illustrated in the bottom panels of Fig. 4, the threshold can be chosen so as to make the final essential support twoconnected (see also Additional file 1). The degrees of freedom required for the parametrisation of the solution remain relatively small; the number gradually increases as the distribution progresses from its initial state, but decreases after the essential support evolves into a twoconnected domain, see the top panel in Fig. 4. Another example regards asymmetrical initial conditions
Such initial conditions force the distribution to evolve in time into one mode of the bistable solution at first, and eventually, equilibrating into the bistable solution as depicted in Fig. 5. In this case setting p _{threshold} to a large value might lead to discovering only one mode of the solution. Therefore, it is important to ensure the deviation between total probabilities of the initial conditions and the approximate solution is kept small,
Shall the deviation increase over a certain level, the value of p _{threshold} should be lowered. As depicted in the right panel of Fig. 3, both p _{ threshold } and number of basis functions, n, have a direct influence on the error of the approximation. As can be seen in Fig. 3, decreasing the value of p _{ threshold } lowers the approximation error up to a certain saturation point. The further improvements are possible only by increasing the number of basis functions.
Comparison with Gillespie SSA
Since the proposed method deals with a CME, that produces a complete probability distribution, it cannot be thought as an alternative to Monte Carlo algorithms, which simulate a single trajectory of a stochastic process. That said, it is possible to extract the frequency of visiting each of the system states by a stochastic process (e.g. generated by Gillespie SSA), and eventually relate it asymptotically to the probability distribution providing the trajectory is long. On practice, this might be a formidable task: producing a long enough trajectory may consume considerable cputime, even more, in cases of multi stability the convergence of the SSA may be hard to estimate. For example, an error diagram of the probability distribution extracted from SSA simulations for the toggle switch problem is given in Fig. 6 a. Initially, the error decreases with increasing number of SSA steps, but as soon as the system switches the mode for the first time (as indicated by the vertical bars), the error decrease slows down considerably. The result extracted form a trajectory with 10^{7} SSA steps deviates form the exact solution by 0.02 in l _{2} norm, and even after being smoothed out by a rectangular 3×3 window the level lines of the probability distribution contain considerable artefacts, see Fig. 6 b. The trajectory itself is depicted in Fig. 6 c. The probability distribution obtained by the approximation on Gaussian basis functions with interface tracking demonstrates a much better accuracy for even lower cputime. As can be seen in Fig. 6 d, the error of two magnitudes lower then the one provided by the SSA is obtained for the same cputime, 200 sec. It is important to note that comparison of cputime might be weakly biased by a particular implementation of the algorithms, that have such a different nature.
Tristable toggle switch
Analogously to the bistable toggle switch a theoretical tristable system describing three mutually competing species A,B,C is considered. In this case the state of the system is characterised by three copy numbers (x,y,z), and the model consist of the following reactions channels,
where the propensities are defined as follows,
The distribution u(x,y,z,t) associates a probability to each system state defined by the vector of copy numbers (x,y,z) at time t. The distribution obeys the following CME,
Similarly to the previous example, we employ a parameter set that yields a symmetric solution,
The stationary solution to Eq. (33) constitutes an interesting example of how complex the essential support might be. Indeed, as can be seen in Fig. 7, depending on the value of the significance level p _{threshold}, the three dimensional domain is exhibiting various types of topology: it is nonconvex, either simple connected or threeconnected. In both cases a highly adaptive parametrisation is essential for saving computational resources. Fig. 8 shows the inner structure of enclosed isosurfaces on the left panel, while the basis function centres used for the stationary solution are presented on the right panel. See also Additional file 2.
Stem cell differentiation problem
Let us consider the osteochondro switch (OCS) model introduced by Schittler et al. [29]. The model represents a particular example of two mutually inhibiting key transcriptional regulators (TRs), which determine the cell fate of osteochondro progenitor cells, see Fig. 9. The system state is represented by the three state variables x,y,z, corresponding to the progenitor (z), osteogenic (x), and chondrogenic (y) TRs. Relating to experimental data, these would be (a rough measure) of mRNA levels, or a more precise measure of transcription factor activated from reporter genes [29].
In stimulifree setup Schittler et al. considered the set of ordinary differential equations
where the meaning and values of the coefficients are given in Table 1. A detailed bifurcation analyses of (35) performed in [29] reveals that depending on the parameter set the system may operate in bistable or tristable regimes. The solution to the differential equation system (35), however, does not provide an appropriate description since a single trajectory can only converge to one of the stationary states, whereas in practice, the system is flipping between stable states due to chemical or thermal noise. Hence, the switching behaviour can only be reproduced employing a distributional description of the system state.
Rewriting the cell differentiation mechanism expressed in (35) in terms of the CME yields the same differential equations as in (33) equipped with a specific set of asymmetric propensities a _{ i }(x,y,z),
Equation 33 is linear, hence any initial condition leads to a unique stationary solution. Various parametric sets for the OCS model may yield solutions with very different essential supports. This is demonstrated in Fig. 10 where isosurfaces of the threedimentional probability distribution u(x,y,z,t _{ end }) are presented (see also Additional file 3). A bistable solution corresponding to the basic parameter set as used in [29] is depicted in Fig. 10 a. Choosing a smaller value for the inflection point m _{ p }=8 forces the solution to become tristable, Fig. 10 b. Scaling up the autoactivation and basal activity parameters a _{ p,o,c }, b _{ p,o,c } by a factor 2.5 produces more segregated maxima in the solution, additionally, the absolute values of the copy numbers, x,y,z, are higher as depicted in Fig. 10 c. Finally, an effect of biased differentiation is modelled by modifying the Hill function associated with osteogenic cells to include a proosteogenic stimulus z _{ o }>0,
Figure 10 d illustrates the three dimensional probability density corresponding to stimulus z _{ o }=6; the remaining parameters are identical to those in Fig. 10 c. An effect of proosteogenic stimuli is further explored in Fig. 11, where the twodimensional marginal distribution \(\sum \limits _{z}u(x,y,z,t_{\textit {end}})\) is plotted for two cases: z _{ o }=0 (panel a), and z _{ o }=6 (panel b).
When considering the time evolution of the solution to the OCS problem, it is natural to expect that the number of degrees of freedom required by the approximation scheme is not constant. For instance, if the simulation for the case study presented in Fig. 10 c is started with the initial condition
the essential support undergoes complex transformation before the solution reaches the stationary solution, as depicted in Fig. 12. The top panel of Fig. 12 shows degrees of freedom as a function of time. Even though the degrees of freedom increase initially, reach maximum, and decrease before the system arrives to the steady state, only a small fraction of the full grid nodes is employed at each point of time. In fact, the maximum number of approximation parameters, 2701, is only 7 % of the overall grid nodes, and 1 % of the total number of states, assuming x,y,z<60. In the previous case study, these values are even more dramatic: at most 3887 basis functions were used for the approximations, which is 2 % of the full grid and 0.05 % of the total number of states assuming the upper bound x,y,z<200.
Conclusions
We proposed a numerical method for the approximation of the solution to a wide range of CME based problems that have been hard to tackle until now. The fact that the method recovers a full representation of multidimensional distributions makes it especially attractive for cases of multistability.
In order to reduce the amount of computational resources, the unknown distribution is searched as a linear combination of Gaussian radial basis functions. The efficiency of the method is improved even further by predicting a manifold containing states with probabilities that are greater than a certain significance threshold in every time step. The prediction is done on the basis of information available form previous time steps. It allows to keep the degrees of freedom of the approximation very close to the optimal value corresponding to the significance threshold.
The method has been applied to the following examples of two and threedimensional CMEs leading to multistable solutions:

A bistable genetic toggle switch describing two competing species. This problem constitutes an important case: when the normal distribution is taken as initial condition, the manifold containing highly probable states undergoes drastic transformations. Its topology transits from simple connected to a twoconnected domain. Since the exact solution is known for this problem, the approximation error can be evaluated.

A tristable toggle switch, yielding a threedimensional symmetric solution, is introduced as a generalisation of the previous problem. Although this case demonstrates a possible mechanism for three competing species and constitutes an interesting test for the algorithm, it remains a theoretical problem.

A cell differentiation model, describing cell fate determination of osteochondro progenitor cells. The model considers two final cell types, osteoblast and chondrocytes, and is a special case of the previous example. It has been shown how variations of some important parameters affect the stationary solution. It has also been studied how a proosteogenic stimulus leads to a nonsymmetrical solution.
Besides CME, the method has been additionally applied to a system of master equations describing a selfregulatory gene.
We expect that the method can be applied to other CME problems including those that have no a priori information available on the shape, location, or upper bound of the domain that contains states with significant probabilities. The domain is constructed and tracked in time using ideas from level set methods. The advantage of the level set approach is that one can perform numerical computations involving surfaces on a fixed Cartesian grid without having to parameterise these objects. In addition, the level set method makes it very easy to follow shapes that change topology, for example when a shape splits into two, develops holes, or the reverse of these operations.
Although the method features many advantages for multistable systems or systems where rare events are important, highdimensional cases (d >4) are hard to tackle with the current implementation. In future work, we plan to relax the condition that radial basis function centres are selected form a predefined grid in order to reach the optimal number of degrees of freedom in the approximation and extend the algorithm to highdimensional cases.
Abbreviations
 CME:

Chemical master equation
 dCME:

Discrete chemical master equation
 DNA:

Deoxyribonucleic acid
 FSP:

Finite state projection
 GBF:

Gaussian basis function
 MC:

Monte Carlo
 mRNA:

Messenger ribonucleic acid
 MSC:

Mesenchymal stem cell
 OCS:

Osteochondro switch
 PDF:

Probability density function
 TR:

Transcriptional regulator
 SSA:

Stochastic simulation algorithm
References
Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977; 81:2340–361.
Gillespie DT. Approximate accelerated stochastic simulation of chemically reacting systems. J Phys Chem. 2001; 115:1716–1733.
Rao CV, Arkin AP. Stochastic chemical kinetics and the quasisteady state assumption: Application to the Gillespie algorithm. J Chem Phys. 2003; 118:4999–5010.
W E, Liu D, VandenEijnden E. Nested stochastic simulation algorithms for chemical kinetic systems with multiple time scales. J Comput Phys. 2007; 221:158–80.
Haseltine EL, Rawlings JB. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys. 2002; 117:6959–969.
Takahashi K, Kaizu K, Hu B, Tomita M. A multialgorithm, multitimescale method for cell simulation. Bioinformatics. 2004; 20:538–46.
Alfonsi A, Cancès E, Turinici G, Ventura BD, Huisinga W. Adaptive simulation of hybrid stochastic and deterministic models for biochemical systems. ESAIM Proc. 2005; 14:1–13.
Jahnke T, Udrescu T. Solving chemical master equations by adaptive wavelet compression. J Comput Phys. 2010; 229(16):5724–741.
Hellander A, Lötstedt P. Hybrid method for the chemical master equation. J Comput Phys. 2007; 227(1):100–22.
Erban R, Chapman SJ, Kevrekidis IG, Vejchodský T. Analysis of a stochastic chemical system close to a sniper bifurcation of its meanfield model. SIAM J Appl Math. 2009; 70(3):984–1016.
Menz S, Latorre J, Schütte C, Huisinga W. Hybrid stochastic–deterministic solution of the chemical master equation. Multiscale Model Simul. 2012; 10(4):1232–1262.
Deuflhard P, Huisinga W, Jahnke T, Wulkow M. Adaptive discrete galerkin methods applied to the chemical master equation. SIAM J Sci Comput. 2008; 30(6):2990–3011.
Cotter SL, Vejchodsky T, Erban R. Adaptive finite element method assisted by stochastic simulation of chemical systems. SIAM J Sci Comput. 2013; 35(1):107–31.
Munsky B, Khammash M. The finite state projection algorithm for the solution of the chemical master equation. J Chem Phys. 2006; 124(4):044104.
Peleš S, Munsky B, Khammash M. Reduction and solution of the chemical master equation using time scale separation and finite state projection. J Chem Phys. 2006; 125(20):204104.
Munsky B, Khammash M. Transient analysis of stochastic switches and trajectories with applications to gene regulatory networks. IET Syst Biol. 2008; 2(5):323–33.
Cao Y, Liang J. Optimal enumeration of state space of finitely buffered stochastic molecular networks and exact computation of steady state landscape probability. BMC Syst Biol. 2008; 2(1):30.
Cao Y, Lu HM, Liang J. Probability landscape of heritable and robust epigenetic state of lysogeny in phage lambda. Proc Natl Acad Sci. 2010; 107(43):18445–18450.
Kazeev V, Khammash M, Nip M, Schwab C. Direct solution of the chemical master equation using quantized tensor trains. PLoS Comput Biol. 2014; 10(3):1003359.
Wolf V, Goel R, Mateescu M, Henzinger TA. Solving the chemical master equation using sliding windows. BMC Syst Biol. 2010; 4(1):42.
Kryven I, Iedema PD. Transition into the gel regime for free radical crosslinking polymerisation in a batch reactor. Polymer. 2014; 55(16):3475–489.
Kryven I, Iedema PD. Topology evolution in polymer modification. Macromol Theory Simul. 2014; 23(1):7–14.
Kryven I, Iedema PD. Deterministic modelling of copolymer microstructure: composition drift and sequence patterns. Macromol React Eng. 2015; 9:285–306.
Kryven I, Lazzari S, Storti G. Population balance modeling of aggregation and coalescence in colloidal systems. Macromol Theory Simul. 2014; 23(3):170–81.
Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in escherichia coli. Nature. 2000; 403(6767):339–42.
Baksh D, Song L, Tuan R. Adult mesenchymal stem cells: characterization, differentiation, and application in cell and gene therapy. J Cell Mol Med. 2004; 8(3):301–16.
Roeder I, Glauche I. Towards an understanding of lineage specification in hematopoietic stem cells: a mathematical model for the interaction of transcription factors gata1 and pu. 1. J Theor Biol. 2006; 241(4):852–65.
Foster DV, Foster JG, Huang S, Kauffman SA. A model of sequential branching in hierarchical cell fate determination. J Theor Biol. 2009; 260(4):589–97.
Schittler D, Hasenauer J, Allgöwer F, Waldherr S. Cell differentiation modeled via a coupled twoswitch regulatory network. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2010; 20(4):045121.
B.Sidje R. EXPOKIT: A software package for computing matrix exponentials. ACM Trans Math Softw. 1998; 24(1):130–56.
Edelsbrunner H, Kirkpatrick D, Seidel R. On the shape of a set of points in the plane. IEEE Trans Inf Theory. 1983; 29(4):551–9.
Osher S, Fedkiw R. Level set methods and dynamic implicit surfaces. Applied Mathematical Sciences. Vol. 153. New York: Springer; 2006.
Hornos J, Schültz D, Innocentini G, Wang J, Walczak A, Onuchic J, et al. Selfregulating gene: an exact solution. Phys Rev E. 2005; 72(5):051907.
Schultz D, Onuchic JN, Wolynes PG. Understanding stochastic simulations of the smallest genetic networks. J Chem Phys. 2007; 126(24):245102.
Sjöberg P, Lötstedt P, Elf J. Fokker–planck approximation of the master equation in molecular biology. Comput Vis Sci. 2009; 12(1):37–50.
Engblom S. Galerkin spectral method applied to the chemical master equation. Commun Comput Phys. 2009; 5:871–96.
Acknowledgements
The authors would like to thank to Steffen Waldherr for helpful discussion and suggestions regarding cell differentiation modelling. IK acknowledge the financial support from the Marie Curie Actions (PITNGA2009238700).
Author information
Affiliations
Corresponding author
Additional information
Authors’ information
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
ChS and IK designed the research. IK developed the algorithm and the implementation was carried out by SR and IK. SR and IK wrote the manuscript, which has been read and approved by all authors.
Additional files
Additional file 1
An animation illustrating the usage of the numerical method for integration of the bistable toggle switch problem. (MPEG 1352 kb)
Additional file 2
An animation depicting time evolution of the solution to the tristable toggle switch problem. The solution is represented by a series of isosurfaces. The set of basis function centres used for approximation of the steady state solution is also given. (MPEG 15872 kb)
Additional file 3
An animation depicting the time evolution of the solution to cell differentiation problem with stimulifree setup. The solution is represented by a series of isosurfaces. The set of basis function centres used for approximation of the steady state solution is also given. (MP4 9902 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Kryven, I., Röblitz, S. & Schütte, C. Solution of the chemical master equation by radial basis functions approximation with interface tracking. BMC Syst Biol 9, 67 (2015). https://doi.org/10.1186/s129180150210y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s129180150210y
Keywords
 CME
 Adaptivity
 Toggle switch
 Multistability
 Cell differentiation