Markov State Models of gene regulatory networks
 Brian K. Chu^{1},
 Margaret J. Tse^{1},
 Royce R. Sato^{1} and
 Elizabeth L. Read^{1, 2}Email author
DOI: 10.1186/s1291801703944
© The Author(s). 2017
Received: 27 September 2016
Accepted: 13 January 2017
Published: 6 February 2017
Abstract
Background
Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fatedecisions. Quantitative models that can link molecularlevel knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cellreprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking.
Results
The method identifies the number, phenotypes, and lifetimes of longlived states for a set of common gene regulatory network models. Application of transition path theory to the constructed Markov State Model decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic stateswitching.
Conclusions
In this proofofconcept study, we found that the Markov State Model provides a general framework for analyzing and visualizing stochastic multistability and statetransitions in gene networks. Our results suggest that this framework—adopted from the field of atomistic Molecular Dynamics—can be a useful tool for quantitative Systems Biology at the network scale.
Keywords
Multistable systems Stochastic processes Gene regulatory networks Markov State Models Cluster analysisBackground
Gene regulatory networks (GRNs) often have dynamics characterized by multiple attractor states. This multistability is thought to underlie cell fatedecisions. According to this view, each attractor state accessible to a gene network corresponds to a particular pattern of gene expression, i.e., a cell phenotype. Bistable network motifs with two possible outcomes have been linked to binary cell fatedecisions, including the lysis/lysogeny decision of bacteriophage lambda [1], the maturation of frog oocytes [2] and a cascade of branchpoint decisions in mammalian cell development (reviewed in [3]). Multistable networks with three or more attractors have been proposed to govern diverse cell fatedecisions in tumorigenesis [4], stem cell differentiation and reprogramming [5–7], and helper T cell differentiation [8]. More generally, the concept of a rugged, highdimensional epigenetic landscape connecting every possible cell type has emerged [9–11]. Quantitative models that can link molecularlevel knowledge of gene regulation to a global understanding of network behavior have the potential to guide rational cellreprogramming strategies. As such, there has been growing interest in the development of theory and computational methods to analyze global dynamics of multistable gene regulatory networks.
Gene expression is inherently stochastic [1, 12–14], and fluctuations in expression levels can measurably impact cell phenotypes and behavior. Numerous examples of stochastic phenotype transitions have been discovered, which diversify otherwise identical cellpopulations. This spontaneous stateswitching has been found to promote survival of microorganisms or cancer cells in fluctuating environments [15–17], prime cells to follow alternate developmental fates in higher eukaryotes [18, 19], and generate sustained heterogeneity (mosaicism) in a homeostatic mammalian cell population [20]. These findings have motivated theoretical studies of stochastic stateswitching in gene networks, which have shed light on network parameters and topologies that promote the stability (or instability) of a given network state [20]. Characterizing the global stability of states accessible to a network is akin to quantification of the “potential energy” landscape of a network. Particularly, with the advent of stemcell reprogramming techniques, there has been renewed interest in a quantitative reinterpretation of Waddington’s classic epigenetic landscape [21], in terms of underlying regulatory mechanisms [10, 22].
A number of mathematical frameworks exist for modeling and analysis of stochastic gene regulatory network (GRN) dynamics (reviewed in [23, 24]), including probabilistic Boolean Networks, Stochastic Differential Equations, and stochastic biochemical reaction networks (i.e., Chemical Master Equations). Of these, the Chemical Master Equation (CME) approach is the most complete, in that it treats all biomolecules in the system as discrete entities, fully accounts for stochasticity due to molecularlevel fluctuations, and propagates dynamics according to chemical rate laws. The CME is analytically intractable for GRNs except in some simplified model systems [25–29], but trajectories can be simulated by Monte Carlo methods such as the Stochastic Simulation Algorithm (SSA) [30]. Alternatively, methods for reducing the dimensionality of the CME, enabling numerical approximation of network behavior by matrix methods, have been developed [31–35].
Analysis of multistability and global dynamics of discrete, stochastic GRN models remains challenging. In this study, we define multistability in stochastic systems as the existence of multiple peaks in the stationary probability distribution. In such systems, the GRN dynamics can be considered somewhat analogous to that of a particle in a multiwell potential [3]. (Peaks in the probability distribution—or alternatively, basins in the potential—may or may not correspond to stable fixed points of a corresponding ODE model, as discussed in more detail further on.) Stochastic multistability is often assessed by plotting multipeaked steadystate probability distributions (obtained either from long stochastic simulations [5, 36, 37] or from approximate CME solutions [35, 38, 39]), projected onto one or two userspecified system coordinates. However, even small networks generally have more than two dimensions along which dynamics may be projected, meaning that inspection of steadystate distributions for a given projection may underestimate multistability in a network. For example, the statespace of a GRN may comprise different activitystates of promoters and regulatory sites on DNA, the copynumber of mRNA transcripts and encoded proteins, and the activity or multimerstates of multiple regulatory molecules or proteins. Furthermore, while steadystate distributions give a global view of system behavior, they do not directly yield dynamic information of interest, such as the lifetimes of attractor states.
In this paper, we present an approach for analyzing multistable dynamics in stochastic GRNs based on a spectral clustering method widely applied in Molecular Dynamics [40, 41]. The output of the approach is a Markov State Model (MSM)—a coarsegrained model of system dynamics, in which a large number of system states (i.e., “microstates”) is clustered into a small number of metastable (that is, relatively longlived) “macrostates”, together with the conditional probabilities for transitioning from one macrostate to another on a given timescale. The MSM approach identifies clusters based on separation of timescales, i.e., systems with multistability exhibit relatively fast transitions among microstates within basins and relatively slow interbasin transitions. By neglecting fast transitions, the size of the system is vastly reduced. Based on its utility for visualization and analysis of Molecular Dynamics, the potential application of the MSM framework to diverse dynamical systems, including biochemical networks, has been discussed [42].
Biochemical reaction networks present an unexplored opportunity for the MSM approach. Herein, we applied the method to small GRN motifs and analyzed their global dynamics using two frameworks: the quasipotential landscape (based on the logtransformed stationary probability distribution), and the MSM. The MSM approach distilled network dynamics down to the essential stationary and dynamic properties, including the number and identities of stable phenotypes encoded by the network, the global probability of the network to adopt a given phenotype, and the likelihoods of all possible stochastic phenotype transitions. The method revealed the existence of network states and processes not readily apparent from inspection of quasipotential landscapes. Our results demonstrate how MSMs can yield insight into regulation of cell phenotype stability and reprogramming. Furthermore, our results suggest that, by delivering systematic coarsegraining of highdimensional (i.e., manyspecies) dynamics, MSMs could find more general applications in Systems Biology, such as in signaltransduction, evolution, and population dynamics. In our implementation, the MSM framework is applied to the CME, thus mapping all enumerated molecular states onto longlived system macrostates. We anticipate that the method could in future studies be used to analyze more complex systems where enumeration of the CME is intractable, if implemented in combination with stochastic simulation or other model reduction approaches.
Methods
Gene regulatory network motifs
We studied two common GRN motifs that are thought to control cell fatedecisions. The full lists of reactions and associated rate parameters for each network are given in the Additional file 1. Both motifs consist of two mutuallyinhibiting genes, denoted by A and B. In the Exclusive Toggle Switch (ETS) motif, each gene encodes a transcription factor protein; the protein forms homodimers, which are capable of binding to the promoter of the competing gene, thereby repressing its expression. One DNApromoter region controls the expression of both genes; when a repressor is bound, it excludes the possibility of binding by the repressor encoded by the competing gene. Therefore, the promoter can exist in three possible binding configurations, P _{00}, P _{10}, and P _{01}, denoting the unbound, a _{ 2 }bound, or b _{ 2 }bound states, respectively. Production of new protein molecules (including all processes involved in transcription, translation, and protein synthesis) occurs at a constant rate, which depends on the state of the promoter. When the gene is repressed, the encoded protein is produced at a low rate, denoted g _{ 0 }. When the gene is not repressed, protein is produced at a high rate, g _{ 1 }. For example, when the promoter state is P _{10} the a protein is produced at rate g _{ 1 }, and the b protein is produced at g _{ 0 }. When the promoter is unbound, neither gene is repressed, causing both proteins to be produced at rate g _{ 1 }.
In the Mutual Inhibition/SelfActivation (MISA) motif, each homodimeric transcription factor also activates its own expression, in addition to repressing the other gene. The A and B genes are controlled by separate promoters, and each promoter can be bound by repressor and activator simultaneously. Therefore, the Apromoter can exist in four possible states, A _{00}, A _{10}, A _{01} and A _{11}, denoting unbound, a _{ 2 }activator bound, b _{ 2 }repressor bound, and both transcription factors bound, respectively (and similarly for the Bpromoter). Proteins are produced at rate g _{ 1 } only when the activator is bound and the repressor is unbound. For example, the A _{10} promoter state allows a protein to be produced at g _{ 1 }. The other three A promoter states result in a protein being produced at rate g _{ 0 }. Similarly, the rate of b protein production depends only on the binding configuration of the Bpromoter. In both the ETS and MISA networks, protein dimerization is assumed to occur simultaneously with binding to DNA. All rate parameters are given in Additional file 1: Tables S1 and S2.
Chemical master equation
where p(x, t) is the probability over the system statespace at time t, and K is the reaction ratematrix. The offdiagonal elements K _{ ij } give the timeindependent rate of transitioning from state x _{ i } to x _{ j } , and the diagonal elements are given by K _{ ii } = − ∑_{ j ≠ i } K _{ ji }. We assume a wellmixed system of reacting species, and the state of the system is fully specified by x ∈ ℕ^{ S }, a statevector containing the positiveinteger values of all S molecular species/configurations. We hereon denote these statevectors as “microstates” of the system. In the ETS network, x = [n _{ A }, n _{ B }, P _{ ab }], where n _{ A } is the copynumber of a molecules (protein monomers expressed by gene A, and similar for B), and P _{ ab } indexes the promoter bindingconfiguration. In the MISA network, x = [n _{ A }, n _{ B }, A _{ ab }, B _{ ba }], which lists the protein copy numbers and promoter configurationstates associated with both genes.
The reaction rate matrix K ∈ ℝ^{ N × N } is built from the stochastic reaction propensities (Additional file 1: Eq. 1), for some choice of enumeration over the statespace with N reachable microstates. In general, if a system of S molecular species has a maximum copy number per species of n _{max}, then N ~ n _{max} ^{ S }. To enumerate the system statespace, we neglect microstates with protein copynumbers larger than a threshold value, which exceeds the maximum steadystate gene expression rate, g _{1}/k, (where g _{1} is the maximum production rate of protein and k is the degradation rate), as these states are rarely reached. This truncation of the statespace introduces a small approximation error, which we calculate using the Finite State Projection method [31] (Additional file 1: Figure S1).
Stochastic simulations
Stochastic simulations were performed according to the SSA method, implemented by the software package StochKit2 [43].
Quasipotential landscape
The steadystate probability π(x) over N microstates is obtained from K as the normalized eigenvector corresponding to the zeroeigenvalue, satisfying K π(x) = 0 [44]. Quasipotential landscapes were obtained from π(x) using a Boltzmann definition, U(x) = − ln(π(x)) [22]. All matrix calculations were performed with MATLAB [45].
Markov State Models: mathematical background
The last 15 years have seen continual progress in development of theory, algorithms, and software implementing the MSM framework. We briefly summarize the theoretical background here; the reader is referred to other works (e.g., [41, 46–49]) for more details.
The MSM is a highly coarsegrained projection of system dynamics over N microstates onto a reduced space of selected size C (generally, C ≪ N). The C states in the projected dynamics are constructed by clustering together microstates that experience relatively fast transitions among them. The C clusters, also called “almost invariant aggregates” [48], are hereon denoted “macrostates”.
If the system exhibits multistability, then the dynamics can be approximately separated into fast and slow processes, with fast transitions occurring between microstates belonging to the same metastable macrostate, and slow transitions carrying the system from one macrostate to another. Then T(τ) is nearly decomposable, and will exhibit an almost blockdiagonal structure (for an appropriate ordering of microstates) with C nearly uncoupled blocks. In this case, the eigenvalue spectrum of T(τ) shows a cluster of C eigenvalues near λ _{1} = 1, denoting C slow processes (including the stationary process), and for i > C, λ _{ i } ≪ λ _{ C }, corresponding to rapidly decaying processes. The system timescales can be computed from the eigenvalue spectrum according to t _{ i } = − τ/ln λ _{ i }(τ).
where D is the diagonal matrix obtained from the stationary probability vector, D = diag(π _{1}, …, π _{ N }). The coarsegrained probability \( \tilde{\boldsymbol{\pi}}(x) \) is obtained by \( \tilde{\boldsymbol{\pi}}\left(\mathbf{x}\right)={\boldsymbol{\chi}}^{\mathrm{T}}\boldsymbol{\pi} \left(\mathbf{x}\right) \), and \( \tilde{\boldsymbol{D}}=\mathrm{diag}\left({\tilde{\pi}}_1,\dots, {\tilde{\pi}}_C\right) \). The elements of the linear transformation matrix B are obtained by an optimization procedure, with “metastability” of the resultant coarsegrained projection as the objective function to be maximized. The trace of the coarsegrained transition matrix, \( trace\left[T\right] \) has been taken to be the measure of metastability, because it expresses the probabilities for the system to remain in metastable states over the lagtime (i.e., maximizing the sum over the diagonal elements). The original PCCA method [48] used the sign structure of the eigenvectors to identify almost invariant aggregates (instead of this optimization procedure), and more recent work has identified an alternative objective function [49]. The results of this paper were generated using the PCCA+ implementation of MSMBuilder2 [51].
Construction of Markov State Models and pathway decomposition
where \( \tilde{\mathbf{T}}\left(\tau \right)\in {\mathrm{\mathbb{R}}}^{C\times C} \) is the coarsegrained Markov State Model and D is the diagonal matrix obtained from the stationary probability vector, D = diag(π _{1}, …, π _{ N }). The coarsegrained probability \( \tilde{\boldsymbol{\pi}}\left(\mathbf{x}\right) \) is obtained by \( \tilde{\boldsymbol{\pi}}\left(\mathbf{x}\right)={\boldsymbol{\chi}}^T\boldsymbol{\pi} \left(\mathbf{x}\right) \), and \( \tilde{\boldsymbol{D}}=\mathrm{diag}\left({\tilde{\pi}}_1,\dots, {\tilde{\pi}}_C\right) \).
The Markov State Model is visualized using the PyEmma 2 plotting module [46], where the magnitude of the transition probabilities and steady state probabilities are represented by the thickness of the arrows and size of the circles, respectively.
Upon construction of the Markov State Model, transitionpath theory [52–54] was applied in order to compute an ensemble of transition paths connecting two states of interest, along with their relative probabilities. This was achieved by applying a pathway decomposition algorithm adapted from Noe, et al. in a study of protein folding pathways [54] (details in Additional file 1). A summary of the workflow used in generating the results of this paper is included in the Additional file 1: Supplement S5.
Results
Eigenvalues and Eigenvectors of the stochastic transition matrix reveal slow dynamics in gene networks
The Markov State Model framework has been applied in studies of protein folding, where dynamics occurs over rugged energetic landscapes characterized by multiple longlived states (reviewed in [40, 41]). Therefore, we reasoned that the approach could be useful for studying global dynamics of multistable GRNs. The method identifies the slowest system processes based on the dominant eigenvalues and eigenvectors of the stochastic transition matrix, T(τ), which gives the probability of the system to transition from every possible initial state to every possible destination state within lagtime τ (with τ having units of k ^{− 1} and k being the rate of protein degradation). Inspection of the eigenvalue spectrum of T(τ = 5) for the MISA network in Fig. 1b reveals four eigenvalues near 1 followed by a gap, indicating four system processes that are slow on this timescale. Decreasing τ to 0.5 reveals a stepstructure in the eigenvalue spectrum, suggesting a hierarchy of system timescales. The timescales are related to the eigenvalues according to t _{ i } = − τ/ln λ _{ i }(τ). The Perron eigenvalue λ _{1} = 1 is associated with the stationary (infinite time) process, and the lifetimes t _{2} through t _{5} are computed to be {95.6, 49.4, 30.8, 2.6} (in units of k ^{− 1}). Thus, the first gap in the eigenvalue spectrum arises from a more than tenfold separation in timescales between t _{4} and t _{5}. The original PCCA method [48] used the sign structure of the eigenvectors to assign cluster memberships. Plotting the lefteigenvectors corresponding to the four dominant eigenvalues in the MISA network is instructive: the stationary landscape is obtained from the first lefteigenvector (ϕ _{1} = π(x)), which is positive over all microstates, while the oppositesign regions in ϕ _{2}, ϕ _{3}, ϕ _{4} reveal the nature of the slow processes (Fig. 1d). An eigenvector with regions of opposite sign corresponds to an exchange between those two regions (in both directions, since eigenvectors are signinterchangeable). For example, the slowest process corresponds to exchange between the a > b and b > a regions of statespace, i.e., switching between Bgene dominant and Agenedominant expression states. Eigenvectors ϕ _{3} and ϕ _{4} show that somewhat faster timescales are associated with exchange in and out of the Lo/Lo and Hi/Hi basins.
The Markov State Model approach identifies multistability in GRNs
Reduced models of the MISA network
Parameterdependence of landscapes and MSMs
The PCCA+ algorithm seeks C longlived macrostates, where C is userspecified. We constructed Markov State Models for the MISA network over varying f _{ r }, specifying four macrostates. The MSMs are shown graphically in Fig. 3d. The sizes of the circles are proportional to the relative steadystate probability of the macrostate, and the thickness of the directed edges are proportional to the relative transition probability within τ. In agreement with the landscapes, the MSMs over this parameter regime show increasing probability of the Hi/Hi state, as a result of an increasing ratio of transition probability “into” versus “out of” the Hi/Hi state. The locations of the clusters in the statespace (according to 50% (of the total) stationary probability contours) do not change appreciably. The choice of lagtime τ sets the timescale on which metastability is defined in the system. However, in practice, the PCCA+ seeks an assignment of C clusters regardless of whether C metastable states exist in the system on the τ timescale, and the resulting aggregated macrostates are generally invariant to τ. Thus, for f _{ r } = 1, the algorithm locates four macrostates, although the (lowprobability) Hi/Lo, Lo/Lo, and Lo/Hi macrostates are likely to experience transitions away, into the Hi/Hi macrostate, within τ. These lowprobability states appear in the landscape as shoulders on the outskirts of the Hi/Hi basin. Overall, Fig. 3 demonstrates that, for this parameter regime, the quasipotential landscape and the MSM yield similar information on the global system dynamics in terms of the number and locations of longlived states, and their relative probabilities as a function of the unbinding rate parameter f _{ r. } The MSM further provides quantitative information on the probabilities (and thus timescales) of transitioning between each pair of macrostates.
MSM identifies purely stochastic multistability
The MSM approach identifies three metastable macrostates for the monomer ETS in this parameter regime, as seen in the eigenvalue spectrum, which shows a gap after the third index. The reduced Markov State Model constructed for this network thus reduces the system from N = 7, 803 (51 × 51 × 3) microstates to C = 3 macrostates (Fig. 4b), corresponding to the same Hi/Lo, Hi/Hi, and Lo/Hi metastable phenotypes seen in the quasipotential landscape. Figure 4 demonstrates that the MSM approach can accurately identify purely stochastic multistability in systems where continuous models predict only a single stable fixedpoint steady state. Similar results were found for a selfregulating, singlegene network (Additional file 1: Figure S5 and Table S4). This network, which has been solved analytically, gives rise to a bimodal or monomodal stationary distribution depending on the protein binding/unbinding rates [28, 29, 61].
Analyzing global gene network dynamics with the Markov State Model
MSM provides good approximation to relaxation dynamics from a given initial configuration
Parameterdependence of MSM error
Decomposition of statetransition pathways in gene networks using the MSM framework
MSMs can be constructed with different resolutions of coarsegraining
Each subnetwork in the MSM constructed with C = 16 corresponds to a single macrostate in the MSM constructed with C = 4. Thus, in the C = 4 MSM, four different promoter configurations are lumped together in a single macrostate, and dynamics of transitions among them is neglected. Counterintuitively, the locations of the C = 4 macrostates do not correspond directly to the four basins visible in the quasipotential landscape (Fig. 8b, d). Instead, the clusters combine distinct phenotypes—e.g., the red macrostate combines the A/B Lo/Lo and Lo/Hi phenotypes, because it includes the promoter configurations A _{01} B _{10} and A _{11} B _{10} (corresponding to Lo/Hi expression) and A _{01} B _{00} and A _{11} B _{00} (corresponding to Lo/Lo expression) (Fig. 8b, Additional file 1: Table S5 and Figure S7). This result demonstrates that the barriers visible in the quasipotential landscape do not reflect the slowest timescales in the system. This occurs because of the loss of information inherent to visualizing global dynamics via the quasipotential landscape, which often projects dynamics onto two system coordinates. In this case, projecting onto the protein a and protein b copy numbers loses information about the sixteen promoter configurations, obscuring the fact that barriercrossing transitions can occur faster than some withinbasin transitions. Plotting a time trajectory of brute force SSA simulations for this network supports the findings from the MSM: the dynamics shows frequent transitions within subnetworks, and lessfrequent transitions between subnetworks, indicating the same hierarchy of system dynamics as was revealed by the 4 and 16state MSMs (Fig. 8e).
Transition path decomposition reveals nonequilibrium dynamics
Mapping the most probable paths forward and backward between macrostate “1” (promoter configuration: A _{01} B _{00}) and macrostate “11” (promoter configuration: A _{00} B _{01}) revealed that a number of alternative transition paths are accessible to the network, and the paths typically transit between three and five intermediate macrostates. The decomposition shows three paths with significant (i.e., >15%) probability and 12 distinct paths with >1% probability (for both forward and backward transitions, Additional file 1: Tables S3S4). The pathway decomposition also reveals a great deal of irreversibility in the forward and reverse transition paths, which is a hallmark of nonequilibrium dynamical systems. For example, the most probable forward and reverse paths both transit three intermediates, but have only one intermediate (macrostate 5) in common (Fig. 8c and Additional file 1: Tables S6S7). Thus, the complete process of transitioning away from macrostate 1, through macrostate 11, and returning to 1 maps a dynamic cycle.
Discussion
Our application of the MSM method to representative GRN motifs yielded dynamic insights with potential biological significance. Decomposition of transition pathways revealed that stochastic statetransitions between phenotypic states can occur via multiple alternative routes. Preference of the network to transition with higher likelihood through one particular pathway depended on the stability of intermediate macrostates, in a manner not directly intuitive from the steadystate probability landscape. The existence of “spurious attractors”, or metastable intermediates that act as trap states to hinder stem cell reprogramming, has been discussed previously [11] as a general explanation for the existence of partially reprogrammed cells. By analogy, MSMs constructed in protein folding studies predict an ensemble of folding pathways, as well as the existence of misfolded trap states that reduce folding speed [54]. Our results suggest that multiple partially reprogrammed cell types could be accessible from a single initial cell state. Successful phenotypetransitions can occur predominantly through highpotential (unstable)—and thus difficult to observe experimentally—intermediate cell types. In future applications to specific gene GRNs, the MSM approach could predict a complex map of cellreprogramming pathways, and thus potentially suggest combinations of targets towards improved safety and efficiency of reprogramming protocols. In synthetic biology applications, the method could be potentially used to optimize biochemical parameters in the design of synthetic gene circuits. For example, it may be desirable to realize synthetic switches with a very crisp on/off macrostate partitioning (i.e., lacking spurious intermediate states) to give a highly digital response.
Our study revealed that the twogene MISA network can exhibit complex dynamic phenomena, involving a large number of metastable macrostates (up to 16), cycles and hierarchical dynamics, which can be conveniently visualized using the MSM. The quasipotential landscape has been used recently as a means of visualizing global dynamics and assessing locations and relative stabilities of phenotypic states of interest, in a manner that is quantitative (deriving strictly from underlying gene regulatory interactions), rather than qualitative or metaphorical (as was the case for the original Waddington epigenetic landscape) [21]. However, our study highlights the potential difficulty of interpreting global network dynamics based solely on the steadystate landscape, which is often projected onto one or two degrees of freedom. We found that phenotypically identical cell states—that is, network states marked by identical patterns of protein expression, inhabiting the same position in the projected landscape—can be separated by kinetic barriers, experiencing slow interconversion due to slow timescales for update to the epigenetic state (or promoter binding occupancy). Conversely, phenotypically distinct states marked by different levels of protein expression can be kinetically linked, experiencing relatively rapid interconversion. This type of stochastic interconversion is thought to occur in embryonic stem cells—for example, fluctuations in expression of the Nanog gene have been proposed to play a role in maintaining pluripotency [66, 67]. The hierarchical dynamics revealed by our study supports the idea that the phenotype of a cell could be more appropriately defined by dynamic patterns of regulator or marker expression levels [67], rather than on singletimepoint levels alone. This was seen in the 16state MSM for the MISA network, where a given expression pattern (e.g., the Lo/Lo peak) comprised multiple macrostates from separate dynamic subnetworks.
Complex, highdimensional dynamical systems call for systematic methods of coarsegraining (or dimensionality reduction), for analysis of mechanisms and extraction of information that can be compared with experimental results. In the field of Molecular Dynamics, the complexity of, e.g., macromolecular conformational changes—involving thousands of atomic degrees of freedom and multiple dynamic intermediates—has driven the development of automated methods for prediction and analysis of essential system dynamics from simulations [68, 69]. In that field, coarsegraining has been achieved based on a variety of socalled geometric (structural) or, alternatively, kinetic clustering methods [70, 71]. Noe, et al. [71], discussed that geometric (or structurebased) coarsegraining methods can fail to produce an accurate description of system dynamics when structurally similar molecular conformations are separated by large energy barriers or, conversely, when dissimilar structures are connected by fast transitions, as they found in a study of polypeptide folding dynamics. In such cases, kinetic (i.e., separationoftimescalebased) coarsegraining methods such as the MSM approach are more appropriate. Our application of the MSMs to GRNs demonstrates how similar complex dynamic phenomena can manifest at the “network”scale.
The challenge of solving the CME due to the curseofdimensionality is well known. The MSM approach is related to other projectionbased model reduction methods that aim to reduce the computational burden of solving the CME directly by projecting the rate (or transition) matrix onto a smaller subspace or aggregated statespace with fewer degrees of freedom. Such approaches include the Finite State Projection algorithm [31], and methods based on Krylov subspaces [33, 72, 73], sparsegridding [74], and separationoftimescales [34, 74, 75] (related timescaleseparationbased reduction methods have also been developed to analyze complex ODE models of biochemical networks, e.g., [76, 77]). The MSM is distinct from other timescalebased model reductions in that, rather than partitioning the system into categories of slow versus fast reactions [78] or species [34], or basing categories on physical intuition [75], it systematically groups microstates in such a way that maximizes metastability of aggregated states [40]. The practical benefit of this approach is its capacity to describe a system compactly in terms of longlived, perhaps experimentally observable, states. Another important distinction between the MSM approach and other CME model reduction methods is that its primary endgoal is not to solve the CME per se. Rather, the emphasis in studies employing MSMs has generally been on gaining mechanistic, physical, or experimentallyrelevant insights to complex system dynamics [79–81]. As such, the approach does not optimally balance the tradeoff between computational expense versus quantitative accuracy of the solution, as other methods have done explicitly [82]. Instead, the method can be considered to balance the tradeoff between accuracy and “humaninterpretability”, where decreasing the number of macrostates preserved in the MSM coarsegraining tends to favor the latter over the former.
A potential drawback of the workflow presented in this paper is that it requires an enumeration of the system statespace in order to construct the biochemical rate matrix K. Networks of increased complexity or molecular copy numbers will lead to prohibitively large matrix sizes. Here, we restricted our study to model systems with a relatively small number of reachable microstates (i.e., ~ 10^{4} microstates permitted tractable computations on desktop computers with MATLAB [45]). However, it is important to point out that in typical applications of the MSM framework in Molecular Dynamics, the computational complexity of the coarsegraining procedure is largely decoupled from the full dimensionality of the system statespace, because it is often applied as part of a suite of tools for postprocessing atomistic simulation data. An advantage of the MSM approach is its use of the stochastic transition matrix T(τ) (rather than K), which can be estimated from simulations by sampling transition counts between designated regions of statespace in trajectories of length τ [47]. Systems of increased complexity/dimensionality are generally more accessible to simulations, because the size of the statespace is automatically restricted to those states visited within finitelength simulations. Furthermore, in macromolecular systems with highdimensional configuration spaces, clustering algorithms have been applied in order to obtain a tractable partitioning of statespace, prior to application of the MSM coarsegraining [47]. Typically, a large number of sampled configurations (10^{4}10^{7}) is lumped into a more tractable number of ‘microstates’ (10^{2}10^{4}), and the MSM framework subsequently identifies ~ tens of metastable macrostates. A recent study of Gproteincoupled receptor activation showcased the high complexity of systems that can be analyzed by MSMs: 250,000 sampled molecular structures were projected to coarsegrained MSMs with either 3000 or 10 states [83]. Based on these previous studies in Molecular Dynamics, we anticipate that the MSM framework will likewise prove useful in analysis of highly complex biochemical networks, particularly when coupled with stochastic simulations and thus bypassing the need for enumerating the CME. In ongoing work (Tse, et al., in preparation), we find that the MSM approach interfaces well with SSA simulations of biochemical network dynamics, combined with enhanced sampling techniques [84–86]. We anticipate that the approach could also potentially interface with other numerical approximation techniques that have been developed in recent years for reduction of the CME.
A potential challenge for the application of the PCCA + −based spectral clustering method to biochemical networks is that, as open systems, biochemical networks generally do not obey detailed balance. This means that the stochastic transition matrices do not have the property of irreversibility, which was originally taken to be a requirement for application of the PCCA algorithm [48]. However, later work by Roblitz et al. [49] found that the PCCA+ method also delivers an optimal clustering for irreversible systems. In this study, we found that the PCCA+ method could determine appropriate clusters in GRNs, and could furthermore uncover nonequilibrium cycles, as seen in the irreversibility (distinct forward and backward) of transition paths in the 16state system. Newer methods of MSM building, which are specifically designed to treat nonequilibrium dynamical systems, have appeared recently [87]. It may prove fruitful to explore these alternative methods in order to identify the most appropriate, general MSM framework for application to various biochemical networks. On a separate note, another possible area for future study could be the relationship between the MSM framework, specifically its estimation of switching times in multistable networks, to the results from other theoretical approaches to GRNs, such as Large Deviation Theory [88] or WentzelKramersBrillouin theory [89].
Conclusions
In this work, we present a method for analyzing multistability and global stateswitching dynamics in gene networks modeled by stochastic chemical kinetics, using the MSM framework. We found that the approach is able to: (1) identify the number and identities of longlived phenotypicstates, or network “macrostates”, (2) predict the steadystate probabilities of all macrostates along with probabilities of transitioning to other macrostates on a given timescale, and (3) decompose global dynamics into a set of dominant transition pathways and their associated relative probabilities, linking two system states of interest. Because the method is based on the discretespace, stochastic transition matrix, it correctly identified stochastic multistability where a continuum model failed to find multiple steady states. The quantitative accuracy of the dynamics propagated by the coarsegrained MSM was highest in a parameter regime with slow DNAbinding and unbinding kinetics, indicating that in GRNs the assumption of memoryless hopping among a small number of macrostates is most valid in this regime. By projecting dynamics encompassing a large statespace onto a tractable number of macrostates, the MSMs revealed complex dynamic phenomena in GRNs, including hierarchical dynamics, nonequilibrium cycles, and alternative possible routes for phenotypic statetransitions. The ability to unravel these processes using the MSM framework can shed light on regulatory mechanisms that govern cell phenotype stability, and inform experimental reprogramming strategies. The MSM provides an intuitive representation of complex biological dynamics operating over multiple timescales, which in turn can provide the key to decoding biological mechanisms. Overall, our results demonstrate that the MSM framework—which has been generally applied thus far in the context of molecular dynamics via atomistic simulations—can be a useful tool for visualization and analysis of complex, multistable dynamics in gene networks, and in biochemical reaction networks more generally.
Abbreviations
 CME:

Chemical master equation
 ETS:

Exclusive toggle switch
 GRN:

Gene regulatory network
 GRNs:

Gene regulatory networks
 MISA:

Mutual inhibition/Selfactivation
 MSM:

Markov state model
 ODE:

Ordinary Differential Equation
 PCCA+:

Robust Perron Cluster Analysis
 SSA:

Stochastic simulation algorithm
Declarations
Acknowledgements
We thank Jun Allard for helpful discussions.
Funding
We acknowledge financial support from the UC Irvine Henry Samueli School of Engineering.
Availability of data and materials
The datasets supporting the conclusions of this article are included in the Additional file 2.
Authors’ contributions
BC and ER designed and performed research. MT contributed to data analysis and manuscript preparation. RS contributed to data analysis. BC and ER wrote the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Arkin A, Ross J, McAdams HH. Stochastic kinetic analysis of developmental pathway bifurcation in phage lambdainfected Escherichia coli cells. Genetics. 1998;149(4):1633–48.PubMedPubMed CentralGoogle Scholar
 Xiong W, Ferrell JE. A positivefeedbackbased bistable ‘memory module’ that governs a cell fate decision. Nature. 2003;426(6965):460–5.View ArticlePubMedGoogle Scholar
 Zhou JX, Huang S. Understanding gene circuits at cellfate branch points for rational cell reprogramming. Trends Genet. 2011;27(2):55–62.View ArticlePubMedGoogle Scholar
 Lu M, Jolly MK, Gomoto R, Huang B, Onuchic J, BenJacob E. Tristability in CancerAssociated MicroRNATF Chimera Toggle Switch. J Phys Chem B. 2013;117(42):13164–74.View ArticlePubMedGoogle Scholar
 Feng H, Wang J. A new mechanism of stem cell differentiation through slow binding/unbinding of regulators to genes. Sci Rep. 2012;2:550.PubMedPubMed CentralGoogle Scholar
 Zhang B, Wolynes PG. Stem cell differentiation as a manybody problem. Proc Natl Acad Sci. 2014;111(28):10185–90.View ArticlePubMedPubMed CentralGoogle Scholar
 Wang P, Song C, Zhang H, Wu Z, Tian XJ, Xing J. Epigenetic state network approach for describing cell phenotypic transitions. Interface Focus. 2014;4(3):20130068.View ArticlePubMedPubMed CentralGoogle Scholar
 Hong T, Xing J, Li L, Tyson JJ. A mathematical model for the reciprocal differentiation of T helper 17 cells and induced regulatory T cells. PLoS Comput Biol. 2011;7(7):e1002122.View ArticlePubMedPubMed CentralGoogle Scholar
 Graf T, Enver T. Forcing cells to change lineages. Nature. 2009;462(7273):587–94.View ArticlePubMedGoogle Scholar
 Huang S. The molecular and mathematical basis of Waddington’s epigenetic landscape: a framework for postDarwinian biology? Bioessays. 2012;34(2):149–57.View ArticlePubMedGoogle Scholar
 Lang AH, Li H, Collins JJ, Mehta P. Epigenetic landscapes explain partially reprogrammed cells and identify key reprogramming genes. PLoS Comput Biol. 2014;10(8):e1003734.View ArticlePubMedPubMed CentralGoogle Scholar
 Elowitz MB. Stochastic gene expression in a single cell. Science. 2002;297(5584):1183–6.View ArticlePubMedGoogle Scholar
 Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A. Regulation of noise in the expression of a single gene. Nat Genet. 2002;31(1):69–73.View ArticlePubMedGoogle Scholar
 Golding I, Paulsson J, Zawilski SM, Cox EC. Realtime kinetics of gene activity in individual bacteria. Cell. 2005;123(6):1025–36.View ArticlePubMedGoogle Scholar
 Balaban NQ. Bacterial persistence as a phenotypic switch. Science. 2004;305(5690):1622–5.View ArticlePubMedGoogle Scholar
 Acar M, Mettetal JT, van Oudenaarden A. Stochastic switching as a survival strategy in fluctuating environments. Nat Genet. 2008;40(4):471–5.View ArticlePubMedGoogle Scholar
 Sharma SV, Lee DY, Li B, Quinlan MP, Takahashi F, Maheswaran S, McDermott U, Azizian N, Zou L, Fischbach MA, et al. A chromatinmediated reversible drugtolerant state in cancer cell subpopulations. Cell. 2010;141(1):69–80.View ArticlePubMedPubMed CentralGoogle Scholar
 Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S. Transcriptomewide noise controls lineage choice in mammalian progenitor cells. Nature. 2008;453(7194):544–7.View ArticlePubMedGoogle Scholar
 Dietrich JE, Hiiragi T. Stochastic patterning in the mouse preimplantation embryo. Development. 2007;134(23):4219–31.View ArticlePubMedGoogle Scholar
 Yuan L, Chan GC, Beeler D, Janes L, Spokes KC, Dharaneeswaran H, Mojiri A, et al. A role of stochastic phenotype switching in generating mosaic endothelial cell heterogeneity. Nat Commun. 2016;7:10160.View ArticlePubMedPubMed CentralGoogle Scholar
 Waddington CH. The Strategy of the Genes. London: Allen & Unwin; 1957.
 Wang J, Zhang K, Xu L, Wang E. Quantifying the Waddington landscape and biological paths for development and differentiation. Proc Natl Acad Sci. 2011;108(20):8257–62.View ArticlePubMedPubMed CentralGoogle Scholar
 Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008;9(10):770–80.View ArticlePubMedGoogle Scholar
 Kepler TB, Elston TC. Stochasticity in transcriptional regulation: origins, consequences, and mathematical representations. Biophys J. 2001;81(6):3116–36.View ArticlePubMedPubMed CentralGoogle Scholar
 Shahrezaei V, Swain PS. Analytical distributions for stochastic gene expression. Proc Natl Acad Sci. 2008;105(45):17256–61.View ArticlePubMedPubMed CentralGoogle Scholar
 Mackey MC, TyranKamińska M, Yvinec R. Dynamic behavior of stochastic gene expression models in the presence of bursting. SIAM J Appl Math. 2013;73(5):1830–52.View ArticleGoogle Scholar
 Jiao F, Sun Q, Tang M, Yu J, Zheng B. Distribution modes and their corresponding parameter regions in stochastic gene transcription. SIAM J Appl Math. 2015;75(6):2396–420.View ArticleGoogle Scholar
 Schultz D, Onuchic JN, Wolynes PG. Understanding stochastic simulations of the smallest genetic networks. J Chem Phys. 2007;126(24):245102.View ArticlePubMedGoogle Scholar
 Ramos AF, Innocentini GCP, Hornos JEM. Exact timedependent solutions for a selfregulating gene. Phys Rev E. 2011;83(6):062902.View ArticleGoogle Scholar
 Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977;81(25):2340–61.View ArticleGoogle Scholar
 Munsky B, Khammash M. The finite state projection algorithm for the solution of the chemical master equation. J Chem Phys. 2006;124(4):044104.View ArticlePubMedGoogle Scholar
 Cao Y, Liang J. Optimal enumeration of state space of finitely buffered stochastic molecular networks and exact computation of steady state landscape probability. BMC Syst Biol. 2008;2(1):30.View ArticlePubMedPubMed CentralGoogle Scholar
 Wolf V, Goel R, Mateescu M, Henzinger TA. Solving the chemical master equation using sliding windows. BMC Syst Biol. 2010;4(1):42.View ArticlePubMedPubMed CentralGoogle Scholar
 Pahlajani CD, Atzberger PJ, Khammash M. Stochastic reduction method for biological chemical kinetics using timescale separation. J Theor Biol. 2011;272(1):96–112.View ArticlePubMedGoogle Scholar
 Sidje RB, Vo HD. Solving the chemical master equation by a fast adaptive finite state projection based on the stochastic simulation algorithm. Math Biosci. 2015;269:10–6.View ArticlePubMedGoogle Scholar
 Huang S, Guo YP, May G, Enver T. Bifurcation dynamics in lineagecommitment in bipotent progenitor cells. Dev Biol. 2007;305(2):695–713.View ArticlePubMedGoogle Scholar
 Ma R, Wang J, Hou Z, Liu H. Smallnumber effects: a third stable state in a genetic bistable toggle switch. Phys Rev Lett. 2012;109(24):248107.View ArticlePubMedGoogle Scholar
 Cao Y, Lu HM, Liang J. Probability landscape of heritable and robust epigenetic state of lysogeny in phage lambda. Proc Natl Acad Sci. 2010;107(43):18445–50.View ArticlePubMedPubMed CentralGoogle Scholar
 Munsky B, Fox Z, Neuert G. Integrating singlemolecule experiments and discrete stochastic models to understand heterogeneous gene transcription dynamics. Methods. 2015;85:12–21.View ArticlePubMedPubMed CentralGoogle Scholar
 Pande VS, Beauchamp K, Bowman GR. Everything you wanted to know about Markov State Models but were afraid to ask. Methods. 2010;52(1):99–105.View ArticlePubMedPubMed CentralGoogle Scholar
 Chodera JD, Noé F. Markov state models of biomolecular conformational dynamics. Curr Opin Struct Biol. 2014;25:135–44.View ArticlePubMedPubMed CentralGoogle Scholar
 Bowman GR, Huang X, Pande VS. Network models for molecular kinetics and their initial applications to human health. Cell Res. 2010;20(6):622–30.View ArticlePubMedPubMed CentralGoogle Scholar
 Sanft KR, Wu S, Roh M, Fu J, Lim RK, Petzold LR. StochKit2: software for discrete stochastic simulation of biochemical systems with events. Bioinformatics. 2011;27(17):2457–8.View ArticlePubMedPubMed CentralGoogle Scholar
 van Kampen NG. Stochastic processes in physics and chemistry. Amsterdam; Boston; London: Elsevier; 2007.Google Scholar
 The MathWorks. MATLAB Release. Natick: Massachusetts;2015a.
 Scherer MK, TrendelkampSchroer B, Paul F, PerezHernandez G, Hoffmann M, Plattner N, Wehmeyer C, Prinz J, Noé F. PyEMMA 2: a software package for estimation, validation, and analysis of Markov Models. J Chem Theory Comput. 2015;11(11):5525–42.View ArticlePubMedGoogle Scholar
 Prinz JH, Wu H, Sarich M, Keller B, Senne M, Held M, Chodera JD, Schütte C, Noé F. Markov models of molecular kinetics: generation and validation. J Chem Phys. 2011;134(17):174105.View ArticlePubMedGoogle Scholar
 Deuflhard P, Huisinga W, Fischer A, Schütte C. Identification of almost invariant aggregates in reversible nearly uncoupled Markov chains. Linear Algebra Its Appl. 2000;315(1–3):39–59.View ArticleGoogle Scholar
 Röblitz S, Weber M. Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification. Adv Data Anal Classif. 2013;7(2):147–79.View ArticleGoogle Scholar
 Buchete NV, Hummer G. Coarse master equations for peptide folding dynamics ^{†}. J Phys Chem B. 2008;112(19):6057–69.View ArticlePubMedGoogle Scholar
 Beauchamp KA, Bowman GR, Lane TJ, Maibaum L, Haque IS, Pande VS. MSMBuilder2: modeling conformational dynamics on the picosecond to millisecond scale. J Chem Theory Comput. 2011;7(10):3412–9.View ArticlePubMedPubMed CentralGoogle Scholar
 W. E and VandenEijnden E. Towards a Theory of Transition Paths. J Stat Phys. 2006;123(3):503–523.
 Metzner P, Schütte C, VandenEijnden E. Transition path theory for Markov jump processes. Multiscale Model Simul. 2009;7(3):1192–219.View ArticleGoogle Scholar
 Noe F, Schutte C, VandenEijnden E, Reich L, Weikl TR. Constructing the equilibrium ensemble of folding pathways from short offequilibrium simulations. Proc Natl Acad Sci. 2009;106(45):19011–6.View ArticlePubMedPubMed CentralGoogle Scholar
 Schultz D, Walczak AM, Onuchic JN, Wolynes PG. Extinction and resurrection in gene networks. Proc Natl Acad Sci. 2008;105(49):19165–70.View ArticlePubMedPubMed CentralGoogle Scholar
 Morelli MJ, TănaseNicola S, Allen RJ, ten Wolde PR. Reaction coordinates for the flipping of genetic switches. Biophys J. 2008;94(9):3413–23.View ArticlePubMedPubMed CentralGoogle Scholar
 Huang S. Reprogramming cell fates: reconciling rarity with robustness. Bioessays. 2009;31(5):546–60.View ArticlePubMedGoogle Scholar
 Huang S. Hybrid Thelper cells: stabilizing the moderate center in a polarized system. PLoS Biol. 2013;11(8):e1001632.View ArticlePubMedPubMed CentralGoogle Scholar
 Gardmer T, Cantor C, Collins J. Construction of a genetic toggle switch in Escherichia coli. Nature. 2000;403(6767):339–42.View ArticleGoogle Scholar
 Lipshtat A, Loinger A, Balaban NQ, Biham O. Genetic toggle switch without cooperative binding. Phys Rev Lett. 2006;96(18):188101.View ArticlePubMedGoogle Scholar
 Hornos JEM, Schultz D, Innocentini GC, Wang JA, Walczak AM, Onuchic JN, Wolynes PG. Selfregulating gene: An exact solution. Phys Rev E. 2005;72(5):051907.View ArticleGoogle Scholar
 Lane TJ, Bowman GR, Beauchamp K, Voelz VA, Pande VS. Markov State Model reveals folding and functional dynamics in ultralong MD trajectories. J Am Chem Soc. 2011;133(45):18413–9.View ArticlePubMedPubMed CentralGoogle Scholar
 Tse MJ, Chu BK, Roy M, Read EL. DNAbinding kinetics determines the mechanism of noiseinduced switching in gene networks. Biophys J. 2015;109(8):1746–57.View ArticlePubMedPubMed CentralGoogle Scholar
 Berezhkovskii A, Hummer G, Szabo A. Reactive flux and folding pathways in network models of coarsegrained protein dynamics. J Chem Phys. 2009;130(20):205102.View ArticlePubMedPubMed CentralGoogle Scholar
 Walczak AM, Onuchic JN, Wolynes PG. Absolute rate theories of epigenetic stability. Proc Natl Acad Sci. 2005;102(52):18926–31.View ArticlePubMedPubMed CentralGoogle Scholar
 Chambers I, Silva J, Colby D, Nichols J, Nijmeijer B, Robertson M, Vrana J, Jones K, Grotewold L, Smith A. Nanog safeguards pluripotency and mediates germline development. Nature. 2007;450(7173):1230–4.View ArticlePubMedGoogle Scholar
 Kalmar T, Lim C, Hayward P, MuñozDescalzo S, Nichols J, GarciaOjalvo J, Arias AM. Regulated fluctuations in nanog expression mediate cell fate decisions in embryonic stem cells. PLoS Biol. 2009;7(7):e1000149.View ArticlePubMedPubMed CentralGoogle Scholar
 Chodera JD, Singhal N, Pande VS, Dill KA, Swope WC. Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. J Chem Phys. 2007;126(15):155101.View ArticlePubMedGoogle Scholar
 Bowman GR, Beauchamp KA, Boxer G, Pande VS. Progress and challenges in the automated construction of Markov state models for full protein systems. J Chem Phys. 2009;131(12):124101.View ArticlePubMedPubMed CentralGoogle Scholar
 Deuflhard P, Weber M. Robust Perron cluster analysis in conformation dynamics. Linear Algebra Its Appl. 2005;398:161–84.View ArticleGoogle Scholar
 PérezHernández G, Paul F, Giorgino T, De Fabritiis G, Noé F. Identification of slow molecular order parameters for Markov model construction. J Chem Phys. 2013;139(1):015102.View ArticlePubMedGoogle Scholar
 Burrage K, Hegland M, Macnamara S, Sidje R. A Krylovbased finite state projection algorithm for solving the chemical master equation arising in the discrete modelling of biological systems, Proceedings of the Markov 150th Anniversary Conference. 2006.Google Scholar
 Cao Y, Terebus A, Liang J. Accurate chemical master equation solution using multifinite buffers. Multiscale Model Simul. 2016;14(2):923–63.View ArticlePubMedPubMed CentralGoogle Scholar
 Hegland M, Burden C, Santoso L, MacNamara S, Booth H. A solver for the stochastic master equation applied to gene regulatory networks. J Comput Appl Math. 2007;205(2):708–24.View ArticleGoogle Scholar
 Peleš S, Munsky B, Khammash M. Reduction and solution of the chemical master equation using time scale separation and finite state projection. J Chem Phys. 2006;125(20):204104.View ArticlePubMedGoogle Scholar
 Anna L, CsikászNagy A, Gy Zsély I, Zádor J, Turányi T, Novák B. Time scale and dimension analysis of a budding yeast cell cycle model. BMC Bioinformatics. 2006;7:494.View ArticleGoogle Scholar
 Surovtsova I, Simus N, Lorenz T, Konig A, Sahle S, Kummer U. Accessible methods for the dynamic timescale decomposition of biochemical systems. Bioinformatics. 2009;25(21):2816–23.View ArticlePubMedGoogle Scholar
 Haseltine EL, Rawlings JB. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys. 2002;117(15):6959.View ArticleGoogle Scholar
 Kuroda Y, Suenaga A, Sato Y, Kosuda S, Taiji M. Allatom molecular dynamics analysis of multipeptide systems reproduces peptide solubility in line with experimental observations. Sci Rep. 2016;6:19479.View ArticlePubMedPubMed CentralGoogle Scholar
 Jayachandran G, Vishal V, Pande VS. Using massively parallel simulation and Markovian models to study protein folding: Examining the dynamics of the villin headpiece. J Chem Phys. 2006;124(16):164902.View ArticlePubMedGoogle Scholar
 Singhal N, Snow CD, Pande VS. Using path sampling to build better Markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. J Chem Phys. 2004;121(1):415.View ArticlePubMedGoogle Scholar
 Tapia JJ, Faeder JR, Munsky B. Adaptive coarsegraining for transient and quasiequilibrium analyses of stochastic gene regulation. 2012. p. 5361–6.Google Scholar
 Kohlhoff KJ, Shukla D, Lawrenz M, Bowman GR, Konerding DE, Belov D, Altman RB, Pande VS. Cloudbased simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways. Nat Chem. 2013;6(1):15–21.View ArticlePubMedPubMed CentralGoogle Scholar
 Bhatt D, Bahar I. An adaptive weighted ensemble procedure for efficient computation of free energies and first passage rates. J Chem Phys. 2012;137(10):104101.View ArticlePubMedPubMed CentralGoogle Scholar
 Zhang BW, Jasnow D, Zuckerman DM. The ‘weighted ensemble’ path sampling method is statistically exact for a broad class of stochastic processes and binning procedures. J Chem Phys. 2010;132(5):054107.View ArticlePubMedPubMed CentralGoogle Scholar
 Adelman JL, Grabe M. Simulating rare events using a weighted ensemblebased string method. J Chem Phys. 2013;138(4):044105.View ArticlePubMedPubMed CentralGoogle Scholar
 Marcus W, Fackeldey K. Gpcca: Spectral clustering for nonreversible markov chains. ZIB Rep. 2015;15(35).
 Lv C, Li X, Li F, Li T. Constructing the energy landscape for genetic switching system driven by intrinsic noise. PLoS One. 2014;9(2):e88167.View ArticlePubMedPubMed CentralGoogle Scholar
 Assaf M, Roberts E, LutheySchulten Z. Determining the Stability of Genetic Switches: Explicitly Accounting for mRNA Noise. Phys Rev Lett. 2011;106(24):248102.View ArticlePubMedGoogle Scholar