Case A: slow regime
We first consider the slow regime where the gene states change very slowly compared with the synthesis and degradation of proteins and mRNAs. We start from constructing the energy landscape, which is a useful way to illustrate the dynamics of the system. We simply take the definition of the potential as U(m,n)=− lnP(m,n), which is commonly used by Wang et al. [14, 28, 29]. Here P is the invariant distribution with the form P(m,n)=P
_{0}(m,n)+P
_{1}(m,n). Solving the CME of the system (1)(2) with brutal force, we get the energy landscape as shown in Fig. 2a. It contains two attracting basins, among which the one with large number of molecules is related to the open state, whereas the other one located at the origin is related to the closed state. As the global energy landscape reflects the steady state distribution of the system, it can not tell us nonequilibrium information. Below we will address this issue by reducing the dynamics and revealing the essential nature of the system.
A careful observation shows that the gene will stay on a single state for a very long time and then make transition to another state. So we decouple the landscape into two separated layers according to the gene state (Fig. 2
b). The upper layer stands for the open state, whereas the lower layer stands for the closed state. Transitions between two layers do not often occur. Each layer has only one steady state, which can be found through deterministic rate equations
$$ \begin{aligned} \frac{\mathrm{d}m}{\mathrm{d}t}= &\alpha\left(k_{R0}+k_{R}\frac{n}{n+K}\right)d_{R}m,\\ \frac{\mathrm{d}n}{\mathrm{d}t}= &k_{P}md_{P}n, \end{aligned} $$
((3))
where α takes value 0 if the gene is closed and value 1 if the gene is open. The steady states of Eq. 3 when α=0 and 1 characterize the attraction points in the overall energy landscape and the relaxation from one steady state to the other corresponds to the transition paths. If the noise effect can not be neglected, we can further get the diffusion approximation beyond Eq. 3 (see Additional file 1 for details). When the system size is large, the dynamical process on each level can be approximated by diffusion process [30–32]. We consider the concentration variables x=m/V, y=n/V on the lattice \(\mathbb {N}^{2}/V\), where V is the volume size. Define x=(x,y), then the first and second moments of the process satisfy
$$ \begin{aligned} \dot{\boldsymbol{x}}=&\;\boldsymbol{c}(\alpha,\boldsymbol{x}(t)),\\ \dot{\boldsymbol{\sigma}}=&\;\boldsymbol{\sigma}(t)\boldsymbol{J}^{T}(\alpha,t)+\boldsymbol{J}(\alpha,t)\boldsymbol{\sigma}(t)+2\boldsymbol{D}(\alpha,\boldsymbol{x}(t)), \end{aligned} $$
((4))
where c(α,x) and D(α,x) have the form
$$ \begin{array}{l} \boldsymbol{c}=\left[\begin{array}{c} d_{R} x+\alpha\left(k_{R0}+k_{R}\frac{y}{y+k}\right)\\ k_{P} xd_{P}y \end{array}\right],\\ \boldsymbol{D}=\frac{1}{2}\left[ \begin{array}{cc} d_{R} x+\alpha\left(k_{R0}+k_{R}\frac{y}{y+k}\right) &0\\ 0& k_{P}x+d_{P}y \end{array}\right]. \end{array} $$
((5))
J(α,t) is the Jacobian of c(α,x(t)), and σ(t) is the covariance matrix of the system. Thus, given α=0 or 1, the dynamics of the system can be well approximated by the time evolution of the moment Eqs. 4 and the transition paths mainly follow the solution of the ODEs (3) (Fig. 2
a, b). This is verified by Gillespie’s stochastic simulation algorithm and the results are shown in Fig. 2
c, d. The darkness of the shading points represents the number of visits for transition paths, and the fitted curve shows nice coincidence with the deterministic path, which strongly supports our methodology.
This understanding can also be utilized to get a simple approximation of the invariant distribution. When the rates of genetic switches are extremely slow, the system will mainly stay in closedstate basin or openstate basin with rare transitions. So we have κ≈0, and we can approximate the steady state distribution \(P^{ss}_{\alpha }\) by simple Gaussian distribution \(N(\boldsymbol {x}_{\alpha }^{ss},\boldsymbol {\sigma }_{\alpha }^{ss})\), where \(\boldsymbol {x}_{\alpha }^{ss}\) and \(\boldsymbol {\sigma }_{\alpha }^{ss}\) are the corresponding steady state solution of Eq. 4. Furthermore we can combine the two distributions to depict the invariant distribution of the whole system through a Gaussian mixture model
$$ P^{ss}(\boldsymbol{x})\approx wN(\boldsymbol{x}_{0}^{ss},\boldsymbol{\sigma}_{0}^{ss})+(1w)N(\boldsymbol{x}_{1}^{ss},\boldsymbol{\sigma}_{1}^{ss}), $$
((6))
where the mixture weight w is the proportion of time that the system stays on the closed layer. Since the mean time that gene stays on the open layer is \(d_{G}^{1}\), and the mean time that gene stays on the closed layer is approximately \(k_{G0}^{1}\) (the steady state O
_{
P
}=0 on this layer), the value of w satisfies
$$ \frac{w}{1w}\approx \frac{k_{G0}}{d_{G}}. $$
((7))
Solve it and we get w=k
_{
G0}/(k
_{
G0}+d
_{
G
}). This simple Gaussian mixture approximation is shown and compared with the accurate distribution in Fig. 3
a, b. The approximation seems to fit the solutions of Eqs. 1(2) well, especially around the steady point. But some details corresponding with the switching paths are neglected, which makes some difference with the accurate solutions.
When the rates of genetic switches are relatively slow but not extremely slow, we can refine the approximation (Eq. 6) by taking into account the transition path information. On the open layer, we solve Eq. 4 with initial values \((\boldsymbol {x}_{0}^{ss},\boldsymbol {\sigma }_{0}^{ss})\) to get solutions (\(\boldsymbol {x}_{1}^{t},\boldsymbol {\sigma }_{1}^{t}\)). Then we use the timeaverage of Gaussian distributions to approximate \(P_{1}^{ss}(\boldsymbol {x})\), i.e.
$$ P_{1}^{ss}(\boldsymbol{x})\approx \frac{1}{T_{1}}\int_{0}^{T_{1}} p(\boldsymbol{x},t)\mathrm{d}t, $$
((8))
where p(x,t) is the probability distribution density of \(N(\boldsymbol {x}_{1}^{t},\boldsymbol {\sigma }_{1}^{t})\), and T
_{1} is the average time that system keeps staying on the open layer. The same approach applies to the approximation of \(P_{0}^{ss}(\boldsymbol {x})\). Then we combine \(P_{0}^{ss}(\boldsymbol {x})\) and \(P_{1}^{ss}(\boldsymbol {x})\) to get
$$ P^{ss}(\boldsymbol{x})\approx wP_{0}^{ss}(\boldsymbol{x})+(1w)P_{1}^{ss}(\boldsymbol{x}), $$
((9))
where the value of w remains the same. We call Eq. 9 the modified Gaussian mixture approximation and compare it with the accurate distribution in Fig. 3
c. Indeed, Eq. 6 is the limit of Eq. 9 when κ→0.
We also compute the relative errors of the two approximations and find that the modified Gaussian mixture approximation is superior than the simple Gaussian mixture approximation nearly in the whole slow regime (Fig. 3
d, e, f). On both the open and the closed layers, the modified Gaussian approximation performs better than the simple Gaussian approximation. When the switching rates are extremely slow, the two approximations perform equally well. When κ is not very small but still belongs to the slow regime, the simple Gaussian approximation on the closed layer which is a singlepoint delta distribution, deviates far from the accurate solution. This is the main reason that simple Gaussian mixture approximation has relatively large errors.
Case B: fast regime
We now consider the case that genetic switches are much faster than the other reactions in the system. In this regime, the effective synthesis rate of mRNA is (k
_{
R
}
O
_{
P
}+k
_{
R0})k
_{
on
}/(k
_{
on
}+k
_{
off
}) according to the QSSA, where k
_{
on
} and k
_{
off
} stand for the gene activation and inactivation rates. The deterministic meanfield description of this model yields the ODEs:
$$ \begin{aligned} \frac{\mathrm{d}m}{\mathrm{d}t}&= \frac{\left(k_{R0}+k_{R}\frac{n}{n+K}\right)\left(k_{G0}+k_{G}\frac{n}{n+K}\right)}{\left(d_{G}+k_{G0}+k_{G}\frac{n}{n+K}\right)}d_{R}m,\\ \frac{\mathrm{d}n}{\mathrm{d}t}&= k_{P}md_{P}n. \end{aligned} $$
((10))
This system has two stable fixed points and one saddle in physically reasonable parameter regime. These two stable fixed points correspond to the expressed and unexpressed states at which the copy number of proteins is at high or low state, respectively. As shown in the previous section, the global energy landscape of the slow regime also has two basins, which looks like the fast case in the first sight. But the essential nature is totally different. The bistability of the system in the slow regime originates from the simple twolayer dynamics induced by genetic switching, whereas the bistability in the fast regime is from an adiabatic reduction of the switching dynamics. The previous work of the authors has established a good framework to understand the metastability based on large deviation theory for Markov processes [15, 33]. In this framework, we assume that the steady state distribution P
^{ss}(x) satisfies P
^{ss}(x)∝ exp(V
S(x)), where S(x) is the global quasipotential function that we focus on, and the volume size V is inversely proportional to the noise strength. From WKB asymptotics [13] or the large deviation theory [15, 33], we know that S(x) satisfies the HamiltonJacobi equation H(x,∇S)=0 from the leading order analysis for the corresponding CME, where H is the Hamiltonian of the system. However, solving the HamiltonJacobi equation is not an easy task, and we utilize an alternative efficient optimization approach to obtain S(x). We define the local quasipotential S(x;x
_{0}) with respect to a metastable state x
_{0} as
$$ S(\boldsymbol{x};\boldsymbol{x}_{0})=\inf_{T>0}\inf_{\phi,\phi(0)=\boldsymbol{x}_{0},\phi(T)=\boldsymbol{x}}{\int_{0}^{T}} L(\phi,\dot{\phi})\mathrm{d}t, $$
((11))
where L is the Lagrangian, which is the Legendre transform of the Hamiltonian H, and the path ϕ are all possible continuous connections from x
_{0} to x within time T. From classical mechanics, the double minimization problem (11) can be transformed to a single minimization with respect to the intrinsic representation of the curves ϕ(s) by Maupertuis’ principle [15, 34] as
$$ S(\boldsymbol{x}; \boldsymbol{x}_{0}) = \inf_{\stackrel{\phi(0)=\boldsymbol{x}_{0},\phi(1)=\boldsymbol{x}}{H(\phi,\boldsymbol{p})=0, H_{\boldsymbol{p}}(\phi,\boldsymbol{p})=\lambda \phi'}}{\int_{0}^{1}} p(s) \cdot d\phi(s), $$
((12))
where p(s) and λ are determined from the constraints for each ϕ(s). This minimization problem can be efficiently solved by using the geometric minimum action method (gMAM) [15, 34]. S(x) can be obtained from its local version S(x;x
_{0}) by a suitable sticking procedure [15, 33]. With this approach, we can construct the global quasipotential energy landscape S(x) and obtain the most probable transition paths between two metastable states. More details can be referred to [15].
With proper values of parameters, the system has two metastable states and the global quasipotential energy landscape can be constructed as shown in Fig. 4
a. The landscape looks similar with Fig. 2, but transition paths between two basins show different features. In the slow regime, the transition paths are characterized by the solutions of deterministic ODEs (3) and they do not intersect at any point. In the fast regime, the most probable transition paths of ontooff switching and offtoon switching are not identical, either. They intersect at an unique point, i.e. the saddle point of the deterministic ODEs (10), but they do not cross over each other (Fig. 4
a, b). We also present the results by Gillespie’s algorithm and compare the theoretically predicted optimal transition paths and the fitted curves in Fig. 4
c, d. The comparison shows good matching between the fitted curves and the theoretical transition paths. Our results are consistent with the previous studies [15, 35].
Case C: intermediate regime and comparisons
In both the fast and slow regimes, we have already illustrated some analytical approaches to quantitatively understand the global energy landscape and transitions between metastable states in gene expression model with positive feedback. But in the intermediate regime, where all of the reactions are on the same time scale, these analytical methods fail, since it is difficult to find the appropriate deterministic equations and hard to know whether steady states exist in the system. Thus we compute the key properties in the intermediate regime through Gillespie’s algorithm and compare the results with the slow and fast regimes.
We simulate the mean switching time (MST) with different parameters. The MST is the average transition time between two basins on the global energy landscape, which is often used to characterize the stability of the attractors. Figure 5 shows the MST with respect to different κ and positive feedback strength K. When κ increases from 10^{−5} to 10^{5}, the gene activation rate changes from slow to fast regimes. Since we focus on switching times between metastable states, the value of K should be chosen to guarantee that bistability always exists in the system with respect to all the possible values of κ. We define the existence of bistability as that the system stay at each metastable state with probability more than 0.01. In the slow regime, bistability always exists for any positive K. In the intermediate and fast regimes, we find two boundary values of K, which are K=2754 and K=3211. When K<2754, the system will stay at the off state with probability less than 0.01, whereas when K>3211, the system will stay at the on state with probability less than 0.01. The smaller K is, the larger positive feedback strength is.
In the slow regime, the MST can be characterized by the waiting time for transitions to occur, so the MST decreases as κ increases. In this regime, K has little effect on MST, which is consistent with our analysis in Case A that the MST of ontooff and offtoon transitions is approximately \(d_{G}^{1}\) and \(k_{G0}^{1}\), respectively. When κ increases to a value between 10^{−2} and 10^{−1}, the MST decreases to its global minimum point, and a neighborhood of this point is corresponding to the intermediate regime. Then the MST goes up with κ increasing, and finally reaches a stable value in the fast regime. In the fast regime, when K changes from 2754 to 3211 (i.e. the strength of positive feedback decreases), the MST of offtoon switch increases by a factor of about 10, whereas the MST of ontooff switch decreases by a factor of about 10^{3}. Thus, if the positive feedback becomes stronger, the system will be more probable to stay at the on state, with shorter offtoon switching time and longer ontooff switching time.
Longer MST indicates that metastable states are more stable. Thus, our results in Fig. 5 show that the slow and fast regimes perform better in stability, whereas in the intermediate regime, genetic switches, transcription and translation are on the same time scale, which contributes to large fluctuations of molecule numbers and relatively short MST. These properties remain the same with different positive feedback strength. Furthermore, we also did some interesting simulations to study the change of transition paths as a function of κ (see Additional file 1 for details).
We should mention that the above points are direct observations from numerical results, which are hard to be proven analytically. However, we can confirm from the derivations in Case A that when κ is extremely small, the MST of ontooff and offtoon transitions is approximately \(d_{G}^{1}\) and \(k_{G0}^{1}\). That is to say, the downwardsloping shapes of the left parts of the MST curves are qualitatively robust. Besides, when κ goes to infinity, the MST will converge to some constant values [15]. So the flat parts on the right of the MST curves are also theoretically supported. The existence of “valleys” in the MST curves is in accordance with the intuition, but it is difficult to give a rigorous proof. To investigate the robustness of the whole qualitative shape of the MST curve with respect to different feedback actions, we simulated all the results with O
_{
P
}=n
^{2}/(n
^{2}+K
^{2}), and found that the shapes of the MST curves are almost the same as in Fig. 5. We may reasonably infer that all results still hold assuming that O
_{
P
} is a Hill function and the bistability exists in both slow and fast regimes (see Additional file 1 for more details). These results suggest a natural methodology to distinguish the different genetic activation/inactivation rates in the cellular phenotype observations, which is described in detail in the continued text.
Distinguishing the three regimes
In molecular experiments, it is hard to directly measure the switching rates of the active or inactive gene state. Based on the obtained results, we propose a method to distinguish which regime the considered system falls in. Consider a gene expression circuits with the positive feedback in a eukaryotic cellular system (Fig. 1), such as the yeast or the immune cell etc, we can observe the level of proteins through fluorescence or other markers by flow cytometry. Below we will show how to take advantage of these perceivable data and related analysis to make corresponding distinction.
First let us introduce how to identify the intermediate regime. We utilize the strength of cellular memory as the index, and more concretely, we sort the cells by the protein level and observe how long the sorted cells will take to reach the original invariant distribution of proteins. The detailed flow cytometry procedure of our experiments in silico is stated in the following two steps.

Step 1: Sorting the cells by the protein levels. For example, we choose n=400 as the threshold of classification in our model setup, where n is the number of protein molecules in a cell. We call the cells with large number of proteins (n>400) as P+ cells, and the cells with small number of proteins (n≤400) as P − cells. We define the fraction of P+ cells in a cell group as the P+ fraction. Figure 6
a illustrates the numerically sorted results of the system, including 50,000 sample points which are randomly selected according to the invariant distribution, and P+/P − cells are classified by the protein level in each cell. To apply this approach to the gene regulatory network with different parameters, one need to choose an appropriate threshold according to the real situation, which needs to lie between two metastable states.

Step 2: Culturing the sorted P+ and P − groups of cells separately, and recoding the changes of P+ fractions in each group. Figure 6
b, c show the simulation results of the P+ and P − groups respectively. The evolution of the protein distributions of the sorted P+ group is simulated and illustrated in Fig. 6
g, including results of the slow, intermediate and fast regimes. The sorted P+ group of cells in the intermediate regime can be observed to evolve quickly to its initial equilibrium state. In the third day, a large percent of P+ cells in the intermediate regime have evolved to the P − group, whereas the P+ cells in the slow and fast regimes have little change. This suggests that cellular memory is much stronger in the slow and fast regimes compared with the intermediate regime.
In the real experiments, we do not know whether the time for an appreciable change of protein levels is long or short without reliable reference. But the MST in the intermediate regime is approximately the time for cells to reach a metastable state after leaving another metastable state, thus we can use it as a reference time. We define T as the ontooff (or offtoon) transition time, which does not contain the dwell time at the on (or off) state. Since the synthesis and degradation rates of mRNAs and proteins can be estimated through experimental approaches, we can theoretically figure out the time that the system will spend to reach the off state after leaving the on state. That is to say, T can be roughly estimated a priorily. So we make experiments like Fig. 6
g for a period of time which is close to T. If the value of P+ fraction changes evidently towards the equilibrium level, we will know that the system belongs to the intermediate regime.
Second we try to identify the fast and slow regimes through the perturbationresponse behavior of changing κ. As shown in Fig. 5
a, the MST becomes larger and larger when κ approaches to zero, but it increases and gradually stays close to a fixed value when κ is large. Thus, if we perturb κ through injection of particular chemicals, cells in the two regimes will perform differently in the experiments we designed above. Figure 6
d, e, f illustrate the results of corresponding numerical simulations. In the slow regime (κ=0.001), P+ fraction changes more quickly when κ increases, whereas in the relatively fast regime (κ=2, 50), the time for cells to recover to the invariant distribution is longer or unchanged when κ increases. With this perturbationresponse approach, the slow and fast regimes can also be distinguished from each other (Fig. 6
h). When the positive feedback strength K takes the boundary values mentioned in case C, the simulation experiments above still work well (see Additional file 1 for details). This proposal, based on our theoretical understanding of the feature of the three regimes, is yet to be justified by experimentalists.