A computational method for the investigation of multistable systems and its application to genetic switches

Background Genetic switches exhibit multistability, form the basis of epigenetic memory, and are found in natural decision making systems, such as cell fate determination in developmental pathways. Synthetic genetic switches can be used for recording the presence of different environmental signals, for changing phenotype using synthetic inputs and as building blocks for higher-level sequential logic circuits. Understanding how multistable switches can be constructed and how they function within larger biological systems is therefore key to synthetic biology. Results Here we present a new computational tool, called StabilityFinder, that takes advantage of sequential Monte Carlo methods to identify regions of parameter space capable of producing multistable behaviour, while handling uncertainty in biochemical rate constants and initial conditions. The algorithm works by clustering trajectories in phase space, and iteratively minimizing a distance metric. Here we examine a collection of models of genetic switches, ranging from the deterministic Gardner toggle switch to stochastic models containing different positive feedback connections. We uncover the design principles behind making bistable, tristable and quadristable switches, and find that rate of gene expression is a key parameter. We demonstrate the ability of the framework to examine more complex systems and examine the design principles of a three gene switch. Our framework allows us to relax the assumptions that are often used in genetic switch models and we show that more complex abstractions are still capable of multistable behaviour. Conclusions Our results suggest many ways in which genetic switches can be enhanced and offer designs for the construction of novel switches. Our analysis also highlights subtle changes in correlation of experimentally tunable parameters that can lead to bifurcations in deterministic and stochastic systems. Overall we demonstrate that StabilityFinder will be a valuable tool in the future design and construction of novel gene networks. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0375-z) contains supplementary material, which is available to authorized users.

One of the foundational constructs in synthetic biology is the genetic toggle switch. The toggle switch consists of a set of transcription factors that mutually repress each other [1,[22][23][24]. Genetic switches play a major role in binary cell fate decisions such as stem cell differentiation, as they are capable of exhibiting bistable behaviour, which gives rise to the existence of two distinct phenotypic states. This allows populations of cells to maintain a heterogeneous response to environmental cues and can increase fitness by bet-hedging [25]. Switches are powerful building blocks; they underlie electronics and logic systems, and have great potential in synthetic biology. The genetic toggle switch has been used for a number of applications including the construction of a synthetic genetic clock [22], the regulation of mammalian gene expression [2,5], the development of a predictable genetic timer [26], and the formation of biofilms in response to engineered stimuli [27].
The stability of the toggle switch has been investigated extensively in the literature, but the conclusions drawn vary according to model abstraction. Numerous studies have concluded that cooperativity is a necessary condition for bistability to arise [1,[28][29][30][31]. However, Lipshtat et al. found that stochastic effects can give rise to bistability even without cooperativity [32]. In another study, Ma et al. found that stochastic fluctuations can stabilize the unstable steady state in the deterministic system, giving rise to tristability [33]. In addition, Biancalani et al. identified multiplicative noise as the source of bistability in the stochastic case [34]. As is clear from the above, there is yet to exist a consensus on the stability a switch is capable of, and the most appropriate method of modelling it. Most of these studies assumed the quasi-steady state approximation (QSSA) [35], which cannot always be assumed to hold in vivo [36]. In terms of system design, extensions of the basic toggle switch motif, including additional positive feedback mechanisms, have been investigated [37,38], and optimization methods have been used to identify topologies and parameter values for bistable and tristable genetic switches [39][40][41][42]. For stochastic switch design, control theoretic approaches [43], and simulation-based frameworks [44], have been developed. However, none of these existing approaches can be be applied to reasonably sized models, under the assumption of deterministic and stochastic dynamics, and identify regions of parameter space for which switching occurs, which we argue is critical in designing systems under considerable uncertainty.
Here we present a computational framework based on sequential Monte Carlo [45] that can determine the parameter region for a given model to produce a given number of (stable) steady states. Uniquely, multistable parameter regions can be identified for both deterministic and stochastic systems, and also complex models with many parameters, thus removing the need for simplifying assumptions. Our framework can be used for comparing the conclusions drawn by various modelling approaches and thus provides a way to investigate appropriate abstractions. This framework is available as a Python package, called StabilityFinder. We investigate genetic toggle switches and uncover the design principles behind making bistable and tristable switches (all models used in this study are summarised in Table 1.) We find that both production and degradation rates of transcription factors are key parameters for bistability, and outline how the addition of positive autoregulation, combined with particular parameter combinations, can create multistable switching behaviour. Finally we demonstrate the ability of the framework to examine more complex systems and examine the design principles of a three gene switch. These examples demonstrate that StabilityFinder will be a valuable tool in the future design and construction of novel gene networks.

Methods
StabilityFinder is based on a statistical inference method that combines approximate Bayesian computation (ABC) with sequential Monte Carlo [46]. This simulation-based method uses an iterative process to arrive at a distribution of parameter values that can give rise to observed data or a desired system behaviour [44]. ABC methods are used for inferring the posterior distribution in cases where the likelihood is intractable or is too computationally expensive to evaluate. Instead of computing the likelihood, ABC methods simulate the data and then compare the simulated and observed data through a distance function [46]. Given the prior distribution π(θ) we can approximate the posterior distribution, π(θ |x) ∝ f (x|θ)π(θ), where f (x|θ) is the likelihood of a parameter, θ, given the data, x. There are a number of different variations of the ABC algorithm depending on how the approximate posterior distribution is sampled.
The simplest ABC algorithm is the ABC rejection sampler [47]. In this method, parameters are sampled from the prior and data simulated through the data generating model. For each simulated data set, x * , a distance from that of the desired behaviour is calculated, ρ(x * , y), and if greater than a threshold, , the sample is rejected, otherwise it is accepted.
Algorithm 1 ABC rejection algorithm to generate samples {θ i ; i ≤ N} from π(θ|ρ(x, y) ≤ ) 1: Initialise i = 0 2: Sample a parameter vector θ * from prior, θ ∼ π(θ * ) 3: Simulate a dataset, x * , from the the model, x * ∼ f (x|θ) 4: Calculate the distance d = ρ(x * , y). 5: if d ≤ then 6: θ i = θ * * . i = i + 1 7: if i ≤ N then GoTo step 2 8: end if 9: end if The main disadvantage of this method is that if the prior distribution is very different from the posterior, the acceptance rate is very low [46]. An alternative method is ABC Markov Chain Monte Carlo (MCMC) [48]. The disadvantage of this method is that if it gets stuck in an area of low probability it can be very slow to converge [49]. The method used here is based on sequential Monte Carlo, which avoids both issues faced by the rejection and MCMC methods. It propagates the prior through a series of intermediate distributions in order to arrive at an approximation of the posterior. The tolerance, , for the distance of the simulated data to the desired data is made smaller at each iteration. When is sufficiently small, the result will approximate the posterior distribution [46].
To investigate the multistable behaviour of systems, a number of extensions to existing approaches are required. For a given set of parameter values, sample points are taken across initial conditions using latin hypercube sampling [50], and the ensemble system simulated in time until steady state. The distance function in ABC is replaced by a distance on the desired stability of the simulated model. To do this we cluster the steady state coordinates using K-means clustering [51] and use the Gap statistic to determine the number of clusters [52]. At each iteration, the number of steady states is determined by the number of clusters in phase space. A particle is accepted only if the number of clusters present is within an acceptable distance from the threshold . The algorithm is summarised below.
This algorithm is available as a Python package, called StabilityFinder. The user provides an SBML model file [53,54] and an input file that contains all the necessary information to run the algorithm, including the desired stability and the final tolerance, , for the distance from the desired behaviour necessary for the algorithm to terminate. The flow of execution is illustrated in Fig. 1. Since the algorithm is computationally intensive, all deterministic and stochastic simulations are performed using algorithms implemented on Graphics Processing Units Algorithm 2 StabilityFinder algorithm 1: Initialise t = 0, 2: i = 0 3: if t = 0 then 4: Sample particle from prior, θ * * ∼ π(θ) 5: else 6: Sample θ * from the previous population {θ i t−1 } with weights w t−1 . 7: Perturb the particle, θ * * ∼ K t (θ|θ * ) where K t is the perturbation kernel. 8: end if 9: Sample k initial conditions {x k 0 } via latin hypercube sampling. 10: Simulate k datasets to steady state, {x * k }, from the the model, x * ∼ f (x|θ, x 0 ) 11: Apply clustering in phase space on {x * k } 12: Calculate the distance d = ρ({x * k }, y). 13: if d ≤ t then 14: 15: if i ≤ N then GoTo step 3 16: else 17: Calculate weight for each accepted θ i t 18: , if t ≥ 1.

19:
Normalise weights 20: if t ≤ N t then 22: GoTo step 3 23: end if 24: end if 25: end if (GPUs), which are used for mutli-threaded computation [55]. The algorithm returns the final accepted particles and their associated weights, as well as the initial conditions sampled and the steady state values obtained. The The algorithm takes as input a model and a desired stability structure in phase space. c The system is evolved to achieve the stability of choice via intermediate populations. In this example model, there are two species and two parameters. For the model to be considered bistable, the phase plot of the two species of interest must have two distinct clusters, as shown in (c). d The parameter posterior distribution for the model to achieve the desired behaviour are given as an output. e Flow chart of the Python package, StabilityFinder, which implements the algorithm final accepted particles can be used to study the characteristics of the posterior distribution. The sampled initial conditions and the resulting steady state values can be used to study the basins of attraction of the system.

The Gardner switch under deterministic and stochastic dynamics
The first synthetic genetic toggle switch was constructed in E. coli by Gardner et al. and consisted of two mutually repressing transcription factors [1]. The model used to design and interpret the system is shown in Fig. 2a, and in the deterministic case is defined by the following ODEs where u is the concentration of repressor 1, v the concentration of repressor 2, α 1 and α 2 denote the effective rates of synthesis of repressors 1 and 2 respectively, and β and γ are the cooperativity of repression of promoter 1 and of repressor 2 respectively. Gardner et al. studied the deterministic case and concluded that there are two conditions for bistability for this model; that α 1 and α 2 are balanced and that β, γ >1 [1]. In order to test Stabil-ityFinder we used it to find the posterior distribution for which this model exhibits bistable behaviour. We therefore set the desired behaviour to two (stable) steady states, and using a wide range of values as priors as shown in the Additional file 1, we used StabilityFinder to find the parameter values necessary for bistability to occur. The posterior distribution calculated by StabilityFinder for the Gardner deterministic case is shown in Fig. 2c. These results agree with the results reported by Gardner et al. [1]. For this switch to be bistable α 1 and α 2 must be balanced while β and γ must both be >1, as can be seen in the marginal distributions of β and γ in Fig. 2c.
We next applied StabilityFinder to the case of the Gardner switch under stochastic dynamics using the same priors as the deterministic case, and again searched the parameter space for bistable behaviour. The posterior is shown in Fig. 2d. We can see that the conditions on the parameters required for bistability in the deterministic case generally still stand in the stochastic case. There appears to be slightly looser requirements on the parameters of the stochastic model (wider marginal distributions), which is expected due to the nature of clustering deterministic steady states versus stochastic steady states. The Gap statistic is used in the case of the stochastic case, as it is capable of dealing with noisier data whereas a simpler and faster algorithm is used for clustering the deterministic solutions (see Additional file 1). These results demonstrate that StabilityFinder can be used to find the parameter values that produce a desired stability and allow us to confidently apply the methodology to more complex models.

Repressor degradation rates are key for achieving bistablity
We next analyzed an extension of the Gardner switch model previously studied by Lu et al. [38]. They considered two types of switches, the classic switch consisting of two mutually repressing transcription factors (model LU-CS), as well as a switch with double positive autoregulation (model LU-DP). The LU-CS switch was found to be bistable given the set of parameters used, while the LU-DP switch was found to be tristable [38]. The classical model used in their study is given by the following system of ODEs , and g I represents the production rate, k I the degradation rate, n I the Hill coefficient, x I the Hill threshold concentration and λ I the fold change of the transcription rates, and I ∈ {xy, yx, xx, yy} (see Additional file 1 for the details of all models used).
For the parameter values used in the Lu study, the classical switch exhibits three steady states (Fig. 3), two of which are stable and one is unstable. Using StabilityFinder with priors centred around the parameter values used in the original paper (see Additional file 1), we can identify the most important parameters for achieving bistability. The posterior distribution of these models are shown in Fig. 3a. We find that the parameters representing the rates of degradation of the transcription factors in the system (k x , k y ) must both be large in relation to the prior range, and approximately equal, for bistability to occur. Protein degradation rates have been shown to be important for many system behaviours including oscillations [7,56,57]. We also find that the steady states of the LU-CS model Fig. 3 The three variants of the deterministic Lu models. a The classic switch with no autoregulation is bistable as shown in the stream plot and the phase plot. In the stream plot, the colours indicate the magnitude of the vectors, with yellow indicating high and red low values. The blue points represent stable steady states and the grey points represent unstable steady states. The phase plot shows the steady state values for 100 particles at the final population. Each particle is represented by a different shade of blue. The most restricted parameters for this behaviour are the degradation rates (k x and k y ), which both have to be high while the net protein production for X and Y must be balanced. b The extended Lu model with a single positive autoregulation on X. This model is bistable when the production rate of X, g x , is small. c The Lu model with double positive autoregulation is tristable as shown in the stream plot and the phase plot. We find two types of tristable behaviour, one where the third steady state is zero-zero and one where the third state is high are symmetric: the values for the dominant and repressed species are equivalent in both steady states.

The addition of symmetric and asymmetric positive autoregulation
It is known that the addition of positive autoregulation to the classical toggle switch can induce tristability [37,38,58]. Here we investigate the interplay of positive autoregulation on the values of the other parameters in the model. We extended the analysis presented in Lu et al. by including the switch with single positive autoregulation (model LU-SP), where a single positive autoregulatory feedback is present on one of the genes. This system topology has also been constructed previously [23,59]. The advantage of using StabilityFinder over traditional bifurcation analysis is that the full parameter space is explored rather than solving the system for a single set of parameters. This allows us to deduce model properties that could not otherwise be identified. Robustness to parameter fluctuations can be explored, as well as parameter correlations and restrictions on the values they can take while still producing the desired behaviour.
The LU-DP model is given by whereas the LU-SP switch is modelled using the following ODE system We find that the switch with single positive autoregulation is capable of bistable behaviour as seen in Fig. 3b, but this is only possible when the strength of the promoter under positive autoregulation, g x , is small. There appear to be no such constraints on the strength of the original, unmodified, promoter, g y . We also find that unlike the LU-CS and LU-DP models, the steady states of the bistable LU-SP are not symmetric. The levels of Y are around zero and always lower than the levels of X. The levels of both are lower than those found in the LU-CS and LU-DP models.
Upon examination of the LU-DP model, we also find that tristability in the switch is relatively robust, as this phenotype is found across a large range of parameter values, with no parameters strongly constrained (see Additional file 1) but the two parameters for gene expression, g x and g y tend to be small compared to the priors. Two types of tristable behaviour are identified, one where the third steady state is at (0,0) and and one where the third steady state has non-zero values. This result agrees with previous work where it was found that a switch can exhibit two kinds of tristability, one in which the third steady state is high (III H ) and one in which it is low (III L ) [37].

Design principles for a switch capable of two, three and four steady states
The LU-DP switch is capable of both bistable and tristable behaviour as well as four coexisting states under deterministic dynamics (quadristability) [37]. It is of great interest to understand the conditions under which these three behaviours occur. We carried out a bifurcation analysis of the DP switch using the PyDSTool [60] in order to get an indication of the stabilities this model is capable of, and at which parameter ranges these are found (Fig. 4b). This shows that by varying the parameter for gene expression (g x ) while all other parameters remain constant we can obtain all three behaviours. We find that if 100 ≤ g x ≤ 120 the system exhibits four steady states, if 9 ≤ g x ≤ 10 the system is tristable and if 10 ≤ g x ≤ 100 the system is bistable (see Additional file 1).
Using StabilityFinder we obtained posterior distributions for the bistable, tristable and quadristable phenotypes (Fig. 4). Upon examination of the these distributions, we observe that a subset of the parameter values are different for the three behaviours, although the differences are surprisingly subtle. We find differences in the univariate distribution of the parameters for gene expression, g x , as highlighted in Fig. 4c, box 1. This parameter must be small for four steady states to occur but there are no such restraints for a bistable or a tristable switch. Furthermore, parameter x xx (the dissociation constant for autoregulation) must be small for tristable and quadristable behaviour to be achieved, but there are no such restraints for a bistable switch, as can be seen in Fig. 4c, box 2. Additionally, we find a difference in the bivariate distributions in the posterior. Most notably, we find that parameters x xx and g x are tightly constrained in the tristable and quadristable cases, where both parameters are required to be small, but less so in the bistable case (Fig. 4c, box  3). Another notable difference is between parameters x xx and n xx shown in Fig. 4c, box 5, where they are constrained in the tristable and quadristable cases but not the bistable case. Interestingly, we also find parameter correlations conserved between the three behaviours, as seen in Fig. 4c, box 4, where parameters l xx and g x , (positive autoregulation and gene expression) are negatively correlated in both cases. This highlights the importance of treating unknown parameters as distributions rather than fixed values when studying complex system behaviour. These ensemble based methods are capable of uncovering not only the ranges and values required for certain behaviour, but also the correlations between parameters, which would be missed by optimisation methods.

Bistability and tristability using more realistic abstractions
In order to study the switch system in the most realistic way, we avoid using the quasi-steady state approximation (QSSA) that is often used in modelling the toggle switch. Under the assumption of mass action kinetics, the two-equation system becomes a system of 14 equations and 9 parameters in the classical switch case. These are shown in the Additional file 1. In the cellular environment stochastic effects can be non-negligible and should therefore be taken into account when trying to elucidate the behaviours that a system is capable of. Therefore, we model these switches using stochastic dynamics.
We find that under stochastic dynamics, both the simple switch, CS-MA, and positive autoregulation switch, DP-MA, are capable of bistable and tristable behaviour (Fig. 5). The fact that tristability can occur in the classical model is consistent with the effect of small molecule numbers; if gene expression remains low, it provides the opportunity for small number effects to be observed, and the third unstable steady state to stabilise [33]. To verify the robustness of the tristability found in the stochastic case, we re-sampled the posterior distributions, simulated to steady state and confirmed tristable behaviour. As can be seen in Fig. 5, differences in the parameter values are observed between the bistable and tristable switches, in both CS-MA and DP-MA models. The design principles for both the CS-MA model and the DP-MA model are summarised in Table 2.

Achieving higher multistability
To further demonstrate the flexibility of our framework we investigated a system capable of higher stabilities. Multistability is found in some differentiating pathways, such as the myeloid differentiation pathway [61,62]. We allow Fig. 4 Design principles of multistable switches. a Using the Lu model with added positive autoregulation we uncover the design principles dictating if a switch will be bistable, tristable, or quadristable. b An initial bifurcation analysis of the LU-DP switch uncovers the stabilities it is capable when varying the parameter for gene expression (while keeping all other parameters fixed). c-e By considering the bivariate distributions of the parameters we can uncover the differences in the parameters of a bistable switch compared to a tristable switch, compared to a quadristable switch. The posterior distribution of the bistable switch is shown in purple, the tristable switch in red and the quadristable switch in green, all plotted on the prior ranges. The bivariate distributions for which a difference is observed between the stabilities are in black boxes. An example of the phenotype (phase plot) from each switch is shown next to the corresponding posterior distribution. Parameter legend key: g x production rates; k x degradation rates; n xy Hill coefficients; x xy Hill threshold concentrations; l xy transcription rate fold change; n xx autoregulation Hill coefficients; x xx autoregulation Hill threshold concentrations; l xx autoregulation transcription rate fold change for these more complex dynamics by extending the LU-DP model by adding another gene, making it a three gene switch. This new system is depicted in Fig. 6a. In StabilityFinder we look for six steady states, the output being in nodes X and Y and use the priors shown in the Additional file 1. We successfully find that the system is capable of six steady states, as shown in Fig. 6c. Consistent with the LU-DP switch capable of 2, 3 and 4 steady states, we find that the steady states are symmetric (Fig. 6c). Each of the six steady states exists in symmetry with another one, in tightly constrained regions. We find that the most constrained parameters for this behaviour are again the degradation rate of the proteins, k x . If they are too large or too small the system will not exhibit hexastability. Additionally we find that the Hill coefficients for the repressors, n xy , are constrained to be smaller than 1.5 as seen in Fig. 6d. This example demonstrates that Stability Finder can be used to elucidate the dynamics of more complex network architectures, which will be key to the successful design and construction of novel gene networks as synthetic biology advances.

Conclusions
We have developed an algorithm that can identify the parameter regions necessary for a model to achieve a given number of stable steady states. The novelty in our framework over existing methodology is that complex models can be analyzed assuming both deterministic and stochastic dynamics. We have shown that the algorithm can be used to infer the parameter ranges that give rise to specific behaviour in various models. We uncovered the design principles that make a bistable, a tristable  The three-node mutual repression model, with added positive auto-regulation on each node. a The model is studied in two dimensions using StabilityFinder, with clustering performed on the levels of X and Y. b Null clines and steady states of the three-node switch. c An example phase plot from the posterior of the three-node switch and d the phase plot of 100 particles from the final model. Each particle is represented by a different shade of blue. e The posterior distribution of the 6-steady state three-node system plotted on the prior ranges. Parameters for degradation, k x , and the Hill coefficient, n xy , are the most constrained. Parameter legend key: g x production rate; k x degradation rate; n xy Hill coefficient; x xy Hill threshold concentration; l xy transcription rate fold change; n xx autoregulation Hill coefficient; x xx autoregulation Hill threshold concentration; l xx autoregulation transcription rate fold change and a quadristable switch. We also found that a threenode switch is capable of hexastability. Importantly, we removed assumptions made to simplify the switch models and showed that they are still capable of bistable and tristable behaviour. Although we only examined models containing combined transcription and translation, our approach could be applied to any models of switching behaviour, including more detailed kinetic models and more complex multistable switches that exist in natural biological systems, such as developmental pathways. We also limited our framework to the objective behaviour of a given number of stable steady states. However, this approach is extremely flexible, and could be extended to find systems with a given switching rate, or systems robust to a particular set of perturbations, both of which could be of great importance for building more complex genetic circuits.
One limitation of our approach is that we cannot rule out a specific behaviour; it is always possible that some part of parameter space remains unexplored, or because the search space must be predefined, interesting regions are not included in the search. In the Bayesian sense, this predefined space is the prior distribution for the parameters that give rise to the stability under investigation. In principle, once our knowledge of these biochemical rate constants grows, we can incorporate these data into the prior regions for exploration. Another limitation is that of scalability. Our framework can currently be applied to small and medium size gene networks since the computational time is exponential in size, whereas optimization methods are more scalable [39][40][41]. This is a manifestation of a general tradeoff between finding an optimal value and exploring a parameter space. However, we argue that for current and relevant problems in synthetic biology, this computational burden is acceptable.
Approaches based on parameter space exploration are indispensible tools for providing understanding of general system properties and guiding more detailed experimental and theoretical studies. They will also be key for the design and construction of synthetic gene networks. By selecting standardized parts accordingly, such as promoters, RBS sequences and other untranslated regions [18,[63][64][65], in vivo systems can be matched to parameter regions with a high probability of function.
More generally our results highlight that changing the level of abstraction, in addition to the modification of the feedback structure and parameter values, can significantly alter the qualitative behaviour of a system model. These results advocate the need for a programme of experimental work, combined with systems modelling, to understand the rules of thumb for abstraction in model based design of synthetic biological systems.

Additional file
Additional file 1: Supplementary information. Contains detailed descriptions of the models and algorithms used. (PDF 2350 kb)