- Methodology Article
- Open Access
- Published:

# A method for inverse bifurcation of biochemical switches: inferring parameters from dose response curves

*BMC Systems Biology***volume 8**, Article number: 114 (2014)

## Abstract

### Background

Within cells, stimuli are transduced into cell responses by complex networks of biochemical reactions. In many cell decision processes the underlying networks behave as bistable switches, converting graded stimuli or inputs into all or none cell responses. Observing how systems respond to different perturbations, insight can be gained into the underlying molecular mechanisms by developing mathematical models. Emergent properties of systems, like bistability, can be exploited to this purpose. One of the main challenges in modeling intracellular processes, from signaling pathways to gene regulatory networks, is to deal with high structural and parametric uncertainty, due to the complexity of the systems and the difficulty to obtain experimental measurements. Formal methods that exploit structural properties of networks for parameter estimation can help to overcome these problems.

### Results

We here propose a novel method to infer the kinetic parameters of bistable biochemical network models. Bistable systems typically show hysteretic dose response curves, in which the so called bifurcation points can be located experimentally. We exploit the fact that, at the bifurcation points, a condition for multistationarity derived in the context of the Chemical Reaction Network Theory must be fulfilled. Chemical Reaction Network Theory has attracted attention from the (systems) biology community since it connects the structure of biochemical reaction networks to qualitative properties of the corresponding model of ordinary differential equations. The inverse bifurcation method developed here allows determining the parameters that produce the expected behavior of the dose response curves and, in particular, the observed location of the bifurcation points given by experimental data.

### Conclusions

Our inverse bifurcation method exploits inherent structural properties of bistable switches in order to estimate kinetic parameters of bistable biochemical networks, opening a promising route for developments in Chemical Reaction Network Theory towards kinetic model identification.

## Background

In the context of systems biology, modeling is used to drive the acquisition of knowledge about the molecular mechanisms governing intracellular processes by evaluating the system’s behavior. In this way, measuring how the levels of some key species in the network evolve, for example, upon environmental perturbations, may help to retrieve important features concerning connectivities and rate constants of the underlying mechanism (i.e. model structure and parameters, in mathematical modeling terms). One key challenge in systems biology is the identification of kinetic models from limited quantitative data [1],[2], hence the increasing interest in results connecting structure, parameters and dynamic behavior [3]-[5].

Biochemical switches are biochemical reaction networks with the capacity for two different stable steady states. Depending on the initial and environmental conditions the system will evolve towards one steady state or the other. This property enables the cell to *take decisions*, by transforming, for example, graded stimuli into all or none responses. Bistable switches are ubiquitous in cell signaling networks, from pheromone sensing [6] to cell proliferation regulation [7]. The presence of bistability reveals structural properties that the network might or might not have, allowing to discriminate among different structural hypotheses. There are a number of results connecting particular structural features with the absence or presence of bistability in biochemical reaction networks. Some of these results are central to the Chemical Reaction Network Theory (CRNT) [8],[9], which has received in the recent years a considerable amount of attention from the (systems) biology community [10].

How can bistability bring insight on the parameters of a model?

The *forward problem* of determining the mapping from the parameter space to the space of the model behavior and, in particular, to the capacity of the model for bistability, can be approached using standard bifurcation analysis [11] when the number of parameters is small. To build bifurcation diagrams the steady states of the system are computed by varying one or a small number of parameters with the rest of the parameters being fixed. In a previous work [12] a condition for multiple steady states was derived within the framework of CRNT. The condition can be easily checked by evaluating the determinant of a matrix, thus providing an efficient tool to tackle the aforementioned forward problem: once values are assigned to the parameters it is immediate to assess the capacity or not for multiple steady states.

The inverse problem of finding parameter regions compatible with bistable behaviour can be tackled by checking the multistationarity condition derived in [12] through the whole parameter space. In [12] an algorithm based on interval methods was suggested for a reliable and systematic search in order to obtain a complete partition of the parameter space. Starting from a reaction mechanism with (qualitative) experimental evidence of bistability, the algorithm allowed us, based on this qualitative information, to restrict the parameter space to where multiple steady states can be found.

Here, our goal is to exploit quantitative information from experimental data in order to infer the actual values of the model parameters, and what we develop is a method for parameter estimation from experimental data. Typical evidences for bistability obtained from experiments are, for example, hysteretic dose response curves. Dose response curves indicate how the steady state of a system evolves upon increase or decrease of a given stimulus. In a hysteretic dose response curve a continuous change in the stimulus (dose or input), produces a discontinuous jump in the measured steady state concentration or output beyond the so called bifurcation point. In this work we address the inverse problem of determining the parameters that produce certain desired properties of the bifurcation diagram and, consequently, in the observed experimental dose response curves. In particular, the method uses the location of the bifurcation points given by experimental data^{a} to infer the kinetic parameters, exploiting the condition derived in [12]. In contrast to other methods developed for inverse bifurcation analysis which use generic bifurcation conditions [13],[14] and can therefore be used for any system of ordinary differential equations, our method is specific for chemical reaction networks since it exploits their particular structural properties, bringing additional insight into the inverse bifurcation problem for chemical reaction systems.

The basic ingredients of the inverse bifurcation method are described in the Methods section, where a brief introduction of some essential CRNT concepts is given together with the condition for multiple steady states derived in [12], here extended to a broader class of systems. The main contribution of the paper is presented in the Results and discussion section where we develop the inverse bifurcation method exploiting the multistationarity condition to infer kinetic parameters of bistable networks. The method presented provides an alternative way of estimating parameters with few (but very informative) experimental data, and can also be combined with standard techniques (estimation from time course measurements etc). As a proof of concept we apply our method to estimate the kinetic parameters of a genetic toggle switch model.

## Methods

### Reaction network graph and dynamics

We consider biochemical reaction networks with mass action kinetics where the dynamics are modeled by systems of ordinary differential equations (ODEs). The structure of mass action kinetic models is given by the connectivity and the stoichiometry of the network and the parameters are the kinetic rate constants. Let us denote by *m* the number of species and by *r* the number of reactions in a reaction network, the balance equations describing the dynamics of the concentrations of the species involved are a set of ODEs usually written as $\u010b=\mathit{\text{Nv}}\left(c\right)$, with $c\in {\mathbb{R}}_{\ge 0}^{m}$ being the vector of concentrations, $N\in {\mathbb{R}}^{m\times r}$ the stoichiometric matrix and $v\in {\mathbb{R}}^{r}$ the vector of reaction rates^{b}. Within the context of CRNT alternative ways to encode the balance equations are used such that the connection between structure and dynamics can be more easily exploited [4]. Results in this paper rely on a particular representation of the balance equations already introduced in [8], which is based on the graph of complexes of a reaction network and briefly presented next.

For illustrative purposes we consider the mechanism in Figure 1 where the transformation of a protein *S* to its active form *P* is mediated by another protein or enzyme *E*, whereas *S* inhibits its own activation. The complexes of the network are the linear combinations of species at both sides of the reaction arrows and the number of complexes is denoted by *n*. In the *graph of complexes* of a reaction network the nodes are the complexes ${\mathcal{C}}_{1},\dots ,{\mathcal{C}}_{n}$ and the edges are the reactions. The subgraphs are called *linkage classes* and the number of linkage classes is denoted by *ℓ*. Two nodes ${\mathcal{C}}_{i}$, ${\mathcal{C}}_{j}$ are *strongly linked* if there is a directed path from ${\mathcal{C}}_{i}$ to ${\mathcal{C}}_{j}$ and also a directed path from ${\mathcal{C}}_{j}$ to ${\mathcal{C}}_{i}$. A *terminal strong linkage class* is a maximal set of nodes within a linkage class such that there is no edge pointing to any other set of nodes strongly linked. The number of terminal linkage classes is denoted by *t*. Networks with *t*>*ℓ* are considered in general to be unsuited for the description of real chemical systems [8]. In what follows, we focus on networks where *t*=*ℓ* (note that this includes weakly reversible networks, i.e. networks where every linkage class is a strong terminal linkage class).

For a reaction network being bistable there is necessarily an exchange of matter and/or energy through the boundary of the system. According to the second law of thermodynamics isolated systems evolve towards the thermodynamic equilibrium. Isolated reaction networks endowed with mass action kinetics achieve the thermodynamic equilibrium which has been demonstrated to be globally asymptotically stable [15],[16] precluding any kind of complex nonlinear behavior such as bistability or limit cycles. A reaction kinetic system is thermodynamically isolated if there is no matter exchange with the environment and the kinetic constants are such that the detailed balance condition is fulfilled [9].

Exchange of matter can appear implicitly, as it happens in reactions of the form *S*→2*S*, or explicitly in the form of pseudo reactions of the species that are entering or going out of the control volume with a zero complex representing the environment. In the context of signaling, we use the same formalism of pseudo reactions with the environment to account for the formation or degradation of a species *S* (*∅*→*S* or *S*→*∅* respectively).

In our example we consider the degradation of the species *S* and *P*, and the constitutive formation of the inactive form *S*. The corresponding graph of complexes is depicted in Figure 1 where every complex is labeled with a number *i* from 1 to *n*.

Once species and complexes are labeled (note that the ordering of both the species and the complexes is arbitrary) we build the molecularity matrix $Y\in {\mathbb{R}}^{m\times n}$, where *y* _{
ij
} is the molecularity of the species *i* in the complex ${\mathcal{C}}_{j}$. Every complex ${\mathcal{C}}_{j}$ for *j*=1,…,*n* has an associated mass action monomial:

and every arrow in the graph, going from complex ${\mathcal{C}}_{j}$ to complex ${\mathcal{C}}_{k}$ is weighted by a rate constant *k* _{
jk
}. The corresponding reaction rate is *v* _{
jk
}(*c*)=*k* _{
jk
}
*ψ* _{
j
}(*c*). Within this framework the balance equations read:

where the matrix $A\in {\mathbb{R}}^{n\times n}$ is built from the adjacency matrix of the graph and contains the kinetic rate constants [8].

As a consequence of the mass conservation laws the trajectories of Eq. (2) in the concentration space are constrained to a convex region known as the reaction polyhedron [17] or *stoichiometric compatibility class*, which is next defined after introducing the *stoichiometric subspace*. The *stoichiometric subspace* is the subspace spanned by the columns of the stoichiometric matrix *N*. Considering the formulation in Eq. (2), it can be defined for networks with *t*=*ℓ* as $\mathcal{S}:=\mathit{\text{Im}}\left(\mathit{\text{YA}}\right)$. We denote by *s* the dimension of the stoichiometric subspace, i.e. $\mathit{\text{dim}}\left(\mathcal{S}\right)=s$. The number of independent conservation relations is given by *m*−*s*. Let *B* be a matrix such that the rows of *B* ^{T} are a basis of the left null space^{c} of *YA*, i. e. (*Y* *A*)^{T}
*B*=0. Let us take a reference concentration vector *c* _{0} and define

the *reaction polyhedron* consists of all *c*≥0 such that *W*(*c*;*c* _{0})=0. Closed reaction networks are conservative systems [9], i.e. all the species participate in at least one conservation law and therefore there is a strictly positive row vector *v* (with *m* elements) such that *v* *S*=0. On the contrary, open systems are not conservative. The existence of conservation laws implies that the ODE system (2) is not minimal, i.e. some equations depend linearly on the others. We can alternatively describe the system by a set of *s* linearly independent ODEs and a set of *m*−*s* algebraic equations of the form:

with *b*=*B* ^{T}
*c* _{0}. The original system of ODEs is thus transformed into a system of differential algebraic equations (DAEs). The derivation of the set of ODEs (2) and the DAE system for the protein activation network is included in Additional file 1.

### Locus of equilibria and tangent bifurcation condition

A condition for multiple steady states in biochemical reaction networks is derived and demonstrated in [12] for weakly reversible networks. In the context of intracellular processes, networks are often non weakly reversible. However, in the general case they fulfill *t*=*ℓ*, i.e. they have a number of linkage classes equal to the number of terminal linkage classes (as indicated previously, networks with *t*>*ℓ* might not be suitable to represent realistic systems). For networks with *t*=*ℓ* the condition in [12] is still valid. For the sake of completeness, we briefly summarize next the derivation of this condition, simplified and adapted to *t*=*ℓ* networks. The reader can find the detailed derivation and proof in [12], from which the validity for *t*=*ℓ* networks follows.

The system in Eq. (2) is at equilibrium for any *c* that satisfies *Y* *A* *ψ*(*c*)=0. Equivalently, one can say that a vector $z\in {\mathbb{R}}^{n}$ defined as *z*=*A* *ψ*(*c*) belongs to the *deficiency subspace*:

The dimension of this subspace is the *deficiency* of the network, *δ*=*d* *i* *m*(*D* _{
δ
}). For networks with *t*=*ℓ* the deficiency is given by the formula *δ*=*n*−*ℓ*−*s*, and a basis *ω* _{1},…,*ω* _{
δ
} for the deficiency subspace *D* _{
δ
} can be computed from graph matrices taking into account that:

where *Λ* is a *n*×*ℓ* matrix in which entry *Λ* _{
ij
}= corresponds to the node ${\mathcal{C}}_{i}$ in the linkage class ${\mathcal{\mathcal{L}}}_{j}$ such that *Λ* _{
ij
}=1 if ${\mathcal{C}}_{i}\in {\mathcal{\mathcal{L}}}_{j}$ and *Λ* _{
ij
}=0 otherwise.

Since the vector *z* defined above is a linear combination of the vectors of the basis of *D* _{
δ
}, we have:

On the other hand, from Eq. (1) provided that *ψ*(*c*)>0, we get:

From Eqs. (7) and (8), a set of *n*−*ℓ* linearly independent equations can be obtained in terms of the state vector $c\in {\mathbb{R}}_{>0}^{m}$, the deficiency parameter vector $\alpha \in {\mathbb{R}}^{\delta}$ and the kinetic parameter vector $k\in {\mathbb{R}}_{>0}^{r}$:

where ${\mathcal{\mathscr{H}}}_{s}(c,\alpha ;k)\phantom{\rule{0.3em}{0ex}}:{\mathbb{R}}_{>0}^{m}\times {\mathbb{R}}^{\delta}\to \phantom{\rule{0.3em}{0ex}}{\mathbb{R}}^{n-\ell}$ is continuous and differentiable, given a rate constant vector *k*. Eq. (9) describes the locus of equilibria of a reaction network (see Additional file 1 for the detailed derivation of this expression in the protein activation network example). Wherever this locus is continuous Eq. (9) defines the so called *equilibrium manifold* of dimension *λ*=*m*−*s*.

The system in Eq. (2) together with an initial concentration vector *c* _{0} describes an initial value problem. A reaction network has a unique steady state if for all the initial conditions stoichiometrically compatible with *c* _{0} the system reaches the same equilibrium *c* ^{∗}, which is also stoichiometrically compatible with *c* _{0}. The steady state is, in fact, the intersection between the equilibrium manifold and the reaction polyhedron defined by *W*(*c*;*c* _{0})=0, where *W* has been introduced in Eq. (3). Typical bistable systems show three intersections, one of them corresponding to an unstable steady state [18], as illustrated in Figure 2. If the equilibrium manifold and the reaction polyhedron become tangent at some point of the state space, multiple steady states appear, at least, for some stoichiometric compatibility classes. Here it is important to remind that if the network has deficiency 0 and is weakly reversible then bistability is necessarily excluded according to the deficiency zero theorem [8].

Let us assume *δ*=*λ*=1 such that the manifold is a one dimensional curve. Starting from the equilibrium manifold (9) and taking derivatives with respect to *α*, we obtain^{d}
${D}_{c}{\mathcal{\mathscr{H}}}_{s}\frac{\mathit{\text{dc}}}{\mathrm{d\alpha}}(\alpha ;k)+{D}_{\alpha}{\mathcal{\mathscr{H}}}_{s}=0$ and therefore the derivative of *c* with respect to *α* is given by

The equilibrium manifold and the reaction polyhedron are tangent provided that ${B}^{T}\frac{\mathit{\text{dc}}}{\mathrm{d\alpha}}(\alpha ;k)=0$, with *B* being introduced in Eq. (3) (see Figure 2). This geometric condition is generalized for arbitrary deficiency and manifold dimension by defining the following matrix:

where *D* _{
c
}
*W*=*B* ^{T} and *D* _{
α
}
*W*=0. The equilibrium manifold and the reaction polyhedron become tangent when the determinant of matrix (11) vanishes, i.e.:

The tangent bifurcation condition for the protein activation network is derived in Additional file 1. This network fulfills *δ*=*λ*=1. The corresponding equilibrium curves for a given set of parameters are depicted in Figure 3 where it can be observed how the equilibrium manifold is tangent to the direction of the reaction simplex in two different points of the state space (Figure 3A). These points correspond to tangent bifurcations, also known as saddle nodes or limit point bifurcations, indicated by LP in Figure 3B and C, where the steady state concentration values of the species *ESS* and *S* are depicted with respect to the concentration of total enzyme *E* _{
T
}.

## Results and discussion

### Parameter inference from dose response curves: deficiency one networks

*Dose response curves*, also known as stimulus response curves [3], are x-y diagrams commonly used to represent the input-output behavior of biochemical networks where *x* is the magnitude of a measurable stimulus (dose or exposure level) and *y* is the magnitude of the associated system’s response.

In what follows we refer to dose response diagrams obtained when the steady state concentration of a given species (output or *observable*) is plotted against some parameter (input). The parameter can be a kinetic rate or a conservation law constant, i. e. the total amount of an enzyme or conserved moiety, *b* _{
i
} in Eq. (4). Linear combinations of species of the form *q* _{
i
}=*Q* _{
i
}
*c* are also admitted as outputs, where *Q* is a matrix defining a linear map from the space of the species to the space of the *observables*.

If the structure and parameters of a biochemical reaction network model are known, dose response curves can be obtained computationally from the model equations using a continuation algorithm [19]. In a typical bistable system two tangent bifurcations appear during the equilibrium continuation, where the direction of the parameter reverses as the curve is followed (see Figure 4A). These two tangent bifurcations enclose the range of the bifurcation parameter leading to bistability [18].

Experimentally, dose response curves are obtained by measuring the steady state concentration of a given species for increasing/decreasing values of the parameter at hand. Increasing or decreasing the parameter we obtain the forward (see Figure 4B) or the backward (see Figure 4C) dose response curve respectively. Typical bistable systems show hysteretic behavior, leading to differences between parameter increase (forward curve) and decrease (backward curve) as illustrated in Figure 4B and C. For bistable systems, the tangent bifurcation points in the dose response curves can be located by means of experimental measurements. Details on the experimental procedures are out of the scope of this paper.

Suppose that we know the structure of a biochemical network with capacity for bistability and we want to know the values of the kinetic parameters. Next, we introduce a methodology to infer the parameters of the network from dose response curves, or more precisely, from the *tangent bifurcation points*.

For illustrative purposes we focus first on structures where *δ* = *λ*=1 and we consider dose response curves where the input is the total amount of a particular moiety, i.e. where a conservation law constant *b* _{
i
} is plotted on the x-axis, and the concentration of a species *c* _{
j
}, or a combination of species *q* _{
j
}, is plotted on the y-axis. As it is illustrated in Figure 4, from the forward and backward dose response curves we can elucidate the values of the parameter *b* where the tangent bifurcations appear, or in other words, the lower and upper values of the mass conservation constant (*b* ^{l} and *b* ^{u}, respectively) enclosing the region where bistable behavior is observed.

The two points defining the lower and upper bound of the region of bistability in the dose response curve indicate two different stoichiometric compatibility classes tangent to the equilibrium manifold. They correspond, in fact, to two different points in the state space *c* ^{l}=*c*(*α* ^{l};*k*) and *c* ^{u}=*c*(*α* ^{u};*k*) where the tangent bifurcation condition is fulfilled. Therefore, we have:

where $\frac{\mathit{\text{dc}}}{\mathrm{d\alpha}}(\alpha ;k)$ is given by Eq. (10) and *c*(*α* ^{l};*k*)>0, *c*(*α* ^{u};*k*)>0 belong to the equilibrium manifold, i.e.:

We search for a parameter vector *k* such that the equilibrium manifold is tangent to the reaction polyhedrons fixed by *b* ^{l} and *b* ^{u}.

In order to efficiently search for a point in the space of the kinetic parameters satisfying all the conditions above we can solve an optimization problem. First, we define the objective function to be minimized (in this expression *α* ^{1}=*α* ^{l}, *α* ^{2}=*α* ^{u}):

such that at the optimum *J* _{0}=0 the geometric condition is fulfilled. The optimization problem can be formulated as:

where *r* is the number of kinetic parameters, *p* is the decision vector containing the kinetic parameters *k* and the deficiency parameters *α* ^{l} and *α* ^{u}. The lower and upper bounds for the decision variables are denoted by *p* _{
L
} and *p* _{
U
} respectively and the equilibrium manifold equations are treated as equality constraints. The values of *b* ^{l} and *b* ^{u}, as well as the corresponding concentrations (*c* ^{l} and *c* ^{u}) or combinations of concentrations (*q* ^{l} and *q* ^{u}) at the bifurcation points are obtained from dose response curves. This optimization problem is non convex and multimodal and consequently a global optimization algorithm [20] is used to search for the solution.

By solving the optimization problem (13) we obtain a parameter vector *k* which is compatible with the observed dose response behavior. It is important to note here that, although we obtain a single point *k* in the parameter space as the output of the global optimization algorithm, the solution of the optimization problem (13) might not be unique, indicating that the set of parameters chosen cannot be identified univoquely. Identifiability of the parameters *a priori*, which depends on the network structure and the available observables is tackled next, where we propose a procedure to find a set of parameters and/or parameter combinations that can be identified univoquely.

### Parameter inference from dose response curves: general formulation

We have illustrated, for bistable networks of deficiency one and equilibrium manifold of dimension one, how to efficiently find a set of parameters starting from dose response curves, such that the tangent bifurcation condition is fulfilled at the two points enclosing the bistability interval in the dose response diagram. Next we generalize the method such that:

it is valid for networks of arbitrary deficiency and manifold dimension,

it is valid for different types of dose response curves,

it allows to identify a set of parameters of the network with their confidence intervals starting from experimental data subject to noise.

To this aim, we first propose a numerical strategy to elucidate a priori a set of parameters or parameter combinations which can be identified univoquely from the data. Then, we use a Monte Carlo approach to estimate the parameter values and confidence intervals.

#### Error free analysis and identifiability a priori: Constraint Satisfaction Problem

In order to tackle the identifiability a priori, we consider the error free case where neither experimental nor model error is assumed, and explore the parameter space in order to find the *viable* parameter regions i.e. subsets of the parameter space where the reaction network model maintains the desired behaviour. In this case, the location of the bifurcation points of the model in the bifurcation diagram must coincide with the location observed in the experimental dose response curves. To this aim we formulate next a constraint satisfaction problem [21] where a set of constraints impose the desired conditions on the decision variables.

We define the *decision vector* $p\in {\mathbb{R}}^{r+2\lambda}$. It contains, on the one hand, the kinetic parameters that are unknown (*r* in case that all the rate constants are unknown) and, on the other hand, the degrees of freedom which are needed to define the equilibrium manifold at the two bifurcation points. For networks with *λ*=*δ* we take the deficiency parameter vectors ${\alpha}^{l},{\alpha}^{u}\in {\mathbb{R}}^{\delta}$. For networks with *λ*≠*δ* a total of 2*λ* variables are taken from the vectors of deficiency parameters ${\alpha}^{l},{\alpha}^{u}\in {\mathbb{R}}^{\delta}$ and/or the concentration vectors ${c}^{l},{c}^{u}\in {\mathbb{R}}^{m}$.

A first set of (equality) constraints is given by:

with $k\in {\mathbb{R}}_{>0}^{r}$, ${\alpha}^{l},{\alpha}^{u}\in {\mathbb{R}}^{\delta}$. A second set of (equality) constraints is given by the equilibrium manifold equations:

The mass conservation laws and the observed concentrations constitute the third set of (equality) constraints:

where the values of *b* ^{l} and *b* ^{u}, as well as the corresponding concentrations (*c* ^{l} and *c* ^{u}) or combinations of concentrations (*q* ^{l} and *q* ^{u}) at the bifurcation points are obtained from dose response curves. Finally, a set of (inequality) constraints is given by

where *p* _{
L
} and *p* _{
U
} are lower and upper bounds for the decision vector variables.

When the stimulus in the dose response curve is a manipulable kinetic parameter or rate constant, *k* _{
i
}, two extra equality constraints need to be satisfied, indicating the values of the manipulable parameter at the lower and upper bifurcation points (${k}_{i}^{l}$ and ${k}_{i}^{u}$). The dose response curves are obtained in this case, for example, by fixing the kinetic parameter at hand at different values and measuring the corresponding steady state concentration of the species of interest.

As indicated, the dimension of the decision vector will depend on the number of kinetic parameters that remain unknown and the number of degrees of freedom that need to be fixed to define the equilibrium manifold.

In order to find the regions in the parameter space where the constraints (14-17) are satisfied we use the algorithm for exploration of parameter spaces by Zamora-Sillero *et al.* [22]. This algorithm has been shown to efficiently search for pre-defined conditions in high dimensional, non convex and poorly connected viable spaces, characteristic of complex biological networks. It combines local and global explorations of the parameter space, where for the global search, an adaptive Metropolis Monte Carlo method allows to identify poorly connected viable regions. Here, we use the algorithm to identify the *viable regions* containing those parameter vectors allowing for bistability and, in addition, compatible with the expected dose response behavior. The constraints are encoded in a cost function and the algorithm can search through the parameter space for different cost thresholds. The algorithm requires an initial viable point as an input. To start the exploration we take the optimum obtained by minimizing (in this expression *α* ^{1}=*α* ^{l}, *α* ^{2}=*α* ^{u}, *c* ^{1}=*c* ^{l} and *c* ^{2}=*c* ^{u}):

subject to the constraints already defined by (15-17). This optimization problem is non convex and multimodal and we solve it by means of a global optimization method [20].

The exploration algorithm provides as an output a large set of uniformly distributed viable points that allow to characterize the viable regions in the parameter space, i.e. those regions in the space of parameters that fulfill the constraints imposed [22]. The large set of viable points obtained is used to assess numerically which parameters and/or combinations of parameters are locally identifiable *a priori* starting from a given set of observables, by evaluating the viable regions obtained.

We use this approach in order to find a set of identifiable parameters in the protein activation network model. We take as observables the total enzyme concentrations at the lower and upper bifurcation points, ${b}^{l}={E}_{T}^{l}$ and ${b}^{u}={E}_{T}^{u}$ respectively, as well as the concentrations of *S*, *ESS* and *P*. The corresponding values are included in Table 1. Starting from these observables, a set of identifiable parameters *ρ* _{1},…,*ρ* _{7} (including rate constants and/or combinations of rate constants and deficiency parameters) is obtained. These identifiable parameters are shown in Table 2 together with the expressions in terms of the kinetic rate constants and the corresponding viable ranges. Here it is important to remark that, if the set of parameters is identifiable, a lower threshold in the cost function value leads to narrower viable parameter regions. As an indication of performance, for the protein activation model with its original parameterisation (ten dimensional parameter space) a single run of the exploration algorithm found 22327 viable points in 660 seconds.

The bifurcation diagrams and tangent bifurcation points computed for the viable parameter regions are depicted in Figure 5, where the different colors indicate different thresholds of the cost value.

#### Experimental uncertainty and practical identifiability: Monte Carlo Approach

In a real experimental scenario we expect to get a number of replicate measurements for each data point which are subject to experimental error, and we need to estimate the unobserved kinetic parameters *k* from the data^{e}.

Let $\widehat{k}$ be a vector of estimates of *k*, and ${\widehat{\alpha}}^{l}$, ${\widehat{\alpha}}^{u}$ the estimated values of the deficiency parameter vectors for the lower and upper bifurcation points. The lower and upper conservation law constants can be expressed as:

where *ε* ^{bl} and *ε* ^{bu} represent vectors of residuals. Similarly, the expressions for the concentration *observables* read:

We can express the sum of squares of the residuals between the observed responses in the data set and the model prediction as a function of the kinetic parameters and deficiency parameters *α* ^{l} and *α* ^{u} (in this expression *α* ^{1}=*α* ^{l} and *α* ^{2}=*α* ^{u}):

We seek for estimated values of the unknown parameters that, on the one hand minimize the sum of the residuals between observed and predicted bifurcation points, and on the other hand ensure that the tangency condition is fulfilled. Therefore, the parameter estimation is formulated as an optimization problem with two individual objectives. Both the function *J* _{1} defined in Eq. (18) and the function *J* _{2} are optimized simultaneously. Since the objectives are not conflicting we can directly minimize a weighted sum of *J* _{1} and *J* _{2}, with weights *w* _{1} and *w* _{2}. The problem thus reads:

Besides the actual values of the parameters, in presence of experimental noise our goal is to obtain information on the confidence intervals of the estimated parameters. To this aim, we propose a Monte Carlo approach consisting of the following steps:

Assuming we have enough number of experimental replicates, a probability distribution is fitted to the experimental data. Here we assume normally distributed data, but the best choice of the probability density function is to be decided upon the retrieval of the experimental data with regards to the specificities of the system under study and the data acquisition technique. Lets remind that the input data for the inverse bifurcation method are experimental measurements of the bifurcation points in dose response curves. Depending on the experimental set up, we might choose univariate or multivariate distributions to fit our data.

A number *M* of replicates of the observed data is sampled from the distributions previously fitted. For each sample, we solve the optimization problem (20), obtaining one optimal parameter set. In this way, starting from *M* samples we obtain a distribution for every parameter.

Statistical properties of the resulting distributions for the estimated set of parameters are calculated. Starting from the probability distribution estimates, robust confidence intervals and parameter correlations analysis can be computed [23].

As a proof of concept, we apply the approach proposed to the protein activation example with *in silico* noisy data. We take as observables, as in the previous analysis, the total enzyme concentrations at the lower and upper bifurcation points, ${b}^{l}={E}_{T}^{l}$ and ${b}^{u}={E}_{T}^{u}$, as well as the concentrations of *S*, *ESS* and *P*. In particular, we assume that we get from experimental dose response curve analysis measurements for the following pairs: $\left({E}_{T}^{l},{\left[S\right]}^{l}\right)$, $\left({E}_{T}^{l},{\left[\mathit{\text{ESS}}\right]}^{l}\right)$, $\left({E}_{T}^{l},{\left[P\right]}^{l}\right)$, and $\left({E}_{T}^{u},{\left[S\right]}^{u}\right)$, $\left({E}_{T}^{u},{\left[\mathit{\text{ESS}}\right]}^{u}\right)$, $\left({E}_{T}^{u},{\left[P\right]}^{u}\right)$. The goal is to obtain the corresponding distributions for the identifiable parameters *ρ* _{1},…,*ρ* _{7} defined in Table 2.

We generate the *in silico* data starting from the noise free data in Table 1, assuming normally distributed and uncorrelated error, proportional to the mean. Note that the assumptions about the error distribution and correlation might be different for different problems and experimental conditions without affecting the applicability of the method. A number of random replicates is taken from the fitted distributions in order to apply the proposed inverse bifurcation approach. The complete set of experimental data used is depicted in Figure 6A, B and C, where the fitted distributions obtained from the data are shown (we have assumed a measurement error of 10%), together with the replicate samples (1000 replicates). The optimization problem (20) is solved for each sample using the scatter search global solver by Egea *et al.* [20]. The average computation time spent by the solver in every run to get the optimal set of parameters is 30 seconds.

The histograms obtained for every parameter by applying the inverse bifurcation method proposed are shown in Figure 6D-J.

### Parameter estimation via inverse bifurcation of a genetic toggle switch

The capacity for bistability is found in some specialized gene circuits like the bacteriophage *lambda* switch [24]. The interest in genetic switches, also involving networks of non-specialized regulatory components, led to the engineering of the first toggle switch in bacteria described in [25]. In that work, the authors designed a synthetic, bistable gene circuit based on the predictions of a simple mathematical model consisting of two ordinary differential equations where the repressor binding was described by Hill functions. The conditions for bistability in the two dimensional ODE model were explored by [25],[26] using nullcline analysis.

Here we are interested in the inverse problem. Starting from a gene regulatory network with experimental evidence for bistability and assuming that the underlying model structure is known, we want to estimate the kinetic parameters of the model. Through this example, we show how our inverse bifurcation method allows to exploit the capacity for bistability to infer the kinetic parameters of the network. *In silico* data are used aiming to mimic a realistic experimental scenario.

We assume that the gene regulatory network under study is shown experimentally to be bistable. In particular, a characteristic all or none hysteretic response is observed in the system when one of the decay constants is gradually varied.

The underlying mechanism is a regulatory network with the same design principle described in [25], i.e. a repressor *P* _{1} inhibits the transcription of the repressor *P* _{2} from gene *G* _{2}, and the repressor *P* _{2} inhibits the transcription of the repressor *P* _{1} from gene *G* _{1}. The corresponding reaction network is depicted in Figure 7. The mutually inhibitory arrangement of the repressor genes together with the cooperativity of repression of gene *G* _{2} endow the system with capacity for bistability. Assuming mass action kinetics for all the interactions this model structure is, in qualitative terms, compatible with the experimental behavior observed.

In order to infer the parameters of the model using the inverse bifurcation method, we need to localize the tangent bifurcation points based on dose response measurements. In this case, a manipulable decay constant, the degradation rate constant corresponding to the protein *P* _{2}, is chosen as the bifurcation parameter. This degradation rate constant is fixed at different values and the corresponding concentrations of different species are measured at steady state. In this case we chose as observable the concentration of proteins *P* _{1}, *P* _{2} and the intermediate complex *G* _{2}
*P* _{1}, but different selections are possible, depending on what can be measured in a particular experimental scenario.

From the dose response measurements, two tangent points are localized, at low and high decay constant values ${k}_{7\phantom{\rule{0.3em}{0ex}}13}^{l}=0.86$, ${k}_{7\phantom{\rule{0.3em}{0ex}}13}^{u}=2.96$. The steady state concentrations of the selected species (*P* _{1}, *P* _{2} and *G* _{2}
*P* _{1}) are measured at these values of the decay constant. The *in silico* experimental dose response data (99% confidence intervals of the fitted Gaussians) are shown in the figures contained in the Additional file 2, where we plot the steady state concentration measurement of each species obtained at ${k}_{7\phantom{\rule{0.3em}{0ex}}13}^{l}$ versus the one obtained at ${k}_{7\phantom{\rule{0.3em}{0ex}}13}^{u}$.

The gene regulation network has two mass conservation laws (all the derivations are included in Additional file 1) and the corresponding mass conservation constants are *G* _{1}
_{
T
}=*G* _{2}
_{
T
}=0.16605, computed from the gene copy numbers and the cell volume.

In order to find a set of identifiable parameters we first make use of the constraint satisfaction method proposed, starting from noise free data (we take the means of the distributions corresponding to the experimental data in Additional file 2). The numerical values for every observable are included in Table 3. The set of parameters *ρ* _{1},…,*ρ* _{9} defined in Table 4 is found to be identifiable *a priori*. For the original 14 dimensional parameter space the exploration algorithm found 23857 viable points in 990 seconds.

We apply the Monte Carlo approach previously introduced taking 1000 random samples from the data distributions (included in Additional file 2). The parameter distributions obtained by the inverse bifurcation method are depicted as histograms in Additional file 3. From the parameter distributions we compute the confidence intervals and the parameter correlations. The corresponding robust Monte Carlo confidence intervals are depicted together with the histograms in Additional file 3 (for the numeric values see Table 4). The two dimensional correlation among the parameters is illustrated in Figure 8. These results show the successful performance of the method for parameter estimation of the toggle switch model with *in silico* noisy data.

## Conclusions

We develop an inverse bifurcation method that allows to estimate kinetic parameters from experimental dose response curves in bistable systems. The method is suitable for a broad class of biochemical models and dose response data where the stimuli are kinetic or mass conservation constants and the responses are the steady state concentrations of species involved.

In order to infer a set of kinetic parameters of a bistable switch compatible with the available dose response experimental data, the inverse bifurcation is formulated as an optimization problem and solved by a global optimization algorithm [20].

The method exploits the inherent structural properties of biochemical reaction networks. On the one hand, inherent properties of reaction networks were used to derive (in a previous work) the ‘tangency’ condition from the reaction graph, used here instead of a generic bifurcation condition. On the other hand, the method uses the insight that tangent bifurcation points are precisely the loci where the equilibrium manifold and the reaction simplex become tangent. The tangency condition is an algebraic expression dependent on the parameters that can be systematically computed from the graph matrices. In this way our method avoids simulating the system at every iteration of the optimization algorithm reducing dramatically the computational effort in comparison to direct fitting (which in this case would entail the non trivial task of obtaining computationally the dose response curve at every iteration in order to compare it with the experimental one).

As a first approach to assess the identifiability *a priori* and select a set of parameters which are identifiable, we propose a constraint satisfaction problem. This problem is numerically solved by an algorithm [22] that allows to efficiently explore large search spaces to obtain a large set of uniformly distributed points characterizing the viable regions in the parameter space. The parameters that can be directly identified depend on the structure of the equilibrium manifold equations and the dose response data available. This first approach to identifiability is local and heuristic. A complete analytic understanding on how to find *a priori* a set of identifiable parameters is subject of future work.

In presence of experimental uncertainty, Monte Carlo simulation and global optimization are combined to generate the parameter distributions compatible with the experimentally observed behavior.

In this work we show how a set of parameters can be identified using only information about the location of the bifurcation points in the dose response curves. The inverse bifurcation method could also be applied in combination with standard approaches (for example, parameter estimation from time course data), to significantly improve identifiability and facilitate the parameter estimation task.

The method has been successfully applied to a number of examples of interest in biology including a protein activation network and a gene regulatory toggle switch, illustrating the potential of the Chemical Reaction Network Theory for kinetic model identification.

## Endnotes

^{a} In this paper we use experimental data generated *in silico*.

^{b} The time derivative $\frac{\mathit{\text{dx}}}{\mathit{\text{dt}}}\left(t\right)$ of a vector *x* depending on time *t* is denoted by $\stackrel{\u0307}{x}$. The space of *n* dimensional real vectors is denoted by ${\mathbb{R}}^{n}$ and the space of *m*×*n* real matrices by ${\mathbb{R}}^{m\times n}$. The space of *n* dimensional real vectors with all strictly positive entries is denoted by ${\mathbb{R}}_{>0}^{n}$ and the space of *n* dimensional real vectors with all nonnegative entries by ${\mathbb{R}}_{\ge 0}^{n}$.

^{c} In what follows we denote the null space (or kernel) of a matrix X by Ker(X).

^{d}
*D* _{
x
}
*F* denotes the Jacobian of *F* with respect to *x*.

^{e} We here assume for simplicity that the set of parameters *k* is identifiable. In case a different set of parameters *ρ* is found to be identifiable *a priori* the decision vector *k* is replaced by *ρ*.

## Additional files

## References

- 1.
Model-based inference of biochemical parameters and dynamic properties of microbial signal transduction networks . Curr Opin Biotech. 2011, 22 (1): 109-116. 10.1016/j.copbio.2010.09.014.

- 2.
Berthomieux S, Brilli M, de Jong H, Cinquemani E: On the identifiability of metabolic network models . J Math Biol. 2013, 67 (6–7): 1795-1832. 10.1007/s00285-012-0614-x.

- 3.
Angeli D, Ferrell JE: Detection of multistability, bifurcations, and hysteresis in a large class of biological positive-feedback systems . Proc Natl Acad Sci USA. 2004, 101 (7): 1822-1827. 10.1073/pnas.0308265100.

- 4.
Conradi C, Flockerzi D, Raisch J, Stelling J: Subnetwork analysis reveals dynamic features of complex (bio)chemical networks . Proc Natl Acad Sci USA. 2007, 104 (49): 19175-19180. 10.1073/pnas.0705731104.

- 5.
Feliu E, Wiuf C: A computational method to preclude multistationarity in networks of interacting species . Bioinformatics. 2013, 29 (18): 2327-2334. 10.1093/bioinformatics/btt400.

- 6.
Paliwal S, Iglesias PA, Campbell K, Hilioti Z, Groisman A, Levchenko A: MAPK-mediated bimodal gene expression and adaptive gradient sensing in yeast . Nature. 2007, 446: 46-51. 10.1038/nature05561.

- 7.
Yao G, Lee TJ, Mori S, Nevins JR, You L: A bistable Rb-E2F switch underlies the restriction point . Nat Cell Biol. 2008, 10: 476-482. 10.1038/ncb1711.

- 8.
Feinberg M: Chemical reaction network structure and the stability of complex isothermal reactors i. the deficiency zero and deficiency one theorems . Chem Eng Sci. 1987, 42 (10): 2229-2268. 10.1016/0009-2509(87)80099-4.

- 9.
Angeli D: A tutorial on chemical reaction network dynamics . Eur J Control. 2009, 15 (3–4): 398-406. 10.3166/ejc.15.398-406.

- 10.
Bailey JE: Complex biology with no parameters . Nat Biotechnol. 2001, 19: 503-504. 10.1038/89204.

- 11.
Kuznetsov Y:

*Elements of Applied Bifurcation Theory*, Vol. 112 in Applied Mathematical Sciences, New York: Springer; 2004. - 12.
Otero-Muras I, Banga JR, Alonso AA: Characterizing multistationarity regimes in biochemical reaction networks . PLoS ONE. 2012, 7 (7): e39194-10.1371/journal.pone.0039194.

- 13.
Lu J, Engl HW, Schuster P: Inverse bifurcation analysis: application to simple gene systems . Algorithm Mol Biol. 2006, 1: 1-11. 10.1186/1748-7188-1-11.

- 14.
Lu J, August E, Koeppl H: Inverse problems from biomedicine: inference of putative disease mechanisms and robust therapeutic strategies . J Math Biol. 2013, 67 (1): 143-168. 10.1007/s00285-012-0523-z.

- 15.
Gorban AN, Karlin IV: Method of invariant manifold for chemical kinetics . Chem Eng Sci. 2003, 58: 4751-4768. 10.1016/j.ces.2002.12.001.

- 16.
Alberty RA: Principle of detailed balance in kinetics . J Chem Educ. 2004, 81: 1206-1209. 10.1021/ed081p1206.

- 17.
Otero-Muras I, Szederkenyi G, Alonso AA, Hangos KM: Local dissipative hamiltonian description of reversible reaction networks . Syst Control Lett. 2008, 57: 554-560. 10.1016/j.sysconle.2007.12.003.

- 18.
Conradi C, Flockerzi D, Raisch J: Saddle-node bifurcations in biochemical reaction networks with mass action kinetics and application to a double phosphorylation mechanism. In

*Proceedings of the American Control Conference (ACC 2007)*. New York, USA,2007. - 19.
Dhooge A, Govaerts W, Kuznetsov Y: Matcont: a Matlab package for numerical bifurcation analysis of ODEs . ACM T Math Software. 2003, 29: 141-164. 10.1145/779359.779362.

- 20.
Egea JA, Rodriguez-Fernandez M, Banga JR, Marti R: Scatter search for chemical and bioprocess optimization . J Global Optim. 2007, 37 (3): 481-503. 10.1007/s10898-006-9075-3.

- 21.
Jaulin L: Solving set-valued constraint satisfaction problems . Computing. 2012, 94 (2–4): 297-311. 10.1007/s00607-011-0169-5.

- 22.
Zamora-Sillero E, Hafner M, Ibig A, Stelling J, Wagner A: Efficient characterization of high-dimensional parameter spaces for systems biology . BMC Syst Biol. 2011, 5: 142-10.1186/1752-0509-5-142.

- 23.
Balsa-Canto E, Banga JR: Amigo, a toolbox for advanced model identification in systems biology using global optimization . Bioinformatics. 2011, 27 (16): 2311-2313. 10.1093/bioinformatics/btr370.

- 24.
Ptashne M:

*A Genetic Switch: Phage λ and Higher Organisms*, Cambridge, MA: Cell Press and Blackwell Scientific Publications; 1992. - 25.
Gardner TS, Cantor CR, Collins JJ: Construction of a genetic toggle switch in

*Escherichia coli*. Nature. 2000, 403: 339-342. 10.1038/35002131. - 26.
Cherry JL, Adler FR: How to make a biological switch . J Theor Biol. 2000, 203: 117-133. 10.1006/jtbi.2000.1068.

## Acknowledgements

We gratefully acknowledge financial support from the EU FP7 project IFNAction (contract 223608). We would like to thank Eva Balsa-Canto for her helpful comments and the three anonymous reviewers for their comments and suggestions that allowed us to improve the paper.

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

IOM and PY performed the research. IOM developed the theory and conceived and designed the experiments. JS contributed to experiment design. IOM and PY wrote the manuscript. All authors read and approved the final manuscript.

## Electronic supplementary material

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Biochemical reaction network
- Bistability
- Saddle node bifurcation
- Dose response curve
- Chemical reaction network theory