A combined model reduction algorithm for controlled biochemical systems
 Thomas J. Snowden^{1, 2}View ORCID ID profile,
 Piet H. van der Graaf^{3, 2} and
 Marcus J. Tindall^{1, 4}Email author
DOI: 10.1186/s1291801703971
© The Author(s) 2017
Received: 8 May 2016
Accepted: 18 January 2017
Published: 13 February 2017
Abstract
Background
Systems Biology continues to produce increasingly large models of complex biochemical reaction networks. In applications requiring, for example, parameter estimation, the use of agentbased modelling approaches, or realtime simulation, this growing model complexity can present a significant hurdle. Often, however, not all portions of a model are of equal interest in a given setting. In such situations methods of model reduction offer one possible approach for addressing the issue of complexity by seeking to eliminate those portions of a pathway that can be shown to have the least effect upon the properties of interest.
Methods
In this paper a model reduction algorithm bringing together the complementary aspects of proper lumping and empirical balanced truncation is presented. Additional contributions include the development of a criterion for the selection of statevariable elimination via conservation analysis and use of an ‘averaged’ lumping inverse. This combined algorithm is highly automatable and of particular applicability in the context of ‘controlled’ biochemical networks.
Results
The algorithm is demonstrated here via application to two examples; an 11 dimensional model of bacterial chemotaxis in Escherichia coli and a 99 dimensional model of extracellular regulatory kinase activation (ERK) mediated via the epidermal growth factor (EGF) and nerve growth factor (NGF) receptor pathways. In the case of the chemotaxis model the algorithm was able to reduce the model to 2 statevariables producing a maximal relative error between the dynamics of the original and reduced models of only 2.8% whilst yielding a 26 fold speed up in simulation time. For the ERK activation model the algorithm was able to reduce the system to 7 statevariables, incurring a maximal relative error of 4.8%, and producing an approximately 10 fold speed up in the rate of simulation. Indices of controllability and observability are additionally developed and demonstrated throughout the paper. These provide insight into the relative importance of individual reactants in mediating a biochemical system’s inputoutput response even for highly complex networks.
Conclusions
Through application, this paper demonstrates that combined model reduction methods can produce a significant simplification of complex Systems Biology models whilst retaining a high degree of predictive accuracy. In particular, it is shown that by combining the methods of proper lumping and empirical balanced truncation it is often possible to produce more accurate reductions than can be obtained by the use of either method in isolation.
Keywords
Model reduction Lumping Empirical balanced truncation Controlled systemsBackground
The field of Systems Biology has seen a considerable increase in both the number of models created and their complexity across the past decade. The BioModels Database, which acts as an open repository for Systems Biology models, saw the number of models it stores increase approximately tenfold between 2005 and 2010, with the average number of reactions per model having nearly tripled in the same period [1]. This increase in complexity, specifically in the number of species or reactions modelled by each system, has become a defining characteristic of research in this area.
Such systems are typically developed by bringing together biochemical and physiological knowledge to inform highly detailed mechanistic models of biological networks (e.g. signalling pathways, proteinprotein interactions, and genetic cascades). Mathematically, these networks are typically modelled via highdimensional systems of stiff, nonlinear ordinary differential equations (ODEs).
This model complexity, however, can present a number of issues with regards to their use and analysis. For example parameter estimation techniques and realtime numerical simulation can both be difficult to perform for high dimensional or overparameterised systems. Even the basic intuitiveness of a system can potentially be obscured by its complexity. Additionally, complexity of this form is often associated with the ‘curse of dimensionality’, whereby the data that can be obtained for such systems in practice are sparse relative to the volume of the state and parameter spaces.
Model reduction techniques [2] offer one possible approach to easing complexity. A method of model reduction here refers to any method designed to construct a lower order representation (either in terms of the number of state variables or parameters) of a model with which some set of the original dynamical behaviour can be satisfactorily approximated.
A range of model reduction methods exist in the literature, many of which have previously been applied to models of biochemical reaction networks. The most commonly applied methods are based upon timescale separation. These simplify a system by exploiting the wide ranges in reaction rates and equilibration speeds typical of biochemical reaction networks. These approaches include variants of the quasisteady state approximation [3–9], variants of the rapid equilibrium approximation [10–15], the intrinsic low dimensional manifold method [16–20], and computational singular perturbation [21–24]. Beyond timescale exploitation, sensitivity analysis can also be used to guide model reduction by identifying and eliminating those portions of a network least responsible for the dynamical behaviour of interest [25–29]. Optimisation based methods [30–35] seek to evaluate a range of possible reduced models under a given metric of reduction accuracy before returning the best available option. Lumping methods, meanwhile, reduce a network by treating groups of statevariables as a single dynamical, ‘lumped’ variable [36–41]. Additionally, there exist a range of singular value decomposition (SVD) reduction methods based upon the matrix decomposition of the same name. These exploit the property that SVD can be used to give lower rank approximations of matrices. Such methods have seen limited application to biochemical systems, but a number of publications employing a particular variant known as balanced truncation can be found in the literature [42–44].
Each model reduction approach has advantages and disadvantages in the reduction of large scale models of biochemical reaction networks. There is no ‘one size fits all’ method of reduction; the best method available depends inextricably on the properties of the model and the aims of reduction.
Systems Biology models, such as those this paper seeks to reduce, usually possess a number of mathematical properties that can influence the suitability of specific reduction methodologies: most notably, such models are often stiff, nonlinear and of extremely high dimensionality (e.g. containing 10s or even 100s of statevariables and parameters). Stiffness, in this context, refers to a parameter dependent property of a system of differential equations whereby their numerical integration can require taking a steplength that is excessively short relative to the exact solution’s smoothness in a given interval. This has relatively severe implications for simulating such models, as stiffness is associated with issues of numerical stability and computational processing time. In the case of largescale Systems Biology models, stiffness is typically a consequence of reactions in the system evolving and equilibrating at greatly different timescales as compared with one another. Meanwhile, the nonlinearity of these models implies that a number of analytical methodologies will not be applicable. Often linearisation is used in this context, but in many such cases this incurs a prohibitively large error. Finally, the high dimensionality of such systems can also present issues of combinatorial complexity or an excessive computational burden for some methods of mathematical analysis.
This paper specifically seeks to address the topic of controlled biochemical reaction networks. Here a controlled network refers to any model for which the dynamics are influenced by the concentration of a particular reactant that can be considered as an input into the network, within which some given combinations of the reactants can be considered as outputs of the system. In the context of Systems Biology this includes, for example, models of receptor signalling pathways where the concentration of an extracellular ligand may be seen as an input controlling the pathway. The concentration of some subset of the intracellular signalling species may also be considered an output that is directly observed or inferred from some measure of the cellular response. The recently emerging field of Quantitative Systems Pharmacology [45, 46], which proposes to mechanistically model the effects of drug action from the genetic scale upwards is particularly amenable to such a formulation.
Here we develop a model reduction algorithm focused on maintaining the inputoutput relationship of a controlled biochemical reaction network. The algorithm combines several approaches including conservation analysis, proper lumping and empirical balanced truncation. For controlled systems with a specified output, empirical balanced truncation is designed to give a reduction that accurately maintains the inputoutput relationship. Unfortunately, due to the need to repeatedly simulate the system under a range of perturbed conditions, empirical balanced truncation can be highly sensitive to model stiffness. Hence, given the stiffness that is commonly associated with such systems, in the combined algorithm proper lumping is employed as a preconditioner to enable the application of empirical balanced truncation whilst retaining an accurate reduced model.
Our paper is structured as follows: we will first outline the general model reduction problem and then proceed to provide an overview of conservation analysis, empirical truncation, and proper lumping. A detailed account of how these methods can be brought together to obtain more accurate reductions than can be obtained via application of any single method in isolation will then be given. Finally, we demonstrate the algorithm via application to two examples: an 11 dimensional model of bacterial chemotactic signalling in Escherichia coli [47] and a 99 dimensional model of extracellular signalregulated kinase (ERK) phosphorylation mediated via the epidermal growth factor (EGF) and nerve growth factor (NGF) receptor pathways [48]. Results are compared and a number of enhancements to the core methods are discussed.
Problem outline
with initial conditions x(0)=x _{0} and where the overdot represents the time derivative, such that \(\dot {\boldsymbol {x}}=\frac {\mathrm {d}\boldsymbol {x}}{\mathrm {d}t}\). Here the state variables \(\boldsymbol {x}(t)\in \mathbb {R}^{n}\) typically represent the timevarying concentrations of the modelled species, \(\boldsymbol {u}(t)\in \mathbb {R}^{l}\) (such that u _{ i }(t)∈u(t)) represent the input variables, and \(\boldsymbol {y}\in \mathbb {R}^{p}\) represent the output variables. Here f(x(t)) is the set of functions describing the dynamical interaction between the individual reactants, each set of functions g _{ i }(x(t)) describes the dynamical behaviour of the reactants interacting with each of the inputs, and h(x(t)) describes the combinations of the reactant concentrations corresponding to each of the outputs.
where \(\tilde {\boldsymbol {x}}\in \mathbb {R}^{\tilde {n}}\) represents a reduced set of state variables (such that \(\tilde {n}<n\)) and for which, given a set of inputs u(t), the reduced set of outputs \(\bar {\boldsymbol {y}}(t)\) represents an approximation of the original set y(t). Similarly to f, g, and h in the unreduced model, \(\tilde {\boldsymbol {f}}(\tilde {\boldsymbol {x}}(t))\) and \(\tilde {\boldsymbol {g}}_{i}(\tilde {\boldsymbol {x}}(t))\) are sets of functions describing the dynamical effects of interactions between the reduced statevariables and inputs. Meanwhile \(\tilde {\boldsymbol {h}}(\tilde {\boldsymbol {x}}(t))\) approximately maps the reduced statevariables to the outputs.
Here, the relative error is selected such reduced models can be compared fairly for a range of different perturbation magnitudes applied both to the inputs and the initial condition of the statevariables.
Methods
Our combined model reduction algorithm is designed to bring together the methods of nondimensionalisation, conservation analysis, proper lumping and empirical balanced truncation. At its core the method employs proper lumping as a preconditioner (to reduce model stiffness) prior to the application of empirical balanced truncation. In this section we briefly review the variants of the methods employed before providing a more detailed account of the algorithm.
Nondimensionalisation
Nondimensionalisation refers to the process of rescaling the variables in a system such that the physical units (typically units of concentration and time) are removed from the model [49]. There are number of purposes for nondimensionalisation in the analysis of biochemical systems — primary amongst these is its use in accessing characteristic or intrinsic properties of the reaction network. Usually these are ratios of kinetic rate constants and conserved values that enable greater intuition into how the parameterisation of a model governs its behaviour.
This yields a nondimensionalised parameter set \(\tilde {\boldsymbol {p}}\) with entries representing ratios of the original parameters p. Often, nondimensionalisation can result in a reduction in the dimension of the new parameter set \(\tilde {\boldsymbol {p}}\) by finding ratios that are fixed to 1 irrespective of the original parameterisation. This does not, however, result in a reduction in the number of modelled reactions or reactants and hence does not reduce complexity as previously defined. Additionally, the dimensionless parameters may lose their innate biological meaning as the ratios they represent may not always hold particular biological significance.
Conservation analysis
It is common in models of biochemical reaction networks for the total concentration of certain subsets of the species to remain constant at all times independent of the model’s specific parameterisation [50]. Such subsets are commonly referred to as conserved moieties. Replacement of statevariables via the algebraic exploitation of conservation relations is a common first step in the analysis of biochemical reaction networks. Eliminating one of the conserved statevariables for each of the conservation relations can be used to yield a simplified realisation of the system.
In principle, all such conservation relations for a given biochemical reaction network can be determined by finding the linear dependencies in the associated stoichiometry matrix of the system. Mathematically this relies upon computing the left null space of the stoichiometry matrix for which a number of wellestablished methods exist. A review of a range of such techniques, including Gaussian elimination and singular value decomposition, can be found in Sauro and Ingalls [51]. For smaller scale systems such methods will usually find all available conservation relations, but for higher dimensional systems difficulties may occur. In particular, it is often necessary to select more stable computational methods to avoid missing conservation relations or finding false ones due to numerical error. A particularly stable method based upon the construction of a QR decomposition via Householder reflections has been developed by Vallbhajosyula et al. [52]. We therefore employ a form of this approach here, a more detailed mathematical treatment of which can be found in Section 1.2 of the Additional file 1.
This form of analysis leads to a simplified realisation of the system under which it is possible to obtain nonsingular Jacobians for given states of the model. As such it can be seen as a first step in model reduction schemes involving numerical methods.
Proper lumping
Finding an optimal lumping
There exist a number approaches for selecting an appropriate lumping matrix L for producing a system of given reduced dimensionality \(\tilde {n}\). Here we choose to employ the scheme described by Dokoumetzidis and Aarons [37]. This algorithm runs an exhaustive search of possible lumping matrices to determine which produces the lowest error between simulation of the original model and the reduced model. To speed up this process, it is assumed (from justifications given in the original paper) that the lowest error k dimensional reduction obtained via lumping of an n dimensional system can also be found as the optimal lumping of two states in the k+1 dimensional reduction. This yields a ‘forward selection’ strategy, where 2 of the statevariables are lumped at each step, which greatly decreases the combinatorial burden of possible lumping matrices that must be evaluated.
Lumping and stiffness
A high stiffness coefficient implies that parts of the system are evolving at significantly greater rates than others which can lead to traditional methods of simulation being numerically unstable.
The lumping algorithm of Dokoumetzidis and Aarons [37] will tend to sum together those statevariables that interact on faster timescales than their neighbours, and hence rapidly reach a form of proportional equilibrium. As a result the reduced model will tend to contain a lower range of timescales and a lower stiffness coefficient with every additional dimension eliminated. Furthermore, proper lumping, which creates reduced statevariables as straightforward sums of the originals, retains a degree of a biological interpretability that may not hold for alternative, coordinate transforming methods of model reduction.
Empirical balanced truncation
Our approach to empirical balanced truncation is based upon the procedure developed by Hahn and Edgar [54]. The aim is to construct two covariance matrices, known as the empirical controllability and observability Gramians, via repeated simulations of the system under perturbations of the input and the initial conditions respectively. The controllability Gramian provides information on how changes in the input will alter the state of the system. The observability Gramian provides information on the magnitude of the output any given initial condition of the system can produce. As these Gramians are positive semidefinite matrices, one good way to interpret them is as ellipsoids in the phasespace of the model [55]. The ellipsoid of the controllability Gramian represents the set of states that can be reached for a given magnitude of input, the ellipsoid of the observability Gramian describes the set of initial states that can produce an output of a given magnitude. A detailed mathematical account of the computation of empirical Gramians can be found in Section 1.3 of the Additional file 1.
We have modified the empirical balanced truncation procedure of Hahn and Edgar to deal with a system where statevariables have been eliminated via conservation analysis. In particular the conserved totals should be treated as functions of the initial conditions of the system and be altered in accordance with the perturbations used to create the observability Gramian. This allows us to perturb the initial conditions of the system without risking violation of conservation.
Once the Gramians have been computed the aim is to construct a balancing transformation of the state variables which also acts to equalise and diagonalise the Gramians. As the Gramians are now diagonal, each of the associated statevariables is therefore orthogonal in terms of their contribution to the inputoutput relationship. Hence, under this transformation, the statevariables which contribute least can be truncated without influencing the remaining terms.
This reduced system will not only feature fewer statevariables, but will also typically be faster to simulate and contain fewer parameters.
Stiffness and empirical balanced truncation
The construction of empirical Gramians via repeated simulation of the system under perturbations of input and initial conditions can be sensitive to numerical error. Where balanced truncation is applied using Gramians with a high degree of associated error, blowup problems can occur for simulations of the reduced system. This accumulation of numerical error is particularly likely to occur for systems with a highstiffness coefficient (χ≫1).
Given the stiffness reducing property of lumping, however, and its retention of some biological meaning it can potentially be treated as a preconditioning step  enabling the application of more numerically sensitive methods (such as empirical balanced truncation) to previously lumped systems.
The combined model reduction algorithm
The algorithm is designed to be automatically applicable to models given in Systems Biology Markup Language (SBML) form — a standardised format for the representation, storage and easy communication of Systems Biology models. Many such models of this form can commonly be obtained from online repositories. Publicly open databases, such as the previously mentioned BioModels Database, contain thousands of such models enabling researchers to share their work in a more accessible way.
As preliminary steps, nondimensionalisation and conservation analysis are applied to the model. Nondimensionalisation is applied to improve numerical accuracy by reducing the range of parameters accounted for in the model. Conservation analysis is then applied in order to obtain a simplified realisation of the system and remove any associated singularities from the system’s associated Jacobian.
At the core of the algorithm, however, are the methods of proper lumping and empirical balanced truncation. Theoretically, empirical balanced truncation should produce lower output error reductions across a range of inputs than lumping. However, due to the need to construct accurate covariance matrices of data from repeated numerical simulations, empirical balanced truncation can fail when applied to highly stiff systems. Conversely, lumping strips the model of some of its stiffness for each reduced dimension. In the combined algorithm, therefore, the complimentary aspects of proper lumping and empirical balanced truncation are exploited. Lumping is used to reduce the system until the stiffness coefficient \(\chi _{\tilde {n}}\) of the reduced model is within a satisfactory range \(\chi _{\tilde {n}}<\chi _{c}\), for χ _{ c } some prechosen critical stiffness value (from numerical experimentation with example models we set this to be 250 in the automated algorithm). Empirical balanced truncation is then employed to obtain further model reduction whilst maintaining a good error bound between the outputs of the original and reduced models.
The algorithm as presented will proceed until the reduced system exceeds the maximum tolerated error, here defined to be 5%. The algorithm then returns the lowest dimensional reduced system that meets this constraint. It is not possible to know a priori what degree of reduction will be attainable by the algorithm, and the actual reduction achieved is both model structure and parameterisation dependent. Application of the combined reduction algorithm does require that the model it is applied to is fully parameterised prior to reduction. As such, the reduced model is only guaranteed to be accurate locally to a point in parameter space. For small perturbations to the initial parameterisation it is likely that the reduced model will remain accurate, however for larger deviations it may not be suitable for describing the overall inputoutput behaviour of the system.
Combined algorithm implementation
 1.
Import the model of interest with a set of predefined statevariables (\(\boldsymbol {x}(t)\in \mathbb {R}^{n}\)), reactions, and rate parameters into the algorithm.
 2.
Request the user to define the set of inputs (u) and the set of outputs (y=h(x(t))) in which they are interested.
 3.
Randomly select a sample of 50 possible nondimensionalisations under the original parameterisation of the model.
 4.
Calculate the range of the new parameter set for each nondimensionalisation.
 5.
Select and apply the nondimensionalisation sampled with the smallest range of orders of magnitude of parameters to help avoid roundingoff and truncation errors.
 6.
Compute the conservation relationships using a standard QR factorization method via Householder reflections (outlined, for example, in [51]).
 7.
Prioritise which statevariables to replace in the dynamical system via exploitation of the conservation relations.
 8.
Compute the linear, proper lumping matrix for all lumpable pairs of statevariables remaining in the system. Expressed alternatively, compute the set of all possible proper lumping matrices that will reduce the number of statevariables by one.
 9.
Compute the lumping inverse.
 10.
Simulate each lumped system and compute the associated maximal relative output error.
 11.
Select the lumping matrix L and associated inverse that produces the lowest error and apply this reduction.
 12.
Calculate the stiffness coefficient \(\chi _{\tilde {n}}\) for this reduced system.
 13.
Return to step (8). Exit the loop of steps (8) (13) when the lumping either violates the maximum tolerated error (typically set to 5%) or reduces the stiffness coefficient to within an acceptable range \(\chi _{\tilde {n}}<\chi _{c}\) or when the reduced dimensionality \(\tilde {n}=1\).
 14.
Compute the empirical Gramians for the lumped system.
 15.
Compute the balancing transformation for the given Gramians.
 16.
Apply the transformation to the model via the Galerkin projection and truncate statevariables until the maximum acceptable error is reached.
 17.
Produce the symbolic form of the reduced model and terminate.
The main algorithm was implemented in a commercial software package (Matlab 2013b, MathWorks Inc., Natick, MA). To enable the importing and manipulation of models stored in SBML, additional use was made of two opensource toolboxes  libSBML [57] and SBtoolbox2 [58]. Note that a more detailed account of the steps and implementation of the combined algorithm can be found in Section 3 of the Additional file 1.
Algorithm enhancements
The algorithm additionally features two specific enhancements to the combined methodologies that will be outlined in the following section. The first of these addresses the question of how to select which statevariables should be explicitly eliminated from the system of ODEs via the exploitation of conservation relations. The second addresses the question of which of the possible generalised inverses should be used to reduce the system via lumping with the Galerkin projection.
Selection of statevariables for elimination via conservation analysis:
Typically, the most accurate possible proper lumping of two statevariables in a model will occur between those two that most rapidly equilibrate with respect to the remainder of the system. In short, if two variables quickly reach a point where their proportional concentration with respect to each other is approximately constant, they can be lumped accurately. If a statevariable from this ‘best’ lumped pair has been replaced by an algebraic equation from the conservation relations, then it will necessarily be unavailable for lumping in the combined algorithm. Another concern is that the choice of statevariables eliminated will have an effect upon the reduction of stiffness attainable via the application of lumping; if the fastest interacting statevariables are not represented explicitly the stiffness will require many more steps of lumping in order to be reduced.
To avoid this issue the combined algorithm initially ‘speed’ ranks the statevariables in the model such that the slowest variables can then be selected for elimination. This speed ranking is achieved via numerical calculation of the Jacobian evaluated at the steadystate \(\boldsymbol {x}=\boldsymbol {x}^{\ast }_{\boldsymbol {u}_{0}}\) of the system attained under an unperturbed input u=u _{0}. The absolute value of the diagonal entries of the Jacobian \(\left. J\right _{\boldsymbol {x}=\boldsymbol {x}^{\ast }_{\boldsymbol {u}_{0}}}\) are then used as an approximate metric of the initial timescale or speed of each of the statevariables. This is equivalent to the outgoing rate of concentration from each statevariable at the unperturbed steadystate of the network. Those statevariables corresponding to the largest values are deemed fast, whilst those corresponding to the smallest values are deemed slow and prioritised for elimination via application of conservation analysis.
Alternative lumping inverses:
In order to apply a lumping L under the Galerkin projection, some generalised rightinverse \(\bar {L}\) of L such that \(L\bar {L}=I_{\tilde {n}}\) is required. Note that, as \(\bar {L}\) could potentially be any generalised rightinverse of L, and there exists an infinite number of ways to construct such a matrix. However, the particular choice made will have an effect on the maximal relative output error incurred by the reduced model as defined by Eq. (4). Some of the alternatives are evaluated here.
As, in the case of a controlled biochemical network, the steadystate of the system is necessarily dependent on the value of the inputs u(t). We here seek to maintain the generality of the reduced model by employing the steadystate attained under an unperturbed input u=u _{0}, such that \(\boldsymbol {x}=\boldsymbol {x}^{\ast }_{\boldsymbol {u}_{0}}\).
Typically the approach of reconstructing the steadystate values will lead to a better approximation of the original dynamics. In the case where a lumping sums a group of species all with a steadystate value of zero, however, this will lead to a singularity and thus the MoorePenrose approach is to be preferred.
In the case where the time T→∞ and either or both statevariables have a nonzero steadystate this reduces to the steadystate reconstructing lumping inverse defined by Eq. (12). For the zero steadystate situation, however, it avoids the issue of numerical singularities.
Indices of controllability and observability
One of the benefits of the application of empirical balanced truncation is its potential use in obtaining metrics of observability and controllability for the individual statevariables of nonlinear systems. Given the typically nonlinear nature of cell signalling models, this potentially allows a framework for accessing indices of the controllability of each statevariable via receptor activation or suspension, the observability of each statevariable in influencing the output of interest, and the contribution of each statevariable to the overall inputoutput relationship of the model.
The interpretation of standard Gramians as ellipsoids describing the controllability and observability of the directions in phasespace [55] can also be extended to the interpretation of empirical Gramians. Hence it is possible to employ these matrices to obtain indices of controllability and observability for individual statevariables in nonlinear systems. Given this, we define the following indices.
Observability index:
Controllability index:
Inputoutput importance index:
Results and discussion
In this section two examples are employed to demonstrate the application of the combined model reduction algorithm, the enhancements made to the base methods, and the calculation of the indices of controllability and observability. The first of these systems is a relatively simple (11 dimensional) model of bacterial chemotaxis in Escherichia coli  the modest scope of this model allows the application of our methods to be more easily intuited. The second is a significantly more substantial model (99 dimensional) describing the mediation of ERK activation via both the EGF and NGF receptor pathways. Through application to a model of this size, our methods demonstrate their potential usefulness in analysing models that might be inscrutable under traditional approaches.
A model of bacterial chemotaxis
We have applied our combined model reduction methodology to a model of chemotactic signalling in E. coli as detailed in Tindall et al. [47] and summarised in Section 2 of the Additional file 1. This is a modest 11 dimensional model detailing a system of 12 biochemical reactions. This serves as a reasonable starting example as the model is intuitively tractable, but also of a sufficient size and complexity to be meaningfully reduced by the combined algorithm.
E. coli is a common, rod shaped, gramnegative bacterium often used as a model organism in biological studies due to both the large body of existing literature characterising its behaviour as well as the relative ease and inexpensiveness in its growth and experimental manipulation. There are many strains of E. coli present in nature, but the model discussed here pertains specifically to those strains that exhibit a chemotactic response. Chemotaxis is the process by which a cell senses an environmental chemical gradient and biases its movement towards those regions most suitable for growth and reproduction. In the model presented here, this process involves the transmembrane receptors on the surface of the bacterium sensing the local concentrations of an attractant or repellent; a decrease in attractant or an increase in repellent will cause the receptors to activate a signalling pathway inside the cell resulting in an increase of the intracellular concentration of the phosphorylated chemotactic CheY protein, referred to here as CheY_{ P }. This concentration, in turn, modulates the flagellum’s movement, resulting in a change of direction for the cell either towards attractants or away from repellents.
The model has a stiffness coefficient at the initial condition of the system of 958.3, which is relatively high. Values for these initial conditions can be found in Section 2 of the Additional file 1.
Reduction
We sought to compare the reduction of this example via lumping and empirical balanced truncation alone, and our combined algorithm.
In respect of the combined algorithm, the process of reduction began by nondimensionalising the variables of the system  specifically seeking to rescale the initial conditions and coefficients in the model such that they spanned the fewest orders of magnitude with the aim of avoiding possible issues of numerical truncation. The conservation relations for the system were then calculated; as 4 conservation relations were found, this lead to a system of 7 ODEs that exactly replicated the original system’s dynamics.
The algorithm then lumped the statevariables until the stiffness coefficient was less than 250. In our example this occurred at 5 statevariables. Finally, empirical balanced truncation was applied to the 5 dimensional reduced model. In this case, the empirical Gramians were constructed using data from 100 distinct simulations covering perturbations to both the models input parameter u(t) and the initial conditions x _{0}. In both cases, perturbations were uniformly sampled from 0.2 to 1.8 times each parameter’s original, unperturbed value. Using the balancing transformation and straightforward truncation the associated error for each possible dimensionality of reduction (between four and one) was calculated. A significantly more detailed account of reducing the bacterial chemotaxis model can be found in Section 2.1 of the Additional file 1.
Error results for the nonlinear methods of model reduction applied to the E. coli chemotaxis model
Number of  Empirical balanced  Lumping  Combined method 

dimensions  truncation  
6  1.396%  0.15%   
5  1.655%  0.51%   
4  #  0.54%  0.51% 
3  #  4.77%  2.00% 
2  #  12.88%  2.84% 
1  #  75.56%  21.86% 
Selection of species eliminated via conservation analysis
As was previously discussed, the selection of species eliminated via conservation analysis can have a substantial effect on model reduction error. The combined algorithm hence applies a speedranking step designed to estimate which of the biochemical reactants are likely to be most readily lumped.
Comparison of reduction error and stiffness coefficients at each level of dimensional reduction for the E coli signalling model under different approaches to conservation analysis using the combined algorithm
Dimensions  Naive elimination  Selective elimination  

Lumping error (ε)  Stiffness  Lumping error (ε)  Stiffness  
6  1.37%  876.73  0.15%  794.91 
5  2.45%  904.80  0.51%  56.22 
4  4.8%  907.59  0.54%  60.16 
3  74.18%  542.24  4.77%  58.95 
2  339.99%  99.32  12.88%  25.95 
1  424.33%  1  75.56%  1 
Analysis of alternative lumping inverses
Comparison of maximal relative error under differing lumping inverses for the reduction of the E coli model via lumping in isolation
Dimensions  MoorePenrose  Steadystate  Average 

6  6.67%  0.15%  0.18% 
5  10.58%  0.55%  0.51% 
4  13.99%  0.54%  1.26% 
3  26.93%  4.78%  4.77% 
2  78.36%  13.45%  12.88% 
1  70.92%  75.56%  82.34% 
These results show that, whilst the MoorePenrose inverse performs worse than the others, both the steadystate and averaged inverses can produce very good results. Given that the optimal approach can vary between lumping steps, the combined algorithm has been designed in such a way as to trial both the average and steadystate inverse at each step before selecting the most accurate. The overall inverse is then returned as the composition of the sequentially best inverses for each dimensionality of lumping.
Indices of controllability and observability
In the 5 dimensional lumped model created via the algorithm, only one lumped statevariable was created. This variable represents the sum of species Y_{ P }, A·Y_{ P }, and A_{ P }·Y_{ P } which can be thought of as representing the phosphotransfer chain between species i.e. A and Y.
Controllability and observability index values for the model of chemotactic signalling in E. Coli
Species  Controllability  Observability  Inputoutput 

CheA·CheY  0.865  0.881  0.762 
CheA_{ P }  0.846  0.497  0.421 
CheY_{ P }+CheA·CheY_{ P }+CheA_{ P }·CheY  1  1  1 
CheY_{ P }·CheZ  0.343  0.703  0.241 
CheB_{ P }  0.004  0.006  2×10^{−5} 
A model of extracellular regulatory kinase activation
The second example provided here is a model of ERK phosphorylation mediated via the EGF and NGF receptor pathways that was originally detailed in Sasagawa et al. [48]. This biological system commonly arises in the study of cancer and pain, and remains an area of ongoing clinical significance. The SBML representation employed here of this model including the parameterisation and initial conditions is available at www.ebi.ac.uk/biomodelsmain/BIOMD0000000049.
This is a relatively large biochemical model describing 150 reactions and 99 species. The model additionally integrates two receptor pathways (EGFR and NGFR) allowing exploration as to how they interact. Due to its size and clinical relevance, this model therefore represents a prime candidate for the application of model reduction techniques. Although fully parameterised systems of this scope remain somewhat uncommon, their occurrence in the literature is increasing; primarily a result of increases in data and knowledge of cellular systems at finer spatial scales. However, even with such data available, approximations may still be required to model parameters particularly in cases where the model acts as a representation of more complex underlying biochemical mechanisms. We have thus employed this system as an example to demonstrate that our methodology remains valid for systems of varying complexity and size, assuming all parameter values are known.
Additionally, the system employs initial conditions such that it begins at a nonzero steadystate with no input into the model (i.e. at the natural rate of EGF binding). Note that we chose to look at the EGF receptor  ERK activation pathway in particular as its dynamical behaviour exhibits an adaptive response [48]. The retention of this nonlinear behaviour in a reduced model serves as a good demonstration of the combined algorithm’s strengths and particular applicability in the field.
Error results comparing the application of empirical balanced truncation, lumping, and the combined method of model reduction to the ERK activation model
Dimension  EBT error  Lumping error  Stiffness  Combined error 

75  0.76%  ≈0% ^{∗}  42658  − 
50  #  0.01%  42633  − 
25  #  0.52%  10664  − 
15  #  1.26%  7934  − 
14  #  2.21%  7934  − 
13  #  2.29%  7934  − 
12  #  1.21%  1591  − 
11  #  3.07%  236  − 
10  #  6.02%  264  2.84% 
9  #  10.96%  211  4.02% 
8  #  13.12%  43  4.32% 
7  #  14.18%  42  4.77% 
6  #  29.53%  44  13.08% 
5  #  39.03%  45  20.81% 
4  #  46.47%  212  31.09% 
3  #  54.67%  42  34.58% 
2  #  53.52%  18  41.10% 
1  #  55.73%  1  50.46% 
The original model had an extremely high stiffness coefficient at the initial condition of the system of 42658. Via lumping this could be reduced to 235.7 for the 11 dimensional lumped reduction enabling application of empirical balanced truncation. In this case, the empirical Gramians were constructed using data from 50 distinct simulations covering perturbations to both the models input parameter u(t) and the initial conditions x _{0}. In both cases, perturbations were uniformly sampled from 0.4 to 1.6 times each parameter’s original, unperturbed value.
The results shown in Table 5 highlight the extent of the reductions that can be obtained via the combined method with very little associated error. In particular the combined method has again demonstrated that it can produce better reductions than either method in isolation. The 7 dimensional reduced model had only a 4.77% associated maximal relative error as compared to the original model when simulated from steadystate under a 50% inhibition of EGF receptor binding. This is equivalent to an approximately 68% improvement in error over the 7 dimensional reduction achieved by lumping in isolation. Repeated simulation revealed that the original model had an average simulation time of 1.357 seconds, whilst the reduced 7 dimensional model required only 0.144 seconds.
Note that Matlab files regarding the reduction of the ERK activation model can be found online [62].
Indices of controllability and observability
Controllability and observability index values for the model of ERK activation controlled via the EGFR pathway
Lumped  Controllability  Observability  Inputoutput 

statevariable  index  index  index 
x _{1}(t)  0.3022  0.0176  0.0053 
x _{2}(t)  0.0201  0.0153  0.0003 
x _{3}(t)  0.0189  0.0009  1.681×10^{−5} 
x _{4}(t)  0.8547  0.2752  0.2352 
x _{5}(t)  0.6194  0.0079  0.0049 
x _{6}(t)  1  1  1 
x _{7}(t)  0.1412  0.1043  0.0147 
x _{8}(t)  0.1378  0.0018  0.0003 
x _{9}(t)  0.1539  0.0164  0.0025 
x _{10}(t)  0.0473  0.0092  0.0004 
x _{11}(t)  0.3312  0.0442  0.0146 
Despite this, however, it is still possible to obtain some insight from the indices calculated. The most controllable and observable lumped statevariable, x _{6}(t), for example, primarily contains the concentration of singly phosphorylated mitogen/extracellular signalregulated kinase (MEK), in isolation and in its various complex forms. This concurs with the known biology as phosphorylated MEK represents the point in the pathway that all possible routes of activation are directed towards before the phosphorylation of ERK occurs. The second most important statevariable, x _{4}(t), is somewhat more difficult to parse, but is most distinguishable from the other statevariables by its inclusion of the SOS protein and associated complexes and species along this pathway. This, again, concurs with the known biology as SOS represents the most responsive branch of the pathway to EGFR binding that is described by the model. Similarly, the most unimportant statevariable, x _{3}(t), can be seen as disproportionately representing the FRS2C3G branch of the pathway, which is significantly more responsive to binding of the NGF receptor than that of EGF.
Overall the indices do seem to automatically provide a degree of model intuition that is often not possible for such large systems. As such they represent one possible solution to a core issue in the study of complex models. Due to the nature of the lumping algorithm, however, they are somewhat limited in their usefulness. An alternative approach, which we would hope to explore in future work, would be to constrain the lumping algorithm so as to obtain more readily interpreted lumped variables prior to the calculation of indices.
Conclusions
In this paper a combined model reduction algorithm incorporating conservation analysis, proper lumping and empirical balanced truncation for the reduction of high dimensional systems of nonlinear ODEs has been presented. This algorithm was designed for specific application to controlled models of biochemical systems. Such models typically have a number of associated properties, including nonlinearity, stiffness and high dimensionality, which any model reduction approach must address. Under the combined algorithm conservation analysis is first used to obtain a simplified realisation of the system, proper lumping is then used to reduce the model whilst also reducing the stiffness coefficient and finally empirical balanced truncation is employed to achieve a more accurate and lower dimensional reduction than could be obtained by further lumping. The algorithm has additionally been designed to be highly automatable; it is implemented so as to take models stored in the SBML format and return highly accurate reduced systems without significant user input.
Whilst the algorithm has broad applicability, crucially, it focuses on the reduction of controlled biochemical systems that are amenable to an inputoutput formulation. In reducing a model the algorithm seeks primarily to maintain the inputoutput response profile, such that the reduced model can accurately predict the output of the original model for a wide range of possible inputs. Such an approach may be particularly applicable to the newly emerging field of Quantitative Systems Pharmacology, which aims to mechanistically describe the effects of drug administration across multiple scales of action. Here the preservation of doseresponse behaviour is of primary importance and tallies well with the algorithm’s emphasis on maintaining a model’s inputoutput relationship.
One of the difficulties with reduced models obtained under this approach is that the meaning of the dynamical equations is necessarily obfuscated by the coordinate transformations that have been applied. The combined algorithm starts with a fully mechanistic ‘white box’ model of a signalling pathway, whereby all of the biological reactions and reactants have a onetoone correspondence with terms in the system of ODEs. The algorithm then produces a reduced ‘greybox’ model where this biological interpretation of the governing equations is now hidden by the transformations applied. Computationally, however, the reduced model remains an accurate approximation to the original. The reduced statevariables can still be mapped back to the original biological species, thus allowing mechanistic insight to be gained as opposed to solely empirical observation. Hence, whilst the reduced system may no longer serve as a directly interpretable description of the original biological system, it still retains a high degree of utility for the analysis of the model’s overall inputoutput relationship and its mechanistic underpinnings.
The algorithm in the form presented here assumes that the model in question is initially fully parameterised. It is however possible to extend the algorithm such that the reduced models constructed are guaranteed to remain valid across a range of parameterisations by additionally evaluating perturbations to the parameter values whilst constructing the lumping matrices and the empirical Gramians. Hence, at each step the possible reduced models are tested not only for perturbations to the original input and the initial conditions, but also for perturbations to the parameterisation of the system. To maintain validity across a defined range of values in parameter space, sampling approaches such as latin hypercube sampling would likely work well, but such ideas are not explored in this initial study.
The algorithm was demonstrated via application to two systems; an 11 dimensional model of bacterial chemotactic signalling in E. coli and a 99 dimensional model of ERK activation mediated via the EGF and NGF receptor pathways. The results for both models highlight the extent of reduction that can be obtained with very little associated error and therefore the potential usefulness of these model reduction methods in the analysis of Systems Biology models.
From the results it can be seen that both models began with a high stiffness coefficient. As a result, the application of empirical balanced truncation to the unreduced systems yielded higher error than lumping alone and numerically failed after truncating only a small proportion of the statevariables. Crucially, the proper lumping scheme was able to act as a good preconditioner and efficiently stripped stiffness from the systems, hence closing eigengaps in the associated Jacobians. Via subsequent application of empirical balanced truncation the combined algorithm was able to produce significantly better reductions than could be obtained by any of the individual methods in isolation. Furthermore, these reduced systems exhibited a significant speedup in simulation time. In the case of the E. coli model we observed a 96.1% speedup in simulation time whilst only incurring a 2.8% error. The ERK activation model demonstrated a 89.4% speedup in simulation time in exchange for a 4.77% maximal relative error. For applications such as parameter optimisation or agent based modelling approaches, where an extremely large number of simulations may be required, speedup is a substantial benefit. This speedup can be attributed to two main factors  the reduction in the number of state variables present in each system and the reduction in their overall stiffness coefficients.
There are still a number of open questions with regards to this work, in particular to what extent does a model reduction (specifically, under a coordinate transformation of the state variables) remain accurate if the parameters in the original system are altered? Additionally, are there reduced models that remain valid over a larger range of parameterisations than others? If now, instead of considering the rate parameters to represent a set of fixed values, we consider the parameters to represent a distribution of physically acceptable values how does this affect our selection of the best reduction? Understanding these questions is a necessary next step in further growing the practical applicability of model reduction for biochemical systems.
The paper also introduced the concept of empirical indices of controllability and observability for biochemical systems; as was demonstrated these indices can be used to provide substantial insight into models of even a very large size. This usefulness, however, is dependent upon the biological interpretation of the specific lumping used to precondition the system. Developing a lumping algorithm specifically for the calculation of these indices represents a reasonable extension of this work.
The combined model reduction algorithm has demonstrated good results for the reduction of controlled, nonlinear, stiff, highdimensional models of biochemical reaction networks. Reduced systems produced under this approach maintain a high degree of inputoutput accuracy and see a significant speedup in simulation time. These results justify research into the combining of complementary reduction methodologies and highlight the use of empirical balanced truncation, which had not previously seen application in the reduction of biochemical systems.
Abbreviations
 EGF:

Epidermal growth factor
 ERK:

Extracellular regulatory kinase
 MEK:

Mitogen/extracellular signalregulated kinase
 NGF:

Nerve growth factor
 ODE:

Ordinary differential equations
 SBML:

Systems Biology Markup Language
 SVD:

Singular value decomposition
Declarations
Acknowledgements
None.
Funding
TS was funded for this research through an Engineering and Physical Sciences Research Council (EPSRC) Collaborative Awards in Science and Engineering (CASE) studentship in conjunction with Pfizer Global Research and Development. MT was in receipt of a Research Councils UK (RCUK) Fellowship.
Availability of data and materials
The majority of the datasets supporting the conclusions of this article are included within the article and its Additional file 1. A number of Matlab files that can be used to reproduce the reduced forms of the ERK activation model, however, are made available online [62]. Use of these files requires Matlab’s Symbolic Math toolbox.
Authors’ contributions
PHvdG initially proposed the study of model reduction for Systems Biology models. MT was the primary supervisor of the study with input from PHvdG. TS designed and implemented the combined algorithm, and performed the calculations for both case studies. TS drafted the manuscript with input from MT. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, Li L, He E, Henry A, Stefan MI, Snoep JL, Hucka M, Le Novere N, Laibe C. BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst Biol. 2010; 4:92.PubMedPubMed CentralGoogle Scholar
 Okino M, Mavrovouniotis M. Simplification of mathematical models of chemical reaction systems. Chem Rev. 1998; 98(2):391–408.PubMedGoogle Scholar
 Petrov V, Nikolova E, Wolkenhauer O. Reduction of nonlinear dynamic systems with an application to signal transduction pathways. Syst Biol IET. 2007; 1(1):2–9.Google Scholar
 Schneider KR, Wilhelm T. Model reduction by extended quasisteadystate approximation. J Math Biol. 2000; 40(5):443–50.PubMedGoogle Scholar
 Vejchodskỳ T, Erban R, Maini PK. Reduction of chemical systems by delayed quasisteady state assumptions. 2014. arXiv preprint arXiv:1406.4424.
 Vejchodskỳ T. Accurate reduction of a model of circadian rhythms by delayed quasi steady state assumptions. 2013. arXiv preprint arXiv:1312.2825.
 Choi J, Yang KW, Lee TY, Lee SY. New timescale criteria for model simplification of bioreaction systems. BMC bioinformatics. 2008; 9(1):338.PubMedPubMed CentralGoogle Scholar
 West S, Bridge LJ, White MR, Paszek P, Biktashev VN. A method of speed coefficients for biochemical model reduction applied to the NF κB system. J Math Biol. 2015; 70(3):591–620. arXiv preprint arXiv:1403.1610.PubMedGoogle Scholar
 Härdin HM, Zagaris A, Krab K, Westerhoff HV. Simplified yet highly accurate enzyme kinetics for cases of low substrate concentrations. FEBS J. 2009; 276(19):5491–506.PubMedGoogle Scholar
 Vora N, Daoutidis P. Nonlinear model reduction of chemical reaction systems. AIChE J. 2001; 47(10):2320–32.Google Scholar
 Gerdtzen ZP, Daoutidis P, Hu WS. Nonlinear model reduction for energy metabolism in Saccharomyces cerevisiae. In: American Control Conference, 2002. Proceedings of the 2002: 2002. p. 2867–72. IEEE.
 Gerdtzen ZP, Daoutidis P, Hu WS. Nonlinear reduction for kinetic models of metabolic reaction networks. Metab Eng. 2004; 6(2):140–54.PubMedGoogle Scholar
 Prescott TP, Papachristodoulou A. Layering in networks: The case of biochemical systems. In: American Control Conference (ACC), 2013: 2013. p. 4544–49. IEEE.
 Prescott TP, Papachristodoulou A. Layered decomposition for the model order reduction of timescale separated biochemical reaction networks. J Theor Biol. 2014; 356:113–22.PubMedGoogle Scholar
 Sivakumar H, Hespanha JP. Towards modularity in biological networks while avoiding retroactivity. In: American Control Conference (ACC), 2013: 2013. p. 4550–556. IEEE.
 Maas U, Pope SB. Simplifying chemical kinetics: intrinsic lowdimensional manifolds in composition space. Combustion Flame. 1992; 88(3):239–64.Google Scholar
 Vallabhajosyula RR, Sauro HM. Complexity reduction of biochemical networks. In: Simulation Conference, 2006. WSC 06. Proceedings of the Winter: 2006. p. 1690–1697. IEEE.
 Zobeley J, Lebiedz D, Kammerer J, Ishmurzin A, Kummer U. A new timedependent complexity reduction method for biochemical systems. In: Transactions on Computational Systems Biology I. Berlin, Heidelberg: Springer: 2005. p. 90–110.Google Scholar
 Surovtsova I, Zobeley J. Focusing on dynamic dimension reduction for biochemical reaction systems. Understanding Exploiting Syst Biol Biomed Bioprocesses. 2006; 31:31–46.Google Scholar
 Surovtsova I, Simus N, Lorenz T, König A, Sahle S, Kummer U. Accessible methods for the dynamic timescale decomposition of biochemical systems. Bioinformatics. 2009; 25(21):2816–23.PubMedGoogle Scholar
 Kourdis PD, Goussis DA, Steuer R. Physical understanding via reduction of complex multiscale models: glycolysis in Saccharomyces cerevisiae. In: BioInformatics and BioEngineering, 2008. BIBE 2008. 8th IEEE International Conference On: 2008. p. 1–6. IEEE.
 Kourdis PD, Steuer R, Goussis DA. Physical understanding of complex multiscale biochemical models via algorithmic simplification: Glycolysis in Saccharomyces cerevisiae. Physica D: Nonlinear Phenomena. 2010; 239(18):1798–817.Google Scholar
 Kourdis PD, Palasantza AG, Goussis DA. Algorithmic asymptotic analysis of the NF κB signaling system. Comput Math Appl. 2013; 65(10):1516–34.Google Scholar
 Surovtsova I, Simus N, Hübner K, Sahle S, Kummer U. Simplification of biochemical models: a general approach based on the analysis of the impact of individual species and reactions on the systems dynamics. BMC Syst Biol. 2012; 6(1):14.PubMedPubMed CentralGoogle Scholar
 Degenring D, Froemel C, Dikta G, Takors R. Sensitivity analysis for the reduction of complex metabolism models. J Process Control. 2004; 14(7):729–45.Google Scholar
 Liu G, Swihart MT, Neelamegham S. Sensitivity, principal component and flux analysis applied to signal transduction: the case of epidermal growth factor mediated signaling. Bioinformatics. 2005; 21(7):1194–202.PubMedGoogle Scholar
 Smets I, Bernaerts K, Sun J, Marchal K, Vanderleyden J, Van Impe J. Sensitivity functionbased model reduction: a bacterial gene expression case study. Biotech Bioeng. 2002; 80(2):195–200.Google Scholar
 Apri M, de Gee M, Molenaar J. Complexity reduction preserving dynamical behavior of biochemical networks. J Theor Biol. 2012; 304:16–26.PubMedGoogle Scholar
 Maurya M, Bornheimer S, Venkatasubramanian V, Subramaniam S. Reducedorder modelling of biochemical networks: application to the GTPasecycle signalling module. IEE Proc Syst Biol. 2005; 152(4):229–42.Google Scholar
 Maurya MR, Scott JB, Venkatasubramanian V, Subramaniam S. Modelreduction by simultaneous determination of network topology and parameters: Application to modules in biochemical networks. In: 2005 Annual Meeting AIChE: 2005.
 Maurya M, Bornheimer S, Venkatasubramanian V, Subramaniam S. Mixedinteger nonlinear optimisation approach to coarsegraining biochemical networks. IET Syst Biol. 2009; 3(1):24–39.PubMedPubMed CentralGoogle Scholar
 Hangos KM, Gábor A, Szederkényi G. Model reduction in biochemical reaction networks with michaelismenten kinetics. In: European Control Conference (ECC), July 1719 2013, Zurich: 2013. p. 4478–483.
 Taylor SR, Petzold LR, et al. Oscillator model reduction preserving the phase response: application to the circadian clock. Biophys J. 2008; 95(4):1658–73.PubMedPubMed CentralGoogle Scholar
 Anderson J, Chang YC, Papachristodoulou A. Model decomposition and reduction tools for largescale networks in systems biology. Automatica. 2011; 47(6):1165–74.Google Scholar
 Prescott TP, Papachristodoulou A. Guaranteed error bounds for structured complexity reduction of biochemical networks. J Theor Biol. 2012; 304:172–82.PubMedGoogle Scholar
 Danø S, Madsen MF, Schmidt H, Cedersund G. Reduction of a biochemical model with preservation of its basic dynamic properties. FEBS J. 2006; 273(21):4862–77.PubMedGoogle Scholar
 Dokoumetzidis A, Aarons L. Proper lumping in systems biology models. IET Syst Biol. 2009; 3(1):40–51.PubMedGoogle Scholar
 Gulati A, Isbister G, Duffull S. Scale reduction of a systems coagulation model with an application to modeling pharmacokinetic–pharmacodynamic data. CPT: Pharmacometrics Syst Pharmacol. 2014; 3(1):90.Google Scholar
 Koschorreck M, Conzelmann H, Ebert S, Ederer M, Gilles ED. Reduced modeling of signal transduction–a modular approach. BMC Bioinformatics. 2007; 8(1):336.PubMedPubMed CentralGoogle Scholar
 Sunnåker M, Schmidt H, Jirstrand M, Cedersund G. Zooming of states and parameters using a lumping approach including backtranslation. BMC Syst Biol. 2010; 4(1):28.PubMedPubMed CentralGoogle Scholar
 Sunnåker M, Cedersund G, Jirstrand M. A method for zooming of nonlinear models of biochemical systems. BMC Syst Biol. 2011; 5(1):140.PubMedPubMed CentralGoogle Scholar
 Liebermeister W, Baur U, Klipp E. Biochemical network models simplified by balanced truncation. FEBS J. 2005; 272(16):4034–43.PubMedGoogle Scholar
 Härdin H, van Schuppen J. System reduction of nonlinear positive systems by linearization and truncation. Positive Syst. 2006;:431–38.
 Sootla A, Anderson J. On projectionbased model reduction of biochemical networks – Part I: The deterministic case. Los Angeles: IEEE 53rd Annual Conference on Decision and Control (CDC): 2014. arXiv preprint arXiv:1403.3579.
 van der Graaf PH, Benson N. Systems pharmacology: bridging systems biology and pharmacokineticspharmacodynamics (PKPD) in drug discovery and development. Pharm Res. 2011; 28(7):1460–4.PubMedGoogle Scholar
 Sorger PK, Allerheiligen SR, Abernethy DR, Altman RB, Brouwer KL, Califano A, D’Argenio DZ, Iyengar R, Jusko WJ, Lalonde R, et al. Quantitative and systems pharmacology in the postgenomic era: new approaches to discovering drugs and understanding therapeutic mechanisms. In: An NIH White Paper by the QSP Workshop GroupOctober: 2011.
 Tindall M, Porter S, Wadhams G, Maini P, Armitage J. Spatiotemporal modelling of CheY complexes in Escherichia coli chemotaxis. Progress Biophys Mol Biol. 2009; 100(1):40–6.Google Scholar
 Sasagawa S, Ozaki YI, Fujita K, Kuroda S. Prediction and validation of the distinct dynamics of transient and sustained ERK activation. Nat Cell Biol. 2005; 7(4):365–73.PubMedGoogle Scholar
 Murray JD. Mathematical Biology I: An Introduction. New York: Springer; 2002.Google Scholar
 Klipp E, Liebermeister W, Wierling C, Kowald A, Lehrach H, Herwig R. Systems Biology. Oxford: WileyBlackwell; 2013.Google Scholar
 Sauro HM, Ingalls B. Conservation analysis in biochemical networks: computational issues for software writers. Biophys Chem. 2004; 109(1):1–15.PubMedGoogle Scholar
 Vallabhajosyula RR, Chickarmane V, Sauro HM. Conservation analysis of large biochemical networks. Bioinformatics. 2006; 22(3):346–53.PubMedGoogle Scholar
 Antoulas AC. Approximation of Largescale Dynamical Systems. Advances in Design and Control. Philadelphia: Society for Industrial and Applied Mathematics; 2005.Google Scholar
 Hahn J, Edgar TF. An improved method for nonlinear model reduction using balancing of empirical Gramians. Comput Chem Eng. 2002; 26(10):1379–97.Google Scholar
 Dullerud GE, Paganini F, Vol. 6. A Course in Robust Control Theory. New York: Springer; 2000.Google Scholar
 Laub AJ, Heath MT, Paige CC, Ward RC. Computation of system balancing transformations and other applications of simultaneous diagonalization algorithms. Autom Control IEEE Trans. 1987; 32(2):115–22.Google Scholar
 Bornstein BJ, Keating SM, Jouraku A, Hucka M. LibSBML: an API library for SBML. Bioinformatics. 2008; 24(6):880–1.PubMedPubMed CentralGoogle Scholar
 Schmidt H, Jirstrand M. Systems biology toolbox for matlab: a computational platform for research in systems biology. Bioinformatics. 2006; 22(4):514–5.PubMedGoogle Scholar
 Wei J, Kuo JC. Lumping analysis in monomolecular reaction systems. analysis of the exactly lumpable system. Ind Eng Chem Fundam. 1969; 8(1):114–23.Google Scholar
 Kuo JC, Wei J. Lumping analysis in monomolecular reaction systems. analysis of approximately lumpable system. Ind Eng Chem Fundam. 1969; 8(1):124–33.Google Scholar
 Li G, Rabitz H. A general analysis of approximate lumping in chemical kinetics. Chem Eng Sci. 1990; 45(4):977–1002.Google Scholar
 Snowden TJ, van der Graaf PH, Tindall MJ. Reduction results for a mathematical model of ERK activation (Matlab Files). 2016. doi:10.5281/zenodo.192503 https://doi.org/10.5281/zenodo.192503. Accessed 12 Apr 2016.