 Methodology article
 Open Access
 Published:
Simplification of biochemical models: a general approach based on the analysis of the impact of individual species and reactions on the systems dynamics
BMC Systems Biology volume 6, Article number: 14 (2012)
Abstract
Background
Given the complex mechanisms underlying biochemical processes systems biology researchers tend to build ever increasing computational models. However, dealing with complex systems entails a variety of problems, e.g. difficult intuitive understanding, variety of time scales or nonidentifiable parameters. Therefore, methods are needed that, at least semiautomatically, help to elucidate how the complexity of a model can be reduced such that important behavior is maintained and the predictive capacity of the model is increased. The results should be easily accessible and interpretable. In the best case such methods may also provide insight into fundamental biochemical mechanisms.
Results
We have developed a strategy based on the Computational Singular Perturbation (CSP) method which can be used to perform a "biochemicallydriven" model reduction of even large and complex kinetic ODE systems. We provide an implementation of the original CSP algorithm in COPASI (a COmplex PAthway SImulator) and applied the strategy to two example models of different degree of complexity  a simple oneenzyme system and a fullscale model of yeast glycolysis.
Conclusion
The results show the usefulness of the method for model simplification purposes as well as for analyzing fundamental biochemical mechanisms. COPASI is freely available at http://www.copasi.org.
1 Background
Biochemical systems are inherently high dimensional due to the large number of interrelated cellular components and processes, the temporal organization of which spans time scales of several orders of magnitude. Aiming at a comprehensive understanding of the dynamic behavior of such systems has led to the development of an ever increasing number of computational models which are in the majority of cases formulated on the basis of ordinary differential equations (ODEs) [1]. Even though the purpose of computational models is to facilitate understanding and analysis of the underlying biochemical mechanisms, this again becomes cumbersome with the growing complexity of modern models. Therefore, it is important to identify those parts of the biochemical systems and of the model that are responsible for the observed physiological behavior. This necessitates the development of methods for the rational simplification of computational models and to make them comfortably accessible to the community.
Numerous methods have been developed to simplify (bio)chemical reaction networks (see review [2]). For biochemical systems many of the reduction methods aim at analyzing the steady state behavior either heuristically [3] or employing mathematical analysis (e.g. sensitivities [4, 5]). Since biochemical systems usually do not reside in a steady state timedependent approaches have recently been proposed (see for example [6, 7]). Most of these use a mathematical analysis of the different timescales occurring in the biochemical systems, e.g. the Intrinsic LowDimensional Manifolds (ILDM) method [8–11] and the Computational Singular Perturbation (CSP) method [12–14]. Apart from the advantage of a timeresolved analysis, these methods can provide useful insights, such as the support of the detection of fast reactions and species as well as the identification of potential rate controlling reactions. However, a disadvantage of the above methods is that the reduced models are systems of mathematically transformed differential or differential algebraic equations (DAE) which may not relate onetoone to biochemical species and reactions hampering the biochemical interpretation. In contrast, the methods based on steadystate or partial equilibrium approximations keep the onetoone relation and are therefore simple to biochemically interpret.
In this paper, we focus on deriving simplified biochemical models by discarding fast reactions and species. For this purpose we present a reduction strategy which is based on the CSP algorithm developed by Lam and Goussis [14]. The algorithm examines the time scales of ODE systems and supports the separation of the biochemical network into fast and slow subsystems. This is achieved through the elimination of the detected quasistationary species and quasiequilibrium reactions.
The original CSP algorithm is implemented in the software COPASI [15] making it accessible to the scientific community. COPASI is a platformindependent, userfriendly software tool that allows easy access to powerful numerical methods for simulation and analysis of biochemical reaction networks.
We apply the simplification strategy to two different systems to exemplify its use. Thus, as a simple system, we present the derivation of the MichaelisMenten Kinetics. As a realistic case, we analyze the glycolysis in Saccharomyces cerevisiae [16] in three different dynamic regimes. We show that several variables can be eliminated still keeping the original dynamics intact. Furthermore, regulatory mechanisms cause different players to participate with different relative importance in the dynamics of the system.
Time Scale Separation Analysis
In order to explain the basic notions of a time scale decomposition we start with a firstorder kinetics system. Then, the differential equations describing the system dynamics y are linear:
with constant and diagonalisable Jacobian J. By using the set of right eigenvectors A of J as the new basis we can decompose the Jacobian [17] and transform the original equations into:
The components of the transformed concentration vector x are called modes. Because Λ is a diagonal matrix of real or complex eigenvalues λ_{ i } of J, the transformed ODE system is fully decoupled:
Thus, the modes ${x}^{i}~{e}^{{\lambda}^{i}t}$ evolve independently of each other. The reciprocals of ℜ(λ^{i} ):
have a dimension of time and are called time scales (TS). Ordering them w.r.t. magnitudes τ_{1} <τ_{2} < ... <τ_{ N } leads to approximate speed ranking of the modes [14]. The modes corresponding to fast time scales (eigenvalues with large negative real part) approach 0 very quickly and can be eliminated from the system for t ≫ τ_{ M } , where τ_{ M } is a fast time scale.
Two additional aspects are worth being emphasized here. First, although the transformed representation of the system dynamics in terms of modes provides a systematic basis for the decomposition of the reaction network, it does not guarantee reducing the number of biochemical species or reactions in the system, since many different species might contribute to one and the same transformed equation. So, there is no straightforward relation to reduction methods commonly used in biochemistry such as the quasi steady state (QSS) approximation or the quasi equilibrium (QE) assumption.
An additional aspect of TS decomposition is that it is based on the local analysis of the system dynamics. For general nonlinear problems however the Jacobian is timedependent. Its eigenvalues and eigenvectors change with time. Hence, in order to obtain a reasonable characterization of the systems dynamics the time scale decomposition has to be applied repeatedly at many time points through the evaluation time of the reaction system.
2 Results
2.1 CSP in COPASI
Consider a system consisting of K biochemical reactions, the dynamics of which is determined by a system of N ordinary differential equations:
here y is the Ndimensional concentrations vector, s_{ r }(r = 1, . . . , R) are the Ndimensional stoichiometric vectors and F^{r} (y) is the rate of the rth reaction.
The main idea of the CSP method is to split the Ndimensional space of the vector g into two subspaces, a fast and a slow subspace:
In general, an Ndimensional vector may be expressed in terms of any set of N linearly independent basis vectors (e.g. [17]). The objective of the CSP method is to express g in a new basis, one that is tuned to the dynamics of the system, where the fast and slow components evolve independently of each other.
The subspace g^{fast} relates to the fast time scales of the system. If its contribution is negligible (according to some error criteria), the original system (Eq. 1) simplifies to the system of the following differential algebraic equations (DAEs):
This DAE system does not contain the fastest time scales of the original model. Hence, it is much less stiff than the original system and can be simulated easily. Nevertheless such a simplification does not guarantee the reduction of the number of species and reactions ("real" system reduction) as explained above. Therefore, a similarly important aspect of our strategy is the identification of QSS metabolites and QE reactions by means of CSP data (see [14] and below).
The Algorithm
Let us differentiate equation (1) with respect to time and get the following form with the Jacobian J:
Now we (again) focus on the choice of an appropriate basis a_{ i }, i = 1, . . . , N, but in contrast to the approach of time scale separation, these linearly independent vectors do not have to be eigenvectors of the Jacobian J (not even orthogonal). Then g always has the unique representation:
where a_{ i } f^{i} is a socalled reaction mode, the amplitude f^{i} is given by:
The notation ⊙ abbreviates the standard scalar product, i.e. here ${\mathbf{b}}^{i}\odot {\mathbf{a}}_{j}={\sum}_{n=1}^{N}{\mathbf{b}}_{n}^{i}{\mathbf{a}}_{j}^{n}.$ The set of N row vectors b^{i} are the inverses of a _{ i }; together they satisfy the following orthonormal condition: ${\mathbf{b}}_{i}\odot {\mathbf{a}}_{j}={\delta}_{j}^{i},\phantom{\rule{1em}{0ex}}i,j=1,2,...,N.$
The CSP provides an algorithm to determine the number of fast modes M, and to compute the sets of linearly independent a_{ i } and b^{i} , such that
Differentiating equation (4) with respect to time we get:
where A and B are matrices consisting of the column vector a_{ i } and row vector b^{i} as basis vectors.
For linear problems the matrix Λ is time independent. In this case the choice of eigenvectors of the Jacobian as the new basis leads to the diagonal matrix Λ (see above). The corresponding amplitudes f evolve independently of each other with their own characteristic time scale τ_{ i } . For general nonlinear systems, however, Λ is time dependent and usually not diagonal. The CSP method provides an iterative procedure of refinement of basis vectors a_{ i } and b _{ j }. When recursively applied, the refinement procedure weakens the coupling between the M fast and the N  M slow amplitudes. The matrix Λ built from the final refined set of basis vectors is blockdiagonal and the fast amplitudes are uncoupled from the slow ones approximately, so that the residual coupling can be neglected.
The process starts with an arbitrary initial guess for the basis vectors a_{ i } and the assumption that the first M basis vectors span the Mdimensional fast subdomain. The corresponding time scales should be much faster than some characteristic time scale of interest, then M is to be selected to provide a gap between the slow (time of interest) and fast time scales:
When for the final set of basis vectors the sum of M fast reaction modes falls below some userspecified threshold:
these can be eliminated from the initial system (Eq. 1), because their contributions to g are negligible. As a consequence, the evolution of the reduced system depends on the slow modes only (see Eq. 2).
Implementation
The CSP algorithm was implemented as an integral part of the COPASI software in C++ and is freely available with the current releases of the package. In this implementation, the CSP algorithm is applied to the models for which linear dependencies due to conservation relationship are eliminated. This is achieved by the analysis of the stoichiometric matrix and is performed by COPASI automatically.
The following CSP parameters have to be defined by the user.
Intervals
The user specifies the number of time points for which the CSP analysis is carried out by setting the time interval. The time interval should be large in comparison with the user's time scale of interest.
Ratio of time scale separation ε
This parameter specifies the gap between the time scales related to the fast and slow modes (Eq. 7).
Error tolerance
Absolute ε_{abs} and relative error ε_{rel} are set to control when a fast mode is considered to be exhausted (Eq. 8).
The CSP algorithm described above provides local information at certain time steps. To obtain global features of the system behavior the analysis must be performed at all points in the range of interest. For this purpose the CSP step involves numerical integration using the LSODA solver [18]. LSODA is part of the ODEPACK library [19]. It solves ODE systems with a dense or banded Jacobian when the problem is stiff, but it automatically selects between nonstiff (Adams) and stiff (BDF) methods. The Jacobian is generated numerically.
When the CSP algorithm at time point t has been performed and both final refined sets of basis vectors a _{ i }(t) and b^{i} (t) are available the M(t) is set to the number of fast exhausted modes and τ_{ M } (t) is then the time scale of the slowest of fast reaction modes at time point t.
The CSP output data (see below) can either be exported to a text file (save as Report in COPASI) for the use in other software (gnuplot, Octave etc.) or displayed in the graphical user interface as tables. In this case a color coding is used where the numbers are additionally visualized by different shades of color. This makes it easy to immediately spot e.g. the most important contributions to a specific mode for a large model (where the result tables are correspondingly large).
We also use three dimensional bar graphs for visualizing the matrices employing the qwtplot3D library (http://qwtplot3d.sourceforge.net) integrated in COPASI. These bar graphs can be turned and zoomed interactively. Furthermore single rows or columns of the matrix can be highlighted. An additional diagram shows the distribution of the timescales of the different modes at chosen points of time (Figure 1). Applying the time slider in the graphical user interface it is very simple to switch between the results for different time points. Therefore the user can easily get an overview of the timedependent changes of the timescale separation.
CSP Data used for model analysis
The CSP algorithm supplies the modeler with local CSP output data [14] that relates the time scales to species and reactions of the original biochemical system. The data is computed by the help of the refined sets of basis vectors a _{ i }(t) and b^{i} (t). The user is provided with the CSP output at each defined time point during the interval of interest and can use it to reduce the model in a rational way. The CSP output data are displayed in COPASI in a number of matrices. Here we briefly explain the most important CSP output data which are available in COPASI:
Time scales
The analysis of time scales evolution can provide useful information about the system dynamics. The fast dissipative time scales relate to the eigenvalues of the Jacobian with large negative real parts. The explosive modes are associated with positive eigenvalues. Modes with equal time scales correspond to pairs of complex conjugate eigenvalues indicating oscillatory components in the system behavior.
Radical Pointer (RP)
The CSP Radical pointer identifies the species for which the QSSA can be justified. Whenever the ith diagonal element of mth fast mode projection matrix Q_{ m } = a _{ m } b^{m} is not a small number, species i is said to be a CSP radical.
Participation Index (PI) and QE reactions
The relative level of participation of the rth elementary reaction to the nth CSP reaction mode can be represented by the mode participation index, ${P}_{r}^{i}$, defined as follows:
where i = 1, . . . , N, r = 1, . . . , R.
Importance Index (II)
The relative importance of the contribution of the rth elementary reaction to the rate of change of the ith element of y can be represented by the importance index, ${I}_{r}^{i}$:
i = 1, . . . , N, r = 1, . . . , R. An effective stoichiometric vector of the rth elementary reaction, ${\mathbf{r}}_{r}^{\mathsf{\text{slow}}}=\left(\mathbf{I}\mathbf{Q}\left(M\right)\right)\odot {\mathbf{s}}_{r}$ is computed using the fast subspace projection matrix $\mathbf{Q}\left(M\right)=\sum _{m=1}^{M}{\mathbf{Q}}_{m}$. The reaction with the largest ${I}_{r}^{i}$ for the species y_{ i } is the rate controlling reaction.
CSP  based model reduction
In this paragraph we summarize the most important steps in the reduction of the kinetic mechanism based on the results of the CSP algorithm described above. Model reduction is mainly the outcome of a sequence of QSSA for species and QEA for reactions which leads to the lumping or elimination of corresponding variables. The QSSA identifies species whose production and destruction rate are in approximate balance. Mathematically it means that the righthand side of the corresponding differential equation is zero. The QE assumption corresponds to reactions whose forward and reverse rates are nearly equal (see for instance, [20]). In either case an approximate algebraic relation (equation of state) is obtained between participating species.
As described in the previous paragraph the CSP method provides the numerical data (RP, PI and II) that are an effective diagnostic tool allowing the detection of species which can be approximated by an equation of state, as well as the determination of the relative level of participation of distinct reactions to the modes.
In contrast to the original CSP method [14] we introduce and use the "subspace" radical pointers and the "subspace" participation indices rather than the individual mode RP and PI. This is based on the fact, that even though the matrix Λ (Eq. 6) built by the help of the final refined set of the basis vectors is blockdiagonal and the fast modes are decoupled from the slow ones, the fast and slow modes could be coupled between themselves. So, it appears to be more reasonable to consider a projection of the CSP indices on the full fast and slow subspaces.
We consider the sum of all CSP radicals as selected by M fast modes and define the species with the largest "fast subspace" radical pointers as QSS. Similarly the sum of Participation indices over all slow and fast modes should be considered separately in order to detect the fast reactions. The normed PIs over fast and slow subspaces are:
We declare the reaction k as QE, if it is active in the fast and does not influence the slow space: $P{I}_{k}^{\mathsf{\text{fast}}}\gg P{I}_{k}^{\mathsf{\text{slow}}}$ at all time points, where the CSP analysis was carried out.
Practically, there exist only very few guidelines in the literature for deriving model simplifications based on the QEA and QSSA. Therefore, we would like to quickly summarize the procedure for the CSPbased model simplification:

1.
First, a time scale of interest should be selected. This can for instance correspond to the time resolution of the experiment which is the basis for the model. The aim of the model simplification is to reduce all scales that are faster than this chosen scale.

2.
Second, user defined parameters have to be selected in COPASI as explained above. Since the CSP information will be available for every time interval and is the basis for the timedependent model reduction, the time interval should be large enough in comparison with the time scale of interest.

3.
Third, performing the CSP and analyzing the results in order to find the QSS species and QE reactions.

4.
Fourth, solving the corresponding algebraic equations of state and eliminating the respective variables from the reaction networks. The kinetic laws for slow reactions should be modified by substitution with explicit expressions for CSP radicals.

5.
Fifth, parameter adjustment (e.g. by parameter estimation) with respect to quasi equilibrium constants of eliminated reactions in order to achieve the desired accuracy.
It is worth mentioning that during the simplification the existing conservation laws have to be preserved. The algebraic equations should be solved under conditions that the equations of moieties are fulfilled.
2.2 Application examples
We have applied the method to two models. All informations and scripts needed to reproduce the figures in this subsection are available in Additional Files 1, 2, 3.
MichaelisMenten Kinetics
As in [10] we start our discussion with the simplest enzymatic reaction mechanism, the irreversible MichaelisMenten kinetics:
The model was build in COPASI and consists of the two reactions R_{1} (S +E ⇔ C) and R2 (C → P +E). In order to illustrate the handling of the CSP based model reduction we consider two limit situations for the dimensionless parameters: ${S}_{t}=\frac{{k}_{2}}{{k}_{1}}\to 0$ and ${M}_{r}=\frac{{E}_{0}}{{S}_{0}}\to 0$ (here E_{0} and S_{0} are initial enzyme and substrate concentrations, respectively). We used the following CSP parameter values: ε = 0.01, ε_{rel} = 10^{5}, and ε_{abs} = 10^{10}. In both cases, a clear timescale separation occurs.
(i) M_{ r } → 0 (E_{ 0 } ≪ S_{ 0 }): This is the standard situation for MichaelisMenten kinetics. The motion on the fast timescale is dominated by the complex C decoupled from the substrate S. On the slow part, the changes of substrate and complex are balanced. The quasi steady state assumption for complex C leads then to the MichaelisMenten kinetic law.
The CSP method allows the distinction between slow and fast modes (for times t >0.03). The Radical Pointer from the CSP data shows that the complex C dominates the fast mode. The contributions of both reactions to the slow and fast modes are comparable (see Figure 2, which displays the evolution in time of Radical Pointer and Participation Indices). Thus, the QSSA for the complex C is justified in this case.
(ii) S_{ t } → 0 (k_{ 2 } ≪ k_{ 1 }). This limit means that an equilibrium between the enzyme E, the substrate S and the enzymesubstrate complex C is established quickly. The slow step is the breakdown of C to produce the product P and the enzyme E.
The CSP analysis leads to the occurrence of two independent dynamical modes. After the short transient phase (t <0.006, when no reduction is possible) the contribution of C to the fast mode is larger than the one of S (nevertheless no real dominance occurs). Over the time the contribution of both variables becomes equal (Figure 2). Thus, the QSSA for complex C is incorrect.
Nevertheless, there is a clear separation of reactions in the modes. The reaction R_{2} of product formation dominates clearly the slow mode. Both reactions are active in the fast mode (see Figure 2). Thus, the reaction ${R}_{1}:S+E\underset{{k}_{1}}{\overset{{k}_{1}}{\iff}}C$ is always practically in equilibrium and the QEA for reaction R_{1} is correct and leads to a similar equations as for "standard" MichaelisMenten kinetics (compare [21] and [10]):
The reader is referred to Additional File 1 for more details of the reduction of the MichaelisMenten kinetics.
Glycolysis in Saccharomyces cerevisiae
We now use the CSP method to examine a more complex model for simplification purposes. We take the quantitative model of yeast glycolysis developed by [16] as an application example which has been also used before in similar studies [13, 22].
The model is based on ODEs and consists of 24 reactions among 22 metabolites with a total of 59 kinetic parameters. The reaction scheme is depicted in Figure 3. From the reaction stoichiometries two moiety conservations are derived:
The complete model is available for download in SBML format at the BioModels database [23] (BIOMD 61) or JWS online [24] (http://jjj.biochem.sun.ac.za/database/hynne/Hynne.xml), the latter version being used in this study. For model details the reader is also referred to [16]. However, there are some model properties we want to mention here explicitly.
The model reproduces experimental data observed in intact yeast cells in a continuousflow stirred tank reactor. Here, the mixed flow glucose concentration, [Glc_{ x }]_{0}, is a bifurcation parameter which means that depending on its value the system behavior changes qualitatively. To be concrete, this glycolysis model exhibits two stationary (<9.6 mM; 16.7 <[Glc_{ x }]_{0} <18.5 mM) and two oscillatory state regimes (9.6 ≤ [Glc_{ x }]_{0} ≤ 16.7; ≥ 18.5 mM). Please refer to Figure eight in [16] for the bifurcation diagram. Notably, the first oscillatory regime has not yet been observed in experiments. So, we consider this as an important model property.
First step: CSP Analysis in COPASI
When performing a model reduction analysis it is indispensable to determine beforehand which properties of the system are to be maintained in the simplified model. We aimed at preserving (within an acceptable error range) the following features in order of priority:

1.
A Hopf bifurcation occurs at some value of [Glc_{ x }]_{0}.

2.
Bifurcation points w.r.t. [Glc_{ x }]_{0} change only little, i.e. different dynamic regimes (including the first oscillatory domain) appear at values of [Glc_{ x }]_{0} close to the corresponding values in the full system.

3.
Steady state levels of metabolite concentrations.

4.
Periods of the oscillations.

5.
Amplitudes of the oscillations.
We, therefore, perform the CSP analysis on the different dynamic regimes separately, i.e. using three different initial conditions for [Glc_{ x }]_{0}, namely 9 mM (steady state), 14 mM and 24 mM (first and second oscillatory state, respectively). All other parameters of the model are taken as in [16].
For each CSP analysis we simulate the system for a time period from 0 to 100 min, thereby taking also the initial transients into consideration, and inspect 250 time points along the trajectory which yields a time interval of 0.4 min. At each time point a full set of CSP data is computed. Example time course trajectories of the concentrations of ATP and NADH are shown in Figure 4. The CSP parameters Ratio of mode separation, Relative Error and Absolute Error are set to 0.99, 1e3 and 1e4, respectively.
In the following, we present the CSP output data (Time Scales, Radical Pointer, Participation Index, Importance Index and so on, see 2.1) one after the other. For each type of data, we point out the major differences between the three dynamic regimes which we interpret as glucosedependent phenomena. If appropriate, special emphasis is given to timedependent differences.
Since the amount of data produced in this comprehensive analysis exceeds the scope of the paper we present each CSP output data with compelling examples. The complete set of data is provided in Additional file 2.
Time scales
The full model exhibits in total 20 different time scales with values that span about seven orders of magnitude (from min to ms). Figure 1 shows the time scale distribution (logarithmic values) of the full model exemplarily for [Glc_{ x }]_{0} = 14 mM at time step 25. Notably, the time scale values change over time. In the steady state regime ([Glc_{ x }]_{0} = 9 mM), we observe two eigenvalue pairs corresponding to the 8 ^{th} and 9 ^{th} as well as 15 ^{th} and 16 ^{th} time scales that consist of complex conjugates (τ_{8} = τ_{9}, τ_{15} = τ_{16}) indicating the system's intrinsic oscillatory vicinity. We see that the real part of these eigenvalue pairs become equal at a certain point in time during the initial transient (Figure 5(a)).
In both oscillatory state regimes, after the initial transients, the values of time scales become oscillating and show in part substantial amplitudes which sometimes also overlap with the values of adjacent time scales. As an example, Figures 5(b) and 5(c) show the time evolution of the 15 ^{th} to 18 ^{th} time scales for [Glc_{ x }]_{0} = 14 and 24 mM, respectively.
Number of fast modes M (Figure 5(d)): As explained above, each time scale corresponds either to a fast or slow socalled mode in the CSP analysis. Like the values of the time scales the number of modes constituting the entire fast or slow subspace changes over time. Since for model reduction only the fast modes are relevant we focus on these. Initially, all three dynamic regimes show seven fast modes. In the steady state regime, after a highly variable transient, M settles to 17. In contrast, M varies between 7 and 9 for the first and between 9 and 10 for the second oscillatory regime. Consequentially, we do not fix the number of fast modes in our CSP analysis but rather take their varying number over time into account in search for QSS metabolites (see RP) and QE reactions (see PI).
CSP Radical Pointer
Figure 6 shows how Radical Pointers are visualized in COPASI. Five metabolites (BPG, GAP, PEP, F6P and NAD) are fast in all of the three dynamic regimes.
CSP Participation Index (PI)
When comparing the normed sum of PIs for the three different regimes, four different categories of reactions can be identified depending on their respective PIs, e.g. the reaction can always be classified as fast or it changes its role between regimes. A heuristic threshold value based on our analysis and experience is chosen. Thus, if the normed sum of PIs over all fast modes exceeds 0.7, the reaction is defined as fast.
1. "fast  fast  fast": vGAPDH, vlpPEP, vPK, vPGI, vALD, vTIM and vAK are fast in all regimes. These reactions, therefore, may be approximated as QE and eliminated in a simplified model. Not surprisingly, the known fast reactions vPGI and vTIM turn up in this group. Interestingly, the group also contains all reactions that either produce energy or redox equivalents, i.e. ATP and NADH, respectively. Obviously, especially in case of reactions being at the edge of the threshold, model reduction still has to be done with care.
2. "fast  slow  slow": vHK, vPFK, vPDC, glycerol production, glycogen production, and ATP consumption are reactions that belong to this group which switch from fast to slow with increasing [Glc_{ x }]_{0}. These reactions (except vPDC) share the property of consuming energy and redox equivalents, i.e. ATP and NADH, respectively. The continuous flow transport reactions between the outside and the chemostat (vinCN, vinGlc) as well as vlacto also belong to this group.
3. "fast  slow  fast": vADH and the transport reactions across the cell membrane (vGlcTrans, vdifACA, vdifEtOH, vdifGlyc) behave differently from all others as they are fast for low and high concentrations of [Glc_{ x }]_{0}. Participation in slow modes seems to be limited to the first oscillatory regime.
4. "slow  slow  slow": All reactions from the chemostat to the outside (voutEtOH, voutGlyc, voutACA) are slow in all regimes.
A typical example of time evolution of the normed PIs for each class of reactions is given in Figure 7.
CSP Importance Index (II)
The majority of reactions exhibit significant importance on a number of metabolites (Normed Importance Index >0.1). Exceptions are vPGI, vALD, vTIM, vlpPEP, vPK, vconsum, vAK and vdifACA, where Importance Indices are of values less than 0.1 for all metabolites. The weak importance of the first five reactions (already indicated as QE by the normed PIs) further confirms that they may be removed from the model. In some cases, the importance index changes in between regimes, depending on [Glc_{ x }]_{0}. Examples for glucosesensitive importance are vinGlc (important at low and unimportant at high glucose concentrations), vHK, vPFK and vGAPDH (unimportant at low and important at high glucose concentration). Obviously, the importance index gives similar information as control coefficients derived from MCA, a fact that we studied and verified (data not shown). However, the CSP IIs give a richer picture of the control distribution compared to MCA.
Second Step: Model Reduction
Based on the time scale separation analysis we suggest four steps to derive a simplified minimal model. A short description is given in the following. For any detail the reader is referred to Additional file 3. Each simplification step concerns a subset of the original model scheme which we call Module, hereafter.
Module 1. QEA for vPGI, AE for F6P. The normed PI revealed that PGI can be approximated as QE and the Radical Pointer of the 5th fast mode identifies F6P as CSP Radical, for which the algebraic equation holds
So, in order to eliminate F6P from the system and to lump PGI together with PFK we need to modify the chemical equation of the PFK reaction to
as well as the kinetic rate law to
Module 2. QEA for vALD and vTIM. The normed PI revealed that vALD and vTIM can be approximated as QE, for which the equations hold
The metabolites which are either substrate or product of the two reactions are FBP, DHAP and GAP. The latter is identified as CSP Radical (see Radical Pointer of the 2nd fast mode). In order to lump vALD and vTIM together we introduce a pool metabolite which we name
and express any of the three metabolites in terms of trioseP. The new chemical equations of the associated reactions are:
Module 3: QEA for vlpPEP. The equilibria for the vlpPEP reaction is expressed as:
BPG is identified as CSP Radical in the first mode and at the same time PEP in the 4th mode. Again, we introduce a pool metabolite
and reduce the vlPEP reaction from the network. The new chemical equations of the associated reactions are:
Module 4: QEA for vPK. The vPK reaction is modeled as irreversible. So, the QEA leads to its lumping together with vPDC and to eliminating pyruvate from the network. The new chemical equation for vPDC is:
In summary, after these four simplification steps the full model (original values in parentheses) has been reduced eventually to 17 (22) species and 19 (24) reactions with a total of 43 (59) parameters (the reduced reaction network is depicted on the Figure 8).
Third step: Parameter adjustment and verification of the reduced model
Due to the fact that the meaning of parameters has been changed in the course of model reduction these parameters (e.g. K4eq) need to be adjusted in order to obtain the full original behavior. This can be simply achieved by parameter scanning around the initial value. It is worth emphasizing here, that not all parameters have to be refitted, only the ones that result from the simplification of the lumping terms (e.g. quasi equilibrium constants resulting from the QEAs).
Finally we evaluate the reduced model by comparing its dynamic properties with the ones of the original full model. Comparative simulations are shown in Figure 9 and reveal that the reduced model captures the essential dynamics of the full model quantitatively very well  except for the amplitudes and the exact location of the bifurcation points for the first oscillatory regime. This discrepancy is of (only) quantitative nature and it does not occur if the full model is reduced by just three reactions (instead of five) as presented in Additional file 3.
3 Discussion and Conclusions
In this paper, we have presented a strategy for model simplification and reduction based on the CSP method. For this purpose and in order to make the method publicly available we implemented the original CSP algorithm in the COPASI software.
The CSP method is restricted to ODE models. Previously described simplification routines based on CSP mainly focus on the conversion of ODEs into DAE systems. In contrast, we use the CSP method to simplify models by lumping those reactions together that could be identified as being in QE. In addition, algebraic equations are used for species that are identified by Radical Pointers. Accordingly, we redefine chemical equations and kinetic rate laws of affected reactions. We demonstrated the usability of this approach using the COPASI implementation of the CSP method for a simple oneenzyme reaction and for a rather complex model of yeast glycolysis [16].
The time scale separation analysis of the glycolysis model revealed five reactions (vPGI, vALD, vTIM, vlpPEP, and vPK) for which the simplification strategy can be applied. We demonstrated that the resulting reduced model is capable of maintaining characteristics of the full model within an acceptable error range:
(i) same dynamic regimes, e.g. Hopf bifurcation point at [Glc_{ x }]_{0} = 18.5 mM; (ii) similar steady state levels of metabolite concentrations; (iii) similar periods for both and amplitudes for the second oscillatory regimes.
Studying different dynamics underlines again (as in [11]) the importance of timeresolved analyses since the contribution of the players in the system may vary over time and in between different dynamical regimes. This is ignored if either steady state data (or single time point data in general) or single dynamic regimes are studied.
Compared to our previous work on the ILDM method [10, 11]  or the ILDM method in general  the CSP allows a more straightforward interpretation of its results with respect to the identification of QSS species and especially QE reactions. In addition, the Importance Index of CSP allows to analyze the impact of individual reactions on the dynamics of the species in the system.
An interesting outcome of our analysis is that it is possible to follow the general inherent temporal organization of the entire system when analyzing the distinctive time scales. Thus, we could observe that for the second oscillatory regime, all time scales oscillate in phase, partially overlapping each other which indicates that the whole system shows slower or faster dynamics in the course of a period.
Moreover, the number of fast modes changes over time and is also different for different dynamic regimes. Both factors prohibit the use of a fixed number of modes for time scale decomposition.
Furthermore, we suggest that the results of the CSP analysis can also be used for studying the relative importance of different reactions for the dynamics of the system. As an example, we observed that the overall participation of PFK in the slow modes increases with increasing glucose levels. In a simple way, this may be explained by the increasing energy charge (ATP concentration) which inhibits the PFK. Therefore, the relative importance of the PFK to the slower modes of the system increases.
Another beneficial result of the simplification process is of course that the number of system parameters is considerably reduced, especially concerning parameters which are involved in processes on a faster time scale than the time scale of interest which are then usually hard to identify. Therefore, using this process less system parameters will be unidentifiable.
Our study is not the first trying to reduce the original glycolysis model by [16]. [13] analyzed exclusively the limit cycle of the second oscillatory regime ([Glc_{ x }]_{0} = 24 mM) employing CSP without taking into account transient behavior. In contrast, we analyzed the model with original initial values taking into consideration also the initial transient time period. In addition, there are major methodological differences. First, our approach focuses on simplifying the underlying biochemical reaction network rather than on approximating the ODE system with a DAE system. Second, we do not fix the number of fast modes. Third, we compute the normed sum of PIs over the entire fast subspace in order to justify QEA.
A completely different approach was taken by [22]. Their sole criterium for the reduction was the fulfillment of a StuartLandau equation which is in principle only valid in the vicinity of a Hopf bifurcation and therefore does not offer a general strategy for system reduction.
Obviously, there are some relations between CSP output data and sensitivity analyses like metabolic control analysis (MCA). Learning e.g. about the impact of individual reactions on systems properties like dynamics could in principle also result from sensitivity analyses. We did a preliminary comparison of the results of our CSP analysis and a conventional MCA for the steady state. This resulted in a similar global picture, but the CSP gave a more finegrained picture w.r.t. the relative importance of reactions on species. In addition, the timeresolved analysis for oscillations is not possible with MCA.
With all the mentioned benefits of using CSP for systems analysis, there are also problems and limitations arising from this approach. We employed several heuristic thresholds for the discrimination of the reactions and species mainly contributing to the fast subspace of the system. These were based on our experience and obviously, this might not be optimal for arbitrary systems. Thus, other systems might demand slightly altered thresholds. This is underlined by the fact that we observed one reaction  AK  that in principle fulfilled all of our criteria for elimination, but in the end, it turned out to be impossible to eliminate from the system without introducing a large error. Therefore, it is always important to carefully check the behavior of the reduced system. The CSP can only support this process in a rational way, but does not allow for a fully automated analysis.
Even though, accordingly, scientists will always have to be on top of this method, it would be useful to support the reduction of the system in a stronger way than just providing the CSP. A semiautomated reduction which then quickly allows to be checked for error compared to the original model would reduce workload considerably and is currently planned to be included in the software. An additional planned extension of the software is the support of different compartment sizes (if multicompartment models are analyzed) which is currently not the case.
All in all, we were surprised that taking into account different dynamic regimes only allowed the elimination of 5 reactions and 5 species of the glycolysis model which is considerably less than previous attempts that focused on particular regimes. This once again supports the view that it is crucial to define which systems behaviors should be reproduced by the simplified model before entering reduction strategies and these initial decisions might result in different models in the end.
References
 1.
Hübner K, Sahle S, Kummer U: Applications and trends in systems biology in biochemistry. FEBS Journal. 2011, 278: 27672857. 10.1111/j.17424658.2011.08217.x.
 2.
Okino MS, Mavrovouniotis ML: simplification of mathematical models of chemical reaction systems. Chemical Reviews. 1998, 98: 391408. 10.1021/cr950223l.
 3.
Schuster S, Pfeiffer T, Moldenhauer F, Koch I, Dandekar T: Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae. Bioinformatics. 2002, 18: 351361. 10.1093/bioinformatics/18.2.351.
 4.
Kauffman K, Pajerowski J, Jamshidi N, Palsson BO, Edwards J: Description and analysis of metabolic connectivity and dynamics in the human red blood cell. Biophysical Journal. 2002, 83: 646662. 10.1016/S00063495(02)751989.
 5.
Price ND, Reed JL, Papin JA, Famili I, Palsson BO: Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. Biophysical Journal. 2003, 84: 794804. 10.1016/S00063495(03)748991.
 6.
Jamshidi N, Palsson BO: Topdown analysis of temporal hierarchy in biochemical reaction networks. PLoS Computational Biology. 2008, 4: 110. 10.1371/journal.pcbi.0040001.
 7.
Radulescu O, Gorban AN, Zinovyev A, Lilienbaum A: Robust simplifications of multiscale biochemical networks. BMC Systems Biology. 2008, 2: 86111. 10.1186/17520509286.
 8.
Maas U, Pope SB: Simplifying chemical reaction kinetics: Intrinsic lowdimensional manifolds in composition space. Combustion and Flame. 1992, 88: 239264. 10.1016/00102180(92)90034M.
 9.
Shaik OS, Kammerer J, Gorecki J, Lebiedz D: Derivation of a quantitative minimal model for the photosensitive BelousovZhabotinsky reaction from a detailed elementarystep mechanism. Journal of Chemical Physics. 2005, 123: 23410310.1063/1.2136882.
 10.
Surovtsova I, Simus N, Lorenz T, König A, Sahle S, Kummer U: Accessible methods for the dynamic timescale decomposition of biochemical systems. Bioinformatics. 2009, 25: 28162823. 10.1093/bioinformatics/btp451.
 11.
Zobeley J, Lebiedz D, Kammerer J, Ishmurzin A, Kummer U: A new timedependent complexity reduction method for biochemical systems. Transactions on Computational Systems Biology. 2005, 1: 90110.
 12.
Goussis DA, Najm HN: Model reduction and physical understanding of slowly oscillating processes: the circadian cycle. Multiscale Modelling and Simulation. 2006, 5: 12971332. 10.1137/060649768.
 13.
Kourdis PD, Steuer R, Goussis DA: Physical understanding of complex multiscale biochemical models via algorithmic simplification: glycolysis in Saccharomyces Cerevisiae. Physica D. 2010, 239: 17891817. 10.1016/j.physd.2010.06.002.
 14.
Lam H: Using CSP to understand complex chemical kinetics. Combustion Science and Technology. 1993, 89: 375404. 10.1080/00102209308924120.
 15.
Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, Singhal M, Xu L, Mendes P, Kummer U: COPASI  a complex pathway simulator. Bioinformatics. 2006, 22: 30673074. 10.1093/bioinformatics/btl485.
 16.
Hynne F, Danø S, Sorensen PG: Fullscale model of glycolysis in Saccharomyces cerevisiae. Biophysical Chemistry. 2001, 94: 121163. 10.1016/S03014622(01)002290.
 17.
Golub GH, van Loan CF: Matrix computations. 1996, Johns Hopkins University Press, Baltimore
 18.
Petzold L: Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM Journal on Scientific and Statistical Computing. 1983, 4: 136148. 10.1137/0904010.
 19.
Hindmarsh AC: ODEPACK, a systematized collection of ODE solvers. Transactions on Scientific Computation. 1983, 1: 5564.
 20.
Visser D, van der Heijden R, Mauch K, Reuss M, Heijnen S: Tendency modeling: a new approach to obtain simplified kinetic models of metabolism applied to Saccharomyces cerevisiae. Metabolic Engineering. 2000, 2: 252275. 10.1006/mben.2000.0150.
 21.
Palsson BO, Lightfoot EN: Mathematical modeling of dynamics and control in metabolic networks I. On Michaelis Menten kinetics. Journal of Theoretical Biology. 1984, 111: 273302. 10.1016/S00225193(84)802118.
 22.
Danø S, Madsen MF, Schmidt H, Cedersund G: Reduction of a biochemical model with preservation of its basic dynamic properties. FEBS Journal. 2006, 273: 48624877. 10.1111/j.17424658.2006.05485.x.
 23.
Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, Li L, He E, Henry A, Stefan MI, Snoep JL, Hucka M, Le Novére N, Laibe C: BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Systems Biology. 2010, 4: 92106. 10.1186/17520509492.
 24.
Olivier BG, Snoep JL: Webbased kinetic modelling using JWS Online. Bioinformatics. 2004, 20: 21432144. 10.1093/bioinformatics/bth200.
Acknowledgements
We would like to thank the COPASI team for support and BIOMS, the NIH, the BMBF (Virtual Liver) and the Klaus Tschira Foundation for funding.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
IS conceived the procedure for CSP based model reduction, performed the CSP analysis and drafted the initial manuscript. NS adapted the CSP for COPASI and implemented the algorithm. KH analyzed the CSP data, classified and interpreted it biochemically. SS supported the method implementation. UK initiated the project and interpreted the results biochemically. All authors participated in discussions and writing of the final manuscript. All authors also read and approved the final manuscript.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Surovtsova, I., Simus, N., Hübner, K. et al. Simplification of biochemical models: a general approach based on the analysis of the impact of individual species and reactions on the systems dynamics. BMC Syst Biol 6, 14 (2012). https://doi.org/10.1186/17520509614
Received:
Accepted:
Published:
Keywords
 Fast Mode
 Dynamic Regime
 Oscillatory Regime
 Quasi Equilibrium
 Fast Time Scale