# Simplification of biochemical models: a general approach based on the analysis of the impact of individual species and reactions on the systems dynamics

- Irina Surovtsova
^{1}Email author, - Natalia Simus
^{1}, - Katrin Hübner
^{1}, - Sven Sahle
^{1}and - Ursula Kummer
^{1}

**6**:14

https://doi.org/10.1186/1752-0509-6-14

© Surovtsova et al; licensee BioMed Central Ltd. 2012

**Received: **4 October 2011

**Accepted: **5 March 2012

**Published: **5 March 2012

## Abstract

### Background

Given the complex mechanisms underlying biochemical processes systems biology researchers tend to build ever increasing computational models. However, dealing with complex systems entails a variety of problems, e.g. difficult intuitive understanding, variety of time scales or non-identifiable parameters. Therefore, methods are needed that, at least semi-automatically, help to elucidate how the complexity of a model can be reduced such that important behavior is maintained and the predictive capacity of the model is increased. The results should be easily accessible and interpretable. In the best case such methods may also provide insight into fundamental biochemical mechanisms.

### Results

We have developed a strategy based on the Computational Singular Perturbation (CSP) method which can be used to perform a "biochemically-driven" model reduction of even large and complex kinetic ODE systems. We provide an implementation of the original CSP algorithm in COPASI (a COmplex PAthway SImulator) and applied the strategy to two example models of different degree of complexity - a simple one-enzyme system and a full-scale model of yeast glycolysis.

### Conclusion

The results show the usefulness of the method for model simplification purposes as well as for analyzing fundamental biochemical mechanisms. COPASI is freely available at http://www.copasi.org.

## 1 Background

Biochemical systems are inherently high dimensional due to the large number of interrelated cellular components and processes, the temporal organization of which spans time scales of several orders of magnitude. Aiming at a comprehensive understanding of the dynamic behavior of such systems has led to the development of an ever increasing number of computational models which are in the majority of cases formulated on the basis of ordinary differential equations (ODEs) [1]. Even though the purpose of computational models is to facilitate understanding and analysis of the underlying biochemical mechanisms, this again becomes cumbersome with the growing complexity of modern models. Therefore, it is important to identify those parts of the biochemical systems and of the model that are responsible for the observed physiological behavior. This necessitates the development of methods for the rational simplification of computational models and to make them comfortably accessible to the community.

Numerous methods have been developed to simplify (bio)chemical reaction networks (see review [2]). For biochemical systems many of the reduction methods aim at analyzing the steady state behavior either heuristically [3] or employing mathematical analysis (e.g. sensitivities [4, 5]). Since biochemical systems usually do not reside in a steady state time-dependent approaches have recently been proposed (see for example [6, 7]). Most of these use a mathematical analysis of the different time-scales occurring in the biochemical systems, e.g. the Intrinsic Low-Dimensional Manifolds (ILDM) method [8–11] and the Computational Singular Perturbation (CSP) method [12–14]. Apart from the advantage of a time-resolved analysis, these methods can provide useful insights, such as the support of the detection of fast reactions and species as well as the identification of potential rate controlling reactions. However, a disadvantage of the above methods is that the reduced models are systems of mathematically transformed differential or differential algebraic equations (DAE) which may not relate one-to-one to biochemical species and reactions hampering the biochemical interpretation. In contrast, the methods based on steady-state or partial equilibrium approximations keep the one-to-one relation and are therefore simple to biochemically interpret.

In this paper, we focus on deriving simplified biochemical models by discarding fast reactions and species. For this purpose we present a reduction strategy which is based on the CSP algorithm developed by Lam and Goussis [14]. The algorithm examines the time scales of ODE systems and supports the separation of the biochemical network into fast and slow subsystems. This is achieved through the elimination of the detected quasi-stationary species and quasi-equilibrium reactions.

The original CSP algorithm is implemented in the software COPASI [15] making it accessible to the scientific community. COPASI is a platform-independent, user-friendly software tool that allows easy access to powerful numerical methods for simulation and analysis of biochemical reaction networks.

We apply the simplification strategy to two different systems to exemplify its use. Thus, as a simple system, we present the derivation of the Michaelis-Menten Kinetics. As a realistic case, we analyze the glycolysis in *Saccharomyces cerevisiae* [16] in three different dynamic regimes. We show that several variables can be eliminated still keeping the original dynamics intact. Furthermore, regulatory mechanisms cause different players to participate with different relative importance in the dynamics of the system.

### Time Scale Separation Analysis

**y**are linear:

**J**. By using the set of right eigenvectors

**A**of

**J**as the new basis we can decompose the Jacobian [17] and transform the original equations into:

**x**are called

*modes*. Because

**Λ**is a diagonal matrix of real or complex eigenvalues

*λ*

_{ i }of

**J**, the transformed ODE system is fully decoupled:

*λ*

^{ i }):

have a dimension of time and are called time scales (TS). Ordering them w.r.t. magnitudes *τ*_{1} <*τ*_{2} < ... <*τ*_{
N
} leads to approximate speed ranking of the modes [14]. The modes corresponding to fast time scales (eigenvalues with large negative real part) approach 0 very quickly and can be eliminated from the system for *t* ≫ *τ*_{
M
} , where *τ*_{
M
} is a fast time scale.

Two additional aspects are worth being emphasized here. First, although the transformed representation of the system dynamics in terms of modes provides a systematic basis for the decomposition of the reaction network, it does not guarantee reducing the number of biochemical species or reactions in the system, since many different species might contribute to one and the same transformed equation. So, there is no straightforward relation to reduction methods commonly used in biochemistry such as the quasi steady state (QSS) approximation or the quasi equilibrium (QE) assumption.

An additional aspect of TS decomposition is that it is based on the local analysis of the system dynamics. For general nonlinear problems however the Jacobian is time-dependent. Its eigenvalues and eigenvectors change with time. Hence, in order to obtain a reasonable characterization of the systems dynamics the time scale decomposition has to be applied repeatedly at many time points through the evaluation time of the reaction system.

## 2 Results

### 2.1 CSP in COPASI

*K*biochemical reactions, the dynamics of which is determined by a system of

*N*ordinary differential equations:

here **y** is the *N*-dimensional concentrations vector, **s**_{
r
}(*r* = 1, . . . , *R*) are the *N*-dimensional stoichiometric vectors and *F*^{
r
} (**y**) is the rate of the *r*-th reaction.

**The main idea of the CSP method**is to split the

*N*-dimensional space of the vector

**g**into two subspaces, a fast and a slow subspace:

In general, an *N*-dimensional vector may be expressed in terms of any set of *N* linearly independent basis vectors (e.g. [17]). The objective of the CSP method is to express **g** in a new basis, one that is tuned to the dynamics of the system, where the fast and slow components evolve independently of each other.

**g**

^{fast}relates to the fast time scales of the system. If its contribution is negligible (according to some error criteria), the original system (Eq. 1) simplifies to the system of the following differential algebraic equations (DAEs):

This DAE system does not contain the fastest time scales of the original model. Hence, it is much less stiff than the original system and can be simulated easily. Nevertheless such a simplification does not guarantee the reduction of the number of species and reactions ("real" system reduction) as explained above. Therefore, a similarly important aspect of our strategy is the identification of QSS metabolites and QE reactions by means of CSP data (see [14] and below).

#### The Algorithm

**J**:

**a**

_{ i }

*, i*= 1, . . . ,

*N*, but in contrast to the approach of time scale separation, these linearly independent vectors do not have to be eigenvectors of the Jacobian

**J**(not even orthogonal). Then

**g**always has the unique representation:

**a**

_{ i }

*f*

^{ i }is a so-called reaction mode, the amplitude

*f*

^{ i }is given by:

The notation ⊙ abbreviates the standard scalar product, i.e. here ${\mathbf{b}}^{i}\odot {\mathbf{a}}_{j}={\sum}_{n=1}^{N}{\mathbf{b}}_{n}^{i}{\mathbf{a}}_{j}^{n}.$ The set of *N* row vectors **b**^{
i
} are the inverses of **a** _{
i
}; together they satisfy the following orthonormal condition: ${\mathbf{b}}_{i}\odot {\mathbf{a}}_{j}={\delta}_{j}^{i},\phantom{\rule{1em}{0ex}}i,j=1,2,...,N.$

*M*, and to compute the sets of linearly independent

**a**

_{ i }and

**b**

^{ i }, such that

where **A** and **B** are matrices consisting of the column vector **a**_{
i
} and row vector **b**^{
i
} as basis vectors.

For linear problems the matrix **Λ** is time independent. In this case the choice of eigenvectors of the Jacobian as the new basis leads to the diagonal matrix **Λ** (see above). The corresponding amplitudes **f** evolve independently of each other with their own characteristic time scale *τ*_{
i
} . For general nonlinear systems, however, **Λ** is time dependent and usually not diagonal. The CSP method provides an iterative procedure of refinement of basis vectors **a**_{
i
} and **b** _{
j
}. When recursively applied, the refinement procedure weakens the coupling between the *M* fast and the *N - M* slow amplitudes. The matrix **Λ** built from the final refined set of basis vectors is block-diagonal and the fast amplitudes are uncoupled from the slow ones approximately, so that the residual coupling can be neglected.

**a**

_{ i }and the assumption that the first

*M*basis vectors span the

*M*-dimensional fast subdomain. The corresponding time scales should be much faster than some characteristic time scale of interest, then

*M*is to be selected to provide a gap between the slow (time of interest) and fast time scales:

these can be eliminated from the initial system (Eq. 1), because their contributions to **g** are negligible. As a consequence, the evolution of the reduced system depends on the slow modes only (see Eq. 2).

#### Implementation

The CSP algorithm was implemented as an integral part of the COPASI software in C++ and is freely available with the current releases of the package. In this implementation, the CSP algorithm is applied to the models for which linear dependencies due to conservation relationship are eliminated. This is achieved by the analysis of the stoichiometric matrix and is performed by COPASI automatically.

The following **CSP parameters** have to be defined by the user.

##### Intervals

The user specifies the number of time points for which the CSP analysis is carried out by setting the time interval. The time interval should be large in comparison with the user's time scale of interest.

##### Ratio of time scale separation *ε*

This parameter specifies the gap between the time scales related to the fast and slow modes (Eq. 7).

##### Error tolerance

Absolute *ε*_{abs} and relative error *ε*_{rel} are set to control when a fast mode is considered to be exhausted (Eq. 8).

The CSP algorithm described above provides local information at certain time steps. To obtain global features of the system behavior the analysis must be performed at all points in the range of interest. For this purpose the CSP step involves numerical integration using the LSODA solver [18]. LSODA is part of the ODEPACK library [19]. It solves ODE systems with a dense or banded Jacobian when the problem is stiff, but it automatically selects between non-stiff (Adams) and stiff (BDF) methods. The Jacobian is generated numerically.

When the CSP algorithm at time point *t* has been performed and both final refined sets of basis vectors **a** _{
i
}(*t*) and **b**^{
i
} (*t*) are available the *M*(*t*) is set to the number of fast exhausted modes and *τ*_{
M
} (*t*) is then the time scale of the slowest of fast reaction modes at time point *t*.

The CSP output data (see below) can either be exported to a text file (save as Report in COPASI) for the use in other software (gnuplot, Octave etc.) or displayed in the graphical user interface as tables. In this case a color coding is used where the numbers are additionally visualized by different shades of color. This makes it easy to immediately spot e.g. the most important contributions to a specific mode for a large model (where the result tables are correspondingly large).

#### CSP Data used for model analysis

The CSP algorithm supplies the modeler with local CSP output data [14] that relates the time scales to species and reactions of the original biochemical system. The data is computed by the help of the refined sets of basis vectors **a** _{
i
}(*t*) and **b**^{
i
} (*t*). The user is provided with the CSP output at each defined time point during the interval of interest and can use it to reduce the model in a rational way. The CSP output data are displayed in COPASI in a number of matrices. Here we briefly explain the most important CSP output data which are available in COPASI:

##### Time scales

The analysis of time scales evolution can provide useful information about the system dynamics. The fast dissipative time scales relate to the eigenvalues of the Jacobian with large negative real parts. The explosive modes are associated with positive eigenvalues. Modes with equal time scales correspond to pairs of complex conjugate eigenvalues indicating oscillatory components in the system behavior.

##### Radical Pointer (RP)

The CSP Radical pointer identifies the species for which the QSSA can be justified. Whenever the *i*-th diagonal element of *m*-th fast mode projection matrix *Q*_{
m
} = **a** _{
m
} **b**^{
m
} is not a small number, species *i* is said to be a CSP radical.

##### Participation Index (PI) and QE reactions

*r*-th elementary reaction to the n-th CSP reaction mode can be represented by the mode participation index, ${P}_{r}^{i}$, defined as follows:

where *i* = 1, . . . , *N, r* = 1, . . . , *R*.

##### Importance Index (II)

*r*-th elementary reaction to the rate of change of the

*i*-th element of

**y**can be represented by the importance index, ${I}_{r}^{i}$:

*i* = 1, . . . , *N, r* = 1, . . . , *R*. An effective stoichiometric vector of the *r*-th elementary reaction, ${\mathbf{r}}_{r}^{\mathsf{\text{slow}}}=\left(\mathbf{I}-\mathbf{Q}\left(M\right)\right)\odot {\mathbf{s}}_{r}$ is computed using the fast subspace projection matrix $\mathbf{Q}\left(M\right)=\sum _{m=1}^{M}{\mathbf{Q}}_{m}$. The reaction with the largest ${I}_{r}^{i}$ for the species *y*_{
i
} is the rate controlling reaction.

#### CSP - based model reduction

In this paragraph we summarize the most important steps in the reduction of the kinetic mechanism based on the results of the CSP algorithm described above. Model reduction is mainly the outcome of a sequence of QSSA for species and QEA for reactions which leads to the lumping or elimination of corresponding variables. The QSSA identifies species whose production and destruction rate are in approximate balance. Mathematically it means that the right-hand side of the corresponding differential equation is zero. The QE assumption corresponds to reactions whose forward and reverse rates are nearly equal (see for instance, [20]). In either case an approximate algebraic relation (equation of state) is obtained between participating species.

As described in the previous paragraph the CSP method provides the numerical data (RP, PI and II) that are an effective diagnostic tool allowing the detection of species which can be approximated by an equation of state, as well as the determination of the relative level of participation of distinct reactions to the modes.

In contrast to the original CSP method [14] we introduce and use the "subspace" radical pointers and the "subspace" participation indices rather than the individual mode RP and PI. This is based on the fact, that even though the matrix **Λ** (Eq. 6) built by the help of the final refined set of the basis vectors is block-diagonal and the fast modes are decoupled from the slow ones, the fast and slow modes could be coupled between themselves. So, it appears to be more reasonable to consider a projection of the CSP indices on the full fast and slow subspaces.

*M*fast modes and define the species with the largest "fast subspace" radical pointers as QSS. Similarly the sum of Participation indices over all slow and fast modes should be considered separately in order to detect the fast reactions. The normed PIs over fast and slow subspaces are:

We declare the reaction *k* as QE, if it is active in the fast and does not influence the slow space: $P{I}_{k}^{\mathsf{\text{fast}}}\gg P{I}_{k}^{\mathsf{\text{slow}}}$ at all time points, where the CSP analysis was carried out.

- 1.
First, a time scale of interest should be selected. This can for instance correspond to the time resolution of the experiment which is the basis for the model. The aim of the model simplification is to reduce all scales that are faster than this chosen scale.

- 2.
Second, user defined parameters have to be selected in COPASI as explained above. Since the CSP information will be available for every time interval and is the basis for the time-dependent model reduction, the time interval should be large enough in comparison with the time scale of interest.

- 3.
Third, performing the CSP and analyzing the results in order to find the QSS species and QE reactions.

- 4.
Fourth, solving the corresponding algebraic equations of state and eliminating the respective variables from the reaction networks. The kinetic laws for slow reactions should be modified by substitution with explicit expressions for CSP radicals.

- 5.
Fifth, parameter adjustment (e.g. by parameter estimation) with respect to quasi equilibrium constants of eliminated reactions in order to achieve the desired accuracy.

It is worth mentioning that during the simplification the existing conservation laws have to be preserved. The algebraic equations should be solved under conditions that the equations of moieties are fulfilled.

### 2.2 Application examples

We have applied the method to two models. All informations and scripts needed to reproduce the figures in this subsection are available in Additional Files 1, 2, 3.

#### Michaelis-Menten Kinetics

The model was build in COPASI and consists of the two reactions R_{1} (*S* +*E* ⇔ *C*) and R2 (*C* → *P* +*E*). In order to illustrate the handling of the CSP based model reduction we consider two limit situations for the dimensionless parameters: ${S}_{t}=\frac{{k}_{2}}{{k}_{-1}}\to 0$ and ${M}_{r}=\frac{{E}_{0}}{{S}_{0}}\to 0$ (here *E*_{0} and *S*_{0} are initial enzyme and substrate concentrations, respectively). We used the following CSP parameter values: *ε* = 0.01, *ε*_{rel} = 10^{-5}, and *ε*_{abs} = 10^{-10}. In both cases, a clear time-scale separation occurs.

(**i**) **M**_{
r
} **→ 0** (**E**_{
0
} **≪ S**_{
0
})*:* This is the standard situation for Michaelis-Menten kinetics. The motion on the fast time-scale is dominated by the complex *C* decoupled from the substrate *S*. On the slow part, the changes of substrate and complex are balanced. The quasi steady state assumption for complex *C* leads then to the Michaelis-Menten kinetic law.

*t >*0.03). The Radical Pointer from the CSP data shows that the complex

*C*dominates the fast mode. The contributions of both reactions to the slow and fast modes are comparable (see Figure 2, which displays the evolution in time of Radical Pointer and Participation Indices). Thus, the QSSA for the complex

*C*is justified in this case.

(**ii**) **S**_{
t
} **→ 0** (**k**_{
2
} **≪ k**_{
-1
}). This limit means that an equilibrium between the enzyme *E*, the substrate *S* and the enzyme-substrate complex *C* is established quickly. The slow step is the breakdown of *C* to produce the product *P* and the enzyme *E*.

The CSP analysis leads to the occurrence of two independent dynamical modes. After the short transient phase (*t <*0.006, when no reduction is possible) the contribution of *C* to the fast mode is larger than the one of *S* (nevertheless no real dominance occurs). Over the time the contribution of both variables becomes equal (Figure 2). Thus, the QSSA for complex *C* is incorrect.

*R*

_{2}of product formation dominates clearly the slow mode. Both reactions are active in the fast mode (see Figure 2). Thus, the reaction ${R}_{1}:S+E\underset{{k}_{-1}}{\overset{{k}_{1}}{\iff}}C$ is always practically in equilibrium and the QEA for reaction

*R*

_{1}is correct and leads to a similar equations as for "standard" Michaelis-Menten kinetics (compare [21] and [10]):

The reader is referred to Additional File 1 for more details of the reduction of the Michaelis-Menten kinetics.

#### Glycolysis in *Saccharomyces cerevisiae*

We now use the CSP method to examine a more complex model for simplification purposes. We take the quantitative model of yeast glycolysis developed by [16] as an application example which has been also used before in similar studies [13, 22].

The complete model is available for download in SBML format at the BioModels database [23] (BIOMD 61) or JWS online [24] (http://jjj.biochem.sun.ac.za/database/hynne/Hynne.xml), the latter version being used in this study. For model details the reader is also referred to [16]. However, there are some model properties we want to mention here explicitly.

The model reproduces experimental data observed in intact yeast cells in a continuous-flow stirred tank reactor. Here, the mixed flow glucose concentration, [Glc_{
x
}]_{0}, is a bifurcation parameter which means that depending on its value the system behavior changes qualitatively. To be concrete, this glycolysis model exhibits two stationary (*<*9.6 mM; 16.7 *<*[Glc_{
x
}]_{0} *<*18.5 mM) and two oscillatory state regimes (9.6 ≤ [Glc_{
x
}]_{0} ≤ 16.7; ≥ 18.5 mM). Please refer to Figure eight in [16] for the bifurcation diagram. Notably, the first oscillatory regime has not yet been observed in experiments. So, we consider this as an important model property.

##### First step: CSP Analysis in COPASI

- 1.
A Hopf bifurcation occurs at some value of [Glc

_{ x }]_{0}. - 2.
Bifurcation points w.r.t. [Glc

_{ x }]_{0}change only little, i.e. different dynamic regimes (including the first oscillatory domain) appear at values of [Glc_{ x }]_{0}close to the corresponding values in the full system. - 3.
Steady state levels of metabolite concentrations.

- 4.
Periods of the oscillations.

- 5.
Amplitudes of the oscillations.

We, therefore, perform the CSP analysis on the different dynamic regimes separately, i.e. using three different initial conditions for [Glc_{
x
}]_{0}, namely 9 mM (steady state), 14 mM and 24 mM (first and second oscillatory state, respectively). All other parameters of the model are taken as in [16].

*Ratio of mode separation, Relative Error*and

*Absolute Error*are set to 0.99, 1e-3 and 1e-4, respectively.

In the following, we present the CSP output data (Time Scales, Radical Pointer, Participation Index, Importance Index and so on, see 2.1) one after the other. For each type of data, we point out the major differences between the three dynamic regimes which we interpret as glucose-dependent phenomena. If appropriate, special emphasis is given to time-dependent differences.

Since the amount of data produced in this comprehensive analysis exceeds the scope of the paper we present each CSP output data with compelling examples. The complete set of data is provided in Additional file 2.

###### Time scales

*min*to

*ms*). Figure 1 shows the time scale distribution (logarithmic values) of the full model exemplarily for [Glc

_{ x }]

_{0}= 14 mM at time step 25. Notably, the time scale values change over time. In the steady state regime ([Glc

_{ x }]

_{0}= 9 mM), we observe two eigenvalue pairs corresponding to the 8

^{ th }and 9

^{ th }as well as 15

^{ th }and 16

^{ th }time scales that consist of complex conjugates (

*τ*

_{8}=

*τ*

_{9},

*τ*

_{15}=

*τ*

_{16}) indicating the system's intrinsic oscillatory vicinity. We see that the real part of these eigenvalue pairs become equal at a certain point in time during the initial transient (Figure 5(a)).

In both oscillatory state regimes, after the initial transients, the values of time scales become oscillating and show in part substantial amplitudes which sometimes also overlap with the values of adjacent time scales. As an example, Figures 5(b) and 5(c) show the time evolution of the 15 ^{
th
} to 18 ^{
th
} time scales for [Glc_{
x
}]_{0} = 14 and 24 mM, respectively.

*Number of fast modes M* (Figure 5(d)): As explained above, each time scale corresponds either to a fast or slow so-called mode in the CSP analysis. Like the values of the time scales the number of modes constituting the entire fast or slow subspace changes over time. Since for model reduction only the fast modes are relevant we focus on these. Initially, all three dynamic regimes show seven fast modes. In the steady state regime, after a highly variable transient, *M* settles to 17. In contrast, *M* varies between 7 and 9 for the first and between 9 and 10 for the second oscillatory regime. Consequentially, we do not fix the number of fast modes in our CSP analysis but rather take their varying number over time into account in search for QSS metabolites (see RP) and QE reactions (see PI).

###### CSP Radical Pointer

###### CSP Participation Index (PI)

When comparing the normed sum of PIs for the three different regimes, four different categories of reactions can be identified depending on their respective PIs, e.g. the reaction can always be classified as fast or it changes its role between regimes. A heuristic threshold value based on our analysis and experience is chosen. Thus, if the normed sum of PIs over all fast modes exceeds 0.7, the reaction is defined as fast.

1. *"fast - fast - fast"*: vGAPDH, vlpPEP, vPK, vPGI, vALD, vTIM and vAK are fast in all regimes. These reactions, therefore, may be approximated as QE and eliminated in a simplified model. Not surprisingly, the known fast reactions vPGI and vTIM turn up in this group. Interestingly, the group also contains all reactions that either produce energy or redox equivalents, i.e. ATP and NADH, respectively. Obviously, especially in case of reactions being at the edge of the threshold, model reduction still has to be done with care.

2. *"fast - slow - slow"*: vHK, vPFK, vPDC, glycerol production, glycogen production, and ATP consumption are reactions that belong to this group which switch from fast to slow with increasing [Glc_{
x
}]_{0}. These reactions (except vPDC) share the property of consuming energy and redox equivalents, i.e. ATP and NADH, respectively. The continuous flow transport reactions between the outside and the chemostat (vinCN, vinGlc) as well as vlacto also belong to this group.

3. *"fast - slow - fast"*: vADH and the transport reactions across the cell membrane (vGlcTrans, vdifACA, vdifEtOH, vdifGlyc) behave differently from all others as they are fast for low and high concentrations of [Glc_{
x
}]_{0}. Participation in slow modes seems to be limited to the first oscillatory regime.

4. *"slow - slow - slow"*: All reactions from the chemostat to the outside (voutEtOH, voutGlyc, voutACA) are slow in all regimes.

###### CSP Importance Index (II)

The majority of reactions exhibit significant importance on a number of metabolites (Normed Importance Index *>*0.1). Exceptions are vPGI, vALD, vTIM, vlpPEP, vPK, vconsum, vAK and vdifACA, where Importance Indices are of values less than 0.1 for all metabolites. The weak importance of the first five reactions (already indicated as QE by the normed PIs) further confirms that they may be removed from the model. In some cases, the importance index changes in between regimes, depending on [Glc_{
x
}]_{0}. Examples for glucose-sensitive importance are vinGlc (important at low and unimportant at high glucose concentrations), vHK, vPFK and vGAPDH (unimportant at low and important at high glucose concentration). Obviously, the importance index gives similar information as control coefficients derived from MCA, a fact that we studied and verified (data not shown). However, the CSP IIs give a richer picture of the control distribution compared to MCA.

##### Second Step: Model Reduction

Based on the time scale separation analysis we suggest four steps to derive a simplified minimal model. A short description is given in the following. For any detail the reader is referred to Additional file 3. Each simplification step concerns a subset of the original model scheme which we call Module, hereafter.

*Module 1*.

**QEA for vPGI, AE for F6P**. The normed PI revealed that PGI can be approximated as QE and the Radical Pointer of the 5-th fast mode identifies F6P as CSP Radical, for which the algebraic equation holds

*Module 2*.

**QEA for vALD and vTIM**. The normed PI revealed that vALD and vTIM can be approximated as QE, for which the equations hold

*trioseP*. The new chemical equations of the associated reactions are:

*Module 3:*

**QEA for vlpPEP**. The equilibria for the vlpPEP reaction is expressed as:

*Module 4:*

**QEA for vPK**. The vPK reaction is modeled as irreversible. So, the QEA leads to its lumping together with vPDC and to eliminating pyruvate from the network. The new chemical equation for vPDC is:

##### Third step: Parameter adjustment and verification of the reduced model

Due to the fact that the meaning of parameters has been changed in the course of model reduction these parameters (e.g. K4eq) need to be adjusted in order to obtain the full original behavior. This can be simply achieved by parameter scanning around the initial value. It is worth emphasizing here, that not all parameters have to be refitted, only the ones that result from the simplification of the lumping terms (e.g. quasi equilibrium constants resulting from the QEAs).

## 3 Discussion and Conclusions

In this paper, we have presented a strategy for model simplification and reduction based on the CSP method. For this purpose and in order to make the method publicly available we implemented the original CSP algorithm in the COPASI software.

The CSP method is restricted to ODE models. Previously described simplification routines based on CSP mainly focus on the conversion of ODEs into DAE systems. In contrast, we use the CSP method to simplify models by lumping those reactions together that could be identified as being in QE. In addition, algebraic equations are used for species that are identified by Radical Pointers. Accordingly, we redefine chemical equations and kinetic rate laws of affected reactions. We demonstrated the usability of this approach using the COPASI implementation of the CSP method for a simple one-enzyme reaction and for a rather complex model of yeast glycolysis [16].

The time scale separation analysis of the glycolysis model revealed five reactions (vPGI, vALD, vTIM, vlpPEP, and vPK) for which the simplification strategy can be applied. We demonstrated that the resulting reduced model is capable of maintaining characteristics of the full model within an acceptable error range:

(i) same dynamic regimes, e.g. Hopf bifurcation point at [Glc_{
x
}]_{0} = 18.5 mM; (ii) similar steady state levels of metabolite concentrations; (iii) similar periods for both and amplitudes for the second oscillatory regimes.

Studying different dynamics underlines again (as in [11]) the importance of time-resolved analyses since the contribution of the players in the system may vary over time and in between different dynamical regimes. This is ignored if either steady state data (or single time point data in general) or single dynamic regimes are studied.

Compared to our previous work on the ILDM method [10, 11] - or the ILDM method in general - the CSP allows a more straightforward interpretation of its results with respect to the identification of QSS species and especially QE reactions. In addition, the Importance Index of CSP allows to analyze the impact of individual reactions on the dynamics of the species in the system.

An interesting outcome of our analysis is that it is possible to follow the general inherent temporal organization of the entire system when analyzing the distinctive time scales. Thus, we could observe that for the second oscillatory regime, all time scales oscillate in phase, partially overlapping each other which indicates that the whole system shows slower or faster dynamics in the course of a period.

Moreover, the number of fast modes changes over time and is also different for different dynamic regimes. Both factors prohibit the use of a fixed number of modes for time scale decomposition.

Furthermore, we suggest that the results of the CSP analysis can also be used for studying the relative importance of different reactions for the dynamics of the system. As an example, we observed that the overall participation of PFK in the slow modes increases with increasing glucose levels. In a simple way, this may be explained by the increasing energy charge (ATP concentration) which inhibits the PFK. Therefore, the relative importance of the PFK to the slower modes of the system increases.

Another beneficial result of the simplification process is of course that the number of system parameters is considerably reduced, especially concerning parameters which are involved in processes on a faster time scale than the time scale of interest which are then usually hard to identify. Therefore, using this process less system parameters will be unidentifiable.

Our study is not the first trying to reduce the original glycolysis model by [16]. [13] analyzed exclusively the limit cycle of the second oscillatory regime ([Glc_{
x
}]_{0} = 24 mM) employing CSP without taking into account transient behavior. In contrast, we analyzed the model with original initial values taking into consideration also the initial transient time period. In addition, there are major methodological differences. First, our approach focuses on simplifying the underlying biochemical reaction network rather than on approximating the ODE system with a DAE system. Second, we do not fix the number of fast modes. Third, we compute the normed sum of PIs over the entire fast subspace in order to justify QEA.

A completely different approach was taken by [22]. Their sole criterium for the reduction was the fulfillment of a Stuart-Landau equation which is in principle only valid in the vicinity of a Hopf bifurcation and therefore does not offer a general strategy for system reduction.

Obviously, there are some relations between CSP output data and sensitivity analyses like metabolic control analysis (MCA). Learning e.g. about the impact of individual reactions on systems properties like dynamics could in principle also result from sensitivity analyses. We did a preliminary comparison of the results of our CSP analysis and a conventional MCA for the steady state. This resulted in a similar global picture, but the CSP gave a more fine-grained picture w.r.t. the relative importance of reactions on species. In addition, the time-resolved analysis for oscillations is not possible with MCA.

With all the mentioned benefits of using CSP for systems analysis, there are also problems and limitations arising from this approach. We employed several heuristic thresholds for the discrimination of the reactions and species mainly contributing to the fast subspace of the system. These were based on our experience and obviously, this might not be optimal for arbitrary systems. Thus, other systems might demand slightly altered thresholds. This is underlined by the fact that we observed one reaction - AK - that in principle fulfilled all of our criteria for elimination, but in the end, it turned out to be impossible to eliminate from the system without introducing a large error. Therefore, it is always important to carefully check the behavior of the reduced system. The CSP can only support this process in a rational way, but does not allow for a fully automated analysis.

Even though, accordingly, scientists will always have to be on top of this method, it would be useful to support the reduction of the system in a stronger way than just providing the CSP. A semi-automated reduction which then quickly allows to be checked for error compared to the original model would reduce workload considerably and is currently planned to be included in the software. An additional planned extension of the software is the support of different compartment sizes (if multi-compartment models are analyzed) which is currently not the case.

All in all, we were surprised that taking into account different dynamic regimes only allowed the elimination of 5 reactions and 5 species of the glycolysis model which is considerably less than previous attempts that focused on particular regimes. This once again supports the view that it is crucial to define which systems behaviors should be reproduced by the simplified model before entering reduction strategies and these initial decisions might result in different models in the end.

## Declarations

### Acknowledgements

We would like to thank the COPASI team for support and BIOMS, the NIH, the BMBF (Virtual Liver) and the Klaus Tschira Foundation for funding.

## Authors’ Affiliations

## References

- Hübner K, Sahle S, Kummer U: Applications and trends in systems biology in biochemistry. FEBS Journal. 2011, 278: 2767-2857. 10.1111/j.1742-4658.2011.08217.x.View ArticleGoogle Scholar
- Okino MS, Mavrovouniotis ML: simplification of mathematical models of chemical reaction systems. Chemical Reviews. 1998, 98: 391-408. 10.1021/cr950223l.View ArticleGoogle Scholar
- Schuster S, Pfeiffer T, Moldenhauer F, Koch I, Dandekar T: Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae. Bioinformatics. 2002, 18: 351-361. 10.1093/bioinformatics/18.2.351.View ArticleGoogle Scholar
- Kauffman K, Pajerowski J, Jamshidi N, Palsson BO, Edwards J: Description and analysis of metabolic connectivity and dynamics in the human red blood cell. Biophysical Journal. 2002, 83: 646-662. 10.1016/S0006-3495(02)75198-9.View ArticleGoogle Scholar
- Price ND, Reed JL, Papin JA, Famili I, Palsson BO: Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. Biophysical Journal. 2003, 84: 794-804. 10.1016/S0006-3495(03)74899-1.View ArticleGoogle Scholar
- Jamshidi N, Palsson BO: Top-down analysis of temporal hierarchy in biochemical reaction networks. PLoS Computational Biology. 2008, 4: 1-10. 10.1371/journal.pcbi.0040001.View ArticleGoogle Scholar
- Radulescu O, Gorban AN, Zinovyev A, Lilienbaum A: Robust simplifications of multiscale biochemical networks. BMC Systems Biology. 2008, 2: 86-111. 10.1186/1752-0509-2-86.View ArticleGoogle Scholar
- Maas U, Pope SB: Simplifying chemical reaction kinetics: Intrinsic low-dimensional manifolds in composition space. Combustion and Flame. 1992, 88: 239-264. 10.1016/0010-2180(92)90034-M.View ArticleGoogle Scholar
- Shaik OS, Kammerer J, Gorecki J, Lebiedz D: Derivation of a quantitative minimal model for the photosensitive Belousov-Zhabotinsky reaction from a detailed elementary-step mechanism. Journal of Chemical Physics. 2005, 123: 234103-10.1063/1.2136882.View ArticleGoogle Scholar
- Surovtsova I, Simus N, Lorenz T, König A, Sahle S, Kummer U: Accessible methods for the dynamic time-scale decomposition of biochemical systems. Bioinformatics. 2009, 25: 2816-2823. 10.1093/bioinformatics/btp451.View ArticleGoogle Scholar
- Zobeley J, Lebiedz D, Kammerer J, Ishmurzin A, Kummer U: A new time-dependent complexity reduction method for biochemical systems. Transactions on Computational Systems Biology. 2005, 1: 90-110.View ArticleGoogle Scholar
- Goussis DA, Najm HN: Model reduction and physical understanding of slowly oscillating processes: the circadian cycle. Multiscale Modelling and Simulation. 2006, 5: 1297-1332. 10.1137/060649768.View ArticleGoogle Scholar
- Kourdis PD, Steuer R, Goussis DA: Physical understanding of complex multiscale biochemical models via algorithmic simplification: glycolysis in Saccharomyces Cerevisiae. Physica D. 2010, 239: 1789-1817. 10.1016/j.physd.2010.06.002.View ArticleGoogle Scholar
- Lam H: Using CSP to understand complex chemical kinetics. Combustion Science and Technology. 1993, 89: 375-404. 10.1080/00102209308924120.View ArticleGoogle Scholar
- Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, Singhal M, Xu L, Mendes P, Kummer U: COPASI - a complex pathway simulator. Bioinformatics. 2006, 22: 3067-3074. 10.1093/bioinformatics/btl485.View ArticleGoogle Scholar
- Hynne F, Danø S, Sorensen PG: Full-scale model of glycolysis in Saccharomyces cerevisiae. Biophysical Chemistry. 2001, 94: 121-163. 10.1016/S0301-4622(01)00229-0.View ArticleGoogle Scholar
- Golub GH, van Loan CF: Matrix computations. 1996, Johns Hopkins University Press, BaltimoreGoogle Scholar
- Petzold L: Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM Journal on Scientific and Statistical Computing. 1983, 4: 136-148. 10.1137/0904010.View ArticleGoogle Scholar
- Hindmarsh AC: ODEPACK, a systematized collection of ODE solvers. Transactions on Scientific Computation. 1983, 1: 55-64.Google Scholar
- Visser D, van der Heijden R, Mauch K, Reuss M, Heijnen S: Tendency modeling: a new approach to obtain simplified kinetic models of metabolism applied to Saccharomyces cerevisiae. Metabolic Engineering. 2000, 2: 252-275. 10.1006/mben.2000.0150.View ArticleGoogle Scholar
- Palsson BO, Lightfoot EN: Mathematical modeling of dynamics and control in metabolic networks I. On Michaelis Menten kinetics. Journal of Theoretical Biology. 1984, 111: 273-302. 10.1016/S0022-5193(84)80211-8.View ArticleGoogle Scholar
- Danø S, Madsen MF, Schmidt H, Cedersund G: Reduction of a biochemical model with preservation of its basic dynamic properties. FEBS Journal. 2006, 273: 4862-4877. 10.1111/j.1742-4658.2006.05485.x.View ArticleGoogle Scholar
- Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, Li L, He E, Henry A, Stefan MI, Snoep JL, Hucka M, Le Novére N, Laibe C: BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Systems Biology. 2010, 4: 92-106. 10.1186/1752-0509-4-92.View ArticleGoogle Scholar
- Olivier BG, Snoep JL: Web-based kinetic modelling using JWS Online. Bioinformatics. 2004, 20: 2143-2144. 10.1093/bioinformatics/bth200.View ArticleGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.