A method for zooming of nonlinear models of biochemical systems
© Sunnåker et al; licensee BioMed Central Ltd. 2011
Received: 3 February 2011
Accepted: 7 September 2011
Published: 7 September 2011
Models of biochemical systems are typically complex, which may complicate the discovery of cardinal biochemical principles. It is therefore important to single out the parts of a model that are essential for the function of the system, so that the remaining non-essential parts can be eliminated. However, each component of a mechanistic model has a clear biochemical interpretation, and it is desirable to conserve as much of this interpretability as possible in the reduction process. Furthermore, it is of great advantage if we can translate predictions from the reduced model to the original model.
In this paper we present a novel method for model reduction that generates reduced models with a clear biochemical interpretation. Unlike conventional methods for model reduction our method enables the mapping of predictions by the reduced model to the corresponding detailed predictions by the original model. The method is based on proper lumping of state variables interacting on short time scales and on the computation of fraction parameters, which serve as the link between the reduced model and the original model. We illustrate the advantages of the proposed method by applying it to two biochemical models. The first model is of modest size and is commonly occurring as a part of larger models. The second model describes glucose transport across the cell membrane in baker's yeast. Both models can be significantly reduced with the proposed method, at the same time as the interpretability is conserved.
We introduce a novel method for reduction of biochemical models that is compatible with the concept of zooming. Zooming allows the modeler to work on different levels of model granularity, and enables a direct interpretation of how modifications to the model on one level affect the model on other levels in the hierarchy. The method extends the applicability of the method that was previously developed for zooming of linear biochemical models to nonlinear models.
One of the main reasons for the rapid growth of the field of systems biology is that it makes extensive use of mathematical modeling [1–3]. This allows for a better handling of high complexity, which is an inherent property of all living systems. Using modeling, complex hypotheses can be formulated and tested in a more systematic manner than is possible using only biochemical reasoning [4–6]. However, even if one can obtain a detailed model of the system with a high predictive power, the model in itself does not automatically lead to a full understanding of the underlying biochemistry. One should for instance analyze the model to single out its essence, i.e., to identify those parts of the model that can be eliminated, while still preserving the model's crucial behavior. This latter task is referred to as model reduction, and it is the topic of this paper. There is an extensive literature available on the topic of model reduction. However, most of these studies have been done outside the field of systems biology, and since systems biology brings about new types of challenges, reduction of biochemical models is still in its early stages. Traditional engineering approaches like balanced truncation have focused on preserving the input-output profile in an optimal manner, both for linear [7–10], and for nonlinear  systems. However, these methods are not suitable for systems biology, because the reduced model has no natural interpretation in itself (nevertheless, some special cases where this problem can be circumvented have been identified [12, 13]). This lack of interpretation is a problem because systems biology models are usually developed to help characterizing the dominating parts and structure of the system, and not only to obtain a black-box predictor. Methods have therefore been developed with traditional chemical approaches that are more centered on reducing the internal dynamics of the system. These methods are typically based on a sensitivity analysis [14–17], on time-scale separation [18–21], or on the lumping of state variables [22–26] (see  for a general review on model reduction). The perhaps most widely used method is lumping. Two of the main reasons for this are that an effective lumping scheme can be identified from basic properties of the model (e.g., the stoichiometry), and that lumped state variables are formed as easily interpretable pools of state variables in the original model. However, lumping does normally not come with the possibility of back-translation from the lumped state variables in the reduced model to the original state variables. In  we provided such relations. This means that we can take the result from a simulation of a reduced model, and without performing a new simulation, directly compute the corresponding trajectories of the desired original state variables. Because of this back-translation possibility, we refer to the resulting two models as two degrees of zooming of the same model. Nevertheless, like in other recent model reduction papers in systems biology [28–32], the results in  were mainly developed with linear systems in mind. Linear systems virtually only appear in the cases of mono-molecular reaction networks and for models describing the probabilistic evolution of a single protein complex [27, 33]. However, already in  we proposed that zooming may in principle also be applicable to nonlinear models, but we did not derive formulae for back-translation. Note that a majority of the currently available systems biology models are in fact nonlinear.
With the method introduced in this paper, we provide the extension of the previously proposed method in  to nonlinear models. We show that new challenges arise due to the nonlinearities, but also how these challenges can be overcome, for instance with a wise choice of state variables in the reduced model. The method is demonstrated by application to two closed models of metabolic systems.
In this paper we present a more general version of the method that was introduced in , which is applicable to nonlinear models. We start with some basic definitions and key observations that are illustrated on a small example model, before we turn to the details of the method.
Basic Definitions and Assumptions
is then commonly used to reduce the model. We refer to a state as fast in the time interval ≤ T0 ≤ t < T1 if Eq. (4) is valid in this time interval, and holds for the class of all considered inputs to the system. Note that T1 = ∞ in the case that the systems remains in QSS, which may for example not be the case for models with switches (where, e.g., the values of a subset of the state variables may change when a certain condition is fulfilled) [34, 35].
Note that Eq. (3) (steady state) necessitates that Eq. (4) (quasi-steady state) is fulfilled, but not vice versa. Although QSS implies that (some of) the terms of the right-hand side of the ODE are large and leaves the left-hand side (derivative term) negligible, the derivative term may still be large enough for the state variable in QSS to change considerably during the time-span of a simulation; the key is that these changes mainly occur on a slow manifold.
Zooming of Linear Models
where x r ∈ ℝ(n - w + 1), A r ∈ ℝ(n - w + 1) × (n - w + 1), B r ∈ ℝ(n - w + 1) × m, and C r ∈ ℝl × (n - w + 1).
where is the rate parameter in the reaction from state variable to state variable x j in the reduced model, and k ji is the rate parameter in the reaction from state variable x i to state variable x j in the original model.
Finally, note that Eqs. (5) and (6) provide a link between and , which constitute two different levels of granularity. It is this link between the models that make us consider them as two different degrees of zooming, and the primary goal of this paper is to establish such a link also for nonlinear models.
Extension to Nonlinear Models - Initial Observations
We will now present some key observations that are used in the derivation of the method for zooming of nonlinear models.
Eq. (12) defines the ODEs of the reduced model (the lumped state variables), and Eqs. (11) and (13) can in principle be used to calculate back-translation formulae, as is demonstrated with the small example model in the next section. However, as we shall see, this approach requires the explicit algebraic solution to a system of nonlinear equations, which is typically an infeasible task. Furthermore, there is not a clear one-to-one mapping between the state variables of the original and reduced models as in the case of proper lumping .
A Small Example Model
We will now present a small example model, with three fast state variables, which is reduced with the approach discussed above. An alternative approach is then demonstrated with the advantage that it scales better to larger models.
where the lumped state variables L1 and L2 are introduced. Note that Eq. (12) defines the dynamics of the lumped state variables.
We can now employ Eq. (12) to solve the ODEs for the lumped state variables L1 and L2, and use Eqs. (19)-(21) as back-translation formulae to compute the trajectories of the original state variables A, B, and C. However, note that for even slightly larger clusters of fast species than the one discussed here it would not be possible to calculate algebraic expressions of the original state variables with this approach, since it builds on the explicit solution of a system of nonlinear equations, which quickly becomes infeasible with growing problem size.
The reduced model consists of the two state variables L1 and B (note that L2 does not appear in the reduced model), and the dynamics is described by Eqs. (12) and (25), respectively. Note that the state variables A and C can be back-translated from the reduced model with Eqs. (23) and (24). This approach is a bit more intricate than the first, but comes with the advantage that we do not need to solve a system of nonlinear equations.
A Method for Zooming of Nonlinear Models
We will now step-by-step present a method that can be used to construct zoomable nonlinear biochemical models. This involves two sub-goals: i) to identify a reduced model that shares important characteristics with the original model, ii) to derive back-translation formulae that can be used to compute the original state variables and parameters from the reduced model.
In an initialization step of the method for a model we first formulate mathematical equations for all conservation relations Eq. (8), state variables in steady-state Eq. (3), and quasi-steady state assumptions Eq. (4). If additional properties of the system are known, we also formulate the corresponding equations.
The first step of the method is to identify the apparent conservation relations in the model.
Definition 1: Let S f be the stoichiometric matrix for the reactions r f (x f , p) as defined in Eq. (9). Each subset of state variables for which the corresponding rows of S f are linearly dependent constitutes an apparent conservation relation. Hence the apparent conservation relations lie in the left null space of S f and the dimension of this space is n - rank(S f ).
Note that the apparent conservation relations are defined in Eq. (11). It is trivial to identify the set of all linearly dependent rows of S f with a mathematical computing software (e.g., SBtoolbox for Matlab ).
The second step of the method is to define the state variables of the reduced model, which we refer to as modified lumped state variables.
Definition 2: Let x be a lumped state variable corresponding to a subset of the state variables in an apparent conservation relation. Then x is a modified lumped state variable if the lumping scheme with respect to the state variables of the original model is proper.
where M m is a × n f matrix with elements equal to 0 or 1 and column sums equal to 1, and l m denotes the modified lumped state variables. We typically have a large freedom in the choice of M m . The number of state variables is maximally reduced if all exact conservation relations in the model are retained as modified lumped state variables (and replaced by constants).
where we used Eq. (27) in the last step.
A modified lumped state variable for which an insufficient number of linear and linearly independent equations are available may still be used in the reduced model. However, the back-translation of the modified lumped state variable to the original state variables is then not possible, and step 3 of the method is ignored.
where the matrix M m is defined in Eq. (26), and is introduced to simplify the notation.
where Eq. (12) was used in the last step. □
The matrix I + J(l m ; p) is symbolically invertible, but may in general contain singularities for particular combinations of parameters values and state variable values. However, the matrix is always invertible for the models discussed in this paper, since the corresponding determinants are strictly positive.
The final step of the method is to back-translate the modified lumped state variables to the original state variables with the fraction parameters derived in step 3. This allows a comparison between the predictions by the reduced model to those of the original model.
The implementation of the method is straight-forward, and we have used Matlab (R2008b) together with the SBtoolbox  as computing software for the models in this paper.
We will now demonstrate the method through application to two example models.
Enzyme Kinetics Model
Note that the complexes C s and C p are formed by S bound to E, and P bound to E, respectively. This model is frequently occurring as part of larger models of biological systems, although the reaction from C S to C P is sometimes neglected, or reversible. The ODEs for the model are listed in Appendix A.1, where the three reactions are defined as: r1 = k1SE - k-1C S , r2 = k2C S , and r3 = k3C P - k-3PE.
which is constant since the total amount of the enzyme E is conserved in the system.
Reduction of the Enzyme Kinetics Model
where L S and L P are apparent conservation relations and LE is an exact conservation relation.
where and . The two remaining modified lumped state variables correspond to S and P in the original model, so we define that and .
The two ODEs in Eqs (41)-(42) define the dynamics of the state variables in the reduced model. We finally note that the exact conservation relation for the substrate, L T = S + P + C S + C P , together with Eqs. (39)-(40) can be used to reduce the model further to a single state.
Robustness of the reduced model for large deviations from the nominal parameter point are presented for the enzyme kinetics model, with a sampling frequency of 0.1 (starting from 0.1) time units.
where the fraction parameters specify the distribution of the enzyme among the corresponding original state variables.
Glucose Transport in Budding Yeast
In  we described how the calculation of fraction parameters, based on a set of assumptions, leads to the same reaction rates in the reduced model as were reported in . The assumptions are that state variables participating in reactions for uptake and release of glucose and G6P across the cell membrane are in QSS, that the transporter is conserved, and that the concentrations of the transporter in the inner and outer regions of the cellular membrane are constant.
It is not clear how equations for back-translation of the state variables in the reduced model in [27, 37] can be derived. The reduced model has three state variables; external- and internal glucose and G6P, but only two differential equations for the in- and outflow of glucose, since the ODE for G6P is replaced by a representative function that is inferred from the G6P data. Our method does not rely on that such information is available, although it would in principle be possible to utilize data fitted functions for the state variables. Also note that the equations in the reduced model describe the total influx and efflux of glucose across the membrane , which cannot be interpreted w.r.t. the state variables of the original model. Other assumptions in  that complicates a comparison with our method are that the efflux of glucose is negligible, that the concentration of glucose in the cytosol is negligible, and that the concentrations of the transporter are constant in the inner and outer regions of the cell membrane (Eq. (50)).
It is not possible to generate a reduced model by direct substitution of the fraction parameters that were derived in  into the ODEs of the original model, since this would lead to the prediction that the state variables are constant (as discussed in ). We will now instead illustrate how our method can be used to derive a reduced and zoomable version of the glucose transport model.
Reduction of the Glucose Transport Model
Before applying our method to the glucose transport model we tried an alternative approach. Eqs. (43)-(46) were solved w.r.t. the state variables in QSS, and the resulting expressions were then substituted into the remaining ODEs. The details of the derivation of the reduced model are presented in Appendix A.3. The reduced model does not produce satisfactory predictions for any other state variable than , which remains approximately constant during the simulation. Implementations of the original model (Additional file 4), the reduced model (Additional file 5), and a script for simulation with SBtoolbox2 for MATLAB  (Additional file 6), are available in Additional files.
Since the first approach turned out to be insufficient for reduction of the glucose transport model we applied our method to the same model. Following , we initially assumed constant transporter concentrations in the inner and outer regions of the cellular membrane, as defined by Eq. (50). For the details on the derivation of the reduced model we refer to Appendix A.4. The reduced model clearly performs better than the model resulting from the first approach, but it is still not satisfactory. However, the assumption of constant regional concentrations of the transporter may not be valid since the transport of glucose across the cell membrane is a rate limiting step in the model, and appears to be important for the state variable dynamics. We therefore decided to neglect Eq. (50) in the reduction process.
Implementations of the original model (Additional file 4), the reduced model (Additional file 7), and a script for simulation with SBtoolbox2 for MATLAB  (Additional file 8), are available in Additional files. In the first step of the method we identify the following apparent conservation relations
We note that there are two disjoint clusters of fast reactions in the model, corresponding to the outer- and inner parts of the cell membrane.
Note that two of the state variables in the original model, and , are also modified lumped state variables.
where . We define that , , and . For the two original state variables that are kept as modified lumped state variables in the reduced model we define that and .
Robustness of the reduced model for large deviations from the nominal parameter point are presented for the glucose transport model, with a sampling frequency of 1 (starting from 1) time units.
In this paper we have presented a novel method for reduction of biochemical models that is compatible with the concept of zooming. Several methods for reduction of biochemical models already exist in the literature. However, few of these methods result in biochemically interpretable models, and to our knowledge there are no nonlinear lumping methods for which the state variables and parameters of the reduced model can be back-translated (mapped) to the original model.
The application of the QSS assumption has been a commonly used tool in the modeling of biochemical networks since the late 1960s, and in chemical kinetics for more than 80 years . The validity of the QSS approximation is well studied both for specific biochemical mechanisms [38, 39] and for more complex models [40, 41]. The resulting equations, together with conservation relations, are typically used to eliminate some of the state variables in the model (e.g., see ). However, with the examples in this paper we have showed that such an approach is not always sufficient, and we propose to use proper lumping of state variables in combination with back-translation.
Our method has several important advantages when applied to biochemical models. The most important advantage is that we end up with reduced models with a clear biological interpretation, meaning that each state variable of the original model corresponds to a fraction of exactly one of the state variables in the reduced model. A consequence is that neighboring species in the original model remain neighbors in the reduced model. Hence we can consider the original and reduced models as two different degrees of zooming; a concept that we discussed in some detail in  for linear models.
The work in this paper can be seen as an extension of the theory introduced for linear models in  to nonlinear models. The method is based on assumptions regarding the dynamics that result in a sufficient number of equations that are linear w.r.t. the state variables to be back-translated. Such equations are typically a natural result of QSS assumptions and conservations relations in models based on mass action kinetics , and in particular in models that involve transporters and enzymes (e.g., the models in this paper). However, note that our method may also be applicable to models with other types of reaction kinetics. We also note that if too few linear relations are available for calculation of fraction parameters for a part of a model, this part can still be reduced and the reduced model can be simulated, although we cannot back-translate the corresponding modified lumped state variables since no fraction parameters are available. However, depending on the purpose of the model it may be enough to calculate fraction parameters for a subset of the state variables in the reduced model. Linearization of the model around a steady state operating point may also be a feasible approach to calculate fraction parameters with the method in .
The proposed method enables mapping of the state variables and parameters of the reduced model to those of the original model. In  we referred to this mapping as back-translation. Back-translation is of great importance, since we can directly observe how modifications to the reduced model impact the original model. It also gives the modeler an opportunity to check whether the assumptions underlying the reduction are acceptable. To illustrate the power of back-translation we provide plots for comparison of simulations of the original and reduced models, w.r.t. the original state variables, for the models to which the method is applied in this paper.
Back-translation of state variables typically requires the solution of a system of nonlinear equations, which often results from the assumption of state variables in QSS and conservation relations. Unfortunately, analytic solutions to systems of nonlinear equations do in general not exist. An advantage with the proposed method is that such solutions are not required, since they are replaced by computation of the inverse of a matrix for each cluster of fast state variables, which is in general a more feasible task.
Our method was applied to a small model with five state variables that commonly appears as part of larger biochemical models, and to a previously published model for the transport of glucose in baker's yeast (S. cerevisiae) . The first model was reduced from five to one state variable, and from five to three parameters. However, note that our focus has been on the reduction of the number of state variables and not the number of parameters, which are reduced as a side-effect of the QSS assumptions.
The model for glucose transport was first reduced with an approach in which the QSS equations and conservation relations were directly substituted into the remaining ODEs. The results of this approach are not satisfactory since the reduced model gives predictions that are different from the original model for most state variables. Our method was then applied to the same model both together with the assumption of equal concentration of the transporter in the inner and outer regions of the cell membrane used in , and without any additional assumptions. The application of our method together with the assumptions used in  results in a model with three state variables. The state dynamics is significantly better preserved than with the first approach, although still not satisfactory. We then decided to reduce the original model without the assumption regarding the localization of the transporter, with two different definitions of modified lumped state variables. While one of these definitions results in a reduced model with four state variables and gave rather accurate predictions, the other choice reduces the number of state variables to five and gives an excellent description of the state dynamics. It is therefore apparent that there is a tradeoff between accuracy and the number of state variables in the reduction process. The glucose transport model corresponds to the first part of glycolysis, in which glucose is transported into the cell. We therefore propose that it might be rewarding to carefully re-investigate the assumptions underlying the reaction rate equations in complete models of glycolysis (see  for one example).
We have also observed a few issues regarding the implementation of the method. The symbolic inversion of the matrix that is necessary to compute the dynamics of the modified lumped state variables may be expensive. However, this is typically only a practical limitation for large matrices, which result from large clusters of fast state variables. In our experience large clusters of fast state variables are relatively rare also in large biochemical models. Another option, if it is not practically feasible to invert the symbolic matrix, is to solve the system of linear equations in Eq. (31) numerically. We also observed that the symbolic right-hand side of the resulting differential equations may be long. However, these are usually not practical limitations for the applicability of the method, e.g., the simulations of all examples in this paper are very fast on a modern computer. Available methods to reduce the analytic reaction rate expressions include sensitivity analysis w.r.t. state variables and parameters, and the method proposed in .
There is still no consensus method for automatic identification of state variables in QSS, although criteria for the detection of state variables in QSS have been proposed, for example in . A simple approach is to simulate the original model and investigate for which state variables the corresponding in- and outflow reaction rates are approximately equal. State variables for which this condition holds are then considered to be in QSS. Note that for the models in this paper it was already clear from the biochemical understanding of the corresponding systems which of the state variables that could be considered fast (see  for the glucose transport example). However, an appropriate general criterion for automatic identification of state variables in QSS is still lacking.
Although the theory presented in this paper constitutes a great leap forward for construction of zoomable models, more research is required to make the method fully automatic. An important challenge is to define a meaningful measure for the similarity between the hierarchical model layers (degrees of zooming). Another interesting, although trivial, observation that deserves further attention is that QSS assumptions typically do not hold in the whole parameter space. Although the reduced models in this paper appear to be robust to varying parameter values it may not be the case in general. It may therefore be revealing to compare the original model and the reduced model to characterize the parameter space regions in which the QSS-assumptions are valid.
We have presented a novel method for reduction of biochemical models that is compatible with the concept of zooming. Zooming allows the modeler to operate on different levels of model granularity, and enables a direct interpretation of how modifications to the model on one level affect the same model on other levels in the hierarchy. The proposed method is based on the application of proper lumping in combination with the identification of linear relations in nonlinear equations.
The method was applied to two example models. The first model is small and commonly occurring as a part of larger biochemical models. The second example is a model for glucose transport in baker's yeast, which constitutes the starting point for glycolysis. Both models could be significantly reduced with the proposed method, and the resulting state variables could be back-translated to the original state variables. The method that is presented in this paper constitutes an extension of the method that was previously developed for linear biochemical models to its nonlinear counterpart. Since most models in the systems biology community are in fact nonlinear, our method constitutes an important step towards zoomable biochemical models.
A.1 Appendix 1
where r1 = k1SE - k-1C S , r2 = k2C S , and r3 = k3C P - k-3PE.
where , S = A, L S = L1, and L E = L2.
A.2 Appendix 2
which was introduced in .
A.3 Appendix 3
In this section we investigate an alternative (naive) approach to reduce the glucose transport model. The first step is to identify state variables for which the QSS assumption holds, and the mass conservation relations in the model. In the second step of this approach we then substitute the corresponding system of equations into the ODEs corresponding to slow state variables.
There are three molecules (moieties) whose mass is conserved in the model as a whole, i.e., Glc, G6P, and E. However, we can not substitute any of the conservation relations into the remaining ODEs without re-introducing state variables that were already eliminated. So the final reduced model takes the form of Eqs. (61) - (64). However, the sum of the state variables and , and and is conserved in the reduced model, which makes it possible to reduce the model to two state variables.
Unfortunately, due to the form of the ODEs and the initial conditions of the state variables in the reduced model, the state variables and remain equal to zero at all times, and only the state variables and take non-zero values. We therefore decided to simulate the original model for a short time until the fast state variables reach QSS, and to use the final state variable values as initial conditions in the reduced model.
which can be used for back-translation of the state variables of the reduced model to those of the original model.
The predictions of the state variables of the original model, resulting from simulations of the original model and the reduced model with the parameter values set as in , is not satisfactory for any other state variable than G 6P, which remains approximately constant over time. Implementations of the original model and the reduced model in SBtoolbox2 for MATLAB  are included in Additional files.
A.4 Appendix 4
In this section we apply our method to the glucose transport model, and following  we will assume that the concentrations of the transporter are constant in the inner and outer regions of the cellular membrane. With this assumption the distribution among the transporter state variables of the original model, which constitute the lumped state variables L E , is uniquely defined.
We note that Eqs. (43)-(47), and (50) are all linear w.r.t. the state variables that constitute state L E , so the requirement for the existence of at least 6 (n m ) linear relations is satisfied, which enables back-translation in step three of the method.
where denotes the first three states variables in l m . Note that there are three state variables in the reduced model, which is the same number as for the reduced model in .
In the fifth step of our method we compare predictions of the original state variables between the original and reduced models, where L E is back-translated with the fraction parameters defined in Eqs. (66)-(67).
The simulation results are clearly more accurate than with the approach in Appendix A.3, although still not satifying. We refer to Additional files for implementations of the original and reduced models in SBtoolbox2 for MATLAB .
A.5 Appendix 5
In this section we apply our method to the glucose transport model, but with an alternative definition of the modified lumped state variables. We do not use the assumption of constant regional concentrations of the transporter (Eq. (50)).
In the first step of the method we note that the apparent conservations are given by Eq. (51).
Where , and we note that , , and are state variables both in the original and reduced models. Also note that the requirement of at least n m equations, that are linear w.r.t. the original state variables and linearly independent, is satified for each of the modified lumped state variables by Eqs. (43)-(49).
and the fraction parameters were calculated with Eq. (30).
In the fifth step of the method we simulate the reduced model with Eqs. (56)-(57) and (70)-(71) and we then use Eqs. (53) and (69) for back-translation of the state variables. A comparison between the original model and the reduced model, w.r.t. the state variables of the original model, is presented in Figure 5 and Figure 6. The agreement between the models is very good. We refer to Additional files for implementations of the original and reduced models in SBtoolbox2 for MATLAB .
The original and reduced versions of the models presented in this paper, and scripts for simulation and