Extraction of elementary rate constants from global network analysis of E. coli central metabolism
 Jiao Zhao^{1},
 Douglas Ridgway^{1},
 Gordon Broderick^{1, 2},
 Andriy Kovalenko^{3} and
 Michael Ellison^{1, 4}Email author
DOI: 10.1186/17520509241
© Zhao et al; licensee BioMed Central Ltd. 2008
Received: 30 November 2007
Accepted: 07 May 2008
Published: 07 May 2008
Abstract
Background
As computational performance steadily increases, so does interest in extending oneparticlepermolecule models to larger physiological problems. Such models however require elementary rate constants to calculate timedependent rate coefficients under physiological conditions. Unfortunately, even when in vivo kinetic data is available, it is often in the form of aggregated rate laws (ARL) that do not specify the required elementary rate constants corresponding to massaction rate laws (MRL). There is therefore a need to develop a method which is capable of automatically transforming ARL kinetic information into more detailed MRL rate constants.
Results
By incorporating proteomic data related to enzyme abundance into an MRL modelling framework, here we present an efficient method operating at a global network level for extracting elementary rate constants from experimentbased aggregated rate law (ARL) models. The method combines two techniques that can be used to overcome the difficult properties in parameterization. The first, a hybrid MRL/ARL modelling technique, is used to divide the parameter estimation problem into subproblems, so that the parameters of the mass action rate laws for each enzyme are estimated in separate steps. This reduces the number of parameters that have to be optimized simultaneously. The second, a hybrid algebraicnumerical simulation and optimization approach, is used to render some rate constants identifiable, as well as to greatly narrow the bounds of the other rate constants that remain unidentifiable. This is done by incorporating equality constraints derived from the KingAltman and Cleland method into the simulated annealing algorithm. We apply these two techniques to estimate the rate constants of a model of E. coli glycolytic pathways. The simulation and statistical results show that our innovative method performs well in dealing with the issues of high computation cost, stiffness, local minima and uncertainty inherent with largescale nonconvex nonlinear MRL models.
Conclusion
In short, this new hybrid method can ensure the proper solution of a challenging parameter estimation problem of nonlinear dynamic MRL systems, while keeping the computational effort reasonable. Moreover, the work provides us with some optimism that physiological models at the particle scale can be rooted on a firm foundation of parameters generated in the macroscopic regime on an experimental basis. Thus, the proposed method should have applications to multiscale modelling of the real biological systems allowing for enzyme intermediates, stochastic and spatial effects inside a cell.
Background
As systems biology matures, it is moving away from static representations of network interactions based on nodes and edges to dynamic representations that describe cellular processes in space and time. Dynamic metabolic processes are quantitatively modelled with ordinary differential equations (ODEs) in two principle ways: the aggregated rate law [1] and mass action rate law [2] approaches. These two models differ from one another in the level of detail at which they operate and hence the contexts in which they can be validly applied.
Dynamic metabolic networks are predominately modelled using aggregated rate laws (ARL). An ARL simplifies the description of a single enzymatic step by aggregating the elementary steps associated with a specific mechanism into a single reaction, where the rate becomes a nonlinear function of substrate, product and regulator concentrations and a typically linear function of enzyme concentration. The classical example of an ARL treatment is the MichaelisMenten equation for the simplest irreversible reaction, the UniUni mechanism. ARL models are not always derived from an underlying mechanism, and in some cases, more phenomenological formulas are adopted to empirically fit experimental data. While the simplified treatment of enzyme kinetics using an ARLbased approach has obvious appeal when attempting to model large metabolic networks, such a reduction of the underlying mechanism inevitably leads to a loss of some useful information about the reaction mechanism, e.g. the sequence of elementary reactions, dynamics of all enzyme intermediates, and so on. ARL models are therefore equipped to capture biological processes on long time scales wherein dynamics of enzyme intermediates can be ignored. To date most experimentally determined kinetic parameters are at this level.
An alternative to the ARL approach is the direct use of mass action rate laws (MRL), each of which states that the rate of an elementary reaction is directly proportional to the product of the effective concentrations of each participating molecule. By definition, MRL models involve the sequence of elementary reactions as well as track dynamics of all of the elements by describing the formation and degradation of all species in an enzymatic reaction. It is particularly suitable for modeling molecular events that happen on the microsecond to millisecond timescale [3, 4].
Regardless of the approach taken, good models of cellular systems are often guided by a pragmatic principle: a model should be as simple as possible, but as complex as necessary. The growing necessity of dealing with complexity is however highlighted by the apparent behavioural differences exhibited by biomolecules within an intracellular environment versus the test tube [5]. These differences can be largely attributed to a range of spatial phenomena including macromolecular crowding, caging, spatial segregation of reactants, and the unpredictable nature associated with the reaction of rare and nonuniformly distributed biomolecules [6]. Significantly, a comprehensive quantitative understanding of most of these phenomena is lacking. Meanwhile, the steady increase in computational capability, coupled with improved technology for making quantitative measurements of single molecules within single living cells, is fuelling interest in an alternative modelling approach in which individual molecules are represented as particles that are imbued with the dynamic properties of movement and reaction as a function of space and time [7–9]. This approach, referred to as Particle Based Simulation (PBS), has the advantage that it can seamlessly link stochastic and continuous processes in a modelling environment where the spatial and physicochemical complications referred to above are represented explicitly. These explicit simulations require however the direct elementary rate constants and enzyme intermediates that distinguish MRL modelling from ARL modelling. If a system has been parameterized as an ARL model, it must first be converted into MRL form in order to set up a PBS simulation. Thus the MRL model becomes the bridge between the ARL format and the PBS format.
As a consequence of its higher level of detail, the MRL approach is starting to receive special attention for the elucidation of complex biological systems [10–14]. To date the biggest limitation associated with the MRL approach is the lack of detailed quantitative biochemical data to fuel the models [12]. This dearth of data has prompted the development of several estimation methods for the massaction rate constants of enzymatic reactions.
The classic approach combines the Schematic Method of King and Altman [15] with the General Rule of Cleland [16] (SMKA/GRC). SMKA/GRC relates unknown elementary rate constants to known ARL kinetic constants. Individual rate constants are then calculated by solving linear or nonlinear algebraic equations. One limitation of SMKA/GRC is that most of the experimentally determined ARL constants are derived from isolated enzymes in vitro over a range of conditions. This lack of biological context calls into question the relevance of network models based on these parameters in describing complex cellular behaviour under physiological conditions. One other limitation is that it is not always possible to uniquely determine the rate constants from existing ARL rate constants, when: 1) the number of MRL parameters is large, 2) there is a redundancy of values among ARL kinetic constants or 3) there are technical difficulties in solving nonlinear equations. Moreover, use of SMKA/GRC method alone is not able to deal with empirical ARL equations where some kinetic constants are missing or reduced to empirical constants.
Recently, Yang et al. [11] have proposed a simpler alternative to SMKA/GRC termed the Lambda and Omega approximation method. Estimating rate constants from available in vitro kinetic data (K_{m}, K_{cat} and K_{i}) involves a rapid equilibrium approximation that assumes that reactants, free enzymes and enzymebound intermediates reach equilibrium quickly relative to the rate of catalysis. The approach to steadystate can be controlled in the model by using large, timeinvariant constant numbers (Λ and Ω) that are associated with enzymesubstrate and enzymeinhibitor binding reactions, respectively. When compared with SMKA/GRC, this approximation method is based on fewer kinetic constants and simpler algebraic relations, leading to easier mathematical manipulations. It does however suffer from limitations such as the existence of trimolecular association reactions that are physiologically improbable (e.g. the enzyme can interact simultaneously with two substrates in a Bi Bi mechanism) and the inherent ambiguity imposed by the dependence of the rate constants on arbitrary values of Λ and Ω.
The final method is numerical simulation and optimization (NSO) [17] for network models, where nonlinear least squares regression is used in combination with simulation to optimize parameters from timecourse variables. In principle, this method could be used to find optimal MRL rate constants provided that enough timecourse data and constraints are available. However, this strategy often meets with difficulties because even relatively simple metabolic pathways modelled by MRL expand to large and stiff systems of ODEs. Unsurprisingly then, the use of NSO in MRL modelling is not prevalent and is often constrained to the analysis of small systems [17]. Furthermore, the complexity of the parameter space coupled with poor knowledge of the in vivo rate constants means that the optimization algorithm is easily trapped by local minima [18] or returns a family of solutions [19]. By comparison, for ARL models, NSO has been successfully applied to globally optimize the parameters for more complex metabolic networks [20, 21]. If a method could be developed that deals with the issues of network scale, stiffness, local minima and parameter identifiability, then NSO could play an important role in the development of detailed, complex and therefore useful MRL models.
In this work, we present a novel methodology for extracting MRL elementary rate constants from ARL network models, which combines the advantages of two techniques with respect to model structure. Our ultimate goal is to generate an automatic transformation from ARL kinetic information into the elementary MRL rate constants required for our PBS modelling effort. When it is applied to a challenging parameterization problem in regards to central metabolism of E. coli, our method proved efficient and robust, thereby enabling systemic investigation of the mass action rate laws of a largescale cellular network.
Results
Method evaluation
Reaction mechanism and protein abundance for glycolytic pathways of E. coli
Enzyme  Abundance ^{a} (molecules/cell)  Concentration ^{b} (μM)  Kinetic mechanism  Elementary reactions 

Phosphogluco isomerase (pgi)  1099  2.7  Reversible Uni Uni [51]  G6P + EPGI → EPGIG6P 
EPGIG6P → G6P + EPGI  
EPGIG6P → F6P +EPGI  
F6P + EPGI → EPGIG6P  
Phosphofructo kinase (pfkA)  3287  8.2  Allosteric regulation and ordered sequential mechanism [39] [52]  ATP + EPFKA_{TR} → EPFKA_{R}ATP 
EPFKA_{R}ATP → ATP + EPFKA_{TR}  
F6P + EPFKA_{R}ATP → EPFKA_{R}ATPF6P  
EPFKA_{R}ATPF6P → F6P + EPFKA_{R}ATP  
EPFKA_{R}ATPF6P → FDP + EPFKA_{R}ADP  
EPFKA_{R}ADP → ADP + EPFKA_{TR}  
Fructosebisphosphate aldolase (fba)  16326  40.5  Ordered Uni Bi [51]  FDP + EFBA → EFBAFDP 
EFBAFDP → FDP + EFBA  
EFBAFDP → GAP + EFBADHAP  
GAP + EFBADHAP → EFBAFDP  
EFBADHAP → DHAP + EFBA  
DHAP + EFBA → EFBADHAP  
Triosephosphate isomerase (tpiA)  9106  22.6  Reversible Uni Uni [51]  DHAP + ETPIA → ETPIADHAP 
ETPIADHAP → DHAP + ETPIA  
ETPIADHAP → GAP + ETPIA  
GAP + ETPIA → ETPIADHAP  
Glyceraldehyde 3phosphate dehydrogenase (gapA)  49091  121.7  Ordered sequential mechanism [37]  NAD + EGAP → EGAPNAD 
EGAPNAD → NAD + EGAP  
GAP + EGAPNAD → EGAPNADGAP  
EGAPNADGAP → GAP + EGAPNAD  
EGAPNADGAP → PGP + EGAPNADH  
PGP + EGAPNADH → EGAPNADGAP  
EGAPNADH → NADH + EGAP  
NADH + EGAP → EGAPNADH  
Phosphoglycerate kinase (pgk)  14682  36.4  Ordered sequential mechanism [53]  PGP + EPGK → EPGKPGP 
EPGKPGP → PGP + EPGK  
ADP + EPGKPGP → EPGKPGPADP  
EPGKPGPADP → ADP + EPGKPGP  
EPGKPGPADP → ATP + EPGKPG3  
ATP + EPGKPG3 → EPGKPGPADP  
EPGKPG3 → PG3 + EPGK  
PG3 + EPGK → EPGKPG3  
Phosphoglycerate mutase (pgm)  966  2.4  Reversible Uni Uni [21]  PG3 + EPGM → EPGMPG3 
EPGMPG3 → PG3 + EPGM  
EPGMPG3 → PG2 + EPGM  
PG2 + EPGM → EPGMPG3  
Enolase (eno)  11283  28.0  Reversible Uni Uni [21]  PG2 + EENO → EENOPG2 
EENOPG2 → PG2 + EENO  
EENOPG2 → PEP + EENO  
PEP + EENO → EENOPG2  
Pyruvate kinase (pykF) ^{b}  500  1.2  Allosteric regulation and ordered sequential mechanism [24, 39]  PEP + EPYKF_{TR} → EPYKF_{R}PEP 
EPYKF_{R}PEP → PEP + EPYKF_{TR}  
EPYKF_{R}PEP + ADP → EPYKF_{R}PEPADP  
EPYKF_{R}PEPADP → EPYKF_{R}PEP + ADP  
EPYKF_{R}PEPADP → PYR + EPYKF_{R}ATP  
EPYKF_{R}ATP → ATP + EPYKF_{TR} 
Summary statistics about rate constants estimated from the proposed hybrid method ^{a,b}
Enzyme  Ordinary differential equations  Kvalues ^{c} (Mean ± SD)  CV (%)  CI (95%)  fvalvalues (Mean ± SD) 

Phosphogluco isomerase (pgi)  K_{1f,PGI}*[G6P]*[EPGI]  (1.27 ± 0.00) × 10^{5}  0.0  (1.27, 1.27) × 10^{5}  (9.2 ± 2.4) × 10^{3} 
K_{1r,PGI}*[EPGIG6P]  (1.28 ± 0.00) × 10^{5}  0.0  (1.28, 1.28) × 10^{5}  
K_{2f,PGI}*[EPGIG6P]  (2.41 ± 0.00) × 10^{5}  0.0  (2.41, 2.41) × 10^{5}  
K_{2r,PGI}*[F6P]*[EPGI]  (1.39 ± 0.01) × 10^{6}  0.0  (1.39, 1.39) × 10^{6}  
Phosphofructo kinase (pfkA)^{d}  K_{1f,PFKA}*f*[EPFKA_{TR}]*[ATP]^{m}  (4.21 ± 0.63) × 10^{1}  15.0  (3.83, 4.59) × 10^{1}  (5.4 ± 1.2) × 10^{2} 
K_{1r,PFKA}*[EPFKA_{R}ATP]  (1.73 ± 1.94) × 10^{0}  112.7  (0.55, 2.90) × 10^{0}  
K_{2f,PFKA}*[EPFKA_{R}ATP]*[F6P]^{n}  (4.70 ± 0.50) × 10^{3}  10.7  (4.39, 5.00) × 10^{3}  
K_{2r,PFKA}*[EPFKA_{R}ATPF6P]  (6.52 ± 7.93) × 10^{1}  121.6  (1.73, 11.31)×10^{1}  
K_{3f,PFKA}*[EPFKA_{R}ATPF6P]  (3.50 ± 0.76) × 10^{2}  21.8  (3.05, 3.96) × 10^{2}  
K_{4f,PFKA}*[EPFKA_{R}ADP]  (3.67 ± 3.01) × 10^{5}  82.2  (1.85, 5.49) × 10^{5}  
Fructose bisphosphate aldolase (fba)  K_{1f,FBA}*[FDP]*[EFBA]  (4.98 ± 0.01) × 10^{3}  0.2  (4.98, 4.98) × 10^{3}  (4.8 ± 0.1) × 10^{3} 
K_{1r,FBA}*[EFBAFDP]  (9.14 ± 0.02) × 10^{2}  0.2  (9.13, 9.14) × 10^{2}  
K_{2f,FBA}*[EFBAFDP]  (4.33 ± 0.00) × 10^{2}  0.0  (4.33, 4.33) × 10^{2}  
K_{2r,FBA}*[GAP]*[EFBADHAP]  (9.98 ± 0.02) × 10^{4}  0.2  (9.98, 9.99) × 10^{4}  
K_{3f,FBA}*[EFBADHAP]  (5.95 ± 0.01) × 10^{4}  0.2  (5.94, 5.95) × 10^{4}  
K_{3r,FBA}*[DHAP]*[EFBA]  (1.04 ± 0.00) × 10^{4}  0.0  (1.04, 1.04) × 10^{4}  
Triosephosphate isomerase (tpiA)  K_{1f,TPIA}*[DHAP]*[ETPIA]  (1.17 ± 0.00) × 10^{3}  0.0  (1.17, 1.17) × 10^{3}  (1.5 ± 0.6) × 10^{4} 
K_{1r,TPIA}*[ETPIADHAP]  (2.34 ± 0.00) × 10^{2}  0.0  (2.34, 2.34) × 10^{2}  
K_{2f,TPIA}*[ETPIADHAP]  (3.04 ± 0.00) × 10^{3}  0.0  (3.04, 3.04) × 10^{3}  
K_{2r,TPIA}*[GAP]*[ETPIA]  (1.09 ± 0.00) × 10^{4}  0.0  (1.09, 1.09) × 10^{4}  
Glyceraldehyde 3 phosphate dehydrogenase (gapA)  K_{1f,GAP}*[NAD]*[EGAP]  (3.01 ± 0.00) × 10^{4}  0.0  (3.01, 3.01) × 10^{4}  (7.8 ± 0.4) × 10^{2} 
K_{1r,GAP}*[EGAPNAD]  (7.57 ± 0.00) × 10^{3}  0.0  (7.57, 7.57) × 10^{3}  
K_{2f,GAP}*[GAP]*[EGAPNAD]  (1.85 ± 0.25) × 10^{4}  13.5  (1.75, 1.95) × 10^{4}  
K_{2r,GAP}*[EGAPNADGAP]  (9.61 ± 0.90) × 10^{5}  9.4  (9.25, 9.96) × 10^{5}  
K_{3f,GAP}*[EGAPNADGAP]  (1.67 ± 0.83) × 10^{6}  50.0  (1.34, 2.00) × 10^{6}  
K_{3r,GAP}*[PGP]*[EGAPNADH]  (1.99 ± 0.64) × 10^{9}  32.3  (1.74, 2.24) × 10^{9}  
K_{4f,GAP}*[EGAPNADH]  (7.61 ± 0.01) × 10^{3}  0.2  (7.61, 7.62) × 10^{3}  
K_{4r,GAP}*[NADH]*[EGAP]  (7.36 ± 0.13) × 10^{1}  0.2  (7.35, 7.37) × 10^{1}  
Phosphoglycerate kinase (pgk)  K_{1f,PGK}*[PGP]*[EPGK]  (1.77 ± 0.00) × 10^{6}  0.0  (1.77, 1.77) × 10^{6}  (3.4 ± 1.4) × 10^{3} 
K_{1r,PGK}*[EPGKPGP]  (8.30 ± 0.00) × 10^{4}  0.0  (8.30, 8.30) × 10^{4}  
K_{2f,PGK}*[ADP]*[EPGKPGP]  (4.58 ± 0.05) × 10^{5}  0.0  (4.57, 4.60) × 10^{5}  
K_{2r,PGK}*[EPGKPGPADP]  (4.52 ± 0.35) × 10^{3}  7.8  (4.41, 4.63) × 10^{3}  
K_{3f,PGK}*[EPGKPGPADP]  (3.47 ± 3.45) × 10^{5}  99.5  (2.36, 4.57) × 10^{5}  
K_{3r,PGK}*[ATP]*[EPGKPG3]  (5.10 ± 4.99) × 10^{5}  97.9  (3.50, 6.69) × 10^{5}  
K_{4f,PGK}*[EPGKPG3]  (1.64 ± 0.88) × 10^{5}  53.6  (1.36, 1.92) × 10^{5}  
K_{4r,PGK}*[PG3]*[EPGK]  (1.24 ± 0.68) × 10^{5}  54.7  (1.03, 1.46) × 10^{5}  
Phosphoglycerate Mutase (pgm)  K_{1f,PGM}*[PG3]*[EPGM]  (2.01 ± 0.00) × 10^{6}  0.0  (2.01, 2.01) × 10^{6}  (2.8 ± 1.3) × 10^{2} 
K_{1r,PGM}*[EPGMPG3]  (3.64 ± 0.00) × 10^{5}  0.0  (3.64, 3.64) × 10^{5}  
K_{2f,PGM}*[EPGMPG3]  (3.71 ± 0.00) × 10^{4}  0.0  (3.71, 3.71) × 10^{4}  
K_{2r,PGM}*[PG2]*[EPGM]  (1.08 ± 0.01) × 10^{6}  1.3  (1.07, 1.09) × 10^{6}  
Enolase (eno)  K_{1f,ENO}*[PG2]*[EENO]  (1.42 ± 0.00) × 10^{5}  0.0  (1.42, 1.42) × 10^{5}  (2.6 ± 1.2) × 10^{2} 
K_{1r,ENO}*[EENOPG2]  (2.37 ± 0.00) × 10^{3}  0.0  (2.37, 2.37) × 10^{3}  
K_{2f,ENO}*[EENOPG2]  (1.18 ± 0.00) × 10^{4}  0.0  (1.18, 1.18) × 10^{4}  
K_{2r,ENO}*[PEP]*EENO]  (1.05 ± 0.02) × 10^{5}  1.9  (1.05, 1.06) × 10^{5}  
Pyruvate kinase (pykF) ^{e}  K_{1f,PYKF}*f*[EPYKF_{TR}]*[PEP]  (1.59 ± 0.00) × 10^{2}  0.0  (1.59, 1.59) × 10^{2}  (6.0 ± 2.0) × 10^{4} 
K_{1r,PYKF}*[EPYKF_{R}PEP]  (4.96 ± 0.17) × 10^{1}  3.4  (4.89, 5.03) × 10^{1}  
K_{2f,PYKF}*[EPYKF_{R}PEP]*[ADP]  (4.58 ± 2.92) × 10^{2}  63.9  (3.37, 5.78) × 10^{2}  
K_{2r,PYKF}*[EPYKF_{R}PEPADP]  (2.43 ± 1.85) × 10^{2}  76.2  (1.67, 3.19) × 10^{2}  
K_{3f,PYKF}*[EPYKF_{R}PEPADP]  (2.71 ± 1.71) × 10^{2}  63.2  (2.00, 3.41) × 10^{2}  
K_{4f,PYKF}*[EPYKF_{R}ATP]  (7.89 ± 3.09) × 10^{1}  39.2  (6.61, 9.16) × 10^{1} 
Computational cost
Given an optimization algorithm, one of the challenges for a largescale nonlinear model is the computational economy of that method. This includes how to deal with large numbers of parameters and how to circumvent bottlenecks that limit algorithm performance. These issues must be addressed to make parameter estimation a practical reality; otherwise the computational effort may go beyond a reasonable amount of time for MRL models [22].
The principle technique applied here is a hybrid ARL/MRL strategy, which reduces the number of parameters that need to be estimated simultaneously. This technique allows us to estimate parameters of the mass action rate laws for each enzyme in separate steps. In addition, the number of free parameters is further reduced by adding equality constraints derived from algebraic SMKA/GRC method. However, the global optimization still consumes substantial computational time, since it requires vast numerical integrations of ODEs in order to evaluate the cost function at each iteration step. Particularly, the computational cost increases further when the target model has the significant stiffness that often appears in mass action equations for enzymatic reaction systems. In such systems, each solution requires small integration steps to accommodate the introduction of fastvarying enzyme intermediates. By converting Matlab Mfiles for differential equations into MEXfiles (MEX stands for MATLAB Executable files, which are dynamically linked subroutines produced from C source code), as well as by opting for the stiff ode15s solver, up to 10fold faster optimization can be achieved. Each optimization run was able to quickly obtain optimal solutions within several minutes with the computer environment as follows: Intel Core™ 2 Duo Processor 1.73 GHz CPU with memory size of 2.5 GB, thereby facilitating the repetition of the optimization many times for statistical purposes.
Identifiability
Other challenges for optimizing a largescale nonlinear model include local minima and nonconvex regions over the objective function space. In this work, we adopted the simulated annealing algorithm, which combines the advantages of our two proposed techniques (i.e. hybrid MRL/ARL modeling and hybrid algebraicnumerical optimization), for globally minimizing multivariate functions.
To ensure that the algorithm is not trapped in suboptimal local minima within a large search space, a suitable number of optimizations (25 runs) were done with different random initial guesses over the entire range of the rate constants. Our cost function values (fval) show that the method is able to consistently return parameters from which the MRL model dynamics closely matched those observed for the original ARL model, thereby avoiding the local minima problem leading to a badness of fit between ARL and MRL dynamics.
It has been found that the candidate values of the rate constants may be highly correlated [12] and the search surface may consist of a very flat valley floor [23], resulting in unreliable or unreproducible estimates although the fit of model to data may be very good. Such illposed/nonconvex optimization problems must be taken into account while assessing the quality of MRL model fit to ARL data. The coefficient of variation (CV), defined as the standard deviation divided by the mean, was used as a measure of the reproducibility of the results from 25 optimization runs. Distributions with a CV < 10% are considered highprecision and lowvariance, while those with a CV varying between 10% and 50% are considered moderate precision and variance. Table 2 shows that CV for 80% of parameters was below 50%, with 66% of parameters with CV < 10% and with 14% of parameters with CV in the range of 10 – 50%, indicating that agreement between the optimization runs varied from moderate to very good for most parameters. Estimates of 20% of parameters are associated with CV ranging from 54% to 122%, which suggests that these parameters are not highly identifiable from the existing kinetic information. Nevertheless, the confidence interval shows that with 95 percent certainty the actual values for these unidentifiable rate constants fall within a much narrower range than that for the original biological bounds. Apparently, these unidentifiable parameters are intervalidentifiable, being bounded within a finite interval from the existing ARL dynamics, algebraic equality and inequality constraints.
Experimental uncertainty
Another important issue in the parameter estimation process is the existence of uncertainty in the experimental data, including both the concentration time courses and the enzyme abundances used in our method. These uncertainties can affect the parameter estimates, especially since the enzyme concentration for pykF has been obtained through analysis of 2D gels. Such gelbased proteome technique is frequently subject to geltogel variations [25], so it is more susceptible to noise as compared with other measurement techniques. Because of potential high noise levels in analysis of 2D gels, we use pykF as a representative best case to investigate the bias and variation of parameter estimation caused by uncertainty in experimental measurements. We added Gaussian distributed random variates to the experimentally determined value for pykF (1.2 μM, which is taken as the 'noisefree' value in this work). 25 such noisy enzyme concentrations were generated for each of two noise levels (10% and 20%). A 95% confidence interval (CI) for the fitted parameters and the relative errors (RE) between noise and noisefree solutions were used to evaluate the precision and bias due to experimental uncertainty of enzyme level.
Effect of uncertainty in pykF enzyme concentration measurement on rate constants estimation ^{a}
Parameters  0% noise ^{b}  10% noise ^{c}  20% noise ^{c}  

Mean ± SD × 10^{2}  95% CI × 10^{2}  RE %  Mean ± SD × 10^{2}  95% CI × 10^{2}  RE %  Mean ± SD × 10^{2}  95% CI × 10^{2}  RE %  
K _{ 1f }  1.59 ± 0.00  [1.59, 1.59]  0  1.61 ± 0.17  [1.54, 1.68]  1.3  1.51 ± 0.21  [1.42, 1.60]  5.0 
K _{ 1r }  0.50 ± 0.02  [0.49, 0.50]  0  0.50 ± 0.06  [0.47, 0.52]  0  0.47 ± 0.07  [0.44, 0.50]  6.0 
K _{ 2f }  4.58 ± 2.92  [3.37, 5.78]  0  4.61 ± 3.76  [3.05, 6.16]  0.7  4.52 ± 2.47  [3.50, 5.54]  1.3 
K _{ 2r }  2.43 ± 1.85  [1.67, 3.19]  0  1.96 ± 1.76  [1.24, 2.69]  19.3  2.62 ± 1.71  [1.90, 3.32]  7.8 
K _{ 3f }  2.71 ± 1.71  [2.00, 3.41]  0  2.69 ± 1.63  [2.02, 3.36]  0.7  2.70 ± 1.51  [2.08, 3.32]  0.4 
K _{ 4f }  0.79 ± 0.31  [0.66, 0.92]  0  0.89 ± 0.56  [0.66, 1.12]  12.7  0.66 ± 0.19  [0.58, 0.74]  16.5 
In short, our results indicate that the proposed method can be applied to moderately noisy data. In particular, we have shown for the pykF example the modest impact on parameter estimation for an underlying MRL model at a 20% uncertainty in enzyme level. For proteins with a dramatically high uncertainty from 2D gel analysis, several techniques, such as prefractionation, parallel and repeated run of gels, are available to reduce the noise level before these proteomic data are incorporated in our method.
Model evaluation
Parameter sensitivity
We then investigated how these optimal parameters influenced the systemic response, which are normally quantified through sensitivity analysis using the methods of Metabolic Control Analysis (MCA) at steady state. Our interest, however, was to examine the effect of changing these parameters on the MRL system's temporal response, where the behaviour of interest is often found. We have therefore focused on timedependent sensitivity analysis.
Model performance
Since the rate constants for each enzyme are estimated in separate steps, the next question of interest would be what happens when these individual MRL parts are assembled together to form a coupled enzymatic reaction system. We therefore assembled these MRL parts into Chassagnole's central metabolism model to check functioning of the new assembly under actual operating conditions.
To compare the relative stabilities of the ARL and MRL networks, we first computed the Jacobian matrix to determine eigenvalues for both systems at steadystate. The largest MRL eigenvalue observed was 0.00079 s^{1}, and the spectrum of ARL values indicated a maximum of 0.00092 s^{1}. The fact that all eigenvalues prove to be negative for both models indicates that the ARL and MRL models are able to return to equilibrium following small perturbations. Since the MRL form essentially introduces fast variables that have been eliminated in the ARL form, it is not surprising that the MRL model greatly increases the eigenvalue spread, changing the smallest eigenvalue from 3.4 × 10^{3} s^{1} to 1.9 × 10^{7} s^{1}and the smallest time constant from 2.9 × 10^{1} ms to 5.3 × 10^{5} ms.
As a result of the ensuing introduction of enzyme forms to the model, the time scale range of reaction is greatly expanded by many orders of magnitude, rendering the MRL model considerably stiffer than its ARL counterpart (ARL stiffness 3.7 × 10^{6}, MRL stiffness 2.4 × 10^{10}). These widely differing time scales means that the usual numerical methods require small time step sizes to achieve stable solutions. For the metabolic network described here, we initially opted for Matlab's builtin forward differencing integrator, ode45, which failed to achieve solutions through its selection of inordinately small time steps for the model with both fast and slow changing variables. Therefore, we used the backward differencing solver ode15s to accommodate the inherent stiffness of the MRL model. Using this approach, the computational cost of an MRL simulation was still 33% larger than an ARL simulation. However, by compiling the MRL model into MEX binaries, the CPU time was reduced 87%.
Discussion
The immediate motivation for our MRL modelling is to provide association and dissociation rate constants for the particlebased (PB) modelling aimed at building biophysical realism through fourdimensional simulations of in vivo elemental reactions. The incorporation of rate constants into PB modelling can be implemented by using the standard Smoluchowski theory [26] for computing timedependent reaction coefficients and survival probability [27–29]. Due to the lack of highquality experimentally determined rate constants, researchers have to make many, often arbitrary, assumptions on the values of these parameters, making this type of model less appealing to biologists. To the best of our knowledge, the method we present in this paper is the first step which allows passing in vivo data on an experimental basis to the dynamic multidimensional modelling at a finer scale. The work presented here provides us with some optimism that models operating at different scales can in fact be linked in a meaningful way.
In a more general sense, it is clear that increasingly sophisticated and reliable models of system dynamics will depend upon a sufficient underlying layer of biophysical detail so that they can respond and adapt realistically to changes in the physiological environment. Notably the ARL approach is capable of dealing with large networks by ignoring the details of enzyme intermediates and the rate constants that underpin biophysical reality. By comparison, MRL models provide the detailed framework required for a foundation for building biophysical realism. Thus, there is a distinct need for developing mechanistic MRL models which can provide more realistic predictions of cellular components and dynamics in a model organism.
MRL models belong to the class of nonconvex nonlinear models wherein a number of difficulties may arise when estimating parameters of a realistic dynamical system, like e.g. convergence to local solutions, flat objective function in the neighbourhood of the solution and unreasonable computational effort for a problem with a large number of parameters. While previously most research work in this area has focused on the search algorithms (e.g. the hybrid stochasticdeterministic search [30], scatter search [23] and modern evolution search [17]), our work, however, focuses on the other side of the problem of parameter estimation, i.e. the model structure. We exploit the model structure to improve the efficiency and robustness of parameterizing a largescale MRL model. It is based upon a strategy that identifiable structures or submodels can be generated by systematically eliminating parameters of the original model until it becomes identifiable [19]. We present a novel methodology with respect to parameter elimination without changing the original dynamics, which combines the advantages of two hybrid techniques. By replacing a single ARL pathway with its MRL equivalent, and installing this module to the same place as before while keeping other original ARL pathways unchanged, each set of MRL reactions can be independently and efficiently optimized. The alternative is an extremely cumbersome optimization process involving the simultaneous adjustment of an unreasonably large number of parameters. The model structure can be further manipulated by applying equality constraints to the rate constants associated to each enzyme. The resulting technique for incorporating parameter equality constraints into numerical simulation and optimization consistently reduces the number of parameters for a single enzyme, thereby ensuring maximum efficiency and robustness of the parameterization method. Consequently, our method may pave the way towards future systemic investigation of the mass action rate laws of largescale cell network from widely accessible ARL models.
A problem that attracts continuing interest is that not all parameters in a largescale nonconvex nonlinear model are uniquely identifiable. DiStefano [31] introduced the notion of interval identifiabilty to describe finite bounds on the unidentifiable rate constants of general mammillary models. Vicini et al. [19] used additional parameter knowledge to narrow the bounds of rate constants that remain unidentifiable in mammillary and catenary compartmental models. In MRL models with respect to glycolysis of E. coli, we also found that 20% of parameters are not highly identifiable from the existing dynamics data. We incorporate several levels of coupling, including equality and inequality constraints and global optimization algorithm, to successfully reduce the range of computable bounds for highly unidentifiable parameters. Comparing with a widevarying range applicable to massaction rate constants, the range shrinking applied here is the best way so far to acquire reasonable approximations of the parameters.
In addition to their immediate motivational value for the particlebased modelling, our novel methodology and the resulting MRL models have some other interesting applications. For example, starting from the steady state before glucose impulse, the initial concentration for every enzyme form can be derived from the Schematic Method of King and Altman (SMKA). Then all enzyme forms freely evolve to comply with systemic dynamics under the constraint of fixed total concentration, thereby releasing the constraint of the widelyused quasisteady state assumption (QSSA). This avoidance of QSSA will greatly extend the application area of the proposed method, since QSSA can be problematic for some in vivo pathways at high enzyme levels [32] and also for fast transient change reactions such as signalling and transduction pathways [14]. Parameter sensitivity is another important aspect that may be applicable to experiments regarding parameter identifiability. Through the timedependent sensitivity analysis, parameters within a certain period of time demonstrate little impact on the simulator results (Figure 3). It is therefore not worthwhile focusing experiments on this period to tune the parameters. Moreover, sensitivity analysis reveals key elementary reaction steps that would affect the overall dynamics of the metabolic network. One potential approach to accelerate optimization convergence would be to focus much of the computational effort on these crucial parameters. The prioritization of parameters and time interval to calibrate them is expected to evolve as an area of importance, providing a direction to future Omics efforts in this area to provide systemslevel measurements for virtually all types of cellular components and parameters in a model organism [33].
Conclusion
In this investigation we incorporated the protein abundance information into our MRL framework to globally optimize elementary rate constants through a novel hybrid method. We effectively deal with the issues of network scale, stiffness, local minima, computational burden and parameter unidentifiability inherent within a large MRL model. Since the proposed method makes full use of the available experimental data, it addresses the problem of the computer simulations of biological systems which have high resolution regimes but lack experimental support at such a finer scale. The work presented here provides us with optimism that models in the mesoscopic regime (e.g. particlebased methods) can be rooted on a firm foundation of parameters generated in the macroscopic regime on an experimental basis.
Moreover, the resulting MRL models are as close as possible to the biological experiments. Therefore, they can be used to steer further biological experiments aimed at supporting computer simulation. For example, specific direction and guidance for sampling procedures can be issued after a timedependent sensitivity analysis, through which the most sensitive parameters and time intervals are identified.
Methods
Hybrid model development
ARL and MRL
Before initiating the parameter optimization process, a kinetic model which will be the subject of the optimization has to be specified. This implies defining the ARL reactions, setting their MRL kinetic types, and identifying all of the variables and parameters. The ARL model, which is adapted from Chassagnole's dynamical model of E. coli central metabolism [21], consists of mass balance equations for extracellular glucose and for the intracellular metabolites as shown in Figure 1. The time course of unbalanced cometabolite (NAD+, NADH, NADP+, NADPH, AMP, ADP and ATP) concentrations were fitted with analytical functions [21]. Kinetic rate equations define the metabolic pathways through a coupled system of aggregated rate laws in the form of mechanistic or empirical rate expressions. The first step in transforming individual ARL reactions into massaction form is to define the elementary steps of a reaction mechanism from a literature search on the individual enzymes; what remains is to find an appropriate set of parameters for the MRL model.
Hybrid MRL/ARL modelling
Direct replacement of every reaction step with its MRL mechanism leads to an unreasonably large number of parameters for simultaneous optimization. Therefore we have developed a hybrid modelling approach to optimize parameters for the MRL model. In this approach, the network is partitioned into modules, one for each enzymatic reaction. Then, a series of hybrid ARL/MRL models can be constructed, one for each ARL reaction, with that single individual ARL reaction replaced by its MRL version. Parameter optimization is done on the hybrid models, optimizing the parameters for that single set of MRL reactions alone in the context of the remaining ARL network. In this way, the number of simultaneous parameters requiring optimization is substantially reduced. An example is shown in Figure 1, illustrating the replacement of the phosphoglucoisomerase (pgi) pathway with the two individual reversible elementary reaction steps in the MRL form.
Algebraicnumerical simulation and optimization
Once all the necessary information has been defined, it is then passed to the optimization module that combines SMKA/GRC with NSO.
Initial conditions
Initial concentrations of metabolites and cometabolites upon a glucose impulse are the same as for Chassagnole's ARL model [21] of E. coli growing on minimal medium. Initial enzyme concentrations shown in Table 1, except pyruvate kinase, are adapted from Lu et al. [34], where mass spectrometrybased absolute protein expression (APEX) profiling was developed for precise measurement of E. coli protein abundances. Protein abundance for pyruvate kinase is taken from 2D gel data [35], since so far only abundance data from gelbased proteome technique is available. The adjustable parameters that are the subject of optimization need to be given an initial value as the optimization methods must start at some point of parameter space. For our proposed hybrid method, the adjustable parameters can take any initial value within boundary constraints that are detailed below. Adjustable parameters are selected based on one or more of the following criteria: (1) choose a parameter which can be classified as more sensitive than others through sensitivity analysis; (2) choose a parameter which has relatively small acceptable range; or (3) choose a parameter through which other rate constants can be determined. The selected adjustable parameters for each mechanism are defined in Additional File 1.
Algebraic manipulation
With given guesses for the adjustable MRL parameters, all MRL rate constants for an individual ARL reaction are estimated by applying SMKA/GRC, where combinations of MRL rate constants are algebraically related to known ARL kinetic constants. Then, given the enzyme concentration listed in Table 1, the initial steadystate concentration of all enzyme forms are determined from the SMKA distribution equations consisting of metabolite concentrations and estimated rate constants.
Numerical simulation and optimization
With algebraderived MRL parameters from ARL constants, using the initial distribution of enzyme species, the dynamics of all of the elements and reaction rates upon glucose impulse were simulated in the hybrid ARL/MRL models using a stiff ODE solver (ode15s in Matlab). The resulting metabolite concentration and reaction rate curves were compared with the same curves generated using the ARL model alone, allowing the optimization process to find parameter values for the MRL steps which correspond as closely as possible to the ARL model.
where NEC is the Number of the Experimental Conditions for the data, NTS is the total Number of Time Series for metabolites and substep rates specific to an individual ARL reaction, NSP is the Number of Sampling Points in time series c, X_{ predicted }(n,c,t) is the Time Course Data from the hybrid model simulation, X_{ reference }(n,c,t) is the Time Course Data from the reference ARL simulation, and the weighting w(n,c) is $1/{\left\underset{t}{\mathrm{max}}{X}_{reference}(n,c,t)\right}^{2}$ for reference time series c at condition n.
Generation of new guess
If the maximum iteration number for simulated annealing (i.e. the termination criterion) is not reached, a new guess for the adjustable parameters is generated. All MRL rate constants are then reestimated by SMKA/GRC step and the cost function value is reevaluated by numerical simulation. This cycle is repeated until satisfactory results are achieved. The evaluation is considered satisfactory if the evaluation criterion fval is less than 0.08. We have found that if fval is larger than 0.08, the MRL simulation results are unable to provide a reasonable match to the ARL simulation results.
Simulated annealing procedure [18]
The rate constants are globally optimized via the following simulated annealing procedure: the starting annealing temperature (T_{0}) was set around the order of magnitude of the cost function at the initial estimates, and the annealing temperature was linearly decreased by a reduction factor (RF) of 0.1 until the temperature reached zero. In order to verify that the annealing schedule was able to explore the entire parameter space of the underlying MRL mechanism, multiple testing calculations were performed with varying RF around the preset value. The annealing schedule with the best cost function value is regarded as an optimal one for the global optimization process.
Statistical analysis
Optimized parameters
A suitable number of optimizations (25 runs) were done with different random initial guesses distributed over the entire acceptable range of the rate constants. Mean and standard deviation (SD) of parameters were calculated from these 25 optimization runs. A 95% confidence interval (CI) for the fitted parameters and the coefficient of variation (CV = SD/Mean) were used to evaluate the precision and variation of the parameters.
Effect of experimental uncertainty in enzyme level
The noise was computer generated with random numbers based on Gaussian distribution and added to the experimentally determined value for pykF (1.2 μM, which was measured by 2D gel with some uncertainty). Twentyfive noisy enzyme concentrations were randomly generated for each noise level (10% and 20%) and the optimization was repeated. A 95% confidence interval (CI) for the fitted parameters and the relative errors (RE) between noise and 'noisefree' solutions were used to evaluate the parameter precision and bias due to experimental uncertainty.
Example of a multisubstrate kinetic mechanism
Glyceraldehyde 3phosphate dehydrogenase (gapA), which obeys a multisubstrate kinetic mechanism, is used as an example to illustrate and explain the basic steps of the hybrid algebraicnumerical method.
Chassagnole's model for gapA is a simplified ARL equation, since neither the sequence of subreactions nor the enzyme forms can be deduced from the model. The first binding of NAD^{+} to the enzyme and the last release of NADH from the enzyme have been found for NAD^{+}linked dehydrogenases [37]. So an ordered sequential mechanism would be expected with gapA; what remains is to identify if the binding mechanism proceeds through a ternary complex (Figure 4A) or through a binary complex (Figure 4B). There is experimental evidence that a kinetically significant ternary complex exists for NAD+linked dehydrogenases [37], so the optimization process presented here starts with an ordered Bi Bi mechanism with a ternary central complex. Treating k_{2} and k_{4} as adjustable parameters leads to a set of equality constraints for rate constants and enzyme forms (See Additional file 1 for details). By varying k_{2} and k_{4}, the optimization process automatically tries to arrive at the best solution for MRL parameters, so that the resulting concentration and rate curves correspond as closely as possible to the same curves generated using the ARL model alone. In the cost function (Eqn. 1), NEC is 1, relating to a glucose impulse to extracellular concentration of 2 mM. NTS is 6, consisting of 2 timeseries for metabolites and 4 timeseries for net rates of four substeps. NSP is 21, relating to a sampling time interval of 0–20 s with a sampling point every 1 s. The time period over which we run the simulation and optimization is consistent with the original experiment, where all the intracellular metabolites could be sampled and measured within 20 seconds after the glucose impulse [21].
Example for allosteric regulation
Pyruvate kinase (pykF) is an allosteric enzyme whose kinetic behaviour is usually described by the concerted allosteric transition mode of the Monod, Wyman, and Changeux (MWC) model [38]. According to the MWC model, pykF can exist in an active state (ER) or an inactive state (ET). The fraction of active enzyme in the ER or ET states is determined by the concentrations and relative affinities of the inhibitor (ATP) and the activator (FDP) for the ER and ET states [39]. In addition to this allosteric regulation, the enzyme normally follows the substrates binding rule in the order PEP and ADP and the products releasing rule in the order PYR and ATP [24]. Since the equilibrium constant of the pykF reaction is much larger, of the order of 10^{5} [40], this reaction is always regarded as irreversible with the back reaction being ignored completely.
where K_{ eq }is the equilibrium constant and the power of 4 is the number of the allosteric sites for pykF [39]. Thus, k_{1}·f in Figure 2A represents an effective rate constant for the first step of the sequential multisubstrate reactions, and the SMKA/GRC treatments yields all the enzyme forms and rate constants as shown in Additional file 1.
MRL model evaluation
Parameter sensitivity
where J_{ i }(t) are the time course of glycolytic rates and p_{ j }are rate constants. Eqn. (3) is evaluated by numerical differentiation of the network model using finite differences.
Coupledenzyme system
To evaluate the performance of the MRL model for the coupledenzyme system, we simultaneously replaced ARL pathways corresponding to glycolysis (9 separate ARL reactions) with their MRL versions in the context of the remaining ARL model. For each enzyme, we picked the set of the rate constants with the best cost function values.
The eigenvalues u satisfy the characteristic equation of the matrix A
det(A  ul) = 0
Eqn. (5) is a polynomial equation in u of degree n, and I is the identity matrix. By the Fundamental Theorem of Algebra, this equation has n solutions. If all the eigenvalues of A have negative real part then the steady states of all species are stable. The time constant, which indicates how fast a deviation from a given steady state will decline, is defined as the reciprocal absolute values of the real parts of the eigenvalues [42]. In addition, the stiffness or eigenvalue spread of the model can be calculated as the ratio of the largest over the smallest eigenvalue.
Simplications, Boundary constraints, and Tools
Simplications for Phosphofructokinase
Allosterism can cause fractal properties [43] and kinetic changes due to enzyme conformational changes. To avoid the vast expansion of parameters to be optimized, in this situation we model the particularly complicated allosteric regulation for phosphofructokinase (pfkA) via nonintegral reaction orders, thereby breaking from the pure elementary reaction schema, but maintaining the type of bimolecular reactions (Figure 2B). This treatment for its simplicity greatly increased the ability to characterize reaction parameters. Since the original ARL model is almost an empirical expression for pfkA dynamics, it is impossible to establish a relationship between kinetic constants and rate constants through SMKA/GRC. For this reason, we treat all rate constants and nonintegral reaction orders as adjustable parameters that vary freely within the acceptable ranges that are detailed below. These parameters were used to calculate the initial concentrations of enzyme forms through SMKA (see Additional file 1) and then estimated by numerical simulation and optimization to reproduce the original timecourse data for the ARL model.
Boundary constraints
When two reactants for bimolecular reactions are approximately equal in size, the maximum diffusionlimited association rate constant for molecular interactions corresponds to k_{ a }≈ 10^{6}10^{7} mM^{1} s^{1}. Since the association could be even faster for enzymatic reactions where one molecule is small and diffuses rapidly while the other is large and provides a large target [44], the order of magnitude for association rate constants are constrained to be not over 10^{9} mM^{1} s^{1}. The dissociation rate constants are required to not be in excess of 10^{6} s^{1} [45].
Parameters for phosphofructokinase (pfkA) are confined within a relatively small parameter space, which is based on mechanistic analysis and timeconsuming trialanderror testing. Bacterial pfkA enzyme has been found to be inhibited by PEP as a result of interaction at an allosteric site [46] and also by a high concentration of ATP due to substrate antagonism [47]. Considering the high levels of PEP and ATP in this study, it is reasonable to assume that the involved elementary steps have low rate constants, except for the last release of ADP where a higher rate constant is necessary for ensuring the fast breakdown of ERADP in order to provide as much as possible of the enzyme in the active state (Figure 2B). For these reasons, the upper boundaries of the forward rate constants are set at 1000 mM^{1} s^{1}, 5000 mM^{1} s^{1}, 1000 s^{1} and 1000000 s^{1} for k_{1}, k_{2}, k_{3} and k_{4}, respectively, and the backward rate constants are confined within 100 s^{1} and 2000 s^{1} for k_{1} and k_{2}, respectively. The upper boundaries were further verified by random testing, where parameters out of the preset ranges appeared unable to reach a satisfactory cost function value.
Tools
Matlab (Mathworks, Nattick MA) toolboxes [48] were used for model simulation and optimization, with models compiled into MEX binaries for performance [49]. The simulated annealing (SA) code was partly based on the C code of Press and Teukolsky [50].
Abbreviations
 The following abbreviations are used for individual enzymes:

aroH– DAHP synthase (EC 2.5.1.54), eno– Enolase (EC 4.2.1.11), fba– Fructosebisphosphate aldolase (EC 4.1.2.13), gapA– Glyceraldehyde 3phosphate dehydrogenase (EC 1.2.1.12), glgC– Glucose1phosphate adenylyltransferase (EC 2.7.7.27), gnd 6phosphogluconate dehydrogenase (EC 1.1.1.44), gpsA– Glycerol 3phosphate dehydrogenase (EC 1.1.1.94), pdhA– Pyruvate dehydrogenase (EC 1.2.4.1), pfkA– Phosphofructokinase (EC 2.7.1.11), pgi– Phosphoglucoisomerase (EC 5.3.1.9), pgk– Phosphoglycerate kinase (EC 2.7.2.3), pgm– Phosphoglycerate mutase (EC 5.4.2.1), pgm1– Phosphoglucomutase (EC 5.4.2.2), ppc– Phosphoenolpyruvate carboxylase (EC 4.1.1.31), prs– Ribose phosphate pyrophosphokinase (EC 2.7.6.1), pykF– Pyruvate kinase (EC 2.7.1.40), rpe– Ribulose phosphate epimerase (EC 5.1.3.22), rpi– Ribose phosphate isomerase (EC 5.3.1.6), talA– Transaldolose (EC 2.2.1.2), tktA– Transketolase a (EC 2.2.1.1), tktB– Transketolose b (EC 2.2.1.1), tpiA– Triosephosphate isomerase (EC 5.3.3.1), zwf– Glucose6phosphate dehydrogenase (EC 1.1.1.49). The following abbreviations are used for metabolites DHAP – Dihydroxyacetonephosphate, E4P – Erythrose4phosphate, F6P – Fructose6phosphate, FDP – Fructose1,6bisphosphate, G1P – Glucose1phosphate, G6P – Glucos6phosphate, GAP – Glyceraldehyde3phosphate, GLC – Glucose, PEP – Phosphoenolpyruvate, PG – 6phosphogluconate, PG2 – 2phosphoglycerate, PG3 – 3phosphoglycerate, PGP – 1,3diphosphoglycerate, PYR – Pyruvate, Rib5P – Ribose5phosphate, Ribu5P – Ribulose5phosphate, Sed7P – Sedoheptulose7phosphate, Xyl5P – Xylulose5phosphate. Other abbreviations are: ARL – Aggregated Rate Law, CI – Confidence Interval, CV – Coefficient of Variation, fval:– Cost Function Value, GRC – General Rules of Cleland, MEX – MATLAB Executable Files, MRL – Mass Action Rate Law, NSO – Numerical Simulation and Optimization, ODE – Ordinary Differential Equation, PBS – Particle Based Simulation, QSSA – Quasi Stationary State Assumption, RE – Relative Errors compared to 0% noise case, SD – Standard Deviation, SMKA – Schematic Method of King and Altman.
Declarations
Acknowledgements
This work was supported by grants from the Canada Foundation for Innovation, IBM Canada, AVAC Ltd. and the Alberta Agricultural Research Institute.
Authors’ Affiliations
References
 Segel IH: Enzyme kinetics: behavior and analysis of rapid equilibrium and steady state enzyme systems. 1975, xxii, 957 pNew York, WileyGoogle Scholar
 Shiraishi F, Savageau MA: The tricarboxylic acid cycle in Dictyostelium discoideum. I. Formulation of alternative kinetic representations. J Biol Chem. 1992, 267 (32): 2291222918.PubMedGoogle Scholar
 Keane JF, Bradley C, Ebeling C: A compiled accelerator for biological cell signaling simulations. Symposium on Field Programmable Gate Arrays. 2004, 233 2241. Monterey, California, USA. ACM, New York, NY, USAGoogle Scholar
 Hansen RE, Tonsager MW: Determination of the regime of rapid reacting systems in stopped and steadyflow investigations by the velocity probe method. J Phys Chem. 1988, 92 (8): 2189 22196. 10.1021/j100319a022.View ArticleGoogle Scholar
 Schnell S, Turner TE: Reaction kinetics in intracellular environments with macromolecular crowding: simulations and rate laws. Progress in Biophysics and Molecular Biology. 2004, 85: 235260. 10.1016/j.pbiomolbio.2004.01.012View ArticlePubMedGoogle Scholar
 Ridgway D, Broderick G, Ellison MJ: Accommodating space, time and randomness in network simulation. Curr Opin Biotechnol. 2006, 17 (5): 493498. 10.1016/j.copbio.2006.08.004View ArticlePubMedGoogle Scholar
 Stiles JR, Bartol TM: Monte Carlo methods for simulating realistic synaptic microphysiology using MCell. Computational Neuroscience: realistic modeling for experimentalists. Edited by: Schutter E. 2001, 87127. Boca Raton, Fla, CRC PressGoogle Scholar
 Andrews SS, Bray D: Stochastic simulation of chemical reactions with spatial resolution and single molecule detail. Phys Biol. 2004, 1 (34): 137151. 10.1088/14783967/1/3/001View ArticlePubMedGoogle Scholar
 Broderick G, Ru'aini M, Chan E, Ellison MJ: A lifelike virtual cell membrane using discrete automata. In Silico Biol. 2005, 5 (2): 163178.PubMedGoogle Scholar
 Rohwer JM, Meadow ND, Roseman S, Westerhoff HV, Postma PW: Understanding glucose transport by the bacterial phosphoenolpyruvate:glycose phosphotransferase system on the basis of kinetic measurements in vitro. J Biol Chem. 2000, 275 (45): 3490934921. 10.1074/jbc.M002461200View ArticlePubMedGoogle Scholar
 Yang CR, Shapiro BE, Mjolsness ED, Hatfield GW: An enzyme mechanism language for the mathematical modeling of metabolic pathways. Bioinformatics. 2005, 21 (6): 774780. 10.1093/bioinformatics/bti068View ArticlePubMedGoogle Scholar
 Famili I, Mahadevan R, Palsson BO: kCone analysis: determining all candidate values for kinetic parameters on a network scale. Biophys J. 2005, 88 (3): 16161625. 10.1529/biophysj.104.050385PubMed CentralView ArticlePubMedGoogle Scholar
 Rangamani P, Iyengar R: Modelling spatiotemporal interactions within the cell. J Biosci. 2007, 32: 157–167 10.1007/s1203800700143View ArticlePubMedGoogle Scholar
 Millat T, Bullinger E, Rohwer J, Wolkenhauer O: Approximations and their consequences for dynamic modelling of signal transduction pathways. Math Biosci. 2007, 207 (1): 4057. 10.1016/j.mbs.2006.08.012View ArticlePubMedGoogle Scholar
 King EL, Altman C: A Schematic Method of Deriving the Rate Laws for EnzymeCatalyzed Reactions. J Phys Chem. 1956, 60: 13751378. 10.1021/j150544a010.View ArticleGoogle Scholar
 Cleland WW: The kinetics of enzymecatalyzed reactions with two or more substrates or products. I. Nomenclature and rate equations. Biochim Biophys Acta. 1963, 67: 104137. 10.1016/00063002(63)918006View ArticlePubMedGoogle Scholar
 Mendes P, Kell D: Nonlinear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation. Bioinformatics. 1998, 14 (10): 869883. 10.1093/bioinformatics/14.10.869View ArticlePubMedGoogle Scholar
 Kirkpatrick S, Gelatt CD, Vecchi MP: Optimization by Simulated Annealing. Science. 1983, 220 (4598): 671680. 10.1126/science.220.4598.671View ArticlePubMedGoogle Scholar
 Vicini P, Su H, DiStefano JJ: Identifiability and interval identifiability of mammillary and catenary compartmental models with some known rate constants. Math Biosci. 2000, 167 (2): 145161. 10.1016/S00255564(00)000353View ArticlePubMedGoogle Scholar
 Rizzi M, Baltes M, Theobald U, Reuss M: In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae .2. Mathematical model. Biotechnol Bioeng. 1997, 55 (4): 592608. 10.1002/(SICI)10970290(19970820)55:4<592::AIDBIT2>3.0.CO;2C.View ArticlePubMedGoogle Scholar
 Chassagnole C, NoisommitRizzi N, Schmid JW, Mauch K, Reuss M: Dynamic modeling of the central carbon metabolism of Escherichia coli. Biotechnol Bioeng. 2002, 79 (1): 5373. 10.1002/bit.10288View ArticlePubMedGoogle Scholar
 Tran TT, Mittal A, Aldinger T, Polli JW, Ayrton A, Ellens H, Bentz J: The elementary mass action rate constants of Pgp transport for a confluent monolayer of MDCKIIhMDR1 cells. Biophys J. 2005, 88 (1): 715738. 10.1529/biophysj.104.045633PubMed CentralView ArticlePubMedGoogle Scholar
 RodriguezFernandez M, Egea JA, Banga JR: Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinformatics. 2006, 7: 483 10.1186/147121057483PubMed CentralView ArticlePubMedGoogle Scholar
 Macfarlane N, Ainsworth S: A kinetic study of Baker'syeast pyruvate kinase activated by fructose 1, 6diphosphate. Biochem J. 1972, 129 (5): 10351047.PubMed CentralView ArticlePubMedGoogle Scholar
 SeillierMoiseiwitsch F: Statistical Analysis of 2D Gel Patterns . The Proteomics Protocols Handbook. Edited by: Walker JM. 2005, Clifton, Humana PressGoogle Scholar
 Smoluchowski MV: Investigation into a mathematical theory of the kinetics of coagulation of colloidal solutions. Z phys Chem. 1917, 92: 129168.Google Scholar
 Agmon N, Szabo A: Theory of reversible diffusioninfluenced reactions. J Chem Phys. 1990, 92 (9): 52705284. 10.1063/1.458533.View ArticleGoogle Scholar
 Popov AV, Agmon N: Threedimensional simulations of reversible bimolecular reactions: The simple target problem. Journal of Chemical Physics. 2001, 115: 89218932. 10.1063/1.1412609.View ArticleGoogle Scholar
 Ridgway D, Broderick G, LopezCampistrous A, Ru'aini M, Winter P, Hamilton M, Boulanger P, Kovalenko A, Ellison MJ: Coarsegrained molecular simulation of diffusion and reaction kinetics in a crowded virtual cytoplasm. Biophys J. 2008Google Scholar
 RodriguezFernandez M, Mendes P, Banga JR: A hybrid approach for efficient and robust parameter estimation in biochemical pathways. Biosystems. 2006, 83 (23): 248265. 10.1016/j.biosystems.2005.06.016View ArticlePubMedGoogle Scholar
 DiStefano JJI: Complete parameter bounds and quasiidentifiability of some unidentifiable linear systems, . Math Biosci. 1983, 65: 5168. 10.1016/00255564(83)900706.View ArticleGoogle Scholar
 Schnell S, Maini PK: Enzyme kinetics at high enzyme concentration. Bull Math Biol. 2000, 62 (3): 483499. 10.1006/bulm.1999.0163View ArticlePubMedGoogle Scholar
 Joyce AR, Palsson BO: The model organism as a system: integrating ‘omics’ data sets. Nature Reviews Molecular Cell Biology. 2006, 7: 198210. 10.1038/nrm1857View ArticlePubMedGoogle Scholar
 Lu P, Vogel C, Wang R, Yao X, Marcotte EM: Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol. 2007, 25 (1): 117124. 10.1038/nbt1270View ArticlePubMedGoogle Scholar
 Link AJ, Robison K, Church GM: Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K12. Electrophoresis. 1997, 18 (8): 12591313. 10.1002/elps.1150180807View ArticlePubMedGoogle Scholar
 Gadkar KG, Gunawan R, Doyle FJ: Iterative approach to model identification of biological networks. BMC Bioinformatics. 2005, 6: 155 10.1186/147121056155PubMed CentralView ArticlePubMedGoogle Scholar
 Preuveneers MJ, Peacock D, Crook EM, Clark JB, Brocklehurst K: D3hydroxybutyrate dehydrogenase from Rhodopseudomonas spheroides. Kinetic mechanism from steadystate kinetics of the reaction catalysed by the enzyme in solution and covalently attached to diethylaminoethylcellulose. Biochem J. 1973, 133 (1): 133157.PubMed CentralView ArticlePubMedGoogle Scholar
 Monod J, Wyman J, Changeux JP: On the Nature of Allosteric Transitions: A Plausible Model. J Mol Biol. 1965, 12: 88118.View ArticlePubMedGoogle Scholar
 Termonia Y, Ross J: Oscillations and control features in glycolysis: numerical analysis of a comprehensive model. Proc Natl Acad Sci U S A. 1981, 78 (5): 29522956. 10.1073/pnas.78.5.2952PubMed CentralView ArticlePubMedGoogle Scholar
 CornishBowden A, Cardenas ML: Information transfer in metabolic pathways. Effects of irreversible steps in computer models. Eur J Biochem. 2001, 268 (24): 66166624. 10.1046/j.00142956.2001.02616.xView ArticlePubMedGoogle Scholar
 Cha S: A simple method for derivation of rate equations for enzymecatalyzed reactions under the rapid equilibrium assumption or combined assumptions of equilibrium and steady state. J Biol Chem. 1968, 243 (4): 820825.PubMedGoogle Scholar
 Heinrich R, Schuster S: The Regulation of Cellular Systems. 1996, New York , Chapman and HallView ArticleGoogle Scholar
 Li HQ, Chen SH, Zhao HM: Fractal mechanisms for the allosteric effects of proteins and enzymes. Biophys J. 1990, 58 (5): 13131320.PubMed CentralView ArticlePubMedGoogle Scholar
 Berg OG, von Hippel PH: Diffusioncontrolled macromolecular interactions. Annu Rev Biophys Biophys Chem. 1985, 14: 131160. 10.1146/annurev.bb.14.060185.001023View ArticlePubMedGoogle Scholar
 Schomburg I, Chang A, Schomburg D: BRENDA, enzyme data and metabolic information. Nucleic Acids Res. 2002, 30 (1): 4749. 10.1093/nar/30.1.47PubMed CentralView ArticlePubMedGoogle Scholar
 Kimmel JL, Reinhart GD: Reevaluation of the accepted allosteric mechanism of phosphofructokinase from Bacillus stearothermophilus. Proc Natl Acad Sci U S A. 2000, 97 (8): 38443849. 10.1073/pnas.050588097PubMed CentralView ArticlePubMedGoogle Scholar
 Zheng RL, Kemp RG: The mechanism of ATP inhibition of wild type and mutant phosphofructo1kinase from Escherichia coli. J Biol Chem. 1992, 267 (33): 2364023645.PubMedGoogle Scholar
 Schmidt H, Jirstrand M: Systems Biology Toolbox for MATLAB: a computational platform for research in systems biology. Bioinformatics. 2006, 22 (4): 514515. 10.1093/bioinformatics/bti799View ArticlePubMedGoogle Scholar
 Schmidt H: SBaddon: high performance simulation for the Systems Biology Toolbox for MATLAB. Bioinformatics. 2007, 23 (5): 646647. 10.1093/bioinformatics/btl668View ArticlePubMedGoogle Scholar
 Press WH: Numerical recipes in C : the art of scientific computing. 1992, xxvi, 994 pCambridge; New York, Cambridge University Press, 2ndGoogle Scholar
 Richter O, Betz A, Giersch C: The response of oscillating glycolysis to perturbations in the NADH/NAD system: a comparison between experiments and a computer model. Biosystems. 1975, 7 (1): 137146. 10.1016/03032647(75)900519View ArticlePubMedGoogle Scholar
 Campos G, Guixe V, Babul J: Kinetic mechanism of phosphofructokinase2 from Escherichia coli. A mutant enzyme with a different mechanism. J Biol Chem. 1984, 259 (10): 61476152.PubMedGoogle Scholar
 Kohn MC, Garfinkel D: Computer simulation of metabolism in palmitateperfused rat heart. II. Behavior of complete model. Ann Biomed Eng. 1983, 11 (6): 511531. 10.1007/BF02364082View ArticlePubMedGoogle Scholar
 Sundararaj S, Guo A, HabibiNazhad B, Rouani M, Stothard P, Ellison M, Wishart DS: The CyberCell Database (CCDB): a comprehensive, selfupdating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 2004, 32 (Database issue): D2935. 10.1093/nar/gkh108PubMed CentralView ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.