 Methodology article
 Open Access
A cooperative strategy for parameter estimation in large scale systems biology models
 Alejandro F Villaverde^{1},
 Jose A Egea^{2} and
 Julio R Banga^{1}Email author
https://doi.org/10.1186/17520509675
© Villaverde et al.; licensee BioMed Central Ltd. 2012
Received: 21 February 2012
Accepted: 11 June 2012
Published: 22 June 2012
Abstract
Background
Mathematical models play a key role in systems biology: they summarize the currently available knowledge in a way that allows to make experimentally verifiable predictions. Model calibration consists of finding the parameters that give the best fit to a set of experimental data, which entails minimizing a cost function that measures the goodness of this fit. Most mathematical models in systems biology present three characteristics which make this problem very difficult to solve: they are highly nonlinear, they have a large number of parameters to be estimated, and the information content of the available experimental data is frequently scarce. Hence, there is a need for global optimization methods capable of solving this problem efficiently.
Results
A new approach for parameter estimation of large scale models, called Cooperative Enhanced Scatter Search (CeSS), is presented. Its key feature is the cooperation between different programs (“threads”) that run in parallel in different processors. Each thread implements a state of the art metaheuristic, the enhanced Scatter Search algorithm (eSS). Cooperation, meaning information sharing between threads, modifies the systemic properties of the algorithm and allows to speed up performance. Two parameter estimation problems involving models related with the central carbon metabolism of E. coli which include different regulatory levels (metabolic and transcriptional) are used as case studies. The performance and capabilities of the method are also evaluated using benchmark problems of largescale global optimization, with excellent results.
Conclusions
The cooperative CeSS strategy is a general purpose technique that can be applied to any model calibration problem. Its capability has been demonstrated by calibrating two largescale models of different characteristics, improving the performance of previously existing methods in both cases. The cooperative metaheuristic presented here can be easily extended to incorporate other global and local search solvers and specific structural information for particular classes of problems.
Keywords
Background
The aim of systems biology is to understand the organization of complex biological systems by combining experimental data with mathematical modeling and advanced computational techniques. Mathematical models play a key role, since–among other functions–they summarize the currently available knowledge in a way that allows to make experimentally verifiable predictions. Due to the increasing amount of “omic” data available from high throughput techniques, there is a need for largescale model building methods.
Model building is a complex task that usually follows an iterative process[1–7]. It begins with the definition of the purpose of the model, this is, with the determination of the questions that the model should be able to answer. This conditions the modeling framework and the information that must be included in the model. The next step is to propose a mathematical structure with the necessary level of detail, which will in general include a number of unknown, nonmeasurable parameters. An estimation of these parameters is needed in order to obtain quantitative predictions; this is the next step, which is commonly known as parameter estimation or identification, data fitting, or model calibration[1, 4–6]. The final step is model validation, which entails testing the model with new data; if this reveals modeling errors, the process should be iteratively repeated.
In recent years a number of approaches to largescale kinetic modelling have been presented. Jamshidi and Palsson[8] presented a procedure to construct dynamic network models in a scalable way using metabolomic data mapped onto existing stoichiometric models, thus incorporating kinetics and regulation into those stoichiometric models. Liebermeister and Klipp[9, 10] introduced a simplified, general rate law named convenience kinetics. It can be derived from a randomorder enzyme mechanism, and it may be used to obtain a dynamical model from a biochemical network. The resulting model has plausible biological properties. More recently, the same authors proposed five modular rate laws[11], characterized by different formulae of their denominators, as a way of parameterizing metabolic network models. Tran et al[12] advocated the use of an ensemble of dynamic models. The models in the ensemble share the same mechanistic framework, span the space of all kinetics allowable by thermodynamics, and reach the same steady state. The size of the ensemble is reduced by acquiring data, converging to a smaller and more predictive set of models, which are able to describe relevant phenotypes upon enzyme perturbations. The wellknown flux balance analysis method (FBA) has been widely used for largescale analysis of metabolic networks. In[13] it was extended in order to incorporate transcriptional regulation (regulatory FBA, or rFBA); further developments included also signal transduction networks (integrated FBA or iFBA[14], and integrated dynamic FBA or idFBA[15]). Another integration of FBA with kinetic information was carried out by Smallbone et al. In[16] they presented a method for building a parameterized genomescale kinetic model of a metabolic network, based solely on the knowledge of reaction stoichiometries. Fluxes were estimated by flux balance analysis and allowed to vary dynamically according to linlog kinetics. This method was applied[17] to a model of S. cerevisiae comprising 956 metabolic reactions and 820 metabolites. Kotte et al[18] modeled the coupling of enzymatic and transcriptional regulation of E. coli’s central metabolism using differential equations. The resulting mediumscale model is able to explain, from the interplay of known interactions, the adjustments of metabolic operation between glycolytic and gluconeogenic carbon sources. Another interesting contribution to the problem of building largescale mathematical models of biological systems can be found in[19]. It extends an ordinary differential equation (ODE) model of ErbB signaling pathways to include 28 proteins, 499 ODEs, 828 reactions, and 229 parameters. The parameter space size is reduced to 75, following an initial sensitivity analysis based on estimates of nominal parameter values. Parameter estimation is then carried out by means of extensive computations using an stochastic method (simulated annealing).
 1.
They are highly nonlinear, which creates multimodality, so standard local methods (e.g. LevenbergMarquardt or GaussNewton) converge to local solutions. This must be overcome with the use of global methods capable of finding the optimum in a rugged landscape.
 2.
There is large number of parameters to be estimated. This is an especially problematic issue, since the necessary computational effort increases very rapidly with the problem size. The increase may be exponential, thus preventing methods that work well for a limited number of parameters from being applied to realistically sized problems. This means that global deterministic methods, which guarantee finding the global optimum, fail to provide the solution in reasonable computing times. Hence, stochastic methods must be used instead.
 3.
The information content of the available experimental data is frequently scarce, which might cause an identifiability problem (i.e., different combinations of parameter values may produce similar model outputs). An example illustrating this point can be found in the aforementioned work of Chen et al [19], where the estimation of parameters yielded the result that the model is nonidentifiable.
Due to these difficulties, no full and proper model calibration has been performed so far in the largescale kinetic models referenced above. In many cases, a large subset of parameters were taken, where available, from previous models or databases, but this procedure, although practical for some aplications, is not equivalent to a proper calibration of the largescale model. Our main objective in this paper it to present a novel metaheuristic which exploits cooperative parallelism in order to surmount the difficulties mentioned above. Parallel implementations of global optimization methods have been used recently in systems biology (see[21] and references therein). Here we have improved this concept by incorporating cooperation of individual search threads.
The novelty of our approach has the following two main pillars:

The use of an efficient global optimization method for solving largescale parameter estimation problems, based on extensions of the scatter search metaheuristic[22–25].

A parallel implementation of the algorithm incorporating a cooperative strategy, which means that there is an exchange of information between a set of individual optimization programs, or threads, running in parallel. As a result, this cooperation not only speeds up the algorithm, but also alters its systemic properties, thus yielding performance improvements more than proportional to the increase in computing power.
The capabilities of these novel methods are illustrated considering two case studies based on E. coli:
This paper is structured as follows: after the presentation of the problem statement, we discuss the need of global optimization methods, and then we present a cooperative parallel method which is able to handle the challenges arising from this class of problems. Results for two different case studies are then presented and discussed, finally arriving to a set of conclusions.
Problem statement
Given a model of a nonlinear dynamic system and a set of experimental data, the parameter estimation problem consists of finding the vector of decision variables p(unknown model parameters) that minimizes a cost function that measures the goodness of the fit of the model predictions with respect to the data, subject to a number of constraints. The output state variables that are measured experimentally are called observables.
where n_{ ε } is the number of experiments,${n}_{o}^{\epsilon}$ is the number of observables per experiment, and${n}_{s}^{\epsilon ,o}$ is the number of samples per observable per experiment. The measured data will be denoted as${ym}_{s}^{\epsilon ,o}$ and the corresponding model predictions will be denoted as${y}_{s}^{\epsilon ,o}$ (p). Finally, W is a scaling matrix used to balance the contributions of the observables (usually by scaling each temporal series with respect to its maximum value).
where g is the observation function, x is the vector of state variables with initial conditions x_{0}, f is the set of differential and algebraic equality constraints describing the system dynamics (that is, the nonlinear process model), h_{ eq }and h_{ in } are equality and inequality constraints that express additional requirements for the system performance, and p^{ L } and p^{ U } are lower and upper bounds for the parameter vector p.
As was mentioned in the Background section, in systems biology models this problem is often multimodal (nonconvex), due to the nonlinear and constrained nature of the system dynamics. Hence, standard local methods usually fail to obtain the global optimum. As an alternative, one may choose a multistart strategy, where a local method is used repeatedly, starting from a number of different initial guesses for the parameters. However, this approach is usually not efficient for realistic applications, and global optimization (GO) techniques, such as the ones presented in the next section, need to be used instead.
Methods
To efficiently solve the calibration problem there is a need of global optimization methods whose performance does not deteriorate when the number of parameters to be estimated is very large (as it occurs with most methods). Global optimization methods can be divided into deterministic and stochastic. Deterministic global optimization methods guarantee that the solution is the global optimum, but the computational effort they require can make them unaffordable for largescale problems. Stochastic global optimization methods, on the other hand, do not guarantee the global optimality of the solution, but they are frequently capable of finding excellent solutions in reasonable computation times. A particularly efficient class of stochastic global optimization methods are the socalled metaheuristic approaches, which combine mechanisms for exploring the search space and exploiting previously obtained knowledge. Here we propose a method, Cooperative enhanced Scatter Search (CeSS), based on extensions and cooperative parallelization of the scatter search metaheuristic[22–25].
Global optimization via enhanced scatter search
Scatter search can be classified as an evolutionary optimization method which, like many other metaheuristics, arose in the context of integer optimization[26]. Several adaptations to continuous problems have been developed in recent years. The application of the method to parameter estimation in systems biology problems has provided excellent results[22].
Scatter search is a populationbased algorithm which, compared with other methods like genetic algorithms, makes use of a low number of population members, called “Reference Set” (RefSet) in this context. Besides, scatter search includes the socalled “improvement method” which usually consists in a local search to speedup the convergence to optimal solutions.
The method starts by creating an initial population of diverse solutions within the search space (Figure1.a). From them, the best solutions in terms of quality are selected to create the initial RefSet (in red in Figure1.b). The rest of solutions in the RefSet are chosen randomly (in green in Figure1.b), although in some advanced implementations they may be chosen following a criterion of maximum diversity to cover a broader area of the search space. Once the RefSet solutions have been selected, the remaining diverse solutions generated in the first step are deleted and the socalled “subset generation method” starts. It consists in combining sets of RefSet solutions to create new ones. In our scheme, we combine every solution in the RefSet with the other RefSet by generating hyperrectangles defined by their relative position and distance, and creating new solutions inside them. This step is illustrated in Figure1.c. Each RefSet member will create one hyperrectangle with every other RefSet member, thus generating a number of b−1 new solutions (being b the RefSet size) per each RefSet member. After this procedure, all the RefSet members have a set of offspring solutions around them covering different distances and directions (Figure1.d). If the best offspring solution outperforms the parent solution (which is the case of Figure1.d with the best offspring solution represented as a star), the replacement (or update) is carried out. Otherwise, the same RefSet member will stay in the population in the next iteration. Although the procedure is illustrated for just one RefSet member, the same scheme applies for every solution in the RefSet. After the RefSet update, the algorithm applied the “solution combination method” again, repeating the process until a stop criterion is met.
In this work, a novel extended implementation of the enhanced scatter search metaheuristic (eSS)[24, 25], has been used. Its pseudocode is shown in Algorithm 1.
Algorithm 1
Basic pseudocode of eSS
Set parameters: dim_refset, local.n 2, balance, ndiverse
Initialize n_{ stuck }, neval
Create set of ndiverse solutions
Generate initial RefSet with dim_refset solutions with high quality and random elements
repeat
Sort RefSet by quality [x^{1},x^{2},…,x^{ dim_refset }]so that f(x^{ i }) ≤ f(x^{ j }) where i,j∈[1,2,…,dim_refset] and i < j
if max$\left(\mathit{\text{abs}}\left(\frac{{x}^{i}{x}^{j}}{{x}^{j}}\right)\right)\le \epsilon $ with i < j then
Replace x^{ j }by a random solution
end if
for i = 1 to dim_refset do
Combine x^{ i }with the rest of population members to generate a set of dim_refset new solutions,
offspring ^{ i }
${x}_{\text{off}}^{i}=$ best solution in offspring^{ i }
if${x}_{\text{off}}^{i}$ outperforms x^{ i }then
Label x^{ i }
x_{temp} = x^{ i }
improvement = 1
∧ = 1
while $f\left({x}_{\text{off}}^{i}\right)<f\left({x}_{\text{temp}}\right)$ do
Create a new solution, x_{new}, in the hyperrectangle defined by$\left[{x}_{\text{off}}^{i}\frac{{x}_{\text{temp}}{x}_{\text{off}}^{i}}{\wedge},{x}_{\text{off}}^{i}\right]$
${x}_{\text{temp}}={x}_{\text{off}}^{i}$
${x}_{\text{off}}^{i}={x}_{\text{new}}$
improvement = improvement + 1
if improvement = 2 then
∧ = ∧/2
improvement = 0
end if
end while
end if
end for
offspring=$\left(\begin{array}{c}{\mathbf{\text{offspring}}}^{1}\\ {\mathbf{\text{offspring}}}^{2}\\ \dots \\ {\mathbf{\text{offspring}}}^{n}\end{array}\right)$ with n = dim_refset
if neval ≥ local.n 2 and at least one local solution has been found then
Sort offspring solutions by quality [x_{1,1},x_{2,1},…,x_{m,1}] where m = n(n−1) and f(x_{i,1}) ≤ f(x_{j,1}) where i,j∈[1,2,…,m] and i < j
Compute the minimum distance between each element i ∈ [1,2,…,m] in offspring and all the local optima found so far, d(x_{ i })
Sort offspring solutions by diversity [x_{1,2},x_{2,2},…,x_{m,2}] with d(x_{i,2}) ≥ d(x_{j,2}) where i,j∈[1,2,…,m] and i < j
Choose z as the offspring member balancing quality and diversity according to balance
Perform a local search from z to obtain z^{∗}
if z^{∗} is a new local optimum then
Update list of found local optima adding z^{∗}
end if
if f(z^{∗}) < fbest then
Update xbest, fbest
end if
neval = 0
end if
neval = neval + function evaluations of current iteration
Replace labeled population members by their corresponding${x}_{\text{off}}^{i}$ and reset their corresponding n_{ stuck }(i)
Add one unit to the corresponding n_{ stuck }(j) of the not labeled population members
if any of the n_{ stuck }values > nchange then
Replace those population members by random solutions and reset their n_{ stuck }values
end if
until stopping criterion is met
Apply final local search over xbest and update it
The innovative mechanisms implemented in different parts of the method as well as the appropriate local search selection, make the algorithm specially suitable for solving largescale optimization problems. These mechanisms, which address the most important question in global optimization (i.e., the balance between intensificacion, or local search, and diversification, or global search) are depicted in the following points:

Populationbased algorithms with a large population size can become inefficient due to the large amount of function evaluations they perform in each iteration. On the other hand, if the population size is too small, the algorithm may converge prematurely to suboptimal solutions. The enhanced scatter search method uses a small population size, even for largescale problems, but allowing more search directions than in classical scatter search designs thanks to the hyperrectanglebased combination scheme. This does not increase the number of evaluations per iteration while preserving the diversity in the search.

Another key aspect in populationbased algorithms is the population update scheme. (μ + λ) strategies[27] (i.e., members of the following population will be selected among the set of solutions including old population members and new offspring solutions) may tend to prematurely stagnate the population in suboptimal solutions. (μ λ) strategies (i.e., members of the following population will be selected among the set of solutions including only new offspring solutions) may need a high number of function evaluations to converge. On the other hand a (1 + 1) strategy applied to every population member as the one depicted in Figures1.(ad) can take the advantages of both mechanisms. This is the strategy used in eSS and, again, provides a good balance between intensification (i.e., local search) and diversification (i.e., global search).

The specific intensification mechanisms are crucial in large scale optimization in order to reduce the computational times as much as possible. eSS implements an intensification mechanism already in the global phase, which exploits the promising directions defined by a pair of solutions in the RefSet. Besides, the use of a local search method to accelerate the convergence is mandatory in this type of applications. Gradientbased methods are the most widely used local methods. However, for dynamic problems they might be inefficient due to some characteristics of this type of problems (see[28] for more details). Furthermore, the calculation of the numerical sensitivities carried out by these methods may become unaffordable when the problem dimension is too high. A heuristic local search method seems to be the most convenient choice when dealing with largescale problems because, even if the local convergence might not be ensured, avoiding the calculation of sensitivities or search directions may save a large amount of calculation time. For this reason, and after extensive initial tests, we have selected a heuristic method known as “Dynamic Hill Climbing”[29], available in the eSS framework. This method provided the best results for the problems tested.

Even if intensification strategies should be favoured in largescale optimization due to the limited computational time usually available in real applications, diversification strategies should not be neglected because stagnation of the search may appear even in early stages of the process. In this regard, eSS implements diversification strategies which make use of memory to infer whether a solution is stagnated in a local optimum or if it is too close to previously found solutions. When these solutions are identified, they are automatically replaced to avoid spending computational effort searching in areas already explored.
Cooperative optimization
A common way of reducing the computational cost of a given algorithm is to parallelize it. This entails performing several of its calculations simultaneously in different processors. Except for some special cases, which are usually called “embarrasingly parallel”, it is not trivial to separate the problem into parallel tasks. Here we are interested in the parallelization of a metaheuristic optimization algorithm, the enhanced scatter search method described in the previous section.
Before we describe the proposed methodology, we clarify the use of a few words in order to avoid confusion. We refer to a context where there are a number of processors available for computing, which can work simultaneously (in parallel). All of the processors are assumed to be physically identical, that is, they have the same hardware characteristics. A program running in a processor is called thread; both terms are interchangeable. In our methodology we use a master processor, with the task of coordinating the remaining processors (slaves). All of the slave processors implement the same program, which means they perform similar calculations, using different data sets and different settings.
There are not many examples of research in the area of parallelization of metaheuristics exploiting cooperation in the way we present here. In[30] three different parallelizations of the scatter search were proposed and compared. In the first one, each local search was parallelized. In the second one, several local searches were carried out in parallel. In these two cases it was possible to reduce the computational time. A third option was to run the algorithm for different populations in parallel, but without any communication between threads.
More recently[31] a more ambitious approach was presented, where different kinds of metaheuristics were combined in order to profit from their complementary advantages. Information was shared between algorithms and it was dynamically tuned, so the most succesful algorithms were allowed to share more information. For a detailed review of the state of the art in parallelization of metaheuristics, see[32].
Sharing information can be seen as a way of cooperating. Such cooperation produces more than just speedup: it can change the systemic properties of an algorithm and therefore its macroscopic behaviour. This was acknowledged in[33], where four parameters that may affect this behaviour were identified: (i) information gathered and made available for sharing; (ii) identification of programs that share information; (iii) the frequency of information sharing among the sequential programs; and (iv) the number of concurrent programs.
 1.
Information available for sharing: the best solution found and, optionally, the reference set (RefSet), which contains information about the diversity of solutions.
 2.
Threads that share information: all of them.
 3.
Frequency of information sharing: the threads exchange information at a fixed interval τ.
 4.
Number of concurrent programs: η.
Each of the η threads has a fixed degree of “aggressiveness”. “Conservative” threads make emphasis on diversification (global search) and are used for increasing the probabilities of finding a feasible solution, even if the parameter space is “rugged” or weakly structured. “Aggressive” threads make emphasis on intensification (local search) and speed up the calculations in “smoother” areas. Please note that both conservative and agressive threads should be able to locate the globally optimal solution of the class of problems considered here, but their relative efficiency will vary depending on the region of the parameter space where the individual threads are operating. In this way, we achieve a synergistic gain. Communication, which takes place at fixed time intervals, enables each thread to benefit from the knowledge gathered by the others. This knowledge may include not only information about the best solution found so far, but also about the sets of diverse parameter vectors that may be worth trying for improving the solution. Thus this strategy has several degrees of freedom that have to be fixed: the time between communication (τ), the number of threads (η), and the strategy adopted by each thread (Ψ). These adjustments should be chosen carefully depending on the particular problem we want to solve. In the following lines we give some guidelines for doing this.
Time between information sharing, τ
where n_{ eval }is the maximum number of function evaluations allowed between cooperation instants. Therefore, τ, as defined above, is a dimensionless and machineindependent parameter that sets the duration of the intervals between information sharing. The logarithm is used for the convenience of avoiding very large numbers. For all the problems considered in this study, we have empirically found that a value in the interval 2.5 < τ < 3.5, with a default value of 2.5, results in a good tradeoff between efficiency and robustness of the cooperative strategy.
Number of concurrent threads, η
As η increases, the expected speed up increases too. However, this increase cannot grow in a linear way for an arbitrarily large number of threads, due to a number of reasons. One reason is that, as the number of threads increases, so does the communication overhead. More importantly, the risk of overlapping  that is, carrying out similar searches in different threads  increases too due to the relatively small size of the RefSet. Hence, when the number of threads is very large, the resulting speed up is smaller than the increase in computational effort. For the problems considered here, we have empirically found that an η = 10 led to good results and thus can be recommended as a default value.
Settings (strategy) for each thread, Ψ
For cooperation to be efficient, every thread running in parallel must perform different calculations. Due to the random nature of metaheuristic algorithms, this can be achieved simply by choosing different random initializations, or “seeds”, for each thread. This results first in different sets of initial guesses, and subsequently in the exploration of different points of the search space. However, this is not the only difference that may be made between threads: we can also select a different behaviour (more or less aggressive) for each of the threads. An “aggressive” thread performs frequent local searches, trying to refine the solution very quickly, and keeps a small reference set of solutions. This means that it will perform well in problems with parameter spaces that have a smooth shape. “Conservative” threads, on the other hand, have a large reference set and perform local searches only sporadically. This means that they spend more time combining parameter vectors and exploring the different regions of the parameter space. Thus, they are more robust, and hence appropriate, for problems with rugged parameter spaces. Since the exact nature of the problem to consider is always unknown, it is advisable to choose a range of settings that yields conservative, aggressive, and intermediate threads. For the eSS algorithm this can be obtained by tuning the following settings:

Number of elements in the Reference Set (dim_refset).

Minimum number of iterations of the eSS algorithm between two local searches (local.n2).

Balance between intensification and diversification in the selection of initial points for the local searches (balance).

Number of diverse solutions initially generated (ndiverse).
All these settings have qualitatively the same influence on the algorithm’s behaviour: if they are large, they lead to conservative threads; if they are small, to aggressive threads. The four parameters that define the degree of aggressiveness of each of the η threads can be stored in an array Ψ_{η×4}. By default, we recommend to use a broad spectrum of aggressiveness using the default values 0.5·n_{ par }< dim _refset < 20·n_{ par }; 0 < local.n 2 < 100; 0 < balance < 0.5; and 5·n_{ par }< ndiverse < 20·n_{ par }.
Algorithm 2
Basic pseudocode of the cooperative strategy
MASTER, PROCESSOR 0: Initialization
Set parameters that are common for all threads: τ, η, n_{ iters }
Set array of aggressiveness parameters: Ψ_{η×4} = {dim_refset_{η×1},ndiverse_{η×1},local.n2_{η×1},balance_{η×1}}
Initialize global reference set array: Globa l_{ ref }= []
for i = 1 to n_{ iters }do
for j = 1 to η do
SLAVE, PROCESSOR j (concurrent processing): run optimization according to Algorithm 1
(Thread j uses aggressiveness settings vector Ψ_{{j,:}} and initial points Globa l_{ ref })
end for
if CPUtime ≥ τ then
for j = 1 to η do
Read results of thread j
if ref_{ j }∉ Global_{ ref }then
Global_{ ref }= [Global_{ ref },ref_{ j }]
end if
end for
end if
end for
Final solution = best solution in Global_{ ref }
Results and discussion
As a preliminary validation, we first applied CeSS to a problem taken from the competition on Large Scale Global Optimization of the 2012 IEEE World Congress on Computational Intelligence[34]. Testing the performance of our method with this problem has two advantages: first, the benchmark is well known in the largescale global optimization community, which facilitates the critical examination of the results; and second, the objective function selected from the benchmark has a smaller computational cost than the ones from our models (despite having one thousand parameters), which allows to carry out more extensive tests in less time. The results are shown in the supplementary material, see Additional file1. From them it is concluded that eSS performs well with the benchmark problem, and that its performance is considerably improved by using its cooperative version, CeSS. This confirms the validity of the proposed approach for largescale global optimization problems.
Overview of model features
Model 1  Model 2  

Levels  Metabolic  Metabolic, transcriptional 
Parameters  918  178 
States  115  47 
Observables  115  47 
Measures  1150  7614 
Static / Dynamic  Static  Dynamic 
Model 1: E. coli CCM
where u is an enzyme level (mmol),${k}_{A}^{M}$,${k}_{B}^{M}$ and${k}_{C}^{M}$ are reactant constants (mM), and k^{±} are turnover rates (s^{−1}). The CM rate law resembles the convenience kinetics, with a slight difference for molecularities m^{±} ≠ 1 (in convenience kinetics, the denominator term for product C would read$1+\frac{c}{{k}_{C}^{M}}+{\frac{c}{{k}_{C}^{M}}}^{2}$).
The resulting model has 918 parameters: internal metabolites concentrations; activation, inhibition, and MichaelisMenten constants; and velocity and equilibrium constants. Their values are given in the supplementary material, see Additional file2. They are estimated by least squares minimization of the difference between the model predictions and artificially generated data regarding internal metabolic concentrations (49) and fluxes (66), yielding a state vector with 115 elements.
Suspecting that a more sophisticated method may be able to find a better solution, we test our enhanced Scatter Search (eSS) algorithm. Since the multistart calculations were carried out in a single processor, with a total running time of 10 days, it is fair to compare the results with those obtained by 10 processors in one day. We thus select 10 different settings for the eSS algorithm and launch the corresponding 10 optimizations. Since eSS provides the possibility of launching local searches, we select the local method that yields best results with the multistart procedure, DHC. After one day, the best objective function value achieved is f_{ best }= 7.86, which represents a reduction of 73% with respect to what was obtained with a multistart of local searches.
Model 2: E. coli metabolic model including enzymatic and transcriptional regulation
The model calibrated in the previous subsection was purely metabolic. Here we consider a previously published E. coli model[18] that also takes into account the enzymatic and transcriptional regulation layer. It consists of 47 ODEs and 193 parameters (affinity constants, specific activities, Hill coefficients, growth rates, expression rates, etc), of which 178 are considered unknown and need to be estimated. We have reformulated the model to use it in an optimization context, where the objective function to be minimized consists of the difference between the simulated concentration profiles obtained with the nominal and the estimated parameters. Upper and lower bounds for the parameters are fixed to values 10 times larger and 10 times smaller than the nominal, except for the Hill coefficients, which are assumed to be between 0.5 and 4. More information is included as supplementary material, see Additional file3.
This model was created to reproduce the way in which E. coli adapts to changing carbon sources. With this aim, it is subjected to a sequence of three consecutive environments, where the carbon source is first glucose, then acetate, and finally a mixture of both. Under these conditions, the 47 concentration profiles are sampled every 1000 seconds, for a total of 162 time points (45 hours). Therefore the overall number of available samples is 47 × 162 = 7614. The equations are integrated using LSODE (Livermore Solver for Ordinary Differential Equations)[36], which is suited for stiff systems. Integrating these equations is a computationally expensive task; one evaluation of the objective function takes several seconds.
It must be noted that in[18] a divideandconquer approach was adopted for estimating the parameter values of the rate equations. In this approach, the largescale optimization problem is divided into a set of small subproblems, where only a few parameters are estimated at a time. This allows to reduce the complexity of the task and consequently the computation times. However, it is an ad hoc procedure that cannot be applied to most systems biology problems. Here we adopt a general purpose, largescale global optimization approach that does not require any special structure of the model equations.
Next, the enhanced Scatter Search (eSS) algorithm is tested. For this particular problem it was decided to use eSS with DHC as the local solver because, once again, it offers the best balance between quality of the solution and computational time. After selecting 10 different settings and letting the corresponding 10 optimizations run for 24 hours, the best objective function value achieved is f_{ best }= 2.37·10^{2} and the worst is f_{ best }= 1.48·10^{3}. Thus the eSS algorithm clearly outperforms any local method.
Finally, we would like to comment on the default values recommended for the search parameters of the cooperative strategy. The main potential disadvantage of using such default settings for a new problem is that the method could perform worse in two extreme situations: easy (mildly nonconvex) problems, where a simple multistart local method could perform very well, and extremely hard (highly nonconvex problems) where there is no structure at all and local searches do not help. Since for each new problem we do not have any a priori information about the topology of the search space, we simply recommend a broad spectrum of values for the aggressiveness of the threads, so even in the extreme situations the cooperative strategy would be able to solve the problem. Of course, any a posteriori information about the topology, after a few runs with the default settings, can be exploited by tuning ψ. This reasoning applies for the case of new arbitrary problems, but we should add that for the particular largescale parameter estimation problems considered above, the default recommended values have shown a good performance.
Conclusions
Summary
Typically, systems biology models have a large number of parameters and are highly nonlinear. Furthermore, the information content of the available experimental data is frequently scarce. As a consequence, it is hard to find the set of parameter values that gives the best fit to experimental data. This task, known as model calibration, is commonly formulated as an optimization problem, which in these cases must be solved with global optimization techniques.
Here we have presented a new method for solving this problem, called Cooperative enhanced Scatter Search (CeSS), which is specifically designed to profit from a parallel environment with several processors. Each of the processors executes a program (or “thread”) which implements the enhanced Scatter Search (eSS) algorithm[24, 25]. This is a state of the art metaheuristic capable of competing succesfully with other global optimization methods. The key feature of the strategy presented here is the cooperation between threads, which means that they exchange information at certain fixed instants. This information consists of the reference set of solutions they have found so far; it depends on the best solution found, the shape of the solution space, and the settings of each algorithm.
We have tested this strategy with two largescale E. coli models of different sizes and characteristics. The first one models the central carbon metabolism (CCM) using a recently proposed common modular rate law[11]. The second is a previously published model that combines metabolism with a transcriptional regulatory layer[18]. In both cases the presented method clearly outperformed several state of the art techniques. The performance and capabilities of the method have also been evaluated using benchmark problems of largescale global optimization, with excellent results.
Insights
The results for the cooperative strategy presented here shows how cooperation of individual parallel search threads modifies the systemic properties of the individual algorithms, improving its performance and outperforming other competitive methods. Importantly, it is a general purpose framework that can be easily applied to any model calibration problem. Furthermore, the underlying cooperative parallel mechanism is relatively simple and can be extended to incorporate other global and local search solvers and specific structural information for particular classes of problems.
Declarations
Acknowledgements
We are grateful to the financial aid received from the EU through project “BioPreDyn” (EC FP7KBBE20115 Grant number 289434), from Ministerio de Economía y Competitividad (and the FEDER) through the projects “MultiScales” (DPI201128112C0403 and DPI201128112C0404), and from the CSIC intramural project “BioREDES” (PIE201170E018). We acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI). We thank Wolfram Liebermeister for supplying the metabolic model of E. coli (“Model 1”).
Authors’ Affiliations
References
 van Riel N: Dynamic modelling and analysis of biochemical networks: mechanismbased models and modelbased experiments. Briefings Bioinf 2006,7(4):364. 10.1093/bib/bbl040View ArticleGoogle Scholar
 Stelling J: Mathematical models in microbial systems biology. Curr Opin Microbiol 2004,7(5):513518. 10.1016/j.mib.2004.08.004View ArticleGoogle Scholar
 Terzer M, Maynard N, Covert M, Stelling J: Genomescale metabolic networks. Wiley Interdisciplinary Rev: Syst Biol Med 2009,1(3):285297. 10.1002/wsbm.37Google Scholar
 Banga J, BalsaCanto E: Parameter estimation and optimal experimental design. Essays Biochem 2008, 45: 195. 10.1042/BSE0450195View ArticleGoogle Scholar
 Jaqaman K, Danuser G: Linking data to models: data regression. Nat Rev Mol Cell Biol 2006,7(11):813819. 10.1038/nrm2030View ArticleGoogle Scholar
 Ashyraliyev M, FomekongNanfack Y, Kaandorp J, Blom J: Systems biology: parameter estimation for biochemical models. FEBS J 2008,276(4):886902.View ArticleGoogle Scholar
 BalsaCanto E, Alonso A, Banga JR: An iterative identification procedure for dynamic modeling of biochemical networks. BMC Syst Biol 2010, 4: 11. 10.1186/17520509411View ArticleGoogle Scholar
 Jamshidi N, Palsson B: Mass action stoichiometric simulation models: incorporating kinetics and regulation into stoichiometric models. Biophys J 2010,98(2):175185. 10.1016/j.bpj.2009.09.064View ArticleGoogle Scholar
 Liebermeister W, Klipp E: Bringing metabolic networks to life: convenience rate law and thermodynamic constraints. Theor Biol Med Modell 2006, 3: 41. 10.1186/17424682341View ArticleGoogle Scholar
 Liebermeister W, Klipp E: Bringing metabolic networks to life: integration of kinetic, metabolic, and proteomic data. Theor Biol Med Modell 2006, 3: 42. 10.1186/17424682342View ArticleGoogle Scholar
 Liebermeister W, Uhlendorf J, Klipp E: Modular rate laws for enzymatic reactions: thermodynamics, elasticities, and implementation. Bioinformatics 2010,26(12):15281534. 10.1093/bioinformatics/btq141View ArticleGoogle Scholar
 Tran L, Rizk M, Liao J: Ensemble modeling of metabolic networks. Biophys J 2008,95(12):56065617. 10.1529/biophysj.108.135442View ArticleGoogle Scholar
 Covert M, Knight E, Reed J, Herrgard M, Palsson B: Integrating highthroughput and computational data elucidates bacterial networks. Nature 2004,429(6987):9296. 10.1038/nature02456View ArticleGoogle Scholar
 Covert M, Xiao N, Chen T, Karr J: Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli. Bioinformatics 2008,24(18):2044. 10.1093/bioinformatics/btn352View ArticleGoogle Scholar
 Lee JM, Gianchandani E, Eddy J, Papin J: Dynamic analysis of integrated signaling, metabolic, and regulatory networks. PLoS Comput Biol 2008,4(5):e1000086. 10.1371/journal.pcbi.1000086View ArticleGoogle Scholar
 Smallbone K, Simeonidis E, Broomhead D, Kell D: Something from nothing: bridging the gap between constraintbased and kinetic modelling. FEBS J 2007,274(21):55765585. 10.1111/j.17424658.2007.06076.xView ArticleGoogle Scholar
 Smallbone K, Simeonidis E, Swainston N, Mendes P: Towards a genomescale kinetic model of cellular metabolism. BMC Syst Biol 2010, 4: 6. 10.1186/1752050946View ArticleGoogle Scholar
 Kotte O, Zaugg J, Heinemann M: Bacterial adaptation through distributed sensing of metabolic fluxes. Mol Syst Biol 2010, 6: 355.View ArticleGoogle Scholar
 Chen W, Schoeberl B, Jasper P, Niepel M, Nielsen U, Lauffenburger D, Sorger P: Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data. Mol Syst Biol 2009, 5: 239.Google Scholar
 Banga JR: Optimization in computational systems biology. BMC Syst Biol 2008, 2: 47. 10.1186/17520509247View ArticleGoogle Scholar
 Jostins L, Jaeger J: Reverse engineering a gene network using an asynchronous parallel evolution strategy. BMC Syst Biol 2010, 4: 17. 10.1186/17520509417View ArticleGoogle Scholar
 RodriguezFernandez M, Egea J, Banga J: Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinf 2006, 7: 483. 10.1186/147121057483View ArticleGoogle Scholar
 Egea JA, RodríguezFernández M, Banga JR, Martí R: Scatter search for chemical and bioprocess optimization. J Global Optimization 2007,37(3):481503. 10.1007/s1089800690753View ArticleGoogle Scholar
 Egea JA, BalsaCanto E, García MSG, Banga JR: Dynamic optimization of nonlinear processes with an enhanced scatter search method. Ind & Eng Chem Res 2009, 48: 43884401. 10.1021/ie801717tView ArticleGoogle Scholar
 Egea JA, Martí R, Banga JR: An evolutionary method for complexprocess optimization. Comput Operations Res 2010,37(2):315324. 10.1016/j.cor.2009.05.003View ArticleGoogle Scholar
 Glover F, Laguna M, Martí R: Fundamentals of scatter search and path relinking. Control and Cybernetics 2000,39(3):653684.Google Scholar
 Beyer HG, Schwefel HP: Evolution strategies – a comprehensive introduction. Nat Comput 2002, 1: 352. 10.1023/A:1015059928466View ArticleGoogle Scholar
 Banga JR, Moles CG, Alonso AA: Global optimization of bioprocesses using stochastic and hybrid methods. Front Global Optimization 2003, 74: 4570.View ArticleGoogle Scholar
 De La Maza M, Yuret D: Dynamic hill climbing. AI EXPERT 1994, 9: 2626.Google Scholar
 GarcíaLópez F, MeliánBatista B, MorenoPérez JA, MorenoVega M: Parallelization of the scatter search for the pmedian problem. Parallel Computing 2002,29(5):575589.View ArticleGoogle Scholar
 Vrugt JA, Robinson B: Improved evolutionary optimization from genetically adaptive multimethod search. Proc Natl Acad Sci USA 2007,104(3):708711. 10.1073/pnas.0610471104View ArticleGoogle Scholar
 Crainic T, Toulouse M: Parallel strategies for metaheuristics. Handb Metaheuristics 2010, 57: 497541.View ArticleGoogle Scholar
 Toulouse M, Crainic TG, Sansó B: Systemic behavior of cooperative search algorithms. Parallel Comput 2004, 30: 5779. 10.1016/j.parco.2002.07.001View ArticleGoogle Scholar
 IEEE World Congress on Computational Intelligence (CEC@WCCI2012): Competition on Large Scale Global Optimization. 2012.http://staff.ustc.edu.cn/~ketang/cec2012/lsgo_competition.htm []Google Scholar
 Kanehisa M, Goto S: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000, 28: 27. 10.1093/nar/28.1.27View ArticleGoogle Scholar
 Hindmarsh AC: LSODE and, LSODI, two new initial value ordinary differential equation solvers. ACMSIGNUM Newsletter 1980,15(4):1011. 10.1145/1218052.1218054View ArticleGoogle Scholar
 EBMLEBI: BioModels Database. [http://www.ebi.ac.uk/biomodelsmain/] []
 Yang Z, Tang K, Yao X: Large scale evolutionary optimization using cooperative coevolution. Inf Sci 2008,178(15):29852999. 10.1016/j.ins.2008.02.017View ArticleGoogle Scholar
 Yang Z, Tang K, Yao X: Multilevel cooperative coevolution for large scale optimization. In Evolutionary Computation, 2008. CEC 2008.(IEEE World Congress on Computational Intelligence).. IEEE Congress on, IEEE; 2008:16631670.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.