Identifying optimization strategies for increasing strain productivity should be possible by applying optimization methods to detailed kinetic models of the target metabolism. Thus, a rational approach would pinpoint the changes to be done - e.g. by modulating gene expression - in order to achieve the desired biotechnological goals [1–4]. To build such models we can either start from a detailed description of the underlying processes (bottom-up strategy) or we can fit kinetic models to experimental data obtained in vivo (top-down strategy).
The bottom-up approach was the original strategy for model building in the biological sciences. Bottom-up kinetic models require information that is seldom available, despite the increasing amount of kinetic data contained in a growing set of databases (for example see [5, 6] and the webpage http://kinetics.nist.gov/kinetics/index.jsp). Even in the best case scenarios where kinetic data are available, the data have often been obtained in different labs and under in vitro conditions that are hardly ever comparable or representative of the situation in vivo. In addition, models built using this strategy often fail to adequately reproduce the known behavior of the target system [7–10]. With the accumulation of time-series data, which were originally generated from the accurate measurement of transient responses, top-down modeling became viable as an alternative to the bottom-up strategy . However, top-down modeling also faces important difficulties. For example, regulatory interactions between metabolites and enzymes are very poorly characterized and most metabolic maps lack such crucial information. Therefore, for a given network structure (i.e. a stoichiometric description) obtained from databases, a large number of alternative regulatory patterns may exist that account for the observed experimental data . Model discrimination among the alternative regulatory patterns requires appropriate experimental design. However, this is seldom considered when performing the time series measurements. Last, but not the least, parameter identifiability in highly non-linear models can be problematic (for a review see ).
An additional issue that is common to models built using both strategies is that such detailed kinetic models include non-convexities that lead to the existence of multiple local optima in which standard non-linear optimization algorithms may get trapped during the search. Several stochastic and deterministic global optimization methods have been proposed to overcome this limitation . Deterministic methods, which are the only ones that can rigorously guarantee global optimality, rely on the use of convex envelopes or underestimators to formulate lower-bounding convex problems that are typically combined with spatial branch and bound strategies. Most of these methods are general purpose and assume special structures in the continuous terms of the mathematical model. Because of this, they can encounter numerical difficulties in specific metabolic engineering systems that require the optimization of kinetic models with a large number of non-convexities of different nature.
Given all these issues, it is hardly surprising that linear stoichiometric models have emerged as the most popular tool to analyze genome-wide metabolic networks using optimization techniques. Linear optimization problems can be solved using very fast and efficient algorithms [15, 16] that are implemented in almost every kind of computer, ranging from laptops to cloud computing centers. In addition, such models require a relatively small amount of information.
The possibility of condensing information about a very large network in a compact form enabled stoichiometric models to provide interesting insights in many different cases. However, the apparent simplicity in building and analyzing stoichiometric models comes at the cost of neglecting regulatory signals, metabolite levels and dynamic constraints. Accounting for these features in a dynamic way requires using more detailed, non-linear, mathematical models [17, 18].
These models go a step further than stoichiometric models by incorporating regulatory influences through a set of ordinary differential equations that can account for the system's dynamics. Building such models is often impossible because the appropriate functional form that needs to be used to describe the dynamical behavior of specific processes is in general unknown. Modeling strategies based on systematic approximated kinetic representations, such as power-laws [19–22], Saturating and Cooperative , or convenience kinetics , side-step this issue by providing uniform forms that are guaranteed to be accurate over a range of conditions and reduce the amount of information required to build the models. Because of the regularity in the form of the mathematical function, models based on approximate formalisms can be automatically built from the reaction scheme of the target system. The model parameters can subsequently be estimated from experimental data using different procedures [13, 25].
Although building and analyzing of comprehensive genome-wide detailed models is still not viable in most cases (see however [26, 27]), developing ways to extend large scale optimization analysis to larger and more realistic non-linear kinetic models is an important part of the future of systems biology . In fact, the optimization of certain types of non-linear problems can already be solved very efficiently and geometric programming problems with up to 1,000 variables and 10,000 constraints can be solved in minutes on a personal computer.
Efficient global optimization techniques are available for power-law models [1, 28–30], either in S-system form or in Generalized Mass Action (GMA) form (for a review see ). In the case of S-system models, a simple logarithmic transformation brings the model to a linear form . In the case of GMA models, the problem can be efficiently solved using branch-and-bound [28, 32] and outer-approximation techniques [29, 30].
The usefulness of the global optimization techniques developed for GMA models has been shown in the analysis of the adaptive response of yeast to heat shock [29, 33]. In essence, starting with a GMA model and considering a set of constraints on flux and metabolite values, we can obtain: (i) The pattern of enzyme activities that maximizes a given objective, (ii) The region of feasible changes in enzyme activities so that the model fulfills a set of constraints on fluxes, metabolites, maximum allowable change in activity, etc., and (iii) A heat map of how the objective function changes within the feasible region. These results share some similarities with those produced with stoichiometric models, but incorporate many additional features.
Based on ideas similar to those that led to the development of the power-law formalism, Sorribas et al.  proposed a new Saturable and Cooperative (SC) formalism, that extends the power-law representation to include cooperativity and saturation. Although models built using this new formalism loses some of the simplicity inherent to the analysis of S-systems and GMA models, they tend to be accurate over a wider range of conditions than both the S-System and GMA representations . Thus, it is important to enlarge the scope of global optimization methods developed for power-law models in order to deal with the SC formalism and analyze under which situations the later models behave better than the former.
Optimization of SC models faces a number of practical problems common to kinetic non-linear models [34, 35]. To sidestep these problems, and in order to be able to use global optimization methods developed for power-law models, we will use a technique called recasting. Recasting permits the exact transformation of a continuous non-linear model with an arbitrary form into a canonical GMA model [36, 37]. This transformation is typically performed by increasing the number of variables of the original model. Through this technique, arbitrary non-linear models can be represented using a canonical form such as GMA or S-system that can be used for simulation and optimization purposes, which opens the door for effectively extending the optimization and feasibility analysis originally devised for GMA models to other detailed kinetic models.
In this paper, and as a first step to define a framework for optimization of non-linear models with arbitrary form and extend FBA and related approaches to detailed kinetic models, we shall show the practical utility of recasting SC models into GMA models for optimization purposes. This technique is similar to the symbolic reformulation algorithm proposed by Smith and Pantelides . Our method, however, focuses on obtaining a power-law representation that greatly facilitates global optimization, instead of continuing with the recasting until converting the model to a standard form containing linear constraints and a set of nonlinearities corresponding to bilinear product, linear fractional, simple exponentiation and univariate function terms. After recasting the model to the canonical form, we can apply any of the optimization strategies we have presented for GMA models [29, 32] to obtain the global optimum of the original SC problem.