Optimization strategies for metabolic networks
© Domingues et al. 2010
Received: 5 March 2010
Accepted: 13 August 2010
Published: 13 August 2010
Skip to main content
© Domingues et al. 2010
Received: 5 March 2010
Accepted: 13 August 2010
Published: 13 August 2010
The increasing availability of models and data for metabolic networks poses new challenges in what concerns optimization for biological systems. Due to the high level of complexity and uncertainty associated to these networks the suggested models often lack detail and liability, required to determine the proper optimization strategies. A possible approach to overcome this limitation is the combination of both kinetic and stoichiometric models. In this paper three control optimization methods, with different levels of complexity and assuming various degrees of process information, are presented and their results compared using a prototype network.
The results obtained show that Bi-Level optimization lead to a good approximation of the optimum attainable with the full information on the original network. Furthermore, using Pontryagin's Maximum Principle it is shown that the optimal control for the network in question, can only assume values on the extremes of the interval of its possible values.
It is shown that, for a class of networks in which the product that favors cell growth competes with the desired product yield, the optimal control that explores this trade-off assumes only extreme values. The proposed Bi-Level optimization led to a good approximation of the original network, allowing to overcome the limitation on the available information, often present in metabolic network models. Although the prototype network considered, it is stressed that the results obtained concern methods, and provide guidelines that are valid in a wider context.
Current metabolic engineering processes allow to manipulate metabolic networks to improve the desired characteristics of biochemical systems . These manipulations may lead to the maximization of the normal product yield or redirect the production to a flux that was residual or non-significant in the original network. The high level of uncertainty in metabolic network models knowledge makes it extremely difficult to determine what are the required manipulations needed to attain a given objective. Since an heuristic approach to such problems does not allow to explore the maximum potential of metabolic engineering, two approaches are usually considered when modeling metabolic networks. Kinetic models describe the complete dynamics of the network, and have proven useful to implement optimization and control over the network, such as in . The creation of reliable kinetic models involves the estimation of parameters, the complexity of this task increasing with the size of the network considered.
The second approach models the networks on the basis of reaction stoichiometry. Although easier to obtain, these models lack the ability to directly predict the dynamics of the system.
Several techniques have been proposed to optimize and infer network characteristics from these models. In  a platform that combines many of these methods is presented. Flux Balance Analysis (FBA) allows the determination of the optimal flux distribution on a network described in terms of the stoichiometry of the reactions and yields reliable results in the study of metabolic systems [4–6]. A review of the method can be seen in .
When optimizing a metabolic network for a given objective two distinct problems must be addressed. The first is to find which branch or branches must be manipulated. The second is to determine what type of alterations must be done. Strategies such as OptKnock  and the work in  address the first problem. In this work a strategy for the second problem is described.
The simulation and engineering of metabolic network models typically involves complex optimization procedures. Geometric Programming (GP), one of the techniques used in this paper, is a powerful mathematical optimization tool that can be applied to problems where the objective and constraint functions have a special form . GP is of particular interest because it can solve large scale problems with extreme efficiency and reliability . Furthermore it has been shown that a problem formulated in S-Systems form can be solved with GP after a minimum adaptation .
A common optimization problem is the maximization of the final concentration of a metabolite whose formation competes with the natural objective of the cell (e.g. maximization of biomass). In this work, a prototype network with such behavior is taken as example and the corresponding optimization problem is solved with three alternative methods.
It is stressed that the emphasis of this work is on the methods and not the specific network considered. The key point of the paper consists in establishing properties of a number of optimization methods that may serve as guidelines when considering more complex networks.
This will be further explained in the next section.
An overall view of the problem considered and paper contributions is first presented. Details may then be seen in subsequent sections.
The problem to consider consists in finding a control function, defined over a finite interval of operation time, such that the final concentration of a desired product is maximum. This product is yielded by a metabolic network that, depending on the control function, either produces it or a product that favors cell growth. In order to settle ideas, assume that the control variable u is such that it is constrained to be in the interval [0, 1], with u = 0 corresponding to only production of cell growth product and no production of the desired product, and u = 1 corresponds to the inverse situation. Values of u in between 0 and 1 correspond to a mixed production in a way that depends on the network dynamics.
Since the optimization is with respect to a time function, this is an in finite dimensional problem. However we prove in this paper, using Pontryagin's Maximum Principle , that the optimal control only assumes values of 0 and 1. This is a priori assumed by other authors  and receives now a solid justification. It is a result valid for similar metabolic network problems that aim at optimizing a final yield (e.g. a concentration at the end of the optimization time interval, such as in ) and such that the control enters linearly in the network equations.
The significance of this result consists in the fact that, instead of searching the optimal control among piecewise continuous functions assuming values between 0 and 1, one only has to look functions assuming the extreme values of 0 and 1.
Furthermore, in the case study considered, it is shown that the optimum has only one switch between 0 and 1. Therefore the search for the optimum is reduced to find the switching instant, t reg , that leads to the maximum final yield. Considering the structure of the metabolic network, this is intuitive: the optimum is achieved by first applying all cell resources to population growth and, after t reg , to redirect them to desired production. If t reg is too small, the desired production rate is higher during more time, but the cell population to which it applies is small. If t reg is too big, there are many cells to produce, but they only act during a small time interval. Hence, there is an optimum value for t reg .
As mentioned in the Background section, a major problem is the high level of uncertainty in the knowledge about metabolic network dynamics. In this respect we consider different optimization algorithms that assume various degrees of information about the system to be optimized.
The first is direct optimization. This assumes complete knowledge about the system and is included to establish a benchmark with which other methods may be compared.
The other two methods are variants of a bi-level algorithm designed in order to accommodate missing information on the network kinetics. Both cases differ from the type of inner-optimization: Geometric Programming in one case and Linear Programming in the other. Both methods lead to good approximations of the optimal control, with a slight advantage of the one relying on Geometric Programming.
Here the states x i , i = 1,..., 5 are metabolite concentrations at the network nodes, v i , i = 1,..., 4 are fluxes associated to the metabolic network branches and k is a constant parameter that represents the uptake of x 1. In the equations, u represents a control function that allows to redirect the flux between the branches x 2 → x 3 and x 2 → x 4. Assuming that x 3 represents a precursor of the cellular objective (such as growth) and x 5 the desired product, if u(t) is biased towards the branch of v 2 this yields the formation of x 3 but little or no production of x 5. If u(t) is biased towards the branch of v 3 the production of x 5 will be affected by the low concentration of x 3 (since there is a forward feedback). Thus, there is an optimal profile for u(t) to maximize the concentration of x 5 at the final time t final .
Parameters used in the prototype network.
Direct optimization uses model (2) with the set of parameters from Table 1.
It is clear from Figure 2 that there is an optimal value for the time of regulation that maximizes the yield of x 5. For the network considered, the optimal time of regulation is t reg = 9s. If u(t) switches from 0 to 1 before t reg the formed biomass will not be enough to maximize x 5(t final ). On the other hand, if u(t) switches from 0 to 1 after t reg , there will be more biomass but there time will not be enough time to produce the maximum possible amount of x 5.
The optimal time of regulation obtained with both GP and LP on the inner optimization was t reg = 9. As shown mathematically in the methods section, the optimal control function is either 0 or 1, provided that the dynamics depends linearly on the control and the cost to optimize has only a final term.
For a class of networks in which the yield of the product that favors cell population growth (the "natural" product) competes with the desired product yield, with the manipulated variable affecting linearly the fluxes, it has been shown that the optimal control assumes only extreme values. While the implementation of this optimal control poses no challenge on in silico metabolic networks, on real metabolic networks complex bioengineering skills are required. Gene knockout manipulations do not adequate to this kind of control problem due to the long time scale associated with these techniques. The manipulation of specific enzyme levels, controlled by modulating the expression of the corresponding genes using promoter systems and inducers, is a possible solution to this kind of control problem .
The use of a bi-level optimization strategy, that maximizes the natural product in the inner level by manipulating the fluxes, leads to a good approximation to the optimal solution, with the advantage of not requiring the full knowledge of the network model. Real networks are extremely complex and exhibit relations between metabolites that are not always expected or fully understood. This gives emphasis to the need of good in silico models and also to the determination of the exact branches to be modified when optimizing a network. Although the example network used is very simple, it has proved to be useful to test the optimization strategies but a more complex network should be used to confirm that the strategy can be scaled to a larger network.
is maximized under the constraint that u(t) ∈ [0, 1], ∇t ≥ 0.
The solution of the optimization problem is obtained using different approaches. Before accomplishing this task, Pontryagin's Maximum Principle is invoked to establish a particular form of the optimal control function for the class of problems at hand.
The control function is now optimized in order to obtain a maximum yield of biomass at the end of the run-time (t final ). Three different methods, assuming various levels of information about the network, are considered in order to attain this goal.
The first method, direct optimization, is used as a benchmark to compare the results of the other methods. The last two methods rely on a Bi-level optimization and illustrate a possible solution to the optimization problem when the information about the network is incomplete.
The first method, Direct Optimization, is used mainly as a benchmark, to compare the results of the following methods. Since it is assumed that all the information about the network kinetics is known, the system of differential equations, described in (2) is used. Given a function that receives t reg as input and outputs the final yield of x 5, this optimization tests all the possible values of t reg and returns the function J(t reg ) = x 5(t final ). The value of t reg that results on a maximum product yield is then determined by solving a simple optimization problem.
The optimization was tested with two MATLAB functions: fmincon, from the standard optimization toolbox, that finds the minimum of a constrained nonlinear multi variable function, and simannealingSB from Systems Biology Toolbox  that performs simulated annealing optimization.
The Bi-Level optimization algorithm was structured so as to accommodate missing information on the network kinetics. The boxed metabolites and fluxes from Figure 1 are a part of the network that might not be fully described in terms of kinetics. In this approach the missing kinetic information is replaced by stoichiometric data and flux balance analysis is used to obtain the proper flux distribution. Then, an inner optimization determines the fluxes during the batch time. The first step of the inner optimization process is to define the initial conditions of the input x 1 and outputs x 3, x 5. A valid distribution for the fluxes v 1, v 2, v 3 and v 4 is then obtained.
In this algorithm implementation, the inner optimization problem determines the profile of the metabolites, instead of fluxes, due to the nature of the equations. The metabolite concentrations are calculated at the beginning of each time interval, solving a Geometric Programming problem, and used with (2) to integrate the values of x 1, x 3 and x 5 during that interval.
Figure 1 shows a regulation from x 3 (Biomass) to flux v 3. Since stoichiometric models do not account for feedbacks, the effect of x 3 can not be integrated directly in the equations. Assuming that the forward feedback leads to an over expression of flux v 3, then a valid solution is to model the forward feedback as a variation of the constraints applied to flux v 3. Setting flux v 2 (precursor of Biomass formation) as the objective function, the FBA problem is solved with the previous equations to obtain a valid and unique flux distribution at each time step. In the context of the inner-optimization, these fluxes are then used to calculate the values of the input/outputs.
A general tool to solve dynamic optimization problems such as the one considered here is Pontryagin's Maximum Principle PMP .
where F : ℜ n × ℜ → ℜ n , U is the set of valid control inputs and T is the final time, assumed here to be constant.
Where ψ is the cost associated with the terminal condition of the system and L the Lagrangian.
According to PMP, a necessary condition for the optimal control is that, along the optimal solution for the state x, co-state λ and control u the Hamiltonian H is maximum with respect to u .
where ϕ(λ, x) is a function that does not explicitly depend on u. Since, according to (8), the Hamiltonian is linear in u, its maximum is obtained at the boundary of the admissible control set U.
Hence, this shows that, for the metabolic network (1), the control that optimizes (3) only assumes the values u = 0 or u = 1.
as shown before in (3).
The network is described by the system of ordinary differential equations in (2), if we consider the state model in the form of f(x, u), where u is the control function, calculating f x (x, u) is straightforward.
Since L = 0 the Hamiltonian is given by λ T f(x).
that depends linearly on the control function u, as expected.
This expression is the one that determines the number of switches between 0 and 1 of the control variable.
The work reported in this paper was performed within the project DynaMo - Dynamical modeling, control and optimization of metabolic networks, supported by FCT (Portugal) under contract PTDC/EEA-ACR/69530/2006.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.