### The optimization problem

The optimization problem consists in relation to a nonlinear state model of a metabolic network like (2), in selecting *u*(*t*) for *t* ∈ [0, *t*_{
final
}] such that:

J\left(u\right)={x}_{\text{5}}\left({t}_{final}\right)

(3)

is maximized under the constraint that *u*(*t*) ∈ [0, 1], ∇*t* ≥ 0.

The solution of the optimization problem is obtained using different approaches. Before accomplishing this task, Pontryagin's Maximum Principle is invoked to establish a particular form of the optimal control function for the class of problems at hand.

### Optimization

The control function is now optimized in order to obtain a maximum yield of biomass at the end of the run-time (*t*_{
final
}). Three different methods, assuming various levels of information about the network, are considered in order to attain this goal.

The first method, direct optimization, is used as a benchmark to compare the results of the other methods. The last two methods rely on a Bi-level optimization and illustrate a possible solution to the optimization problem when the information about the network is incomplete.

#### Direct optimization

The first method, Direct Optimization, is used mainly as a benchmark, to compare the results of the following methods. Since it is assumed that all the information about the network kinetics is known, the system of differential equations, described in (2) is used. Given a function that receives *t*_{
reg
}as input and outputs the final yield of *x*_{5}, this optimization tests all the possible values of *t*_{reg} and returns the function *J*(*t*_{
reg
}) = *x*_{5}(*t*_{
final
}). The value of *t*_{
reg
}that results on a maximum product yield is then determined by solving a simple optimization problem.

The optimization was tested with two MATLAB functions: *fmincon*, from the standard optimization toolbox, that finds the minimum of a constrained nonlinear multi variable function, and *simannealingSB* from Systems Biology Toolbox [17] that performs simulated annealing optimization.

#### Bi-Level Optimization algorithm structure

The Bi-Level optimization algorithm was structured so as to accommodate missing information on the network kinetics. The boxed metabolites and fluxes from Figure 1 are a part of the network that might not be fully described in terms of kinetics. In this approach the missing kinetic information is replaced by stoichiometric data and flux balance analysis is used to obtain the proper flux distribution. Then, an inner optimization determines the fluxes during the batch time. The first step of the inner optimization process is to define the initial conditions of the input *x*_{1} and outputs *x*_{3}, *x*_{5}. A valid distribution for the fluxes *v*_{1}, *v*_{2}, *v*_{3} and *v*_{4} is then obtained.

After obtaining the flux distribution, new values for the input/outputs can be calculated by integrating their expressions in the considered time interval. During this time interval the function *u*(*t*) and the values of *v*_{1}, *v*_{2}, *v*_{3} and *v*_{4} are kept constant. This process is repeated along a time grid from *t* = 0 to *t* = *t*_{
final
}. The time interval for the integration was defined to be 1 second. The inner optimization process allows us to obtain the product yield, *x*_{5}(*t*_{
final
}), given a certain *u*(*t*), taking into account a valid approximation of the network dynamics over the simulation time. The detailed fluxogram of the inner-optimization is shown in Figure 6.

The bi-level optimization algorithm can be represented schematically as in Figure 7.

#### Inner-optimization using Geometric Programming

On the first implementation of the Bi-Level optimization algorithm the dynamics of the boxed metabolites from Figure 1 are used but, following the algorithm structure, steady-state is assumed. Thus, {\dot{x}}_{2} and {\dot{x}}_{4} from (2) become:

\begin{array}{l}\frac{d{x}_{2}}{dt}={\alpha}_{2}{x}_{1}^{{g}_{21}}-{\beta}_{2}{x}_{3}^{{h}_{23}}{x}_{2}^{{h}_{22}}=0\\ \frac{d{x}_{4}}{dt}={\alpha}_{4}{x}_{3}^{{g}_{43}}{x}_{2}^{{g}_{42}}u-{\beta}_{4}{x}_{4}^{{h}_{44}}=0\end{array}

(4)

In this algorithm implementation, the inner optimization problem determines the profile of the metabolites, instead of fluxes, due to the nature of the equations. The metabolite concentrations are calculated at the beginning of each time interval, solving a Geometric Programming problem, and used with (2) to integrate the values of *x*_{1}, *x*_{3} and *x*_{5} during that interval.

#### Inner-optimization using Linear Programming

On the second implementation it is assumed that only stoichiometric information is available for the reactions inside the box of Figure 1. Assuming steady state, the equations of {\dot{x}}_{2} and {\dot{x}}_{4} become:

\begin{array}{l}\frac{d{x}_{2}}{dt}={v}_{1}-{v}_{2}\times \left(1-u\right)-{v}_{3}\times u=0\\ \frac{d{x}_{4}}{dt}={v}_{3}\times u-{v}_{4}=0\end{array}

(5)

Figure 1 shows a regulation from *x*_{3} (Biomass) to flux *v*_{3}. Since stoichiometric models do not account for feedbacks, the effect of *x*_{3} can not be integrated directly in the equations. Assuming that the forward feedback leads to an over expression of flux *v*_{3}, then a valid solution is to model the forward feedback as a variation of the constraints applied to flux *v*_{3}. Setting flux *v*_{2} (precursor of Biomass formation) as the objective function, the FBA problem is solved with the previous equations to obtain a valid and unique flux distribution at each time step. In the context of the inner-optimization, these fluxes are then used to calculate the values of the input/outputs.

#### Pontryagin's Maximum Principle

A general tool to solve dynamic optimization problems such as the one considered here is Pontryagin's Maximum Principle PMP [13].

Let *x* be the state of a dynamical system with control inputs *u* such that:

\dot{x}=F\left(x,u\right),x\left(0\right)={x}_{0},u\left(t\right)\in U,t\in \left[0,T\right]

(6)

where *F* : ℜ^{n}× ℜ → ℜ^{n}, *U* is the set of valid control inputs and *T* is the final time, assumed here to be constant.

The control function *u* must be chosen in order to maximize the functional *J*, defined by:

J(u)=\psi ({x}_{i}(T))+{\displaystyle \underset{0}{\overset{T}{\int}}L}(x(t),u(t))dt

(7)

Where *ψ* is the cost associated with the terminal condition of the system and *L* the Lagrangian.

According to PMP, a necessary condition for the optimal control is that, along the optimal solution for the state *x*, co-state *λ* and control *u* the Hamiltonian *H* is maximum with respect to *u*[13].

Comparing the cost (3) with the generalized case (7) and taking into consideration that, in the case at hand, given by (1), the dynamics vector field depends linearly on the control, it follows that

H(\lambda ,x,u)=\lambda \varphi (\lambda ,x)u

(8)

where *ϕ*(*λ*, *x*) is a function that does not explicitly depend on *u*. Since, according to (8), the Hamiltonian is linear in *u*, its maximum is obtained at the boundary of the admissible control set *U*.

Hence, this shows that, for the metabolic network (1), the control that optimizes (3) only assumes the values *u* = 0 or *u* = 1.

In the case at hand, we are interested in maximizing the final value of the state *x*_{5}. Since the Lagrangian (L) is zero, (7) becomes *J*(*u*) = *ψ*(*x*_{
i
}(*T*)). Thus, the functional *J* to be maximized is:

\psi (x(T))={u}_{5}({T}_{final})

(9)

as shown before in (3).

Taking into account that, *L* = 0 the adjoint equations are reduced to

\dot{\lambda}=-{f}_{x}^{T}\lambda

(10)

The network is described by the system of ordinary differential equations in (2), if we consider the state model in the form of *f*(*x*, *u*), where *u* is the control function, calculating *f*_{
x
}(*x*, *u*) is straightforward.

Thus

\begin{array}{ccc}{\dot{\lambda}}_{1}& =& {\beta}_{1}{h}_{11}{x}_{1}^{{h}_{11}-1}{\lambda}_{1}-{\alpha}_{2}{g}_{21}{x}_{1}^{{g}_{21}-1}{\lambda}_{2}\\ {\dot{\lambda}}_{2}& =& \begin{array}{l}{\beta}_{2}{x}_{3}{h}^{23}{h}_{22}{x}_{2}^{{h}_{22}-1}{\lambda}_{2}-({\alpha}_{3}{g}_{32}{x}_{2}^{{g}_{32}-1})(1-u){\lambda}_{3}\\ -{\alpha}_{4}{x}_{3}^{g43}u{g}_{42}{x}_{2}^{{g}_{42}}{}^{-1}{\lambda}_{4}\end{array}\\ {\dot{\lambda}}_{3}& =& {\beta}_{2}{x}_{2}^{{h}_{22}}{h}_{23}{x}_{3}^{{h}_{23}-1}{\lambda}_{2}-{\alpha}_{4}u{x}_{2}^{g}42{g}_{43}{x}_{3}^{{g}_{43}-1}{\lambda}_{4}\\ {\dot{\lambda}}_{4}& =& {\beta}_{4}{h}_{44}{x}_{4}^{{h}_{44}-1}{\lambda}_{4}-{\alpha}_{5}{g}_{54}{x}_{4}^{{g}_{54}-1}{\lambda}_{5}\\ {\dot{\lambda}}_{5}& =& 0\end{array}

(11)

The terminal conditions for the co-states *λ* are

{\lambda}_{n}(T)=\frac{\partial \psi}{\partial x}{|}_{x=x(T)}=([00001])

(12)

Since *L* = 0 the Hamiltonian is given by *λ*^{T}*f*(*x*).

Substituting in the expression and after some manipulation, becomes:

\begin{array}{l}H(\lambda (t),x(t),u(t),t)=\\ {\lambda}_{1}\left(k-{\beta}_{1}-{x}_{1}^{{h}_{11}}\right)+{\lambda}_{2}\left({\alpha}_{2}{x}_{1}^{{g}_{21}}-{\beta}_{2}{x}_{3}^{{h}_{23}}{x}_{2}^{{h}_{22}}\right)+{\lambda}_{5}({\alpha}_{5}{x}_{4}^{{g}_{54}})+\\ {\lambda}_{3}{\alpha}_{3}{x}_{2}^{{g}_{32}}-{\alpha}_{4}{\beta}_{4}{x}_{4}^{{h}_{44}}+({\lambda}_{4}{\alpha}_{4}{x}_{2}^{{g}_{43}}{x}_{2}^{{g}_{42}}-{\lambda}_{3}{\alpha}_{3}{x}_{2}^{{g}_{32}})u\end{array}

(13)

that depends linearly on the control function *u*, as expected.

The derivative of the Hamiltonian in order to the control function is:

\begin{array}{ccc}{H}_{u}& =& {\lambda}^{T}{f}_{u}\\ =& -{\lambda}_{3}{\alpha}_{3}{x}_{2}^{{g}_{32}}+{\lambda}_{4}{\alpha}_{4}{x}_{3}^{{g}_{43}}{x}_{2}^{{g}_{42}}\end{array}

(14)

This expression is the one that determines the number of switches between 0 and 1 of the control variable.