Dynamic optimization of biological networks under parametric uncertainty

Background Micro-organisms play an important role in various industrial sectors (including biochemical, food and pharmaceutical industries). A profound insight in the biochemical reactions inside micro-organisms enables an improved biochemical process control. Biological networks are an important tool in systems biology for incorporating microscopic level knowledge. Biochemical processes are typically dynamic and the cells have often more than one objective which are typically conflicting, e.g., minimizing the energy consumption while maximizing the production of a specific metabolite. Therefore multi-objective optimization is needed to compute trade-offs between those conflicting objectives. In model-based optimization, one of the inherent problems is the presence of uncertainty. In biological processes, this uncertainty can be present due to, e.g., inherent biological variability. Not taking this uncertainty into account, possibly leads to the violation of constraints and erroneous estimates of the actual objective function(s). To account for the variance in model predictions and compute a prediction interval, this uncertainty should be taken into account during process optimization. This leads to a challenging optimization problem under uncertainty, which requires a robustified solution. Results Three techniques for uncertainty propagation: linearization, sigma points and polynomial chaos expansion, are compared for the dynamic optimization of biological networks under parametric uncertainty. These approaches are compared in two case studies: (i) a three-step linear pathway model in which the accumulation of intermediate metabolites has to be minimized and (ii) a glycolysis inspired network model in which a multi-objective optimization problem is considered, being the minimization of the enzymatic cost and the minimization of the end time before reaching a minimum extracellular metabolite concentration. A Monte Carlo simulation procedure has been applied for the assessment of the constraint violations. For the multi-objective case study one Pareto point has been considered for the assessment of the constraint violations. However, this analysis can be performed for any Pareto point. Conclusions The different uncertainty propagation strategies each offer a robustified solution under parametric uncertainty. When making the trade-off between computation time and the robustness of the obtained profiles, the sigma points and polynomial chaos expansion strategies score better in reducing the percentage of constraint violations. This has been investigated for a normal and a uniform parametric uncertainty distribution. The polynomial chaos expansion approach allows to directly take prior knowledge of the parametric uncertainty distribution into account. Electronic supplementary material The online version of this article (doi:10.1186/s12918-016-0328-6) contains supplementary material, which is available to authorized users.


Introduction
In this case study the terminal constraint expressing that the concentration of S 4 at time t f should exceed 0.90 mM is robustified.
Since this constraint only looks at one bound that has to be exceeded, the 95% confidence region should be covered when a backoff parameter of α S3 = 1.65 is chosen. The intermediate accumulation objective function is not robustified for this case study (i.e., α J2 ). For this case study the effect of assuming three uncertain parameters k 1 , k 2 and k 3 , characterized by a normal distribution (unless stated differently) with mean value 1 and 20 % relative standard deviation, is studied.

Parametric uncertainty distributions
In the polynomial chaos expansion approach, a priori information on the parametric uncertainty distribution can be taken directly into account via the orthogonal polynomials. Therefore it is investigated whether considering this a priori knowledge of the parametric uncertainty distribution for the robustified optimization provides a significant advantage. Two different parametric uncertainty distributions are studied: a normal parametric uncertainty distribution and a uniform parametric uncertainty distribution. As a remark, this should also be possible for the sigma points approach. However, the implementation is nontrivial and no references could be found in literature. For a normal parametric uncertainty distribution, standard (or normalized) Hermite polynomials are used as shown in Table 1. Table 1: Normalized Hermite polynomials up till third order with ξ a standard normally distributed variable.

Order
Hermite For a uniform parametric uncertainty distribution, another type of polynomials is used. Assume that the parameters θ i θ are uniformly distributed with relative standard deviation σ θi θ ,rel and mean θ i θ ,nom . The interval [θ i θ , min , θ i θ ,max ] covered by the uniform distribution is then computed as follows. Traditionally in literature, Legendre polynomials are chosen [1]. However, these Legendre polynomials only hold in the interval [0, 1]. Since the parameters considered in this work are not all in the interval [0, 1], it is chosen to derive orthogonal polynomials from the definition of orthogonal polynomials. For a uniform parametric uncertainty distribution the probability density function The interval for the uniform parametric uncertainty distribution is assumed to be symmetric around the nominal parameter value θ i θ ,nom : . Using the definition of the variance in Equation (2), ∆θ i θ is derived: (2) Hence, with σ θi θ = σ θi θ ,rel θ i θ ,nom , this becomes:

Trade-off between end time and intermediate accumulation
The multi-objective optimization problem consists of minimizing the time needed to reach at least 0.90 mM of product S 4 and minimizing the accumulation of intermediates in the linear pathway. It is assumed that the final time t f cannot exceed 10 s. In Figure 1 the Pareto fronts are shown for the linearization, sigma points approach, PCE1 and PCE2 approaches, respectively. It is already clear from the nominal problem that reducing the time needed to reach a level of 0.90 mM of S 4 leads to an increase in the intermediate accumulation and vice versa. In Figure 1, there is a clear trend in the Pareto fronts that are receeding from the nominal optimal solution with an increasing backoff parameter value α S4 (denoted by α in Figure 1). Both anchor points shift away from the nominal optimal solution. This is the price in performance (i.e., minimum intermediate accumulation) that has to be paid to ensure a minimum concentration of 0.90 mM for S 4 . The performance in objective function values is worse than in the case of no parametric uncertainty, but in practice this uncertainty is present and should be taken into account.
In Figure 2 the Pareto fronts obtained with the linearization, sigma points, PCE1 and PCE2 are plotted together for α S4 = 1.65 and α S4 = 1.96, corresponding to levels of 5% and 2.5% constraint violations for a normal distribution. This allows to make a comparison between the different approximation techniques for uncertainty propagation for multi-objective dynamic optimization under uncertainty. A first observation for this case study shows that the linearization, sigma points and second order polynomial chaos expansion (PCE2) method take less backoff from the nominal solution than the first order polynomial chaos expansion approach (PCE1). The sigma points method and polynomial chaos expansion method use more sampling points, thus result in a better approximation.     A second observation in Figure 2 is that the Pareto fronts obtained with the sigma points approach and the second order polynomial chaos expansion method are nearly coinciding. This is expected, since the sigma points method and second order polynomial chaos approach are very similar in implementation. The sigma points are a subset of the PCE sampling points for the second order polynomial chaos expansion. This is shown in Table 2. Table 2: Sampling points of the sigma points approach and PCE2 approach for Case 1 and a normally distributed parametric uncertainty distribution.
In case of a normal parametric uncertainty distribution, the expected value in the sigma points method is calculated in the same way as for the PCE2, but for the calculation of the variance the sampling points that are absent in the sigma points approach, are included. For the calculation of the expected value for the terminal constraint in the sigma points approach and PCE2 holds: This motivates why the Pareto fronts of the sigma points and PCE2 are very similar.

Minimization of intermediate accumulation
In what follows the emphasis is on a single objective case, for instance one of the Pareto points. However, this analysis can be done for every point in the Pareto front and any level set. The single objective optimization of the intermediate accumulation is considered. It is assumed that the final time is fixed at 10 seconds. This analysis is done to investigate the approximation techniques for uncertainty propagation more in depth.
Computational aspects. A first aspect is the computational cost of including uncertainty in dynamic optimization. The number of required states and variables, together with the CPU time for the different approximation techniques for uncertainty propagation are presented in Table 3. The CPU times are presented for the largest backoff parameter used in this work, i.e., α S4 = 1.96, to give an upper bound on the computation times that are reached. The results in Table 3 confirm that taking uncertainty into account, leads to an increased computational time. An inherent property of the considered uncertainty propagation techniques, is the increase in number of states when the number of uncertain parameters increases. The increase in computational time for this case study is thus related to the increase in the size of the optimization problem. From this it is clear that the linearization and PCE1 approaches have a similar computation time and are the fastest of the considered approximation techniques for uncertainty propagation, followed by the sigma points approach and PCE2 approach. Control profiles. The determined optimal controls (i.e., the enzyme concentrations e 1 , e 2 and e 3 ) with the approximation techniques for uncertainty propagation for α = 1.65 are illustrated in Figure 3. It is concluded that qualitatively the control profiles look similar, but there is some backoff from the nominal optimal control. The sequential activation of the controls (i.e., the increasing and decreasing enzyme concentrations e i ) is qualitatively the same as the nominal activation. However, the peak values seem each time earlier in the robustified case than the nominal one, to ensure that the activation is sufficient and early enough.  General trends. As stated in the previous subsection on the multi-objective optimization under uncertainty, there are general trends observed between the different methods. These general trends (e.g., the similarities between the sigma points approach and PCE2) are investigated more in depth for the single objective optimization case. First the expected value and 95% confidence bound of S 4 (based on α S4 = 1.65) are compared. This is done in Figure 4. From Figure 4 it can be seen that the expected state and 95% confidence bounds for S 4 are very similar when computed with the linearization, sigma points and PCE2 approaches. The PCE1 approach differs slightly in 95% confidence bound from the others. All lower bounds are at 0.90 mM as required by the imposed constraint in the implementation of the approximation techniques for uncertainty propagation. The expected value is always above the required 0.90 mM. It should also be stressed that the time for which the bound of 0.90 mM for the expected state S 4 is reached, is between 5 and 6 seconds, which is substantially smaller than the fixed end time of 10 seconds.  Figure 4: Comparison of the expected state S 4 and its 95% confidence bound calculated with linearization, sigma points, PCE1 and PCE2 with the nominal case (α = 1.65) in case of 3 uncertain parameters k 1 , k 2 and k 3 .

Monte Carlo simulations
In order to investigate the performance of the different approximation techniques for uncertainty propagation with respect to the constraint violations, a Monte Carlo simulation with 1000 realizations (i.e., randomly generated parameter samples from the parametric uncertainty distribution) has been performed for the four approaches and is compared with the nominal case. The number of constraint violations, expected value of S 4 and the variance on S 4 for the different approaches and α S4 are shown in Table 4. From these simulations, it is observed that all four methods reduce the amount of constraint violations significantly: from 50.9% in the nominal case to even 2.0% for PCE2, when a backoff parameter value of α = 1.96 is chosen.
In practice, the most interesting backoff parameter values are 1.65 and 1.96, corresponding to 5% and 2.5% violations in case of a normal distribution. In this case it is clear that the polynomial chaos expansion method of first order is superior in comparison with the linearization and sigma points approach when it comes to constraint violations. A lower backoff parameter value results into a too high number of constraint violations. The other α i are computed to show the increase in constraint violations and the evolution of the variance and objective function when more backoff is introduced. From the results in Table 4 it is clear that the PCE1 method scores the best with respect to constraint violations, followed by PCE2 and sigma points. The performance of the sigma points method and the second order polynomial chaos expansion are, as shown throughout this case study, very similar.
If an uncertainty distribution is assumed for the constraint, the level of constraint violations should correspond exactly with the confidence level. A too low degree of violations is also not wished, since the uncertainty is not propagated correctly in that case.
Furthermore, the expected values and variances of S 4 for a backoff parameter of α = 1.96 from Table 3 which are predicted with the approximation techniques, are very close to the empirically calculated expected values and variances by Monte Carlo simulation in Table 4, For the sigma points, PCE1 and PCE2 approaches these predicted expected values and variances are an accurate estimation of the ones obtained by Monte Carlo simulation. For the linearization approach, this is not the case. However, the expected value of the intermediate accumulation objective function is not accurately predicted. This objective function is not robustified in this case study, as the variance on the objective function is not taken into account. Therefore, the predictions of the expected value of the objective functions are not accurate, when compared with a Monte Carlo setting.

Uniform distribution
Since prior knowledge on the parametric uncertainty distribution can be taken directly into account in the polynomial chaos expansion approach, the case of a uniformly distributed parametric uncertainty is studied with as expected value, the nominal parameter value and 20% relative standard deviation. The orthogonal polynomials are derived via the definition of orthogonal polynomials. The orthogonal polynomials and their roots for a uniformly distributed parameter θ i ∈ [1 − √ 0.12, 1 + √ 0.12], are presented in Table 5. The PCE sampling points for the first and second order polynomial chaos expansion are determined by computing the roots of the second order polynomial and the roots of the third order polynomial, respectively.
Based on this table, it is clear that the PCE sampling points of PCE1 for a uniform parametric uncertainty distribution, are the same as those for a normal parametric uncertainty distribution. It can be proven for both case studies that the polynomial chaos expansion of first order is the same for a normal and uniform distribution parametric uncertainty distribution with the corresponding expected value θ i θ ,nom and relative standard deviation σ i θ ,rel .
A Monte Carlo simulation procedure with 1000 noise realizations from the uniform distribution has been followed and the results are summarized in Table 6. Considering the constraint violations, all approaches do not exceed the the upper bound of 2.5% and 5% for backoff parameter values of 1.96 and 1.65 respectively. Based on Table 6 the second order polynomial chaos expansion for a normal parametric uncertainty distribution and a uniform parametric uncertainty distribution, can be compared in terms of performance. There is less backoff from the objective function when a uniform parametric uncertainty distribution is assumed, also the variance is reduced for the second order polynomial chaos expansion with respect to the assumption of a normal parametric uncertainty distribution. The percentage of constraint violations on the other hand is slightly higher when a uniform distribution is considered. However, for this case study, the difference in performance between a uniform parametric uncertainty distribution and a normal parametric uncertainty distribution for the polynomial chaos expansion, is small. Therefore, it should be stressed that including the additional information on the parametric uncertainty distribution can be useful. However, gathering information on the parametric uncertainty distribution is quite intensive and does not lead to a drastic improvement in performance for this case study.