### Algorithm

#### Thermodynamic constraints

The directionality of the net flux of a chemical reaction and the change of Gibb's free energy are related to each other by the consensus rule

sgn(*v*) = -sgn (ΔG_{r}), (1)

where 'sgn()' is the sign function, ΔG_{r} denotes the change of Gibb's energy of the reaction and *v* is the net flux (rate) through the reaction. Actual changes of Gibb's energy can be calculated from changes of standard Gibb's energy (where each reactant has a concentration of 1 M) according to

\Delta {\text{G}}_{\text{r}}=\Delta {\text{G}}_{\text{r}}^{0}+\text{R}T{\displaystyle \sum _{M\in \text{P}}\mathrm{ln}[M]-\text{R}T}{\displaystyle \sum _{M\in \text{S}}\mathrm{ln}[M]},

(2)

where [*M*] is the active concentration (activity) of the metabolite *M*, S and P denote the sets of substrates and products of the reaction, respectively. R is the universal gas constant and *T* is the absolute temperature. The change of the standard Gibb's energy is related to the equilibrium constant *K*_{eq} of the reaction by

\Delta {\text{G}}_{\text{r}}^{0}=-\text{R}T\mathrm{ln}{K}_{\text{eq}}.

(3)

It has to be noted that standard Gibb's energy changes depend on temperature, pH value, and ion strength and thus may significantly differ from those determined under specific in vitro conditions. For a metabolic network, (eq. 2) reads in vector notation \frac{\overline{\Delta {\text{G}}_{\text{r}}}}{\text{R}T}=\frac{\overline{\Delta {\text{G}}_{\text{r}}^{\text{0}}}}{\text{R}T}+S\text{C}, where \overline{\Delta {\text{G}}_{\text{r}}} is the column vector of the ΔG_{r} values for all reactions of the network, \overline{\Delta {\text{G}}_{\text{r}}^{\text{0}}} is the column vector of the \Delta {\text{G}}_{\text{r}}^{\text{0}} values, C is the column vector of the natural logarithms of the active metabolite concentrations. (Concentrations are assumed to be strictly positive.) **S** is the stoichiometric matrix of the system, where rows refer to reactions and columns refer to metabolites. A positive or negative matrix element of **S** represents the stoichiometric coefficient with which the metabolite indicated by the column number appears as a product or substrate of the reaction indicated by the row number. Changes of the standard Gibb's energies of reactions can be additively composed of changes of the standard Gibb's energies of the formation of their reactants [28]:

\overline{\Delta {\text{G}}_{\text{r}}^{\text{0}}}=S\overline{\Delta {\text{G}}_{\text{f}}}.

(4)

Owing to the first law of thermodynamics the values of the standard Gibb's free energy changes are not independent from each other but have to obey the principle of micro-reversibility dictating the sum of standard free energy values in a closed system to be zero. In several flux balance studies [16, 17, 29–32] this criterion has been referred to as generalization of Kirchhoff's loop law which [see Additional file 3]. The problem is that experimentally determined values for the changes of standard Gibb's energies are not consistent with the principle of micro-reversibility per se because of experimental errors. Therefore, we add correction terms (forming the vector E) to all observed values of standard Gibb's energy changes and determine minimal corrections necessary to assure the principle of micro-reversibility. The corresponding optimization problem reads

\begin{array}{ll}Minimize\hfill & \Vert \text{E}\Vert \hfill \\ Subjectto\hfill & {\overline{\Delta {\text{G}}_{\text{r}}^{\text{0}}}}^{\ast}=S{\overline{\Delta {\text{G}}_{\text{f}}}}^{\ast}+\text{E},\hfill \end{array}

(5)

where ||E|| is the 2-norm of the vector E, and {\overline{\Delta {\text{G}}_{\text{f}}}}^{\ast} are hypothetic Gibb's free energy changes of formation. {\overline{\Delta {\text{G}}_{\text{r}}^{\text{0}}}}^{\ast}-\text{E} is then used as the modified vector of standard Gibb's energy changes fulfilling the condition of micro-reversibility.

#### Constraints on metabolite concentrations

In case that the metabolite concentrations might assume arbitrary non-negative values it would be always possible to let a chemical reaction proceed in either forward or backward direction. Hence, including information on metabolite concentrations as additional constraints in FBA makes only sense if the concentration of the metabolites can be restricted to a feasible range. If the concentration of a metabolite is known, we use this value as set-point which should be approximated as best as possible by the calculated metabolite concentration. Thus, we add the term

{\displaystyle \sum _{m\in \text{W}}{({c}_{m}-{s}_{m})}^{2}}

(6)

to the objective function where W denotes the set of metabolites *m* for which a set-point (logarithmic) concentration value *s*_{
m
}is available and *c*_{
m
}is the component of C related to metabolite *m*. If the concentration of a metabolite is not exactly known but can be restricted to a narrower concentration range based on metabolite profiling (for *E. coli* such a profile has been summarized by Kümmel et al. [14]) we use this information to define so-called soft concentration bounds denoted by *c*_{low} and *c*_{high}. In case that such physiologically feasible concentration range has not been reported yet we set the lower and upper soft bound close to the minimum and maximum of all known cellular metabolite concentrations. However, it may happen that a non-trivial flux distribution can only be found if the concentration of some metabolites drops off the range defined by the soft bounds. This may be due to the improper choice of soft bounds for some metabolites resulting from large experimental errors in the determination of cellular metabolite concentrations or the (unknown) binding of metabolites to macromolecular structures lowering their effective free concentrations inside the cell. Therefore, concentrations lying outside the range of the soft bounds are allowed in our algorithm but are penalized in the optimization criterion:

s(c)=\{\begin{array}{ll}{c}_{\text{low}}-c\hfill & \text{if}c{c}_{\text{low}}\hfill \\ 0\hfill & \text{if}{c}_{\text{low}}\le c\le {c}_{\text{high}}\hfill \\ c-{c}_{\text{high}}\hfill & \text{if}c{c}_{\text{high}}\hfill \end{array}

(7)

Here *c* is the (logarithmic) active concentration of the metabolite. The penalty function for the whole concentration vector is

s(\text{C})={\displaystyle \sum _{c\text{componentofC}}s(c)}.

(8)

As in Henry et al. [23] we introduce a second type of bounds, so-called hard bounds, to exclude metabolite concentrations which are impossible from the biochemical point of view. The combined effect of set-point values, soft and hard concentration bounds on the scoring function of the optimization algorithm is shown in figure 1.

Metabolic network models may contain reactions which are simplified in a way that reactants are dropped from the reaction formula. For example, the oxidation of glutathione (GSH) to glutathione disulfide (GSSG) is usually written as an overall reaction 2GSH → GSSG. Actually, this reaction should read 2GSH + R-OOH → GSSG + R-OHH_{2}O where R-OOH represents a large group of not further specified hydroperoxides that can be detoxified by the glutathione system. For these lumped reactions it is impossible to give a realistic \Delta {\text{G}}_{\text{r}}^{\text{0}} value. For other reactions a \Delta {\text{G}}_{\text{r}}^{\text{0}} value is simply not known (e. g. for 37 reactions in *E. coli* [23]). For such reactions the consensus rule (eq. 1) is not applied.

#### Setting up the constrained optimization problem

In FBA, formulation of the optimization problem requires to define the following three elements: (i) a physiologically meaningful scoring function to evaluate flux distributions, (ii) the steady-state conditions for all internal metabolites valid for the time-scale of interest (e. g. the time-scale of growth) and (iii) further constraints taking into account biochemical knowledge as, for example, maximal enzyme capacities limiting the flux rates [10] or thermodynamic constraints on flux directions as those discussed above. The steady-state condition can be formulated as

**S'** V = 0 (9)

where **S'** derives from the full stoichiometric matrix **S** of the network upon deletion of all columns referring to those metabolites which are exchanged with the external environment and thus need not to be balanced. According to the principle of minimal fluxes [8, 9] we set up the scoring function as the sum of the absolute values of all reaction fluxes |V| while assigning fixed values *L*_{
j
}, *j* ∈ *J* to all output fluxes, which are directly linked to cellular functions, the so-called target fluxes, the set of which is denoted by *J*. Adding the weighted terms (eq. 6) and (eq. 8) to the scoring function and including the constraint (eq. 9) the complete optimization problem is written as

\begin{array}{ll}Minimize\hfill & \left|\text{V}\right|+{\lambda}_{1}s(\text{C})+{\lambda}_{2}{\displaystyle \sum _{m\in \text{W}}{({c}_{m}-w(m))}^{2}}\hfill \\ Subjectto\hfill & {S}^{\prime}\text{V}=0\hfill \\ \begin{array}{cc}0\le {v}_{j}+\alpha {d}_{j}\le \alpha ,& 1\le j\le n\end{array}\hfill \\ \begin{array}{cc}0\le -\Delta {\text{G}}_{\text{r}}^{j}+\alpha {d}_{j}\le \alpha ,& 1\le j\le n\end{array}\hfill \\ \overline{\Delta {\text{G}}_{\text{r}}}/(\text{R}T)=\overline{\Delta {\text{G}}_{\text{r}}^{0}}/(\text{R}T)+S\text{C}\hfill \\ \text{C}\in \u212d\hfill \\ \begin{array}{cc}{v}_{j}={L}_{j},& j\in J.\end{array}\hfill \end{array}

(10)

\u212d is a vector of ranges defined by the hard concentration bounds. V is the vector of flux rates and *v*_{
j
}is the *j*-th component of V. *λ*_{1}, *λ*_{2} ∈ ℝ^{+} are empirical factors weighting the relative contribution of the various penalty scores relative to the scoring function of fluxes. (For our computations we have chosen *λ*_{1} = 100, *λ*_{2} = 0.01 putting a lower weight to the attainment of set-point concentration values than to the restriciton of the metaboite concentration values to physiologically feasible soft bounds.) *n* is the number of reactions, and for any 1 ≤ *j* ≤ *n*, *d*_{
j
}is a binary variable. *α* is set to a positive number which is larger than any possible flux value and larger than any possible Gibb's energy value, and it can easily be shown that the constraints 0 ≤ *v*_{
j
}+ *αd*_{
j
}≤ *α* and 0 ≤ -\Delta {\text{G}}_{\text{r}}^{j} + *αd*_{
j
}≤ *α* are equivalent to *v*_{
j
}≠ 0 → sgn(*v*_{
j
}) = -sgn (\Delta {\text{G}}_{\text{r}}^{j}). Intentionally, for a zero flux through a reaction the change of Gibb's free energy is not constrained because it might be substantially different from zero if the corresponding enzyme is missing or inhibited. The optimization problem corresponds to a mixed integer (boolean) linear program with quadratic scoring function.

We call a flux distribution obtained by solving the above optimization problem (eq. 10) thermodynamically realizable and refer to it in the following as **TR-fluxmin**, i.e. **t** hermodynamically **r** ealizable **flux**-**min** imized solution. If the maximization of biomass is used as flux objective, the sum of internal fluxes appearing in the objective function is replaced by the negative biomass production rate.

### Testing

#### Application to a metabolic network of human erythrocytes

The method described above was applied to a metabolic network of human red blood cells [24, 33] for which stationary flux distributions have already been calculated in our previous work [8]. The network comprises basically two cardinal metabolic pathways of this cell: glycolysis including the so-called 2,3-bisphosphoglycerate shunt, and the pentose phosphate cycle dividing into an oxidative and a non-oxidative part. The network consists of 27 biochemical reactions, 5 transport processes and 32 metabolites (see figure 2 and the supplementary material for the complete description of the model). The orientation of the arrows in the reaction scheme corresponds to the net direction of the reaction flux at standard concentrations. Standard Gibb's energies have been derived from the equilibrium constants contained in the kinetic model [24, 33]. The functionally essential target reactions that have to be maintained by the network are the following: (i) formation of 2,3-bisphospho-D-glycerate (2,3P_{2}G, reaction #9) required to modulate oxygen affinity of hemoglobin, (ii) ATP-utilization (ATPase, #16), which is mostly spent on the Na^{+}/K^{+}-ATPase to build up the Na^{+}/K^{+}-gradient across the plasma membrane, (iii) oxidation of GSH (GSHox, #21) to prevent oxidative damage of cellular proteins and lipids, (iv) synthesis of PRPP (PRPPS, #26) required for the salvage of adenine nucleotides. The magnitude of these 4 target reactions depends on the specific external conditions of the cell as, for example, osmolarity of the blood or preservation medium, oxidative stress caused by reactive oxygen species, or lowering of the oxygen tension during hypoxia. In our calculations the flux values for these 4 target reactions were chosen as reported for the normal in vivo state of erythrocytes: DPGM = 0.49 mmol/h, ATPase = 2.38 mmol/h, GSHox = 93 *μ* mol/h, PRPPS = 26 *μ* mol/h. With these values for the target fluxes, the comprehensive kinetic model [24, 33] yielded metabolite concentrations as shown in figure 2. These values are in good concordance with experimentally determined concentrations and thus will be referred to in the following as 'observed' concentrations.

Using the same values of standard Gibb's free reaction energy changes as used in the kinetic model and putting the set-point values of the metabolite concentrations to the 'observed' ones, the TR-fluxmin solution of the optimization problem turns out to be identical with the flux-minimized solution determined by our previous approach [8]. A detailed description of the model and the solution mentioned below can be found in the supplements [see Additional file 1].

##### Perturbation analysis

To investigate the impact of errors in the observed metabolite concentrations on the predicted flux distribution the concentration values given in figure 2 were perturbed by multiplying them with a random factor obeying an exponential normal distribution with controlled standard deviation (see caption of figure 3). The hard concentration bounds were chosen as follows: 0.1 *μ* M...100 mM (glucose), 0.1 *μ* M...25 mM (CO_{2} and phosphate), and 0.01 *μ* M...10 mM (28 remaining compounds). Calculation of the flux distribution with randomly altered set-point concentration values was repeated in 1000 trials. Surprisingly, for smaller perturbations the reference solution (= TR-fluxmin for 'true' set-point values) was retained in all trials. But perturbations with a standard deviation of 2 (corresponding to an average factor 7.4 in the change of the concentration values) resulted in a second, slightly different, flux distribution in some trials. For perturbations with standard deviation of 6 (corresponding to an average 400-fold change in the concentration values) this second alternative flux distribution dominated and on top a third alternative flux distributions was obtained in a significant number of trials (see figure 3).

In a second perturbation analysis, the standard Gibb's energy change values were randomly altered in a similar manner. Also here, there was an increasing tendency towards the two alternative flux distributions found before when increasing the magnitude of perturbations (see figure 4). These alternative flux distributions already occurred at a standard deviation of 3 (corresponding to an average deviation of 7.7 kJ/mol of the standard Gibb's energy changes) and their relative share became dominant at a standard deviation value of 6 (corresponding to an average deviation of 15.5 kJ/mol of the standard Gibb's energy changes).

##### Inspection of alternative flux distributions

The three alternative flux distributions obtained in the perturbation studies differ in the uptake flux of glucose (1.50495 mmol/h for the unperturbed network; 1.5015 mmol/h and 1.4972 mmol/h for the two alternative flux distributions) and the fluxes in the non-oxidative pentose phosphate pathway converting ribulose-5-phosphate (Ru5P) into fructose-5-phosphate (Fru6P) and glyceraldehyde 3-phosphate (GraP). The reference solution predicts this pathway to proceed in forward direction thereby forming 20.7 *μ* mol/h ribulose-5-phosphate. In the second solution (glucose uptake 1.5015 mmol/h) this pathway is not used at all whereas in the third flux distribution (glucose uptake 1.4972 mmol/h) it is used in backward direction producing 25.8 *μ* mol/h ribulose-5-phosphate. Interestingly, the latter flux distribution is also obtained for the unperturbed network if the maximization of biomass production used as flux criterion. Notably, all three different flux distributions obtained as solution of the minimization problem (eq. 10) for randomly altered thermodynamic parameters and set-point concentrations are feasible from the kinetic view point, i.e. the kinetic model of the erythrocyte metabolism yields a stable stationary solution.

##### Effect of external concentrations

Our algorithm allows assessing how the predicted flux distributions are affected by changes in the concentration of external metabolites. In vivo, such a situation may occur if some essential fuels for the cellular metabolism are depleted, for example, due to a reduced blood flow through vessels with severe atherosclerotic stenoses, or some end products of the cellular metabolism accumulate because of a reduced excretion capacity of the body. For example, in case of strong physical exercise the concentration of lactate in human blood may rise to values as high as 19.5 mM (in blood plasma) respectively 7.0 mM (in erythrocyte cytoplasm) [34] indicating that the lactate production by the anaerobic skeletal muscle clearly exceeds its rate of re-conversion to glucose in the liver and its utilization rate in the heart muscle. To investigate the consequences of such high blood lactate levels for the metabolism of red cells we calculated thermodynamically realizable flux distributions at gradually increasing concentration of external lactate. For all metabolites except external lactate, the hard bounds were put to ± 25% and the soft bounds to ± 10% deviation from of the 'observed' concentration values. For external lactate concentrations up to a critical value of 12.4 mM our algorithm predicted a thermodynamically realizable flux distribution. For concentrations higher than 12.4 mM no stationary flux distribution solution was found. Increasing gradually the concentration of external lactate up to the critical value of 12.4 mM, the concentrations of pyruvate, NAD^{+} and NADH tended towards the hard bounds to ensure the flux through the lactate dehydrogenase (EC:1.1.1.27) to be directed towards formation of lactate. Our find of a metabolic threshold effect with respect to blood lactate levels corresponds well with clinical observations. At lactate levels higher than 4 mM a reduced deformability of erythrocytes is observed, which may account for the exercise-induced arterial hypoxemia occuring in athletes [35]. Decreasing deformability of erythrocytes is a clear indication for a severely perturbed metabolism of the cell.

#### Application to a metabolic network of *E. coli*

To check the applicability of our algorithm to genome-scale metabolic networks comprising hundreds of reactions and metabolites, we performed the same type of analysis as described above with respect to the metabolic network iJR904 of the bacterium *E. coli* reconstructed by Palsson and co-workers [25]. In this model a minimal medium composed of glucose, ammonium, sulfate, oxygen, phosphate is sufficient for growth according to the biomass creation formula associated with the model. Experimental flux data for *E. coli* has been determined by Emmerling et al. [36] which correspond to 17 internal fluxes of the iJR904 network (using the projection of Segre et al. [37] onto the iJE660a network of *E. coli* [38].) The thermodynamic properties of the iJR904 network, consisting of 659 metabolites and 931 reactions, have been analyzed previously [14, 15, 20, 23, 32]. Since experimentally determined Gibb's free energies are available only for a minor fraction of reactions [20, 39] we use computed values given by Henry et al. [23].

These values were obtained by a slightly modified version of the group contribution method [40, 41]. Physiological concentration ranges were available for 22 internal metabolites (given in Kümmel et al. [14]) and 10 external metabolites (given in Henry et al. [23]). For the other metabolites generic concentration bounds were used based on typical cellular concentration ranges reported in the literature: 20 *μ* M-0.5 mM (soft bounds), 5 *μ* M-2 mM (hard bounds). Further details of the model are given in the supplement [see Additional file 2].

We calculated the flux distribution in this network according to the proposed optimization principle (eq. 10) using as flux objective the maximization of the biomass production. No a priori assumptions were made with respect to the directionality of reactions with two exceptions: The direction of the exchange fluxes was fixed according to the experimental conditions [36] and the direction of 37 internal reactions for which no Gibb's energy value was given in Henry et al. [23] was also fixed [see Additional file 2, archive member 'Ecoli-model.txt', section 'reactions excluded from the TR-property', to see which]. As shown in Fig. 5) (case: 'TR-biomax', data points symbolized by blue triangles) the thermodynamically realizable solution provided a good concordance with observed flux values available for 17 internal reactions. To check the influence of the thermodynamic side constraints on the quality of the flux distribution, we omitted the condition of thermodynamic realizability from the optimization algorithm, again making no a priori assumptions on flux directionalities. In this case ('biomax, fully reversible', data symbolized by red squares in Fig. 5)) the concordance between predicted and observed flux values diminished significantly. This example shows that our algorithm may significantly improve the reliability of flux predictions even if the concentration range of most metabolites is only roughly restrained.

In a third calculation we again omitted the condition of thermodynamic realizability from the optimization algorithm but instead used the heuristic classification of reactions into reversible and irreversible ones as outlined in [25] (case: 'biomass, heuristic irreversibilities', data points symbolized by green diamonds). The obtained flux distribution also yielded a reasonably good concordance between predicted and observed flux values. Notably, this 'classical' variant of FBA gave no better predictions of the observed fluxes than the TR-solution obtained with our algorithm. This qualifies our method as a valuable flux predictor for large-scale networks without the need to apply heuristic rules for the assignment of flux directionalities.

##### Perturbation analysis

Using the same perturbation analysis as outlined above for the erythrocyte network we investigated the impact of alterations in the values of the Gibb's free standard energies on the predicted flux distributions. Such an analysis is of importance as the values of standard Gibb's free energy changes computed by the group contribution method may generally exhibit a large degree of uncertainty [42].

Compared with the findings for the erythrocyte network, much smaller perturbations already resulted in a multitude of alternative flux distributions (see figure 6). Thus, the higher the complexity of the network the more susceptible is the predicted flux distribution is to the choice of the standard Gibb's energies. Closer inspection of the predicted alternative flux distributions showed that the main differences are concentrated in some distinct parts of the networks. We found the largest variability of predicted fluxes for the exchange of CO_{2}, the 3-reaction pathway leading from acetaldehyde and CoA to formation of ATP from ADP via acetyl-phosphate as intermediate, and in the import of *a*-ketoglutarate. The possible fluxes through these reactions appear to be strongly determined by thermodynamic constraints and thus are difficult to predict given low accuracy of thermodynamic data.