Fluxomers: a new approach for 13C metabolic flux analysis
© Srour et al; licensee BioMed Central Ltd. 2011
Received: 10 August 2010
Accepted: 16 August 2011
Published: 16 August 2011
Skip to main content
© Srour et al; licensee BioMed Central Ltd. 2011
Received: 10 August 2010
Accepted: 16 August 2011
Published: 16 August 2011
The ability to perform quantitative studies using isotope tracers and metabolic flux analysis (MFA) is critical for detecting pathway bottlenecks and elucidating network regulation in biological systems, especially those that have been engineered to alter their native metabolic capacities. Mathematically, MFA models are traditionally formulated using separate state variables for reaction fluxes and isotopomer abundances. Analysis of isotope labeling experiments using this set of variables results in a non-convex optimization problem that suffers from both implementation complexity and convergence problems.
This article addresses the mathematical and computational formulation of 13C MFA models using a new set of variables referred to as fluxomers. These composite variables combine both fluxes and isotopomer abundances, which results in a simply-posed formulation and an improved error model that is insensitive to isotopomer measurement normalization. A powerful fluxomer iterative algorithm (FIA) is developed and applied to solve the MFA optimization problem. For moderate-sized networks, the algorithm is shown to outperform the commonly used 13CFLUX cumomer-based algorithm and the more recently introduced OpenFLUX software that relies upon an elementary metabolite unit (EMU) network decomposition, both in terms of convergence time and output variability.
Substantial improvements in convergence time and statistical quality of results can be achieved by applying fluxomer variables and the FIA algorithm to compute best-fit solutions to MFA models. We expect that the fluxomer formulation will provide a more suitable basis for future algorithms that analyze very large scale networks and design optimal isotope labeling experiments.
Metabolism is the complete set of chemical reactions taking place in living cells. These chemical processes form the basis of all life, allowing cells to grow, reproduce, maintain their structure and respond to environmental changes. Metabolic reactions are divided into groups called metabolic pathways, which are typically constructed heuristically according to their connectivity and presumed function . Each metabolic pathway is characterized by a set of chemical reactions that transform substrates into end products while generating intermediate byproducts. Due to its importance in medicine and biotechnology, metabolic pathway research has become a highly active field of investigation .
Initially, the structure of metabolic pathways was examined by identifying their intermediate compounds. Subsequently, the various biochemical reactions connecting these compounds were mapped. Due to the success of this research, the topological structure of many metabolic pathways is nowadays fully documented . The next step in the progression of metabolic pathway research involves quantification of the rates of these various chemical reactions, also known as "fluxes". The values of these rates are affected by various environmental conditions and can change rapidly in response to perturbations. Nevertheless, if the environmental parameters are held fixed and stable, the network can attain a steady state in which the concentrations of all network metabolites are assumed constant over time. This, of course, implies that the rates of their input and output reactions must balance. The latter imposes a set of linear constraints on the metabolic fluxes, known as "stoichiometric balance equations" . Unfortunately, since the number of unknown fluxes typically exceeds the number of independent stoichiometric balances, these constraints are insufficient to completely identify the metabolic network. In order to overcome this lack of information, additional constraints must be provided to the stoichiometric mathematical model to estimate the values of the network fluxes .
Various experimental techniques have been developed to enable measurement of intracellular metabolic fluxes, either directly or indirectly. One of these approaches makes use of isotope labeling experiments. In this method, the metabolic system is administered a known amount of an isotopically labeled substrate (such as glucose labeled with 13C at specific atom positions). By measuring the resulting labeling patterns of intracellular metabolites after steady state has been achieved, additional flux information is obtained.
One major drawback of this experimental approach is the high complexity and computational intensity of the metabolic flux analysis (MFA) required to interpret these labeling measurements. In their series of articles, Wiechert et al. [6–9] constructed a systematic approach for performing this analysis. They show that measurements of the relative abundance of various isotope isomers, also known as "isotopomers", contain enough information to fully identify the metabolic fluxes of the network. Formulating the problem using isotopomer variables (or a transformed set of isotopomer variables referred to as "cumomers"), Wiechert et al. posed the flux estimation problem as a non-convex least-squares minimization, assuming random error is added to their isotopomer measurements. The resulting high-dimensional non-convex problem suffers from various drawbacks, such as slow convergence and notable probability of attaining local minima. Several optimization algorithms have been developed in order to address these drawbacks. Early approaches used iterative parameter-fitting algorithms , evolutionary algorithms  and simulated annealing . Furthermore, several investigations have been conducted in order to assess the accuracy of these results [9, 12, 13]. Recently, a novel method to decompose the metabolic network into Elementary Metabolite Units (EMUs) was introduced  and implemented into the OpenFLUX software package . This decomposition effectively reduces the size of the optimization problem by efficiently simulating only those isotopomers that contribute to the measurement residuals. Nevertheless, all of these algorithms suffer from several major drawbacks due to the standard isotopomer-flux variables used in formulating the optimization problem:
Presence of unstable local minima: due to the non-convex nature of the objective function.
Complex mathematical representation and computational implementation. This results in the need for ad-hoc algorithms and mathematical analysis, and long running times are required for reliable convergence.
The OpenFLUX implementation, for example, may require several dozens of convergence iterations with various initial values in order to achieve acceptable probability of obtaining the optimal set of fluxes in any one of its attempts. In addition, due to the chosen objective function, it is also commonly required to estimate scaling factors for each isotopomer measurement, because of the fact that the available experimental techniques are only capable of measuring isotopomer fractions up to a proportional scaling factor (see Mollney et al.  for further details).
This article introduces a new set of variables for simulating 13C isotope labeling experiments. The main idea underlying this reformulation is that, instead of treating fluxes and isotopomer variables separately, we identify a set of "isotopically labeled fluxes" as our state variables of interest. We refer to these variables as fluxomers. Fluxomers combine flux variables with isotopomer variables and consequently reduce the complexity and nonlinearity of the original isotopomer balance equations. In this article, we show that by reformulating the flux estimation problem in terms of fluxomer variables, it is possible to construct an algorithm that has the following key benefits:
Provides efficient computation of all isotopomers in a metabolic pathway
Is robust to measurement noise (i.e., suppresses the effects of measurement errors) and initial conditions
Eliminates the need for measurement scaling factor estimation
Poses the problem using simple mathematical expressions, allowing the use of generic optimization algorithms
The rest of the article is constructed as follows. The Results and Discussion section illustrates the advantage of our approach via simulation results comparing fluxomer variables to the commonly used cumomer approach and the more recently introduced EMU approach. The Methods section presents the detailed formulation of the fluxomers optimization problem and the fluxomers iterative algorithm (FIA) that provides a reliable and efficient method for solving it. All source code and executables for our algorithms are freely available at the author's website .
We compared our FIA algorithm to the widely used MFA software 13CFLUX , which relies on the cumomer approach, and to the more recent Open-FLUX  software, which is based on the EMU  approach. In order to compare the methods, we conducted flux estimations for various well-studied metabolic pathways. Our first example is based upon the tutorial which Wiechert et al. provide with their 13CFLUX software: the Embden-Meyerhof and Pentose Phosphate metabolic pathways of Escherichia coli . This example compares the running time and robustness of both algorithms in response to input noise. Our second example compares the results and performance of FIA to both an adhoc method and the OpenFLUX algorithm for the analysis of lysine production by C. glutamicum, as described by Becker et al.  and Quek et al. .
EMP & PPP simulation data
Label input data
Comparison of FIA with 13CFLUX for the simple E.coli metabolic network
Algorithm running time comparison for FIA vs. 13CFLUX
In this section we examine the analysis of the central metabolism of two lysine-overproducing strains of Corynebacterium glutamicum: ATCC 13032 (lysCfbr) and its PEFTUfbp mutant. Both express feedback-resistant isoforms of the aspartokinase enzyme lysC, while the latter is additionally engineered to overexpress the glycolytic enzyme fructose-1,6-bisphosphatase. The example is based upon the measurements provided by Becker et al. , who implemented an ad-hoc program to estimate the values of various metabolic fluxes. In their more recent article introducing the OpenFLUX software package , Quek et al. chose to compare their results to those of Becker et al. Therefore, we will expand upon their comparison using our FIA implementation. The input file for FIA was constructed using the measurements and pathway structure given in  and . As described in , the published mass isotopomer fractions were modified for mass interference from non-carbon backbone isotopes using the molecular formula of the amino acid fragments. FIA supports automatic generation of the naturally occurring isotopes correction matrix when the measured molecular formulas are supplied. This adjusts the measured fluxomers vector appearing in the objective function during the process of optimization. If necessary, it is possible not to use this feature but instead to directly supply the algorithm with the corrected measurement values.
When comparing the running times of FIA with OpenFLUX, the different algorithmic approaches of the two must be kept in mind. While OpenFLUX requires the user to supply it with sets of free fluxes, FIA requires no free fluxes nor initial values. Open-FLUX rapidly evaluates dozens of different optimization cycles with random initial values and seeks the best fitting result among them, while FIA uses only one single (longer) run. As such, the convergence probability of OpenFLUX depends on the number of attempts and random values generated during its operation, while the FIA results do not depend on any random value. Furthermore, in its analysis, EMU based algorithms evaluate only the fluxes necessary for measurement comparison, and thus their running time depends both on the metabolic network structure and the amount and location of the given measurements. FIA, on the other hand, can supply the entire set of metabolic fluxes at any given time, with no additional computation requirement (which depends mainly on the network structure).
Relative mass isotopomer fractions comparison for wild-type and mutant C. glutamicum
Sum of weighted residuals
Metabolic fluxes comparison for wild-type and mutant C. glutamicum
Glucose 6-phosphate isomerase
Glucose 6-phosphate dehydrogenase
Glyceraldehyde 3-phosphate dehydrogenase
Pyruvate carboxylase - carboxykinase
We can also run the same optimization, but weight the given flux measurements by their variances. When running this optimization using OpenFLUX (again using 50 iterations), the amount of time was greatly increased, and ended in around 48 minutes. For FIA, on the other hand, the running time was the same as before, thus about 6 minutes. Comparing the results of the algorithms, OpenFLUX suffered from severe convergence problems. Most of its iterations ended without converging at all, while those that did converge yielded useless results, far from the measurements. FIA, on the other hand, succeeded in converging for all scenarios. For the wildtype lysine producing pathway, the results were very close to the ones before (since the fluxes and measurements were quite accurate). For the mutant example, which was less accurate, a reduction of the residual value was achieved by small changes to the measured fluxes. fluxes and residual values can be examined in Table 4 and Table 5.
We now show that FIA can easily use incomplete or non-normalzied measurements by examining its performance in the example above. The supplied MS measurements were normalized to the n +1 backbone carbon atoms of the measured metabolites. Instead of using the supplied normalized data, we multiply each set of metabolite measurements by a random constant number. By doing so, we simulate the case in which only the first 3 (2 for GLY) MS peaks were measured, and had not been normalized. The original and supplied non-normalized measurement values can be found in Table 4. Note that the values were corrected by the molecular formulas of the measured fragments (again, can be automatically performed by FIA). In the absence of normalized data, FIA gave estimated fluxes very close to the previous cases, with very low residual values, as can be seen in Table 5. The running time of the algorithm was not affected by the change.
The main contribution of this article is the introduction of fluxomers--a new set of state variables used to simulate 13C metabolic tracer experiments. The fluxomers approach allows the central optimization problem of MFA to be reformulated as a sequence of quadratic programs, which form the basis of the fluxomers iterative algorithm (FIA). Both fluxomers and FIA result in several important benefits compared to flux-isotopomer variables. Among these advantages are (i) a reduction in algorithm running time required for simulation of isotopomer distributions and metabolic flux estimation, (ii) reduced sensitivity to measurement noise and initial flux values and (iii) availability of complete isotopomer information for a given network (as opposed to the EMU approach, which only supplies partial information) without the need for user specification of free fluxes or initial flux values. Additionally, the error model used by the FIA algorithm has the advantage that it depends solely upon isotopomer ratios rather than complete isotopomer fractions, and therefore it eliminates the need to estimate a normalization factor for each measured isotopomer distribution. Our current results show significant improvements even with regards to simplistic tracer experiments (the running times have been improved by an order of ×3 to ×20 compared to the 13CFLUX algorithm, and about ×2 to ×8 compared to the OpenFLUX implementation). It is important to note that the total time required to obtain an MFA solution is controlled both by (i) the time of each iteration and (ii) the number of optimization iterations that are required to achieve a reliable solution. While a single OpenFLUX iteration is certainly faster than a single iteration of FIA, the FIA algorithm was expressly constructed to provide high reliability in achieving the optimal solution. Therefore, FIA was able to consistently find a better optimal solution in less total time in comparison to the other algorithms examined. Furthermore, extending the fluxomers formulation to other global optimization techniques is straightforward. We expect that reformulating more sophisticated MFA problems--for example, involving optimal experimental design or large-scale metabolic networks--in terms of fluxomer variables will lead to dramatic enhancements of algorithmic efficiency and robustness.
Traditional MFA approaches construct distinct variables for each flux and for each possible labeling state (isotopomer) associated with all metabolites in the network. Fluxomers, on the other hand, are a composite of these two and therefore allow the network state to be described using only one variable type.
Definition 1 (Fluxomer)A fluxomer is the rate that a metabolic reaction transfers labeling from one or more specific substrate isotopomers into product isotopomers.
Taking each fluxomer to be a transformation from one set of labeled atoms into others, we can write its labeling state as an array of binary elements representing the state of each atom it consumes (0 representing an unlabeled atom and 1 representing a labeled atom). Thus, f i (1001) is a fluxomer of reaction i consuming 4 atoms, with its first and last atoms labeled and two middle atoms unlabeled. When using x as an index for one (or more) of the atoms, we denote a sum of fluxomers where the indicated atom can be either labeled or unlabeled (e.g., f i (1x01) is the sum of f i (1001) and f i (1101)). See Figure 3b for a detailed example.
where g(u) is a function of u alone, and h is a constant vector. Thus, for this simple case we can solve for the outgoing fluxomer f 5(01) directly in terms of the fluxomers entering pool B and the total fluxes f 2 and f 5 leaving pool B.
As before, the outgoing fluxomer f 4(0001) can be expressed solely in terms of g--a pure function of u (always a rational function of outgoing fluxes)--and a product of linear projections of x.
where g is a function , and (H 1 , H 2 ) are two m × m matrices. This equation allows for the output fluxomers emanating from a specific metabolite pool to be expressed in terms of the total flux vector u and the fluxomers entering the pool. This enables each outgoing fluxomer to be solved "locally" for the incoming fluxomers. Note that this local calculation does not involve any matrix inversions or other expensive computational procedures. If there are no recycle loops in the network so that all possible paths through the network are non-selfintersecting, this equation can be used to solve sequentially for all "downstream" fluxomers in terms of previously calculated "upstream" fluxomers. In the presence of recycle loops an iterative approach can be constructed to solve for the fluxomers while still avoiding repeated matrix inversions.
In the following, we develop a systematic method for expressing measured isotopomer variables using fluxomer notation. The final result of the analysis shows that isotopomer measurements can be written simply as the norm of a linear transformation of fluxomers, thus Err ~ ||Ax||2. First, we briefly summarize the available isotopomer measurements provided by Nuclear Magnetic Resonance (NMR) and Mass Spectrometry (MS) methods. We then discuss the mathematical modeling of these measurements using fluxomer variables.
MFA experiments are typically carried out by (i) introducing a labeled substrate into a cell culture at metabolic steady state, (ii) allowing the system to reach an isotopic steady state, and (iii) measuring isotopomer abundances of metabolic intermediates and byproducts using either MS or NMR analysis. These two measurement techniques provide qualitatively different information about isotopic labeling.
1H NMR: Measures the fractional 13C enrichment of each proton-bound carbon atom, irrespective of the labeling of its neighboring carbon atoms. Both 12C and 13C atoms are distinguishable in the same spectrum, and therefore the peak areas corresponding to different carbon isotopes can be normalized directly.
13C NMR: Quantifies isotopomers based on the presence of multiplet peaks (e.g., doublets, triplets, doublet doublets, etc.) in the spectrum caused by two or more neighboring 13C atoms. Because 12C atoms are undetectable by 13C NMR, it is impossible to quantify the overall fraction of each isotopomer unless 1H NMR spectra are simultaneously obtained. Instead, only the relative ratio of different isotopomers can be assessed by 13C NMR.
MS: This technique is usually preceded by some form of chromatographic separation (GC or LC) to resolve mixtures into their individual components. These components are then ionized and fragmented in the MS ion source. The ionized particles are separated according to their masses by an electromagnetic filter, and a detector measures the relative abundance of each mass isotopomer. These abundances can be normalized to a fractional scale if all MS peaks corresponding to a particular ion are simultaneously measured.
Previous studies based on flux-isotopomer variables have modeled the measurement error as Gaussian noise added to the fractional isotopomer enrichments. Therefore, if is the vector of measured isotopomer fractions, this model states that , where e is the Gaussian error term. However, a more accurate error model would add the measurement noise directly to the physically measured values. The motivation for the traditionally chosen error model is its relative simplicity when expressed using flux-isotopomer variables. Furthermore, since some isotopomers of a specific metabolite may be unmeasurable, the isotopomer fractions cannot be experimentally determined in many cases. This implies the need for an alternative error model that avoids these shortcomings.
where T and V are transformation matrices needed to convert fluxomers to isotopomer measurements and the diag operator converts its vector argument into a diagonal matrix. The resulting expression is both a simple sum of Gaussian vectors and affine in x.
The advantage of this objective function is that it only depends upon the relative isotopomer intensities in the vector but does not depend upon how these intensities have been normalized (as long as the transformation matrix T is constructed accordingly). This eliminates the need to estimate optimal normalization factors that are required by previous algorithms in order to convert experimental measurements into isotopomer fractions. This is true for both MS and NMR measurements, either when conducted alone or used together in the same experiment.
where A selects the measured elements of the fluxomers vector x(u), b contains their associated values, and S is the stoichiometric matrix of the reaction network. Note that b may contain non-zero elements only when associated with measurements of absolute flux values. For isotopomer measurements, the associated elements of b are zero.
Eq. 14 can be solved using various non-convex global optimizing techniques. These optimizers typically require the user to provide subroutines for computing the value of the objective function and its first derivatives at various points along their convergence path. Furthermore, evaluation of the function x(u) and its derivatives are the main (practically only) time-consuming procedures when solving the optimization problem in eq. 14. The mathematical formulation of eq. 14 is similar to the optimization problem resulting when using the labels and fluxes variables, with one exception - the implicit formula for x(u). As shown above, using fluxomers we are able to formulate the propagation equation (and thus solving x(u)) as a multiplication of homogeneous functions of fluxes, and second order functions of fluxomers. Using labels and fluxes, formulating the same equation results in a sum of functions of the same structure, and the homogeneous separation property vanishes. The following sections exploit this unique property of the fluxomers propagation equation in order to achieve great reduction in the system computational complexity, leading to the FIA algorithm.
This section deals with the evaluation of x(u) along with its gradient using the fluxomer formulation. First, we show that x(u) can be calculated iteratively while avoiding repeated matrix inversions. Then, we demonstrate how the number of iterations can be reduced using a Newton-type gradient-based algorithm. Finally, we explain how it is possible to greatly increase the sparsity of the system using a simple linear transformation of variables, which further reduces the number of iterations needed for convergence.
where x t is the fluxomer vector at iteration t and x t+1is the fluxomer vector at iteration t + 1. In order to simulate the steady-state labeling, we initialize the system with the vector x 0 in which only the input fluxomers are labeled and all others are unlabeled. By recursively substituting x back into the equation, steady state is eventually reached and the final value of x is obtained. (This equation represents a non-linear time-invariant Markov chain.) For the Embden-Meyerhof and Pentose Phosphate Pathway example in the Results and Discussion section, it takes a few hundred iterations to achieve complete stability of the solution (maximal fluxomer value change on the order of 1e-17). Algorithm convergence for a given input vector is retrieved exactly as in the real biological system, and thus a unique solution always exists (for realistic metabolic networks).
In order to determine the root of the propagation equation, FIA starts with an iteration or two using Newton's correction and then continues with the simple "natural" approach. Applying this method to the Embden-Meyerhof and Pentose Phosphate Pathway example in the Results and Discussion section, only a few dozen iterations are now needed. In the next section we show how to reduce both the number of variables and the number of iterations required for convergence by another order of magnitude, without affecting system convergence stability.
The following section introduces a mathematical approach for reducing the number of nonzero elements in our system. Variable reduction techniques such as the recently developed Elementary Metabolite Unit (EMU) network decomposition  were developed for application to systems that are modeled using flux-isotopomer variables. Fluxomers and the FIA algorithm, as opposed to prior approaches, allow us to effectively reduce the number of system variables using a simple linear transformation on x. Our main goal here is to find a transformation for the fluxomer vector x, y = Kx that:
From eq. 16 we see that the greatest expense is due to inversion of a sum of two linear transformations (H 1 and H 2 ) of x. From the iterative propagation equation, eq. 15, we see that x is iteratively calculated by computing the product of the same two matrices. Had it been possible to find a sparse, close-to-diagonal representation for both H 1 and H 2 by simply multiplying them by the matrix from the right, both problems would be solved.
1. Reduced computational complexity -- note that the derivative depends upon the matrices H 1 and H 2 which have already been factored, and thus solving Newton's step is straightforward.
2. Fewer iterations needed for convergence.
As a matter of fact, this transformation reduced the number of iterations needed for convergence of the simple E. coli example to a total of 5.
Keeping in mind that F x (x, u) is in its reduced form due to our variable transformation, solving the equation can be accomplished efficiently.
with A a matrix that selects the measurable elements of u, the meaured elements of u (if they exist), 0 a vector of zeros, and u b a vector of the flux upper bounds. The regularization factor λ starts with some large value, and if necessary is reduced until the optimal value of s is greater than 0. Note that when λ → 0 the problem reduces to finding a feasibile solution of u, and thus always has a solution (for well-structured networks).
Summarizing the above discussion leads to the following algorithm for efficient solution of the MFA optimization problem using fluxomers:
i. Matrix preparation (run once per network):
0. Calculate using LU (PQ) factorization.
When requested by the optimizer, return x(u) and its first derivative:
1. Set y 0 = y init .
2. Set i = 1.
y i = U H [g(u) ○ (L H1 y i - 1 ) ○ (L H2 y i - 1 )].
4. If ||y i - y i - 1||2 > ε N
(b) Set y i = y i - r according to Newton's method.
5. If ||y i - y i - 1||2 > ε e go to 3.
6. Calculate x = [g(u) ○ (L H1 y i ) ○ (L H2 y i )].
7. Solve .
The supplied software uses a variant of the "sequential least-squares" algorithm [24, 25] for solving the non-convex optimization problem in eq. 14. This essentially breaks the problem into a sequence of convex optimization problems for which the solution can be readily computed. Note that other algorithms can be easily used with the same procedures described above.
JDY is supported by an NSF CAREER Award (CBET-0955251) and by the Vanderbilt Discovery Award program.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.