- Methodology article
- Open access
- Published:
A possibilistic framework for constraint-based metabolic flux analysis
BMC Systems Biology volume 3, Article number: 79 (2009)
Abstract
Background
Constraint-based models allow the calculation of the metabolic flux states that can be exhibited by cells, standing out as a powerful analytical tool, but they do not determine which of these are likely to be existing under given circumstances. Typical methods to perform these predictions are (a) flux balance analysis, which is based on the assumption that cell behaviour is optimal, and (b) metabolic flux analysis, which combines the model with experimental measurements.
Results
Herein we discuss a possibilistic framework to perform metabolic flux estimations using a constraint-based model and a set of measurements. The methodology is able to handle inconsistencies, by considering sensors errors and model imprecision, to provide rich and reliable flux estimations. The methodology can be cast as linear programming problems, able to handle thousands of variables with efficiency, so it is suitable to deal with large-scale networks. Moreover, the possibilistic estimation does not attempt necessarily to predict the actual fluxes with precision, but rather to exploit the available data – even if those are scarce – to distinguish possible from impossible flux states in a gradual way.
Conclusion
We introduce a possibilistic framework for the estimation of metabolic fluxes, which is shown to be flexible, reliable, usable in scenarios lacking data and computationally efficient.
Background
Systems biology states that, in order to quantitatively understand and predict the cell behaviour, its constitutive components and their interactions must be studied as a whole system [1, 2]. Metabolic networks are a paradigmatic example of this aim because, even incomplete as they may be, they are the best characterized cellular networks [3]. In recent times, the information embedded in metabolic networks is being used to assemble constraint-based models under the pseudo steady-state assumption, thus not requiring the knowledge of kinetic parameters, which are still rarely known [3, 4]. Constraint-based models allow the calculation of the possible metabolic states or "behaviours" that can be exhibited by the cell; however, they do not predict which of these are likely under given circumstances. One approach to perform these predictions is flux balance analysis (FBA), which is based on the assumption that cell behaviour has evolved to be optimal in a certain sense [5, 6]. It has been shown that FBA is able to predict the actual fluxes [7–9], but this requires to identify which are the relevant objectives for different conditions [7, 10]. As an alternative, one could perform a metabolic flux analysis (MFA) which, generally speaking, is the exercise of estimating the fluxes shown by cells by combination of a constraint-based model and the set of available experimental measurements.
In order to estimate the intracellular fluxes, traditional metabolic flux analysis (TMFA) employs only measurements of uptake and production rates (i.e. influxes into and outfluxes from cells) that are stoichiometrically balanced [11]. This purely stoichiometric approach has some limitations, but most of them can be overcome with simple extensions, as it will be shown below.
One typical difficulty to be tackled by MFA is that the available measurements may be insufficient to estimate the intracellular fluxes, particularly in large-scale networks, because there may be different flux distributions compatible with the available measurements. To face this situation, intracellular information obtained from stable isotope tracer experiments has been incorporated in many studies (13C-MFA) [12–14]. Yet, data from isotope tracer experiments will not be considered in this work. Instead, we follow a constraint-based modeling approach, in the sense that we do not attempt necessarily to predict the actual fluxes with precision, but rather to distinguish "most possible" from "impossible" flux states, based on a suitable definition of "possibility", a constraint-based model and the available measurements, which in most cases do not include isotopic data.
Another option to face a lack of measurements is the use of some rational hypotheses to chose one flux distribution among those that are compatible with the measurements. For instance, Nookaew et al. have proposed to estimate the intracellular fluxes based on the assumption that cells are likely to use as many pathways as possible to maintain robustness and redundancy [15]. Related hypotheses have been formulated using the concept of elementary modes [16, 17]. The assumption of optimal cell behavior typically used in FBA could be also used (e.g. [7]). It will be shown that the methodology we propose is able to detect these flux distribution that are equally possible (or similarly possible), but for the sake of simple exposition we will not use any hypothesis herein. However, the possibilistic framework might be extended to incorporate hypotheses, as discussed in the conclusions section.
In this context, the paper discusses the use of a possibilistic framework for metabolic flux analysis.
Uncertainty, lack of measurements and model imprecision will be handled introducing the notion of "degree of possibility". Then, an efficient optimization-based approach will be employed to query the most possible fluxes and their possibility distributions. The methodology is based on a re-interpretation of the consistent causal reasoning paradigm [18] as an equivalent problem of feasibility subject to equality and inequality constraints; preferences under uncertain knowledge are incorporated by transforming the feasibility problem into a linear optimisation one, which may be interpreted in possibilistic terms. The optimisation approach to logic reasoning has been previously explored by the research group to which the authors belong in [19–21], and this paper applies it to MFA.
The main features of the possibilistic framework introduced in the paper are the following: (i) it is based on a constraint-based model and not only on stoichiometric balances, (ii) it considers measurements uncertainty in a flexible way (e.g. non-symmetric error or a band of uncertainty due to systemic error) and (iii) even model imprecision, (iv) it provides possibility distributions (and intervals) which are more informative than point-wise estimations when multiple flux values might be reasonably possible, (v) it is reliable even if only a few fluxes are measurable, (vi) it has the ability to detect, and handle, inconsistencies between measurements and model, and furthermore (vii) with high computational efficiency.
The structure of the paper is as follows: preliminaries on possibility, optimization and metabolic flux analysis are first addressed. Then, the basics of Possibilistic MFA and some refinements are discussed; the framework is illustrated with simple examples and a well-know model of C. glutamicum. The paper is closed with a summary and a discussion on future work.
Preliminaries: possibility and optimization
In an abstract ideal situation, many estimation problems in science and engineering can be cast as estimating some decision variables δ given the known values of a set of other ones m (possibly, measurements) and a model expressed as a set of equality and inequality constraints (involving decision variables, measurements and some model parameters). Then, the valid estimations will be the feasible solutions of a constraint satisfaction problem [22, 23].
However, in many practical cases, the measurements are imprecise and the model parameters and constraints are also not accurate, so real data violates them. This is the reason why most real-life models should include uncertainty. The most basic representation of uncertainty would be giving interval values to measurements and model parameters. Refinements of the uncertainty representation give rise to probabilistic [23–25] and possibilistic [26–28] frameworks.
Probabilistic frameworks have an underlying interpretation in terms of the frequency in which some flux conditions appear; on the other hand, possibilistic frameworks measure the degree of compliance (consistency) of some decision variables with some (soft) modeling constraints. In this sense, the basic assumptions of both paradigms of inference under uncertainty are different.
In the following subsections the possibilistic framework will be described. Afterwards, this section of preliminaries will be closed discussing the relationship between probability and possibility and justifying the use the possibilistic framework.
Soft constraint satisfaction problems: a possibilistic approach
As explained above, the possibilistic framework is the chosen representation for the problem under study, following the ideas in [29], where possibilistic constraint satisfaction problems (CSP) are presented. There, the authors introduce constraints which are satisfied to a degree, transforming the feasibility/infeasibility of a potential solution into a gradual notion: given a CSP with multiple solutions δ ∈ Δ (where Δ denotes the search space over which feasible values for the decision variables will be searched), a function π : Δ → [0, 1] was suggested in order to represent preference or priority as a "consistency degree". The meaning of π(δ) = 1 would indicate that δ is in full agreement with the model and measurement constraints; the meaning of π(δ) = 0 indicates that δ is in "absolute, total contradiction" with the problem constraints, and never should be considered a feasible value. Intermediate values would denote values of decision variables which "somehow mildly" violate the problem constraints but could be considered "partially possible" from the "practical" knowledge of the "expert" modeller who defined π. The higher the value of π(δ), the higher the accordance with the problem constraints should be (subjectively interpreted as a higher "possibility" of the decision variable choice δ). Given the here outlined subjective meaning of π, it is denoted in literature as possibility distribution. The possibilistic calculus [27, 29] refers then, to computations with possibility distributions from a series of axioms. Basic ideas on it will be outlined below in this section. A simple example now illustrates the basic idea.
Example
Consider a flux balance {f1 = f2}, stating equality between two flows, f1 and f2, supposedly measured in a biological or chemical reaction. The measurements m a = (5, 7) and m b = (5, 5.1) are infeasible, whereas m c = (5, 5) is feasible. However, it seems clear that the subjective "possibility" of m b is higher than that of m a ; m b can be thought to be quite reasonable in practice due to measurement errors.
The idea can be easily formalised for further computations by defining a possibility distribution, for instance, . In this way, potential solutions can be ranked: π(m a ) = 0.018, π (m b ) = 0.99 and π(m c ) = 1. The search space in which to define the possibility, Δ, could be defined as, say, Δ = {(δ1, δ2)|0 ≤ δ i ≤ 10}.
Usually, the function π(δ) is built by "conjunction" of possibility functions of individual relations π i (δ i ) (expressing user-defined preference or priority on each individual constraint, in many cases in a problem-dependent way). Such conjunction will be latter discussed in this section. The best CSP solutions are defined to be those which satisfy the global problem to the maximal degree.
In this way, once the user has defined such function expressing how a particular combination of system variables is "consistent" with its model, the basic idea on possibilistic calculus is, given a subset of the system variables (assumed as known or measured), estimate the "most possible" values of all the remaining variables via an optimization problem. The close relationship between possibilistic calculus and optimisation is discussed in the subsection below.
Possibility theory
The basic building block of possibility theory is a user-defined possibility distribution π : Δ → [0, 1]. This defines the possibility of each "point" δ in Δ. A consistent problem formulation is defined to be the one in which there exists at least one point with possibility equal to one.
The second building block are events, formally defined as subsets of Δ, in order to address problems such as, in the above example, determining the possibility of event A = {(f1, f2) ∈ Δ | 0 ≤ f1 ≤ 3, 4 ≤ f2 ≤ 10}.
Possibility calculus as optimization
By definition, the possibility of an event A (subset of Δ) is computed via:
and, obviously, given two events A and B, A ⊂ B entails π(A) ≤ π(B).
Hence, possibility computations are optimisation problems (Cf. with probability computations, which are integration problems).
For a multidimensional Δ = Δ1 × Δ2, δ = (δ1, δ2) ∈ Δ, the marginal possibility distribution of δ1 is defined as:
i.e., the possibility of the event {δ1 = }.
Optimization as possibility calculus
Conversely, consider a cost function J : Δ → R+ (i.e., verifying J(δ) ≤ 0 for all δ ∈ Δ), so that there exists δ0 ∈ Δ such that J(δ0) = 0. Then, a consistent possibility distribution may be defined on Δ via:
and the possibility of an event A is given by replacing the possibility definition (3) in (1), resulting in:
In the next sections, abusing the notation, an event A will be usually described by a set of constraints on the decision variables δ.
In this way, numeric constrained optimisation problems may be subjectively interpreted in possibilistic terms: the cost J(δ) will be interpreted as the log-possibility of δ and, by definition, unfeasible values of decision variables will be assigned zero possibility.
Let us now review some other relevant definitions and issues in possibilistic calculus.
Necessity
In order to assert that an event A is necessarily true (in our context, that all problem solutions belong to A), saying that A is "possible" may be not enough: it must also be true that the complementary event "not A" is not possible. This motivates the introduction of a necessity measure:
In a binary setting, all solutions belong to a subset A if and only if π(A) = N(A) = 1; there exist solutions in A (and solutions outside A) if π(A) = 1 but N(A) = 0, and there are no solutions in A if π(A) = 0.
Extending the measures π(A), N(A) to [0,1] provides a natural gradation of such concepts: π(A) = 0.95, N(A) = 0.1 would indicate that there are very possible solutions in A, but not all of them are in there (there are solutions with possibility 1-0.1 = 0.9 outside A).
Interactivity and possibilistic conjunction
The possibilistic analog to statistical independence is the non-interactivity.
If the joint possibility of two variables Δ = Δ1 × Δ2, δ = (δ1, δ2) ∈ Δ can be expressed as the product of two univariate ones:
then variables δ1 and δ2 are said to be non-interactive. In that way, given an event A1 ⊂ Δ1 and an event A2 ⊂ Δ2, it is straightforward to prove that:
which can be read as "the possibility of event A1 and event A2 is the product of the individual possibilities when the events relate non-interactive variables", interpreting, as usual in literature, set intersection as a linguistic conjunction.
Under non-interactivity assumption, if the possibility is defined as the logarithm of a cost index (3), the product (6) gets transformed into a sum:
On the following, given individual cost indices J1(δ1), J2(δ2), etc. relating to different constraints, the above expression (8) will be the one used in most cases to define a possibility distribution in the product space. In this way, we are interpreting the possibilistic conjunction operator in [29] as an algebraic product of possibilities, i.e., stating an underlying non-interactivity assumption between different constraints. Note, however, that the interactivity assumption is not always intuitively needed. In the other extreme (total interactivity: variables δ1 and δ2 fully "correlated", for instance equal), we would have: π(A1 ∩ A2) ≤ max(π(A1), π(A2)), which would suggest the maximum possibility as the conjunction operator when two events affect exactly the same decision variables. In between those two extremes, other choices may be also possible (T -norm operators [30]).
Conditional possibility
The possibilistic analog to conditional probability is conditional possibility.
Consider an event B with nonzero possibility. A quotient definition for conditional possibility of an event A given event B will be used in this paper:
In this way, given a (multivariate) possibility distribution π(δ), the conditional possibility can be computed as:
so, if the possibility distribution is actually the exponential of a cost index, we get:
that is, computing the possibility by subtracting the cost associated to event B from the cost of any of its subsets.
In order to get a conditional possibility distribution of a variable δ, we assume event A being an individual point δ*, getting:
That is, the conditional distribution can be obtained by dividing the possibility distribution function for all points in a set by the maximum possibility of them, i.e., normalising the possibility distribution on a restricted conditioning domain B to a maximum equal to one.
The conditional definitions allow for an analogy to Bayesian inference: if we assume that B is actually certain (whatever the a priori possibility π(B) was), then conditional possibility may be understood as an a posteriori possibility.
Possibility versus probability
Both possibility theory and probability theory are frameworks for handling uncertainty in constraint satisfaction problems. Basically, a subjective interpretation would assign high possibility to events with high probability. Hence, in a first approximation, user-defined probabilities and possibilities should be related by an implicit monotonically increasing function. Possibility-necessity measures have also been linked to imprecise probabilities [31]. However, once aggregation takes place (via sums and integration in probability, via maximisations in possibility), although the subjective interpretation might be considered similar, there is no longer an implicit function relating probability and possibility. For further discussion on possibility, probability, and other uncertain reasoning frameworks, and their interrelations, the reader is referred to [31–33].
Ideally, probabilistic results would be preferable (to confidently assert that, e.g., 95% of cases a flux estimate will lie in a particular interval). However, there are some drawbacks: (i) exact probabilistic inference under equality and inequality modeling constraints is computationally hard (multivariate integration on irregular sets) (ii) some of the a priori Bayesian probabilities are in practice rough user-given estimates, (iii) some of the assumptions (linearity of transformation, Gaussian distributions) may not hold in practice, and (iv) there may be some uncertainty in the model parameters or in the model probabilities. Thus, as practical use of probability does not fully adhere to the theoretical assumptions, its results should be interpreted with some flexibility. As this work will discuss, the proposed possibilistic framework is much less demanding computationally (using optimization instead of integrals, so large-scale cases become tractable) and gives similar results to the probabilistic approach in realistic cases.
The objective of the next sections is to set up a possibilistic framework for efficient computations in metabolic flux analysis.
Preliminaries for metabolic flux analysis
In biology, the metabolism of living cells is usually represented by a metabolic network encoding the elementary biochemical reactions taking place within the cell [3]. These metabolic networks can be translated to a matrix N, where rows are the m internal metabolites and columns the n reactions. Assuming that these metabolites are at steady state, mass balances can be formulated as follows [11]:
where v = (v1, v2,...,v n )Tis the n-dimensional vector of metabolic fluxes.
Hence, a (steady-state) flux vector v represents the metabolic state of the cells at a given time, without any information on the kinetics of the reactions; it shows the contribution of each reaction to the overall metabolic processes of substrate utilization and product formation. Notice that as typically n is larger than m, the system (13) is underdetermined, i.e. there is a wide range of stoichiometrically-feasible flux vectors. Assuming now that some fluxes in v have been measured (denoted as v m ), while the rest remain unknown (denoted as v u ), equation (13) can be rearranged as follows:
As measurements are imprecise in practice, such measurement imprecision can be incorporated as constraints:
where e m represents measurements errors and represents the actually measured flux value. In our approach, the measurement uncertainty is translated into an a-priori possibility distribution for e m from sensor characteristics. Other approaches consider different choices, as discussed below.
At this point, Traditional metabolic flux analysis (TMFA) can be defined as the estimation of the flux vector satisfying (14) and compatible with the measurements (15). In particular, TMFA can be formulated as a two step procedure [34, 35]: (I) analyze measurements consistency (and detect gross errors) using chi-square tests, and (II) solve a least squares problem to estimate the actual flux vector v:
where it is assumed that e m are distributed normally with a mean value of zero and a variance-covariance matrix F.
Since all the constraints are linear equalities, the analytic solution of this minimization problem can be obtained, resulting in the expressions to estimate v u and v m that are typically seen in literature (e.g. [11, 36]). However, with this formulation TMFA has some important limitations: (i) irreversibility constraints, or any other inequality constraints, cannot be considered, (ii) measurement errors are assumed to be normally distributed, (iii) it only provides unique-valued flux estimation, and (iv) it needs a high number of measurable fluxes to be of use – system (14) has to be determined and redundant [37].
Several alternatives have been suggested to face those limitations (Table 1). Quadratic programming solves the least squares problem (16) allowing to include irreversibility constraints (LS-MFA), but inherits the rest of drawbacks (and the chi-square tests may lose validity). The flux spectrum approach (FS-MFA) follows an interval approach to overcome the limitations mentioned before, but its estimations tend to be conservative because only lower and upper bounds are used to represent measurements uncertainty [38, 39]. Monte Carlo has been also used in the context of 13C-MFA (e.g. [40–42]), but rarely in absence of isotopic data. Moreover, sometimes it has been used incorrectly: Monte Carlo cannot be performed just solving a quadratic programming problem for each simulated set of measurements, because this introduce a bias on the results. Anyway, the major drawback of Monte Carlo is its high computational cost, which restricts its use for large metabolic networks as an impractical number of samples is required to assess probabilities within a reasonable accuracy.
In the following sections we introduce a possibilistic framework for MFA that brings several interesting features: (i) it overcomes all the mentioned limitations of TMFA, (ii) has the ability to detect, and handle, inconsistencies between measurements and model, and furthermore (iii) with high computational efficiency.
Results and discussion
Possibilistic MFA: problem statement
In this section the possibilistic framework for MFA flux estimations is discussed. First, we define a set of time-invariant constraints derived from the metabolism being modelled. Then we incorporate the constraints imposed by the measured fluxes, representing its uncertainty, by means of auxiliar slack decision variables and a cost index. In this way, the notion of "degree of possibility" is incorporated. Finally, it will be shown how (linear) optimisation problems will be able to settle queries about the most possible fluxes, the possibility distributions, etc.
Model-based constraints
First, let us define a set of invariant constraints that every steady-state flux vector must satisfy; they do not depend on environmental conditions, do not change through evolution, etc. [3]. In this work this model-based constraints, denoted as , will be the stoichiometric relationships (13) and irreversibility constraints, described by means of inequalities:
where D is a diagonal nxn-matrix with D i, i = 1 if the flux i is irreversible (otherwise 0).
Other model-based constraints can be defined in an analogous way. For instance, elementary balances or degree of reduction balances might be incorporated into (17) as additional constraints [11]. It may be also possible to add constraints based on standard Gibbs free energy changes [43, 44] or extracellular metabolites concentrations [45].
Incorporating the measurements
Estimating the non-measured fluxes would amount for solving the above equations (17), where some of the elements in vector v are measured (v m ). However, this simple approach will be impractical in two very common situations:
-
The measurements are very few, so the system has many -possibly infinite- solutions.
-
real measurements do not exactly satisfy the constraints due to measurements (and modelling) errors. Therefore, no solution will be found. (For instance, an unfeasible set results with the constraint v1 = v2 and the measurements {v1 = 0.5, v2 = 0.5001}.)
Hence, the approach needs refinements to deal with a lack of measurements and to introduce the "possibility" of sensor errors and imperfect models. As shown below, such difficulties can be overcame by the introduction of slack variables and a cost index, enabling a grading of the different candidate flux vectors as more or less "possible".
Possibilistic description of measurements
Each experimental measurement can be described by a constraint as follows:
where e m is a decision variable that represents the intrinsic uncertainty of the experimental measurements, i.e. the discrepancy between the actual flux v m , and the measured value . For convenience (see remark below), e m is substituted by two non-negative decision variables, ε1 and μ1:
These decision variables δ = {ε1, μ1} relax the basic assertion = v m , conforming a possibility distribution in (, v m ) associated to some cost index J m . Among different possible choices, a simple -yet sensible- one is the linear cost index:
with α ≥ 0 and β = 0 (usually, if sensor error is "symmetric", α and β should be defined to be equal).
Recalling the concepts introduced t v m he preliminaries section, the interpretation of (19) and (20) may be: "v m = is fully possible; the more v m differs from , the less possible such situation is". Indeed, the event A = {v m = } ≡ {ε1 - μ1 = 0} will be fully possible – as , achieved at ε1 = μ1 = 0, and then π(A) = e-0 = 1. On the other hand, the possibility of the event A corresponding to v m being different from – say, for instance, A = {v m = + ρ} ≡ {ε1 - μ1 = ρ} – will be given by . For instance, with a cost index J(δ) = 5ε1 + 5μ1, and a measurement = 0.1, the possibility of the actual flux v m being v m = 0.2 is e-5*0.1 = 0.6065 ("quite" possible), and the possibility of v m = 1.1 is e-5*1 = 0.0063 ("almost" impossible).
Remark
As explained in a subsequent section, the weights α and β should be defined related to each measurement's "a priori accuracy".
A global cost index
Consider now a set of measurements with its associated slack variables δ1 = (ε1, μ1),... δ m = (ε m , μ m ), and individual cost indices J1(δ1),... J m (δ m ). The corresponding constraints will be called measurement-based constraints, :
In order to have a possibility distribution, under the non-interactivity assumption (6), the cost index is defined as:
where α and β are row vectors of sensor accuracy coefficients and ε 1 , ·μ 1 correspond to stacking in vectors the artificial variables from individual constraints.
The Possibilistic MFA problem
At this point, we can define the possibilistic MFA (PMFA) problem by means of the cost index J (22) and the set of constraints :
where the decision variables δ are the actual fluxes v = (v u , v m ), and the slack variables ε 1 and μ 1 .
The cost index J reflects the log-possibility of a particular combination of the decision variables, that is, the log-possibility of a particular flux vector v.
Remark
The PMFA will be cast as a linear programming problem; this is the reason why the non-negative decision variables ε1 and μ1 were introduced in substitution of e m . However, it can be formulated using any other optimization framework. For instance, PMFA can be easily cast under a quadratic programming framework. Throughout the paper linear programming will be assumed due to its great computational performance (solvable in polynomial time). This supposes a great advantage when dealing with large-scale metabolic networks. Nevertheless, an example using quadratic programming will be described in a subsequent section to point out the flexibility of the PMFA.
Example 1 – Problem statement
Consider the toy metabolic network depicted at the top of Figure 1, and the corresponding constraints, and . Let us consider that the measurement of v2 is "very accurate", that of v5 is moderately accurate and those of v3 and v4 are quite unreliable. The weights α and β associated to the slack variables ε 1 , and μ 1 , can be defined in accordance with this information: if we take α2 = β2 = 2, α5 = β5 = 0.5, and α3 = β3 = α4 = β4 = 0.15, the measurements are depicted on the bottom in Figure 1, for supposed measurements = 9, = 31, = 30, = 10.
Point-wise flux estimations
The simplest outcome of a PMFA problem is a point-wise flux estimation: the minimum-cost (maximum possibility) flux vector. This problem can be conveniently cast as the optimisation of a linear functional subject to linear constraints.
According to (4), the maximum possibility (minimum-cost) flux vector v mp corresponding to a given set of measurements is obtained as the solution to the linear programming (LP) optimization problem:
being its degree of possibility π(v mp ) = exp(J min ).
The obtained flux vector v mp contains the most possible flux values compatible (consistent) with the model and the measurements. A possibility equal to one must be interpreted as the flux vector being in complete agreement with the model and the original measurements. Lower values of possibility imply that v mp corresponds to fluxes v m deviated from the measurements .
Notice that as π(v mp ) = π(), it can be interpreted as the "a priori" possibility of encountering the measurements ; so if it is low, this implies that either (a) there is a gross error in the measurements, (b) there is an error in the model, or both. Therefore, the maximum possibility can be used to evaluate consistency and detect errors (inconsistencies between data and models). We will come back to this point in a subsequent section.
Example 1, continued
Consider again the model and the measurements given in Figure 1. The maximum possibility flux vector resulting from (24) is v mp = (0.75, 9, 30.25, 8.25, 31, 39.3), with a possibility of e-0.3 = 0.74. The most possible flux vector being not fully possible (peak value not equal to 1) indicates that the measurements and the model are not in complete agreement. Indeed, as a matter of fact, the model says that v2 - v4 = v5 - v3, but = -1 and = 1. Should the measurements had been fully compatible with the constraints imposed by the metabolic network – i.e. = 10, = 30, = 30 and = 10 – the maximum possibility flux vector would have been v mp = (0, 10, 30,10, 30, 40), with a possibility of π(v mp c) = 1.
Notice also that the possibility depends on the reliability associated to each measurement. For instance, if all the measurements were supposed to be more reliable – say α' = 10·α and β' = 10·β – the possibility distribution functions would be narrower. The interpretation of the new coefficients would, therefore, be that the same deviation from the fluxes of maximum possibility will be now be considered as a less possible fact.
Possibility distributions as flux estimations
Clearly, a point-wise flux estimation is limited in a situation where multiple flux values might be reasonably possible. To face these situation, marginal and conditional possibility distributions (and intervals) can be obtained, again, by solving linear optimisation problems. These provide a much more informative flux estimation than a point-wise one, such as the maximum possibility flux vector, or the interval of minimum-maximum possible values in [38].
These flux estimations, which are illustrated in Figure 2, wil be presented along this subsection.
Marginal possibility distributions
Marginal possibility distributions (2) can be easily plotted and give a valuable information for the end user: they show, and rank, the possible values for each flux in the network.
The possibility of v i being equal to a given value f, π(v i = f ∩ ), is computed by simply adding a constraint to (24):
Hence, plotting the marginal possibility for a range of fixed given values f (taken within a pre-specified range) will provide the marginal possibility distributions that be interpreted as the "distribution of the possible values for each flux in the network, given the measurements" (see Figure 2, left).
Notice that "cuts" of a possibility distribution, containing those values of v i with a marginal possibility higher than γ can be obtained solving two LP problems:
This provides a highly efficient procedure to compute a possibility distribution: compute "cuts" of possibilities between 0 and 1, say, 0.1, 0.2, etc. (computing the marginal possibility of all the fluxes in the network by means of a grid of points is linear in the number of grid points and polynomial in the number of fluxes). This approach is better than defining a range of values f and computing its possibility with (25) because it avoids the problem of determining the most convenient step size and bounds of the flux (which, usually, are not known a priori).
Conditional possibility distributions
Using the definition given in the preliminaries (12), the conditional possibility distribution of a flux v i can be computed as follows:
Remember that conditional distribution can be obtained normalising the marginal possibility distribution to a maximum equal to one (see Figure 2).
Conditional possibility may be understood as an a posteriori possibility: π(v i = f| ) is the possibility of v i having the value f, if we assume that is actually certain, i.e. that the model and the measurements are correct.
(A posteriori) Possibilistic intervals
In analogy to (26), the interval of flux values with a degree of conditional (a posteriori) possibility higher than can be obtained solving two LP problems:
The upper bound would be obtained by replacing minimum by maximum.
These possibilistic intervals have a similar interpretation to "confidence intervals" ("credible intervals") in Bayesian statistics, providing a concise flux estimation that can be represented by means of a box-plot chart (see Figure 2, right).
Example 1, continued
With the measurements in Figure 1, the resulting marginal possibility distributions for each flux are plotted in Figure 3A. They show that, for instance, the most possible value of v1 is 0.75 (possibility of 0.74), that v1 being 2.25 is quite possible, but that v1 bigger than 10 is almost impossible (possibility lower than 0.05). The possibility distributions also reflect the reliability of the estimation of each flux: the estimation of v6 is less reliable than the one of v1 or v2, since it has a wide range of highly possible values.
Notice too that the uncertainty on the measurements is often strikingly reduced through the flux estimation. For instance, the estimation of v4 – whose measurement was quite unreliable a priori – has been significantly improved, once model constraints and other measurements have been incorporated. This reflects the already noticed fact that the metabolic network structure greatly constrains the possible values of fluxes for a given, typically small, set of measured flux values. The plots of marginal possibility can also detect multiple flux vectors with maximum possibility (possibility distribution functions with flat top). Figure 3B depicts the maximum possibility flux estimation and three possibilistic intervals by means of a box-plot chart. The intervals point out that, for instance, the highly possible a posteriori values of v5 are those in [30.75, 31] (possibility greater than 0.9) and that those in [29.5, 32] are also quite possible (possibility greater than 0.5), while those outside [27, 34.5] are almost impossible as their a posteriori possibility is lower than 0.1.
Possibilistic MFA: Refinements
Now that the basics of the PMFA framework have been introduced, some refinements will be discussed.
A better description of measurement's uncertainty
The formulation used above to describe the uncertainty of the experimental measurements might be considered somehow limited in some applications.
Fortunately, it is very easy to add new slack variables, and modify the (23) and the cost index (22), allowing to work with possibility distribution functions of different characteristics.
As an example, the constraints (29) and cost (30) below describe an interval measurement plus some possibility of outlier measurements:
and
The possibility of is one and the possibility of the actual flux being v m being out of the referred interval depends on the cost index weights (α and β).
For instance, a band with possibility equal to one can be used to account for systemic errors in measuring a particular flux, and a couple of additional slack variables may be defined to account for the decreasing possibility of random errors. These kind of representation of measurement uncertainty will be illustrated in subsequent examples.
Remark
Notice that more slack variables can be added to achieve a more complex representation of the measured flux uncertainty. In fact, any convex representation of the log-possibility uncertainty can be approximated if a sufficient number of slack variables are incorporated (ε1, μ 2, ε2, μ2,...). Details are omitted for brevity.
Considering uncertainty in the model structure
Until now, the model-based constraints (23) – the stoichiometric relationships, reaction's irreversibility, or any other – have been considered as hard constraints; only those flux vectors v that exactly satisfy them could be feasible solutions. However, these constraints can be "softened" via suitable slack variables to consider uncertain knowledge. Then, these additional slack variables may be used in a cost index to generate a possibility distribution.
Consider, as an example, an equality restriction a = b. A relaxed ("softened") version of such restriction may be written as:
with ϵ and ν being slack variables penalised in an optimisation index J = f(ϵ, ν), typically with linear cost index terms, γ ϵ + τυ, in an analogous way to the discussion on uncertain measurements.
Notice also that a "softened" inequality restriction is nothing but an equality one with no penalisation on one of the slack variables above. For instance a ≤ b + ε can be expressed as a = b + ε - μ with free μ.
Such softened model constraints may be used to roughly incorporate imprecision in the model arising, for instance, from non-compliance with the pseudo-steady-state assumption, partial unbalance of some metabolites or uncertain yields. Although these issues require further research, let us outline some preliminary ideas below.
Relaxing the pseudo-steady state assumption
Equation (13) derives from the dynamic mass balance around the internal metabolites c, where it is assumed that ≈ 0 and that the term μ·c is negligible (μ denotes the growth rate). Adding slack decision variables to (13), as it has been explained, makes it possible to relax this assumption.
Partial unbalance of metabolites
Sometimes, a metabolite cannot be assumed to be balanced because there are reactions producing or consuming this metabolite that have not been taken into account in the network; for instance, this is often the case for the cofactors, ATP, NADP, etc. This unknown consumption/production can be represented by means of slack variables (e.g. ϵ and υ) if some interval limits (e.g. ϵ max and υ max ) are provided.
Uncertainty in stoichiometric yields
In some cases, the value of a yield coefficient is not exactly known. This is usually the case of the yield coefficients of lump reactions used to represent biomass synthesis. Let v r be the flux through a reaction with an uncertain yield Y i, r for the metabolite i. The row corresponding to this metabolite in (13) can be rewritten as:
If it is known that and v r is irreversible, equation (32) can be substituted by two constraints:
However, if the flux v r is reversible, inequalities in (33) cannot be set up, and the approach is no longer applicable. Integrating modal interval arithmetic [46] in the proposed framework might be a possible option, under research at this moment.
Illustrative examples of other features of PMFA
Other features of Possibilistic MFA (PMFA) will be briefly illustrated using the simple metabolic network in Figure 1.
Example 2: Errors detection in measurements and model
As earlier mentioned, the value of the peak possibility in the resulting flux distribution provides an indication of the agreement between the model () and the measurements (). A low degree of possibility means that the model and the measurements are inconsistent. That is, that there is not any flux vector "near" the measured values satisfying the model-based constraints. Therefore, if the the maximum possibility flux vector has a low value, one must assume that either (a) there is an important error in one or more measurements, (b) there is a relevant error in the model (e.g. a mass balance is not closed, or a metabolite is not at steady state), or both.
If a high inconsistency (low possibility) is detected, it is possible to investigate what is causing it, and thus correct the measurements or improve the model. Following a straight approach, we can remove one measured flux at a time and perform the flux estimation to determine if the removed measurement was causing the low possibility. If this is the case, we may consider the following alternatives: (a) consider that is a totally unreliable measurement and, thus, accept the flux estimation inferred from the others measurements, (b) obtaining either again, or a different measurable flux which could provide additional information, (c) consider a reliable piece of data and, hence, conclude that there is an error in the model or its assumptions. In case (c), a similar approach can be used to investigate which particular model-based constraint is causing the low possibility – by "softening" the suspicious constraints one at a time.
A simple example of the procedure just described is shown in Figure 4. Initially, a PMFA estimation using all the measured fluxes was performed, obtaining a maximum possibility flux vector with low possibility, π(v) = π() = 0.15. If the estimation is repeated removing the flux , the maximum possibility does not increase; however, when the estimation is performed removing , the maximum possibility is significantly higher (0.7). This suggests that there was a large error in the value , or an error in the model related to the balance around metabolite C which involved fluxes v2, v3 and v6.
Example 3: Scenario lack of data
One of the features of PMFA is that it can be used even if there is a lack of measurements; i.e. even if (14) is underdetermined or not redundant [37].
To point this out, let us continue with our example assuming now that only two fluxes are measured (an under-determined case). PMFA flux estimations, the marginal possibility distributions and the a posteriori intervals, are shown in Figure 5. Notice that crisp flux estimations will only be obtained if the irreversibility constraints – or other inequalities – are able to "bound" the underdeterminacy of (14). Interestingly, our experience show that this is often the case for medium size networks. Moreover, if this is not the case, the possibilistic flux estimation will be less precise – large intervals and "wide" distributions – but still reliable, i.e. the estimation will always be as precise as allowed by the available measurements and knowledge.
Example 4: Using quadratic programming
To show how PMFA can be cast within other optimisation frameworks, an example using quadratic programming will be discussed. We define as = v m + e m and J = ·W·e m , where W is a diagonal matrix of weights. Hence, we have for each measurement, i.e. measurements uncertainty is represented as a quadratic possibility distribution.
Let us continue with our example using the measurements of Figure 1, but representing them with the quadratic formulation just introduced. The original possibility distribution of single measurements (dashed lines) and the possibility distributions computed with PMFA (solid lines) are depicted in the Figure 5. Notice that results are similar to those obtained in the previous example (Figure 1), where the standard linear programming framework was used (even if additional auxiliar variables ϵ2, μ2, etc. were not used). However, the qualitative similarity between the results makes the author think that, in most cases, the linear programming setup is expressive enough and much more efficient than quadratic or other more complex optimization cases.
Example 5: Comparison with other methods
This example compares PMFA with traditional MFA and some of its extensions. We continue using the network depicted in Figure 1, and perform the estimations with PMFA, Traditional MFA (TMFA), MFA as a constraint, least-squares problem (LS-MFA) and the flux spectrum (FS-MFA).
To show that PMFA is able to represent the measurements in a flexible way, we assume that errors in v2 and v3 are non-symmetric, and we add a band of uncertainty to account for systemic errors (Figure 6).
Inconveniently, errors have to be approximated with a normal distribution so that TMFA and LS-MFA can be used (see preliminaries). For the estimations with FS-MFA we represent the measurements with the interval of 95%, or 2σ (see [38]). All the results are depicted in Figure 6. Notice that TMFA assigns a negative value to an irreversible flux, v1, since it is not taking these constraints into account – this was predictable, but it must be highlighted because TMFA is still being widely used in the literature. The results also point out that the possibilistic distribution (and intervals) are much more informative than the point-wise estimations of TMFA and LS-MFA, or the intervals of FS-MFA. Basically, point-wise estimations fail when several flux values reasonably possible, whereas the flux spectrum interval tend to be conservative. Furthermore, remember that TMFA and LS-MFA cannot be used in scenarios lacking data, such as example 3, where PMFA was shown to be valuable.
To complete this perspective, the next section will discuss a comparison between PMFA and Monte Carlo approaches.
Example 6: Comparison with Monte Carlo
Continuing with our example, now measurements are represented (a) in possibilistic terms (linear case) and (b) with a "similar" probabilistic formulation assuming that errors are normally distributed. Both representations are depicted in Figure 7 (dashed lines). Then, we performed two flux estimations using (a) PMFA and (b) Monte Carlo simulations (1.7 millions of combinations of values of measured fluxes were generated, taken into account their normal distribution). The conditional possibility distributions and the histograms resulting from PMFA and Monte Carlo, respectively, are depicted in Figure 7. Even if probability and possibility are not truly equivalent, a reasonable similarity between the results from both approaches exists.
Notice also that this is a simple case where Monte Carlo can be applied. Nonetheless, its worst performance is clear: the cost of computing the possibility distributions is polynomial in the number of fluxes (as shown above), whereas the cost of a Monte Carlo approach grows exponentially with the number of independent decision variables.
Larger-scale Example: C. glutamicum
In this section we apply Possibilistic MFA (PMFA) to a medium-size example. For illustrative purposes, we have chosen a very well-know metabolic model of Corynebacterium glutamicum.
Metabolic network model
The metabolic network of C. glutamicum has been taken from [47] and is a slight variation of the one originally constructed in [48, 49]. The reactions considered in describing the biochemistry of the primary metabolism of C. glutamicum necessary to support lysine and biomass synthesis from glucose are given in the additional file 1. A reaction of ATP dissipation is included in the network, so that the ATP balance could be maintained, without actually constraining the flux space. On the contrary, the co-factors NADP, NAD and FAD are supposed to be balanced. The reaction for biomass formation is an approximation using as reactants those amino acids that explicitly appear in the network and the precursors of the other amino acids synthesized by C. glutamicum.
PMFA setting
The stoichiometric relationships, embedded in a 36 × 40 stoichiometric matrix, and the irreversibility of certain reactions, embedded in a 40 × 40 diagonal matrix, define our model-based constraints () according to (17). Both matrices are given in the additional file 1.
Experimental measurements
Experimental data of a batch fermentation of C. glutamicum cultured on minimal glucose medium has been taken from [48]. There, the growth rate and the fluxes (production/consumption rates) of the external metabolites – lactate, acetate, glucose, O2, CO2, NH3, lysine and trehalose – were experimentally measured. Since the accumulation of lactate and acetate was negligible, their flux is always zero in this case study. The measured fluxes v GLC (1), vO 2(34), vNH 3(35), v LY (37), v Thre (38) and vCO 2(39) and the growth rate v Bio (36), and also their standard deviations, are given in Figure 8.
PMFA setting
Using the data in Figure 8, we have built a possibilistic representation of single measurements defining convenient auxiliar variables and weights. The criterion to choose the weights was: full possibility for v m ∈ ± σ/2 and possibility 0.5 for those in ± σ. The values in ± 2·σ have possibility 0.1 (σ denotes standard deviation. If errors are assumed to be normally distributed, these levels correspond to the probabilistic confidence intervals of 38%, 68% and 95%, respectively). The resultant decision variables and weights define our measurement-based constraints () according to (21). These possibilistic representations are depicted in Figure 8.
Possibilistic flux estimation of C. glutamicum
We used all the available measurements – v GLC (1), vO 2(34), vNH 3(35), v LY (37), v Thre (38), vCO 2(39) and v Bio (36) – to obtain the maximum possibility flux vector (results given in the additional file 1). The flux vector had a degree of possibility 0.38, which could be considered "low" if one considers that a significant degree of uncertainty was already being taken into account (table 1). We then obtained the marginal possibility distributions for each flux, which inspection indicated that the low possibility was almost completely caused by only one measured flux, vNH 3(35). This suggests that this measurement was inaccurate, or that its standard deviation was underestimated. Interestingly, this flux was indeed the most uncertain one in the original dataset (its standard deviation was a huge 44 mM/h for a nominal value of 64.8 nM/h).
As a results of this analysis – which is a rough example of the procedure mentioned in a previous section – we decided to remove the measurement and repeat our calculations. As expected, this time we obtained a maximum possibility flux vector with a similar shape, but higher possibility (0.88). The marginal possibility distributions are depicted in Figure 9A, and the maximum possibility flux estimation and the flux intervals are depicted in Figure 9B. Numerical data are given in the additional file 1, where they can be compared with those obtained if the measurement of vNH 3is used.
Possibilistic flux estimation lacking measurements
We performed a flux estimation using only data of three extracellular fluxes that can be measured with standard equipment: v GLC (1), vCO 2(39) and biomass v Bio (36). In this case, the obtained maximum possibility flux vector is fully possible. This flux vector and the flux intervals are depicted in Figure 10. Remarkably, even if only three fluxes were measured, there was a small range of flux vectors with an a posteriori possibility higher than 0.8. Numerical results are given in the additional file 1.
Possibilistic flux estimation with uncertain model
As explained above, we can "soften" the model-based constraints to relax the pseudo-steady state assumption. As example, we assumed a degree of uncertainty around all the mass balances introducing decision variables ϵ 1 and υ 1 and weights γ1 = τ1 = 2 (see Figure 8). Hence, flux vectors which imply small accumulations of some metabolites will be accepted, yet considered less possible.
It could be also stated that the metabolic network used herein, the one introduced by Vallino et al., relies on an unrealistic assumption: that co-factors NADP, NAD and FAD are balanced [50, 51]. To avoid this, we can remove these metabolites from our stoichiometric matrix or, as an alternative, use the expressivity of the possibilistic framework to allow a certain degree of unbalance for these metabolites. Just as example, herein we assumed that cofactors may be unbalanced with some limits (say, 30 mM/h for NADP/NADPH and 15 mM/h for FAD/FADH and NAD/NADH). This "knowledge" can be easily incorporated into our model defining the convenient auxiliar variables and weights (as explained above).
At this point, PMFA was performed in three scenarios: (a) the model-based constraints are not relaxed (reference case) (b) the pseudo-steady state assumption is relaxed and NADP/NADPH is allowed to be unbalanced, and (c) the pseudo-steady state assumption is relaxed and the three cofactors – NADP/NADPH, FAD/FADH and NAD/NADH – are allowed to be unbalanced. The marginal possibility distributions obtained in each case are compared in Figure 11, where it can be observe how the model uncertainty is translated into the flux estimations; consider this uncertainty results in less precise flux estimations, given the less reliable model equations.
Conclusion
In this paper we have discussed a possibilistic framework for the estimation of the metabolic fluxes shown by cells at given conditions given.
Considering ordinary constraint-satisfaction problems, metabolic fluxes fulfilling a set of model-based constraints and compatible some experimental measurements are "possible", otherwise "impossible". In this paper, this idea is refined to cope with uncertain knowledge – in the form of measurements errors or imperfect models – by introducing the notion of "degree of possibility", which enables grading the candidate flux values as more or less possible. Then, possibilistic MFA is able to query the flux vector of maximum possibility. Moreover, when multiple flux vectors might be reasonably possible, the marginal and conditional possibility distributions for each flux can be computed.
Possibilistic MFA overcomes several limitations of traditional MFA and some of its extensions. It considers measurements uncertainty in a flexible way (e.g. non-symmetric error or a band of uncertainty due to systemic error) and also model imprecision, and it is reliable even if only a few fluxes are measurable (a common scenario). Possibilistic MFA also computes possibility distributions (and intervals) which are more informative than point-wise estimations when multiple flux values might be reasonably possible. These distributions are also better than the intervals provided by the flux spectrum, or other methods giving upper and lower bounds for the fluxes. In addition, Possibilistic MFA has the ability to detect, and handle, inconsistencies between measurements and model. Finally, it must be remarked that Possibilistic MFA estimations have been cast as linear optimisation problems, for which widely-known and efficient tools exist (a MATLAB script solving example 1 is given in the additional file 2 to illustrate this point). This great computational performance makes the methodology capable of dealing with large-scale or even genome-scale metabolic networks.
It must be noticed that there is a challenge when estimating the fluxes in large-scale networks because there may be diffierent flux vectors compatible with the few available measurements [52]. Interestingly, the proposed methodology is still of use in this situation: possibilistic MFA will detect all these flux vectors that are equally possible (or even similarly possible) and depict them by means of possibilistic distributions or intervals (e.g. example 3). Unfortunately, if there is a wide range of candidates, the estimation may be sometimes little informative (but at least we can be sure that it is reliable, because all the flux vectors compatible with model and measurements are captured). One strategy to face this difficulty consists of using a rational hypothesis to promote certain flux vector among those that are equally possible. For instance, it can be assumed that cell behaviour has evolved to be optimal in some sense, so that the fluxes are optimally regulated depending on the given environmental conditions, and then invoke this principle to choose particular flux vectors [3, 7, 9]. There might be still alternate optima, but the approach will reduce the range of possible flux vectors. Notice that this optimality principle, or any other hypothesis, might be incorporated into the possibilistic framework as far as they are encoded in in the form of a cost index (but sometimes it will not be the case or, in other cases, its optimization will not be computationally simple). This point requires further work.
Further extension may also address the adaptation of the ideas introduced herein to metabolic flux analysis with data from labelling experiments (13C-MFA) [12–14]. Extracellular dynamics could be also taken into account incorporating measurements in different time instants [38]. Finally, we are currently developing a software that implements the Possibilistic MFA methods and its future extensions, which will be freely available for academia.
In summary, this papers introduces a unifying framework for flux estimation and (possibilistic) evaluation of consistency that is flexible, usable in scenarios lacking data, highly informative, and computationally efficient. In our opinion, the combination of computational efficiency and flexibility of the assumptions is a distinctive advantage with respect to other approaches which either may rely on stronger assumptions (chi-squared distributions, interval-only descriptions, absence of irreversibility), or be only data-based (so they do not incorporate, say, stoichiometric model balances), or provide only point-wise estimates (of flux or consistency), or be computationally intensive (multi-variate integration in a general Bayesian estimation problem).
References
Palsson B: The challenges of in silico biology. Nature biotechnology. 2000, 18 (11): 1147-50. 10.1038/81125
Kitano H: Computational systems biology. Nature. 2002, 420 (6912): 206-10. 10.1038/nature01254
Palsson B: Systems Biology: Properties of Reconstructed Networks. 2006, Cambridge University Press New York, NY, USA
Llaneras F, Picó J: Stoichiometric modelling of cell metabolism. J Biosci Bioeng. 2008, 105: 1-11. 10.1263/jbb.105.1
Price ND, Papin JA, Schilling CH, Palsson BO: Genome-scale microbial in silico models: the constraints-based approach. Trends in Biotechnology. 2003, 21 (4): 162-9. 10.1016/S0167-7799(03)00030-1
Kauffman KJ, Prakash P, Edwards JS: Advances in flux balance analysis. Current Opinion in Biotechnology. 2003, 14 (5): 491-6. 10.1016/j.copbio.2003.08.001
Schuetz R, Kuepfer L, Sauer U: Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol. 2007, 3: 119- 10.1038/msb4100162
Edwards JS, Ibarra RU, Palsson BO: In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nature biotechnology. 2001, 19 (2): 125-30. 10.1038/84379
Schilling CH, Covert MW, Famili I, Church GM, Edwards JS, Palsson BO: Genome-scale metabolic model of Helicobacter pylori 26695. J Bacteriol. 2002, 184 (16): 4582-93. 10.1128/JB.184.16.4582-4593.2002
Schuster S, Pfeiffer T, Fell DA: Is maximization of molar yield in metabolic networks favoured by evolution?. J Theor Biol. 2008, 252 (3): 497-504. 10.1016/j.jtbi.2007.12.008
G Stephanopoulos AA, Aristidou A: Metabolic Engineering: Principles and Methodologies. 1998, Academic Press, San Diego, USA
Sauer U: Metabolic networks in motion: 13C-based flux analysis. Mol Syst Biol. 2006, 2: 62- 10.1038/msb4100109
Szyperski T: 13C-NMR, MS and metabolic flux balancing in biotechnology research. Q Rev Biophys. 1998, 31: 41-106. 10.1017/S0033583598003412
Wiechert W: 13C metabolic flux analysis. Metabolic Engineering. 2001, 3 (3): 195-206. 10.1006/mben.2001.0187
Nookaew I, Meechai A, Thammarongtham C, Laoteng K, Ruanglek V, Cheevadhanarak S, Nielsen J, Bhumiratana S: Identification of flux regulation coefficients from elementary flux modes: A systems biology tool for analysis of metabolic networks. Biotechnol Bioeng. 2007, 97 (6): 1535-49. 10.1002/bit.21339
Poolman MG, Venkatesh KV, Pidcock MK, Fell DA: A method for the determination of flux in elementary modes, and its application to Lactobacillus rhamnosus. Biotechnol Bioeng. 2004, 88 (5): 601-12. 10.1002/bit.20273
Schwartz JM, Kanehisa M: Quantitative elementary mode analysis of metabolic pathways: the example of yeast glycolysis. BMC Bioinformatics. 2006, 7: 186- 10.1186/1471-2105-7-186
Dubois D, Prade H: Fuzzy relation equations and causal reasoning. Fuzzy Sets and Systems. 1995, 45 (2): 119-134. 10.1016/0165-0114(95)00105-T.
Sala A, Albertos P: Fuzzy systems evaluation: The inference error approach. Systems, Man and Cybernetics, Part B, IEEE Transactions on. 1998, 28 (2): 268-275. 10.1109/3477.662768.
Sala A, Albertos P: Inference error minimisation: fuzzy modelling of ambiguous functions. Fuzzy Sets and Systems. 2001, 121: 95-111. 10.1016/S0165-0114(99)00174-8.
Sala A: Encoding Fuzzy Possibilistic Diagnostics As A Constrained Optimisation Problem. Information Sciences. 2008, 178: 4246-4263. 10.1016/j.ins.2008.07.017.
Kumar V, et al.: Algorithms for constraint-satisfaction problems: A survey. AI magazine. 1992, 13: 32-44.
Russell S, Norvig P: Artificial Intelligence: a modern approach. 2003, Prentice-Hall, 2
Jensen F: Introduction to Bayesian networks. 1996, Springer-Verlag New York, Inc. Secaucus, NJ, USA
Hand D: Statistical reasoning with imprecise probabilities. Applied Statistics. 1993, 42: 237-238. 10.2307/2347427.
Yager R: An introduction to applications of possibility theory. Human Systems Management. 1983, 3: 246-269.
Dubois D, Prade H: Possibility theory – an approach to computerized processing of uncertainty. New York, USA. 1988
Zadeh L: Possibility theory and soft data analysis. Mathematical Frontiers of Social and Policy Sciences. Edited by: Cobb L, Thrall R. 1981, 69-129. Boulder: Westview Press
Dubois D, Fargier H, Prade H: Possibility theory in constraint satisfaction problems: handling priority, preference and uncertainty. Applied Inteligence. 1996, 6 (4): 287-309. 10.1007/BF00132735.
Benferhat S, Dubois D, Prade H: Syntactic Combination of Uncertain Information: A Possibilistic Approach. Lecture notes in computer science. 1997, 30-42. full_text.
Dubois D, Prade H: Interval-valued fuzzy sets, possibility theory and imprecise probability. Proceedings of International Conference in Fuzzy Logic and Technology. 2005
Klir G, Parviz B: Probability-Possibility Transformations: a Comparison. International Journal of General Systems. 1992, 21 (3): 291-310. 10.1080/03081079208945083.
Dubois D, Prade H: Possibility Theory, Probability Theory and Multiple-Valued Logics: A Clarification. Annals of Mathematics and Artificial Intelligence. 2001, 32: 35-66. 10.1023/A:1016740830286.
Heijden T, Luyben K: Linear Constraint Relations in Biochemical Reaction Systems: I. Classification of the Calculability... Biotechnol Bioeng. 1994, 43: 1-10. 10.1002/bit.260430102
Heijden T, Luyben M: Linear Constraint Relations in Biochemical Reaction Systems: II. Diagnosis and Estimation of Gross errors. Biotechnol Bioeng. 1994, 43: 11-20. 10.1002/bit.260430104
Gambhir A, Korke R, Lee J, Fu PC, Europa A, Hu WS: Analysis of cellular metabolism of hybridoma cells at distinct physiological states. J Biosci Bioeng. 2003, 95 (4): 317-27.
Klamt S, Schuster S, Gilles ED: Calculability analysis in underdetermined metabolic networks illustrated by a model of the central metabolism in purple nonsulfur bacteria. Biotechnol Bioeng. 2002, 77 (7): 734-51. 10.1002/bit.10153
Llaneras F, Picó J: A procedure for the estimation over time of metabolic fluxes in scenarios where measurements are uncertain and/or insufficient. BMC Bioinformatics. 2007, 8: 421- 10.1186/1471-2105-8-421
Llaneras F, Picó J: An interval approach for dealing with flux distributions and elementary modes activity patterns. J Theor Biol. 2007, 246 (2): 290-308. 10.1016/j.jtbi.2006.12.029
Wiechert W, Möllney M, Petersen S, de Graaf AA: A universal framework for 13C metabolic flux analysis. Metabolic Engineering. 2001, 3 (3): 265-83. 10.1006/mben.2001.0188
Kadirkamanathan V, Yang J, Billings SA, Wright PC: Markov Chain Monte Carlo Algorithm based metabolic flux distribution analysis on Corynebacterium glutamicum. Bioinformatics. 2006, 22 (21): 2681-2687. 10.1093/bioinformatics/btl445
Schmidt K, Nørregaard LC, Pedersen B, Meissner A, Duus JO, Nielsen JO, Villadsen J: Quantification of intracellular metabolic fluxes from fractional enrichment and 13C-13C coupling constraints on the isotopomer distribution in labeled biomass components. Metabolic Engineering. 1999, 1 (2): 166-79. 10.1006/mben.1999.0114
Henry CS, Broadbelt LJ, Hatzimanikatis V: Thermodynamics-based metabolic flux analysis. Biophys J. 2007, 92 (5): 1792-805. 10.1529/biophysj.106.093138
Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BØ: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121- 10.1038/msb4100155
Mo ML, Palsson BØ, Herrgård MJ: Connecting extracellular metabolomic measurements to intracellular flux states in yeast. BMC Syst Biol. 2009, 3: 37- 10.1186/1752-0509-3-37
Gardeñes E, Sainz M, Jorba L, Calm R, Estela R, Mielgo H, Trepat A: Modal Intervals. Reliable Computing. 2001, 7 (2): 77-111. 10.1023/A:1011465930178.
Gayen K, Venkatesh KV: Analysis of optimal phenotypic space using elementary modes as applied to Corynebacterium glutamicum. BMC Bioinformatics. 2006, 7: 445- 10.1186/1471-2105-7-445
Vallino J: Identification of branch-point restrictions in microbial metabolism through metabolic flux analysis and local network perturbations. PhD thesis. 1991, Massachusetts Institute of Technology, Dept. of Chemical Engineering
Vallino JJ, Stephanopoulos G: Metabolic flux distributions in Corynebacterium glutamicum during growth and lysine overproduction. Reprinted from Biotechnology and Bioengineering, Vol. 41, Pp 633–646 (1993). Biotechnol Bioeng. 2000, 67 (6): 872-85. 10.1002/(SICI)1097-0290(20000320)67:6<872::AID-BIT21>3.0.CO;2-X
Yang TH, Wittmann C, Heinzle E: Respirometric 13C flux analysis-Part II: in vivo flux estimation of lysine-producing Corynebacterium glutamicum. Metabolic Engineering. 2006, 8 (5): 432-46. 10.1016/j.ymben.2006.03.001
Marx A, de Graaf AA, Wiechert W, Eggeling L, Sahm H: Determination of the fluxes in the central metabolism of Corynebacterium glutamicum by nuclear magnetic resonance spectroscopy combined with metabolite balancing. Biotechnol Bioeng. 1996, 49 (2): 111-29. 10.1002/(SICI)1097-0290(19960120)49:2<111::AID-BIT1>3.0.CO;2-T
Bonarius H, Schmid G, Tramper J: Flux analysis of underdetermined metabolic networks: the quest for the missing constraints. Trends in Biotechnology. 1997
Acknowledgements
This research has been partially supported by the Spanish Government (1st and 3rd authors are grateful to grant DPI2008-06880-C03-01, 2nd author is grateful to grants DPI2008-06731-C02-01 and PROMETEO/2008/088). FLL is recipient of a fellowship from the Spanish Ministry of Science and Innovation (FPU AP2005-1442).
Author information
Authors and Affiliations
Corresponding author
Additional information
Authors' contributions
FLL, AS and JP designed the study, analyze the results and conceptualized the manuscript. FLL performed the estimations of the case studies. All authors read and approved the final manuscript.
Electronic supplementary material
12918_2009_347_MOESM1_ESM.xls
Additional file 1: Realistic example. Additional material related with the experimental case study with C. glutamicum. This includes a list of reactions and metabolites, the stoichiometric matrix and measurements data. Numerical results of the four examples described in the article are also included. (XLS 612 KB)
12918_2009_347_MOESM2_ESM.m
Additional file 2: Code for solving a simple example. Simple MATLAB script solving example 1; it computes the maximum possibility flux vector and the marginal possibility distribution for v6. Format: Matlab Script (m). Further details are given in http://science.ensilicio.com/PMFA/PMFA.html (M 1021 bytes)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Llaneras, F., Sala, A. & Picó, J. A possibilistic framework for constraint-based metabolic flux analysis. BMC Syst Biol 3, 79 (2009). https://doi.org/10.1186/1752-0509-3-79
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1752-0509-3-79