### Ensemble of CCR models

The models in the ensemble can be grouped into four categories, characterized by different types of reaction mechanisms and control structures. As a thought experiment, consider the basic kinetic model structure of Eqs. (1)–(7), discussed in the “Methods” section, when there is a perfect symmetry between the availability, uptake, and metabolization of *S*_{1} and *S*_{2}, that is, *S*_{1}(0)=*S*_{2}(0), *r*_{s1}=*r*_{s2}, *r*_{d1}=*r*_{d2}, *r*_{e1}=*r*_{e2}, and *E*_{1}(0)=*E*_{2}(0). Obviously, in this case the time-courses of the variables will not exhibit diauxic growth. The same holds for the basic flux balance model structure of Eqs. (11)–(12).

The CCR models introduced below are all based on different single changes in the structure of the equations and/or the parameter values to break the symmetry between the pathways of the two substrates. For chosen parameter values, all model variants are capable of reproducing diauxic growth and growing on both substrates separately. The equations of the individual model variants, in particular the changes with respect to the basic model structures, are detailed in Additional file 1: Section A.2, together with a plot of the diauxic growth behaviour generated by the model.

Group 1 consists of so-called constraint-based models and are variants of the flux balance analysis model structure of Eqs. (11)–(12). A first way to break the symmetry of the uptake pathways in this model structure is to slighty modify the stoichiometric matrix (model **C1**). Motivated by the diauxic growth behaviour observed with glucose and acetate [16], we modified reaction *r*_{b} in Fig. 4. In contrast to the reaction scheme of Fig. 4, in model C1 metabolite *M* is a central metabolite for substrate *S*_{1} only. From *M*, macromolecules are produced, but at the same time *X*_{2} as a side-product. Metabolite *X*_{2} can be excreted or converted directly into macromolecules (but inefficiently, at a lower maximal rate than *X*_{1}). In the first growth phase, substrate *S*_{1} is consumed and, assuming the uptake reaction is reversible and the inefficient biomass reaction is inactive, substrate *S*_{2} is produced. In the second growth phase, only *S*_{2} is taken up and converted into biomass. The modified scheme is shown in Fig. 5.

A regulatory component in the original model set-up is also sufficient to break the symmetry. Regulatory interactions can be easily incorporated into the framework of FBA (**C2**) by means of a set of rules in Boolean logic (regulatory FBA or rFBA, [17]). These logic rules relate an environmental variable (e.g., the availability of a substrate) to the presence or absence of enzymes (and thus their corresponding reaction rates) in the model. The rules are executed in a sequential manner resulting in a modified stoichiometric matrix. In model C2, diauxic growth is obtained by a single rule inactivating enzyme *E*_{2}, and thus the uptake of *S*_{2}, in case substrate *S*_{1} is available (Fig. 5).

An interesting extension of FBA takes into account the limited space available for transporters in the membrane, and for enzymes in the cytoplasm more generally, an effect called molecular crowding [18, 19]. Imposing a membrane space constraint results in an additional linear inequality in the constraint-based model (**C3** in Fig. 5). Assuming that *E*_{1} and *E*_{2} occupy different amounts of membrane space is sufficient for breaking the symmetry between the two substrates and for diauxic growth behaviour to occur.

A number of recent studies emphasized the characterization of enzyme production in terms of costs (ATP and substrate usage) and benefits (generating energy and precursors for biomass) [20, 21]. These ideas are considered in a further model variant based on FBA: for both pathways, from the uptake of *S*_{1} and *S*_{2} to central metabolism, the trade-off between the cost of making the enzymes and the benefits gained by their use in cellular metabolism leads to an additional inequality in the model (**C4** in Fig. 5). The inequality expresses that the costs of enzyme synthesis should not exceed the benefits by a specified maximum amount.

Group 2 consists of mathematical models that include kinetic terms for enzymatic reactions and enzyme synthesis. The dynamics of the underlying biochemical reaction systems have a crucial property: for state variables there is usually a balance between reactions producing and consuming a component. In addition, what is overlooked in many cases, a dilution term appears in ODEs for concentration variables describing the dilution of enzymes and metabolites due to growth of the population [22]. The dilution term arises from the choice of biomass as the reference volume of the intracellular state variables. As a consequence, the concentration of enzymes and metabolites decays if the rate of growth dilution exceeds the rate of synthesis.

Dilution effects may give rise to diauxic growth. In order to see this, consider kinetic models extended with an additional regulatory interaction, the activation of enzyme synthesis by the metabolites *X*_{1} and *X*_{2} (Fig. 6). The symmetry between the two carbon sources can be broken by changing the values for (i) the maximum rate of enzyme synthesis (**N1**), (ii) the initial enzyme concentration (**N2**), (iii) the affinity of the inducer for its transcription factor and thus the effect on enzyme synthesis (**N3**), or (iv) the maximum rate of substrate uptake (**N4**). For the chosen parameter values the enzymes of the preferred substrate will accumulate, whereas those of the non-preferred substrate will dilute out.

Models in group 3 extend the kinetic expressions used so far by including additional regulatory mechanisms. The most prominent regulatory effects described in the literature on *E. coli* are inducer exclusion and activation by a global transcription factor. Inducer exclusion means that a transport enzyme (e.g., for lactose uptake) or an enzyme involved in metabolism of the substrate (e.g., glycerol metabolism) is subject to control [1]. In the case of lactose and glycerol, a component of the glucose transport system (PEP-dependent phosphotransferase system, PTS), the protein EIIA, acts as an inhibitor of the enzymes. Metabolic regulation of the activity level of the enzymes is very fast in comparison to gene expression regulation. In the case of the glucose-lactose diauxie, a second mechanism based on transcription regulation has been described to control the enzyme concentration: cAMP, a small molecule that is synthesized when the glycolytic flux is low, acts as an activator of the global transcription factor Crp. Crp is involved in a number of cellular processes and most prominently in the control of nearly all carbohydrate uptake systems.

Within our simplified scheme, the two mechanisms are represented as follows (Fig. 7). In the case of inducer exclusion (**R1**), metabolite *X*_{1} acts as a regulatory metabolite and inhibits enzyme *E*_{2} responsible for uptake of substrate *S*_{2}. Activation by a transcription factor is a two-step process: a lower flux through central metabolic pathways will lead to lower concentrations of the metabolites involved, in particular the central metabolite *M*. The lower concentration of *M* leads to the synthesis of a second messenger like cAMP and subsequent activation of the transcription factor. Since the second messenger and the transcription factor are not explicitly represented in the model, metabolite *M* is assumed to directly act as an inhibitor of the synthesis of *E*_{2} (**R2**). Choosing numerical values for strong inhibition by inducer inclusion alone, or for strong activation by a global transcription factor alone, is sufficient to obtain diauxic growth behaviour. Although in both model variants induction of the transport system by *X*_{1} and *X*_{2} is assumed as well, like in the models N1-N4, this addition is not necessary for diauxic growth.

The central metabolite *M* offers further possibilities for control schemes. As described for *E. coli*, the metabolite fructose 1,6-bisphosphate is directly involved in the activation and/or repression of transcription factor Cra (FruR) [1]. Cra is involved in the control of gluconeogenesis. This regulatory mechanism is adapted to our modelling framework in two different ways. First, *M* can act as an activator of the consumption of metabolite *X*_{2} (**R3**). As a consequence, for a higher concentration of *M*, the concentration of *X*_{2} will be lower, which decreases its ability to induce the uptake system of *S*_{2}. Alternatively, *M* can act as an inhibitor of the consumption of metabolite *X*_{1} (**R4**). In this way, for a higher concentration of *M*, the concentration of *X*_{1} will be higher as well, which increases its ability to induce the uptake system of the preferred substrate *S*_{1}. For both schemes, induction of the transport systems by *X*_{1} and *X*_{2} is necessary.

An additional control scheme can be derived from the following line of reasoning (which we could not relate to any known regulatory mechanism in bacterial cells). Assume again that metabolite *M* controls the synthesis of the two enzymes, but in a different manner: high concentrations of *M* lead to synthesis of *E*_{1}, whereas low concentrations lead to synthesis of *E*_{2}. If the total incoming flux to *M* and the concentration of this metabolite are positively correlated, only the first substrate will be consumed because only the first enzyme is synthesized. However, if more complex kinetic expressions for the reaction producing *M* are used, such as product activation, multiple steady states may appear (**R5**). This is illustrated by the bifurcation diagram in Fig. 8: the kinetic rate law leads to two stable steady states for *M* depending on the incoming flux. Assuming that the starting condition is such that only enzyme *E*_{1} is synthesized, the first substrate is taken up (blue circle). As the concentration of *S*_{1} decreases, the incoming flux decreases as well and the system moves along the upper stable branch to the left until the bifurcation point is reached, where it drops to the lower stable branch. This leads to an increase in the amount of enzyme *E*_{2} and the consequent uptake of *S*_{2}. However, the flux to *M* is lower in the second growth phase (green circle). Note, that for bringing the system back to the first steady state, a disturbance directly increasing *M* is necessary (it is not sufficient to feed substrate *S*_{1} again).

Group 4 comprises models that distribute cellular resources over the network components. Resource allocation can be integrated into the models in two different ways, by the solution of an optimization problem or by the definition of specific regulatory interactions influencing the kinetics of the reactions. Models in both subgroups can be seen as variants of the basic kinetic model structure, but models in the first subgroup integrate elements of the flux balance model structure as well, in particular the determination of the value of some of the variables by maximizing an objective function.

For the first subgroup of resource allocation models an objective function is defined, corresponding to either the maximization of the incoming fluxes or the maximization of biomass over a time-interval running from 0 to *t*_{end}. In both cases, the resource to be allocated consists of the intracellular compartment *B*^{′} (Fig. 9). In the first model variant (**A1**), a fraction of *B*^{′} is available for the two transport systems; that is, the (weighted) sum of the concentrations of both enzymes is limited by an upper bound smaller than *B*^{′}. Like for the other model variants in the ensemble, the symmetry can be broken (and substrate *S*_{1} taken up preferentially) by a suitable choice of the weights and other parameters.

The optimization problem for model A1 is static, like in dynamic FBA, in the sense that it is formulated at a specific time-point, for specific concentrations of substrates, biomass, *etc.* The result of the optimization problem (the enzyme concentrations) is fed into the ODEs governing the dynamics of the other variables. The second model variant (**A2**) involves a fully dynamic optimization problem. In this model the rates of enzyme synthesis *r*_{ei}, rather than the concentrations *E*_{i}, depend on *B*^{′}, and the drain of resources towards enzyme synthesis is explicitly accounted for in the ODE for *B*^{′}, by including a reaction with rate *u*_{max} *B*^{′}, where *u*_{max} is the fraction of the biomass component utilized for protein synthesis. The resource is dynamically distributed over the enzymes so as to maximize the biomass produced at *t*_{end}. In addition, we require the size of the internal pool of *M* to be smaller than a certain maximum value. Setting a limit for *M* has a strong influence on the system dynamics in that, if the parameter values for the two uptake pathways are different, it breaks the symmetry between the two pathways and causes one substrate to be preferred.

Cybernetic modelling was introduced some decades ago to describe the behaviour of a population exhibiting diauxic growth [23, 24]. The different variants of cybernetic modelling have notably been capable of accounting for a variety of scenarios for simultaneous or preferential uptake of carbon sources in *E. coli* [23, 25]. Cybernetic models provide a coarse-grained description of microbial kinetics and allocate resources to the synthesis or the activity of specific enzymes in proportion to their return, that is, the growth rate or the quantity of substrate metabolized by the enzyme. They provide two handles for resource allocation: control of enzyme activity and control of enzyme synthesis [23, 24]. Our adaptation of cybernetic modelling, shown in Fig. 9, is restricted to the latter option and accordingly modifies the rate expressions for enzyme synthesis (**A3**). In order to break the symmetry in this case, the maximum uptake rate for the two enzymes was set to different values.

Finally, we consider model variants that distribute energy for transport over the two transport systems. This can be realized in two ways, as schematically shown in Fig. 10. In the first variant, metabolite *M* is needed directly for transport and its concentration is included in the transport kinetics. Metabolite *M* here represents an energy carrier like ATP. The ODE for *M* is updated accordingly to take into account this additional mass flow, and the stoichiometric coefficient for the production of *M* from *X*_{i} is changed to 2 (**A4**). The second variant mimics the observation that group translocation processes transfer phosphoryl groups from one protein to another, as for the above-mentioned PTS (**A5**). In particular, metabolite *M* can be seen as a proxy for PEP, that is, the energy source for glucose transport in *E. coli* and other bacteria, whereas the PTS is represented by the protein *E*_{1}. The enzymes occur in a free (unphosphorylated) and a phosphorylated form to mimick energy consumption by the transport reactions.

A long time ago already it was suggested that cellular systems tend to extend their lifespan for as long as possible ([26], analysed also in [27]). In case a number of substrates are available, this principle would automatically result in the sequential uptake of the substrates, because it would maximize the lifespan *t*_{end} of the microorganism. In the model variant **A6** we therefore define an additional lifespan variable and maximize this variable by means of the so-called free time optimal control approach [27]. In order to guarantee that the cells remain alive, we add a minimal value for *B*^{′} as a constraint for the optimization process. Again, the resources for synthesizing the enzymes are provided by compartment *B*^{′} (see Fig. 10 for the reaction scheme). Interestingly, no further structural modifications or changes in kinetic parameter were necessary to obtain diauxic growth behaviour for this model variant.

### Diauxic growth index

Every model variant in the ensemble presented above is capable of generating diauxic growth behaviour for chosen parameter values. The different (rescaled) models share parameters such as yield coefficients for substrate uptake (Additional file 1: Section A.1). However, some of the parameters are unique for a model variant and determine its specificity (on average, there are 2 to 3 kinetic parameters to choose for each model). One could ask whether this model-specific set of parameters can be tuned in such a way that - possibly - a nearly “perfect” diauxic growth behaviour is obtained. In order to answer this question, we turn the intuitive notion of “perfect” diauxic growth into a quantitative measure, by defining the diauxic growth index, *d*. The index varies between 0 (no diauxic growth, that is, parallel uptake of both substrates) and 1 (perfect sequential uptake). More precisely, *d* is defined as the (absolute) difference of both uptake rates multiplied by the biomass, integrated over the time-interval [0,*t*_{end}]:

$$ d \,\, = \,\, \frac{1}{2} \,\,\, {\int \limits_{t=0}^{t_{end}}} \,\, |r_{1}(t) \,\, - \,\, r_{2}(t) | \,\, b\,\, dt \, {.} $$

(13)

To motivate the definition of *d*, consider the initial value *s*_{0}=1 unit for a single substrate. With a yield coefficient of 1, the integral for this substrate is \({\int } r \,\,b \, dt \,= \, s_{0} \, = \, 1\) (*r* is the uptake rate) when the substrate has been completely consumed by biomass *b*. For two substrates that are consumed one after the other, the value of *d* obtained from Eq. (13) must therefore be 1 and in the case of parallel consumption of the two substrates *d*=0.

For each model variant in the ensemble, we fine-tuned the model-specific parameters with a stochastic search algorithm (while keeping the same value for the shared parameters), so as to maximize the diauxic growth index. The number of model-specific parameters for each model is low since most parameters in the scaled model are given as ratios of two original parameters characterizing the pathways for the two substrates (for example, maximal uptake rates or yield coefficients). We set the scaled parameters to 1, so that the pathways for the two substrates have the same kinetic properties. This ensures that if diauxic growth occurs, it arises from the additional structural assumptions in the model variants that break the symmetry between the pathways. In model R1 (inducer exclusion), for example, only the inhibition constant *K*_{I} remains to be chosen. Figure 11 shows time course data for model variants R1 (nearly perfect diauxic growth) and A5 (partial co-consumption).

The sorted indices thus obtained are also shown in Fig. 11. As expected, most kinetic models with regulation have a high index score, but it can be seen in Fig. 11 that models belonging to other categories may also show nearly perfect diauxic growth, such as models N2 and A2.

### Comparison with experimental data

As already shown in [2] (Fig. 3 therein), the predictions from the models in the ensemble can be directly compared with experimental data. We took data from a standard diauxic growth experiment with glucose and lactose as substrates for *E. coli* (Fig. 1) and adjusted the parameters so as to obtain a good fit. Figure 12 shows a selection of fitted models against experimental data (see also Additional file 1: Section A.4).

Since a large number of model variants show a satisfactory fit, additional experiments are needed for model discrimination. These experiments should stimulate the system in a different way than the reference batch experiment of Fig. 1. We chose two dynamic experiments from a large collection of experimental data [14]: first, a dynamic pulse experiment where the preferred substrate (glucose) is pulsed during exponential growth on a non-preferred second substrate (lactose); and second, an experiment in which an enzyme necessary for the assimilation of the non-preferred substrate (LacZ) has a high initial concentration at the beginning of the experiment (Fig. 13).

In order to discriminate between the model variants, we first simulated the reaction systems with the parameters used in Fig. 11 in the conditions of the new experiments. For the pulse experiment, the timing of the pulse was adapted for each model according to the growth rate; that is, for a model with a slower growth rate, the pulse was applied later. The models N2 and A2 were omitted from the analysis, since N2 imposes an initial condition conflicting with the conditions of the experiment with a high initial enzyme concentration, and the dynamic programming tool used for A2 [13] does not allow the occurrence of externally-determined pulses.

Second, we clustered the predicted time-courses by means of a hierarchical clustering algorithm with a Euclidian distance measure and a predetermined number of clusters. For each cluster, a representative time-course was computed by taking the mean value of the model predictions at every time-point. The resulting clusters were then qualitatively compared with the experimental data. In particular, the following qualitative characteristics were retained (Fig. 13). For the experiment with the high initial LacZ concentration, lactose is not taken up during growth on glucose and LacZ is diluted out. However, the expression of *lacZ* resumes after glucose depletion. In the glucose pulse experiment, lactose uptake slows down (slightly) after the pulse and LacZ synthesis stops, but both resume shortly afterwards. Data were rescaled for the timing of the pulse to allow a fair comparison between the models.

All simulated time-courses for the individual models together with the mean time-courses of the clusters, as well as the dendrograms returned by the clustering procedure, are shown in Additional file 1: Section A.3. Since more than one measured state variable is available, the clustering procedure can be repeated for each measured state variable and the intersections between clusters give an indication of which model variants perform best. We illustrate the outcome of this (repeated) clustering procedure for the two experiments reported in Fig. 13. Inspection of the trend of the mean time-courses in a cluster allows one to determine if the models in the cluster are able to qualitatively reproduce the above-mentioned characteristics of the experiments. For each experiment, this results in a list of models whose predictions match the time-course of the substrate, the time-course of the enzyme, or both (Fig. 14).

For the experiment with the high initial enzyme concentration, most models fail to reproduce the characteristic observation that lactose is not taken up although the necessary enzyme is highly expressed. The only model succeeding in accounting for both the substrate and enzyme time-courses is R1, the inducer exclusion model (Fig. 14). For the pulse experiment, the results are somewhat ambiguous. A number of models reproduce the substrate time-courses well. What poses a problem, however, is that lactose uptake slows down after the glucose pulse for a short time only, simultaneously with an equally-short decrease of the LacZ concentration. Since the LacZ concentration and lactose uptake both resume after this short period, a number of models are - in principle - able to account for the experiment. As indicated in Fig. 14, models with inducer exclusion (R1) and with global transcription factor activation (R2) are in the intersection, but also models that take into account control by central metabolite M (R4) or a different rate constant for gene expression (N1) are consistent with this experiment.