OptORF: Optimal metabolic and regulatory perturbations for metabolic engineering of microbial strains
© Kim and Reed; licensee BioMed Central Ltd. 2010
Received: 11 December 2009
Accepted: 28 April 2010
Published: 28 April 2010
Computational modeling and analysis of metabolic networks has been successful in metabolic engineering of microbial strains for valuable biochemical production. Limitations of currently available computational methods for metabolic engineering are that they are often based on reaction deletions rather than gene deletions and do not consider the regulatory networks that control metabolism. Due to the presence of multi-functional enzymes and isozymes, computational designs based on reaction deletions can sometimes result in strategies that are genetically complicated or infeasible. Additionally, strains might not be able to grow initially due to regulatory restrictions. To overcome these limitations, we have developed a new approach (OptORF) for identifying metabolic engineering strategies based on gene deletion and overexpression.
Here we propose an effective method to systematically integrate transcriptional regulatory networks and metabolic networks. This allows for the formulation of linear optimization problems that search for metabolic and/or regulatory perturbations that couple biomass and biochemical production, thus proposing adaptive evolutionary strain designs. Using genome-scale models of Escherichia coli, we have implemented the OptORF algorithm (which considers gene deletions and transcriptional regulation) and compared its metabolic engineering strategies for ethanol production to those found using OptKnock (which considers reaction deletions). Our results found that the reaction-based strategies often require more gene deletions to remove the identified reactions (2 more genes than reactions), and result in lethal growth phenotypes when transcriptional regulation is considered (162 out of 200 cases). Finally, we present metabolic engineering strategies for producing ethanol and higher alcohols (e.g. isobutanol) in E. coli using our OptORF approach. We have found common genetic modifications such as deletion of pgi and overexpression of edd, as well as chemical specific strategies for producing different alcohols.
By taking regulatory effects into account, OptORF can propose changes such as the overexpression of metabolic genes or deletion of transcriptional factors, in addition to the deletion of metabolic genes, that may lead to faster evolutionary trajectories. While biofuel production in E. coli is evaluated here, the developed OptORF approach is general and can be applied to optimize the production of different compounds in other biological systems.
Metabolic engineering has emerged as an important field aimed to improve cellular production of valuable biochemicals and biofuels. Conventional approaches in metabolic engineering for identifying targets for manipulation focus on metabolic branch points, where undesired reactions are eliminated from competing branches to enhance flux through desired reactions using genetic modifications. However, these metabolic network modifications will not only affect fluxes through local metabolic pathways, but also have system-level effects on metabolic behavior due to changes in carbon, energy, and electron flow. Correspondingly, such conventional approaches may fail to identify modifications in distant pathways that can potentially improve cellular production.
Computational models of metabolism have been successful in predicting the consequences of gene deletions at a systems level [1–4]. In Escherichia coli, genome-scale models of metabolic networks have been used to identify metabolic engineering strategies such as gene deletions or additions to maximize production of primary or secondary metabolites [5–7]. Some computational methods, such as OptKnock , identify knockout strains that would have improved biochemical production capabilities after undergoing adaptive evolution. Knockout mutants that force the coupling between biomass and biochemical production allow one to use growth rate as a selective pressure and find adaptively evolved strains with improved growth rates and production capabilities. Such methods have been used to generate lactate and succinate producing strains [9, 10]. A number of variations on OptKnock have appeared recently which use alternative search algorthims, add non-native pathways, and consider deviations from wildtype flux levels [5, 11, 12].
Computational strain design methods evaluate the effects of gene or reaction deletions to search for the mutants with improved production capabilities. A gene deletion is simulated by removing the reactions associated with the target gene from the metabolic networks; however, most current methods are often based on reaction deletions, and not gene deletions. However, genes and reactions do not always have a one-to-one relationship due to the presence of multi-functional enzymes, enzyme subunits, orphan reactions, and isozymes. Thus, knockout mutants based on reaction deletions can sometimes be genetically impossible or difficult to construct. Also, existing methods do not take into consideration the transcriptional regulatory networks that control metabolism. As a result, predicted strains with high production capabilities may not be able to grow initially or evolve to the desired final state due to regulatory restrictions.
In this study, we present a new optimization approach, OptORF, to identify metabolic engineering strategies based on a minimal number of metabolic and transcription factor gene deletions and metabolic gene overexpression, which couple biomass and biochemical production. Here, gene to protein to reaction (GPR) associations are modeled directly using a Boolean approach and reactions are removed when the associated genes are deleted. Interactions between the regulatory and metabolic networks are also modeled using Boolean approaches by turning on or off metabolic gene expression in response to transcriptional factor (TF) status. These Boolean relationships can be effectively formulated as linear constraints using binary variables and matrices, which are more systematic and/or computationally efficient than previously suggested formulations for modeling GPR associations and integrating metabolic and regulatory models [13–17].
The integrated model of metabolism and regulation can predict the steady-state metabolic flux distributions and regulatory states simultaneously. Consequently, the OptORF framework allows for the identification of optimal metabolic gene knockouts as well as transcription factor knockouts. In addition, overexpression of genes that are unexpressed under a given condition can be found in order to improve the production of a target biochemical. Using genome-scale metabolic and regulatory models of E. coli[18, 19], we have identified metabolic engineering strategies for ethanol production using OptKnock (which considers reaction deletions) and compared these strategies to those found using our new approach OptORF (which considers gene deletions) with and without transcriptional regulatory constraints. Our analysis showed that the strategies based on reaction deletions often require a larger number of gene deletions, and also many of them result in lethal growth phenotypes when transcriptional regulation is considered. In addition, we have identified metabolic engineering strategies for overproduction of higher alcohols such as isobutanol via non-fermentative pathways based on a recent study . While ethanol and higher alcohol production in E. coli is evaluated here, the OptORF approach can be easily applied to other biochemicals and microorganisms.
steady-state mass balance
gene deletions and overexpressions
limited number of gene deletions
limited number of gene overexpressions
Since the cellular objective is maximizing biomass production (B) from substrate (S), pathways involving R5 (producing P2) would be normally preferred to ones involving R2 (producing P1). Given an engineering objective of producting P1, close inspection of the reaction network indicates that removal of reactions R3 and R4 or reaction R5 would couple maximum biomass production to production of P1 instead of P2. OptORF will identify genetic modification strategies involving gene deletions that are associated with these reactions (G3 and G4, or G5 and G6, respectively). However, G1A expression is inhibited by TF1, and TF1 is active in the presence of S, and thus reaction R1 cannot happen. Therefore, OptORF will also identify the overexpression of G1A along with the gene deletions mentioned above. An alternate strategy to the overexpression of G1A would be the deletion of TF1 which inhibits expression of G1A. In fact, when TF1 is deleted, the genes activated by TF1 (G3 and G5) would be no longer expressed, which reduces the number of genes that are needed to be deleted. Therefore, OptORF will first identify double knockout strategies including the TF1 deletion, and then find the alternate strategies with the G1A overexpression (these strategies are shown in Figure 1, see Additional file 1 for the implementation).
Reactions without known GPR associations are not constrained by these GPR rules.
Gene deletion and overexpression
In this study, we used the biomass formation as the objective function of inner problem (p j = 1 for j = biomass formation). The constraints including 'if' indicators are implemented directly using the GAMS/CPLEX indicator constraint facility. We also constrained the dual variables for reaction removal (h j ) to be within a small range (-1 to 1) in order to reduce the solution time (J. Kim, J.L. Reed, and C.T. Maravelias, in preparation).
The objective function in the outer problem of OptORF formulation is a linear combination of fluxes with penalty terms for the total number of gene deletions or overexpressions ( ). The first term defines biochemical production of interest, the second term applies a weighted penalty (α) to an additional gene deletion, and the third applies a penalty (β) to an additional overexpression. In other words, the biochemical production rate should increase at least by α or β if an additional gene is deleted or overexpressed, respectively. These penalty terms can be very useful for eliminating strains needing more genetic modifications if the improvement in production is small. When α or β is a very small value (≈ 10-6), it effectively eliminates unnecessary modifications from the solution without affecting the optimal biochemical production. For example, if deleting gene A results in the same product yield as deleting gene A and B (i.e. deletion of gene B does not improve the yield), then the gene B deletion would not appear in the optimal solution.
Models and simulation conditions
In this study, we have implemented an integrated model of metabolism and regulation for E. coli, iMC1010 v 2, which consists of 906 metabolic genes and 104 TFs. There was one transcription factor, GlnL, that was included in the original model but was missing regulatory targets. GlnL should affect GlnG activity, but instead GlnG activity is independent of GlnL (the correct rule for GlnG should be (GlnL AND Not (nh4(e)>2)). However, this missing regulatory interaction would not affect the results of this study as GlnG is not active under these conditions and was not identified as a strategy for improving production of the alcohols examined here. In the OptKnock simulations, we excluded transport reactions for acetate, carbon dioxide, formate, phosphate, and water from consideration as eliminating transport may be challenging. In addition, ATP synthase deletion was excluded from consideration since the deletion resulted in a high variability in ethanol production at the predicted optimal growth condition. Equivantly, the deletion of focA, focB, and atp operon were excluded from the OptORF simulations. The OptORF approach was applied to identify metabolic engineering strategies for overproduction of ethanol or higher alcohols (i.e. c j = 1 for j = desired alcohol secretion) in glucose minimal media. Maximum glucose uptake rate (GUR) and oxygen uptake rate (OUR) are specified in order to simulate anaerobic growth conditions (GUR = 18.5 mmol/gDW/hr, OUR = 0 mmol/gDW/hr) . A minimal growth rate was set to 0.1 hr-1 for all simulations. The optimization problems were solved using CPLEX 11.2 accessed via the General Algebraic Modeling System (GAMS).
Results and Discussion
We identified metabolic engineering strategies for ethanol production in E. coli using the OptORF formulation with an integrated model of metabolism and regulation, and compared the resulting strategies to ones using a previous approach based on reaction deletions (OptKnock). First, a set of reaction deletion strategies was obtained using OptKnock, and then a corresponding set of gene deletions needed to remove the reactions in each OptKnock strategy was identified. These OptKnock gene deletion designs were then compared to the gene deletion strategies found by OptORF without considering transcriptional regulation to examine the differences between the reaction-based strategies and gene-based strategies. To investigate how transcriptional regulation affects adaptive evolution of microbial strains, we analyzed available data for adaptively evolved E. coli mutant strains using the integrated metabolic and regulatory model. OptKnock strategies were then re-analyzed using an integrated metabolic and regulatory model and compared to the OptORF strategies identified when transcriptional regulatory constraints were considered. Finally, we present metabolic engineering strategies for overproducing ethanol or higher alcohols in E. coli that include both metabolic gene deletions and overexpressions, as well as transcription factor deletions, using our developed approach.
Reaction deletion vs. gene deletion
The number of genetic manipulations needed is an important factor when evaluating metabolic engineering strategies. When isozymes are present, a strategy with the minimum number of reaction deletions does not necessarily correspond to a strategy with the minimum number of gene deletions. For example, there are four gene products in E. coli known to function as serine deaminases. In order to completely remove this particular metabolic reaction from the system, one would have to knockout all four genes. If removal of an alternative reaction would serve the same purpose, but require fewer gene deletions, then OptORF would identify the simpler genetic strategy while reaction-based frameworks would not be able to distinguish between them.
From a computational point of view, gene deletions can be more advantageous than reaction deletions due to the nature of combinatorial optimization. The difficulty of solving such an optimization problem increases exponentially with the total number of decision variables, i.e., reactions or genes to choose from. Generally, the total number of reactions are larger than the total number of genes in available genome-scale models. For example, the most recent metabolic reconstruction of E. coli K-12 MG1655  includes 2,381 reactions, but only 1,260 ORFs are accounted for. Although OptORF requires a number of binary variables for genes, proteins, and regulatory rules, these are very tightly constrained by the GPR association and transcriptional regulatory constraints. As a result, the computation time to solve an OptORF problem is comparable to the time to solve an OptKnock problem.
Adaptive evolution and transcriptional regulation
Transcriptional regulation plays a significant role in controlling the expression of metabolic genes thereby affecting flux through metabolic reactions. These regulatory effects have not been directly considered in previous strain design approaches. Transcription factors not only affect metabolic flux distributions by controlling gene expression, but they also sense and respond to metabolic or environmental changes. Integrating transcriptional regulatory networks with metabolic networks requires the connection between genes and reactions. We have effectively formulated these transcriptional regulatory and gene to protein to reaction (GPR) logical relationships, which enables us to predict the effects of transcription factor deletions as well as metabolic gene deletions on transcription regulation and metabolism, simultaneously.
Metabolic engineering strategies described in this work are based on the assumption that microbial cells would evolve to have higher growth rates, and that biochemical production would increase along with cellular growth rate, the latter being the selective pressure during adaptive evolutionary experiments. An important question that one might ask is how malleable the transcriptional regulatory network is during adaptive evolution. If cells can easily rewire their transcriptional networks to gain higher fitness, it is possible that knockout strains could lose the coupling of biochemical production and growth, if expressing unexpressed genes leads to a higher growth rate without a higher biochemical production. To address this issue, we have analyzed available data for adaptively evolved strains of E. coli, and compared the data to predictions using the integrated model of metabolism and regulation.
Metabolic gene deletion strains have also been evolved on different carbon sources . We have analyzed growth phenotypes for these strains using the metabolic model and integrated metabolic and regulatory models, and found that only the strains grown on malate showed a significant difference in predicted growth rates between the regulated and un-regulated models. Figure 4C shows the experimentally observed growth rates relative to the predictions for mutant strains grown on malate at the end of adaptive evolution (day 40). Mutant strains seem to evolve and increase their growth rates to the optimal values predicted by the integrated model, but do not reach the values predicted by the metabolic model alone. The only strain that did exceed the integrated metabolic and regulatory model predictions, Δzwf, also had large experimental standard deviations in the observed growth rates. Based on these results, it is possible that cells undergoing adaptive evolution do not significantly rewire their transcriptional regulatory networks, and therefore regulation should be considered in the design of production strains.
Another advantage of using an integrated model of metabolism and regulation emerges when it comes to predicting essential genes. An integrated model is better at predicting essential genes under a given condition, and hence more likely prevents gene deletions which are lethal from being included in the strategies. It was previously shown that an integrated model of E. coli correctly predicts the growth phenotypes for 10,833 (78.8%) of the total 13,750 cases (mutant grown in a single environmental condition), while a metabolic model alone predicts 8,968 (65.2%) cases correctly . An integrated model is also capable of predicting essential transcription factors (e.g. cysB and metR) as well as metabolic genes in E. coli[19, 24, 25]. Accordingly, strains that are designed with regulatory considerations should grow better initially and may achieve the desired phenotype faster.
Metabolic model vs. integrated model
For each of the three approaches, we generated a heat map based on the correlation coefficients between the genes that appear in at least 10% of their corresponding 200 strategies (Figure 6B-D). If a pair of gene deletions always appears in strategies together, the corresponding cell in the heat map is colored in yellow. A cell is colored in black when a pair of gene deletions are anti-correlated. For example, pta and eutD appear together since the deletion of both is required to eliminate the phosphate acetyltransferase activity, while either fnr or arcA appears since the deletion of either transcription factor results in a similar phenotype. The pattern of correlation becomes clearer (strategies have less variation) as the structure of the model gets simpler from a gene-based deletion with transcriptional regulation (Figure 6D) to reaction-based deletion without transcriptional regulation (Figure 6B). This indicates that as models account for the complex structure and interactions of networks, more diverse metabolic engineering strategies can be identified.
Strain designs for ethanol production by OptORF
Gene deletion strategies for ethanol production in E. coli
Growth Rate (hr-1)
Ethanol Yield (%)
Gene deletion and overexpression strategies for ethanol production in E. coli.
Growth Rate (hr-1)
Ethanol Yield (%)
Deletion of fnr or arcA is found in most strain designs, where some enzymes involved in aerobic metabolism (that are repressed by Fnr and/or ArcA) can be advantageous for ethanol production. Aerobic genes in central metabolism that are repressed by these anaerobic regulators include aceAB, aceEF, lpd, mdh, sucAB, and sdhABCD. The de-repression of malate dehydrogenase (mdh) was predicted to be especially important based on comparisons between flux distributions with and without mdh. If necessary, such repressed genes may be overexpressed, as an alternative to deleting fnr or arcA to ensure that metabolic activity is high enough to achieve the desired level of ethanol production.
Genes involved in the electron transfer chain were also identified as needing to be deleted to limit the amount of NADH oxidized by this pathway. NADH:ubiquinone oxidoreductase (NDH) I and II catalyze the transfer of electrons from NADH to the quinone pool, and the electrons are passed to fumarate by fumarate reductase (FRD), an essential enzyme for anaerobic growth. OptORF identified the deletion of NDH-1 (nuo), the predominant NDH under anaerobic conditions, to block electron transfer from NADH to fumarate. As a result, the model predicts a decrease in FRD flux and reduced succinate production in NDH-1 deficient strains, while flux through fumarase and malic enzyme is increased.
Deletion of pgi was also found in many of the strain designs for ethanol production, suggesting re-direction of flux through glycolysis to the pentose phosphate (PP) pathway or Entner-Doudoroff (ED) pathway. This increases generation of NADPH whose electrons are passed to NAD via NADH transhydrogenase, and the additional NADH is used to reduce acetyl-CoA to ethanol by alcohol dehydrogenase (AdhE). While increasing the amount of NADH available to produce ethanol, the pgi deletion also lowers the net ATP production and thus decreases growth rate as compared to the wild-type strain. The ED pathway consists of two enzymes, Edd and Eda, and the expression of edd is repressed by the transcription factor GntR. Deletion of gntR would de-repress the expression of edd, which allows for the conversion of glucose to pyruvate and glyceraldehyde-3-phosphate. Equivalently, overexpression of edd was identified as an alternative to deletion of gntR. The activation of the ED pathway in a pgi mutant also leads to a significant increase in growth rate, which would be favorable for industrial-scale ethanol production.
There are three enzymes, PflAB, PflCD and TdcE, which possibly function as pyruvate formate-lyase (PFL). The regulatory model indicates that expression of pflD requires either ArcA or Fnr as activators, and a previous study showed that PFL activity was still detected in pflA or pflB mutant . Another study revealed that a fnr deletion alone is sufficient to decrease PFL activity down to the level of ΔfnrΔarcA strain, while an arcA deletion alone did not decrease PFL activity . Thus, deletion of fnr, pflB, and tdcE would abolish PFL activity and require cells to use pyruvate dehydrogenase (PDH) [26, 30], whose expression is repressed by Fnr and ArcA in anaerobic conditions. Deletion of fnr would lower PFL activity and attenuate the repression of PDH, the result being the production of NADH instead of formate when pyruvate is converted to acetyl-CoA. In the absence of oxygen, some of the acetyl-CoA would be reduced to ethanol consuming two NADH molecules to maintain a redox balance.
Deletion of pta and eutD (both catalyze the conversion of acetyl-CoA to acetylphosphate) would reduce acetate production, and hence increase formation of other by-products such as ethanol, lactate, or succinate. However, multiple studies have shown that the mutations in the ack-pta pathway cause accumulation of pyruvate [31–33], and the integrated metabolic and regulatory model predicts the secretion of pyruvate (64% mol pyruvate/mol glucose) in a ΔptaΔeutD mutant. Pyruvate can be either secreted or reduced to form other fermentation by-products, but there is not enough NADH available to ferment all the pyruvate generated by glycolysis. In order to convert pyruvate to ethanol, arcA and gntR deletions are needed to derepress PDH and the ED pathway, along with a pgi or tpiA deletion to re-direct flux from glycolysis to the ED pathway. A ΔtpiA mutant alone could cause methylglyoxal accumulation and inhibit the anaerobic growth , but re-directing flux to the ED pathway should prevent methylglyoxal accumulation.
In the strategies that include both gene deletion and gene overexpression, we found that overexpression of edd replaced the gntR deletion in most strains to activate the ED pathway. In addition, overexpression of fructose-1,6-bisphosphatase (fbp) was predicted to increase the amount of fructose-6-phosphate, and reverse the direction of the non-oxidative branch of the PP pathway in the strains utilizing the ED pathway. The reversed PP pathway results in a decreased flux in the TCA cycle and an increased flux in the ED pathway and PDH, leading to improved ethanol production. The model predicts that ptsH deletion (in addition to other modifications) increases flux through the lower half of glycolysis and decreases succinate production. Switching glucose transport from the phosphoenolpyruvate:sugar phosphotransferase system (PTS) to proton symport has been shown to improve overall performance and production yield for ethanol as well as other compounds .
Glutamate can be synthesized via multiple pathways depending on the availability of nitrogen sources. When ammonia is abundant, an ATP-independent pathway functions to save energy by converting α-ketoglutarate to glutamate using NADPH. This pathway is encoded by glutamate dehydrogenase (gdhA), the deletion of which would require cells to use the ATP-dependent pathway that normally operates when the concentration of ammonia is low . This ATP-dependent pathway would decrease growth rate, but increase the flux through the ED pathway and PDH, and improve the ethanol production.
Strain designs for higher alcohol production
In addition to ethanol, we have also identified metabolic engineering strategies using OptORF for over-production of higher alcohols such as isobutanol and 2-phenylethanol from glucose. Since E. coli does not naturally produce these higher alcohols, we have augmented the iMC1010v2 network with non-fermentative reactions and corresponding GPR associations for synthesis of these alcohols based on a recent study . In summary, 2-keto acid decarboxylase (KDC) and alcohol dehydrogenase (ADH) were added to the network to allow for production of 1-propanol, 1-butanol, isobutanol, 2-methyl-1-butanol, 3-methyl-1-butanol, and 2-phenylethanol from intermediates in isoleucine, leucine, and valine biosynthesis. We have assumed these enzymes, KDC and ADH, have no substrate specificity so that the production of any higher alcohol is equally preferred.
Gene deletion and overexpression strategies for isobutanol production in E. coli.
Growth Rate (hr-1)
Isobutanol Yield (%)
In addition to isobutanol, we found that the production of 1-propanol and 2-phenylethanol can also be coupled to growth, but their yields were much lower than the isobutanol yield (~38% and ~5.7%, respectively, see Additional file 3). However, the production of other branched-alcohols such as 2-methyl-1-butanol or 3-methyl-1-butanol can be accompanied with the production of other alcohols including ethanol and isobutanol. In other words, cells could either produce 2-methyl-1-butanol (3-methyl-1-butanol) along with the other alcohols or produce only the other alcohols. In such cases, changes in the substrate specificity of KDC or ADH enzymes would be needed to generate specific alcohols. Interestingly, the identified metabolic engineering strategies for 2-phenylethanol production were very distinct from the strategies for other alcohol production strains (see Additional file 3). While strategies for producing other alcohols involved increasing fluxes in the oxidative branch of PP pathway and ED pathway, the strategies for 2-phenylethanol include deletion of genes in the both the oxidative (zwf or gnd) and non-oxidative (talAB) branches of the PP pathway. The model predicts that these gene deletions would increase the fluxes in the aromatic amino acid biosynthesis pathways, which leads to the increased availability of phenylpyruvate, the precursor for 2-phenylethanol. Analysis of these higher alcohols illustrates how OptORF can be used to couple biomass and production of metabolites which are not part of central metabolism.
We have systematically integrated metabolic and regulatory models, and developed a new computational framework (OptORF) for designing microbial strains for metabolite production. We compared our new approach to OptKnock, and found four primary differences between the strains that are identified using the two approaches. First, OptKnock may propose removing reactions that do not have any genes associated with them, making the construction of such strains experimentally impossible. Second, OptORF can find metabolic engineering strategies requiring the smallest number of gene deletions while still achieving high production yields. Since OptKnock strategies are based on reaction deletions they often require more gene deletions than those found using OptORF. Third, OptKnock may suggest reaction deletions that result in a different solution space when the necessary genes are deleted or transcriptional regulatory effects are accounted for. In this case the adaptive evolutionary outcome would be different than what is predicted when only reaction deletions are considered, sometimes resulting in reduced production yields or lethal phenotypes. Lastly, OptORF can propose changes such as the overexpression of metabolic genes or deletion of transcriptional factors that may lead to faster evolutionary trajectories.
Based on our analysis of experimental data using integrated metabolic and regulatory model it is unclear to what extent, if any, cells re-wire their transcriptional regulatory network during adaptive evolution. Given that a finite number of mutations are found in adaptively evolved strains , it seems likely that cells could get stuck in a local maxima in the fitness landscape, where they would need to change the regulation of multiple gene products to improve fitness. This idea is supported by the fact that the same starting strain can evolve to different end points, and in some cases achieve only sub-optimal behaviors [9, 39, 40]. By taking regulatory effects into account when designing strains it may be possible to start with strains that are already expressing the necessary enzymes needed to achieve the desired production and growth rates. Some evolved strains may stay within the solution space defined by metabolic and regulatory constraints, while others may alter their regulatory networks if it results in a significant growth advantage, thus altering the solution space in which they evolve. Thus, it will be particularly important to conduct parallel evolutionary experiments to find evolved strains that lead to higher production without violating regulatory constraints.
In its current implementation, OptORF uses Boolean approximations to describe how transcriptional regulation affects metabolic fluxes. Although the use of Boolean variables do not exactly represent the dynamic nature of metabolism and regulation, it has been previously shown that constraint-based models using these approximations successfully predict the cellular behavior in continuous and batch culture [1, 19, 21, 24]. The approach could be extended to include other types of regulatory models which can account for varying levels of gene expression or enzyme activity. A previous study has shown that the behavior of a transcriptional regulatory network can be well approximated by a system of linear equations near a steady-state, where gene expression does not substantially change . The OptORF approach could be improved by applying these linear approximations in the regulatory part of the model, in order to describe varying gene expression levels, and using approaches to constrain metabolic fluxes based on predicted gene expression levels [42–44].
The OptORF approach is currently applied to produce metabolites that can be coupled to biomass production. A recent study has used a genetic algorithm to design strains with un-coupled metabolite and biomass production, where a bi-level problem is used and the inner problem uses an objective function to predict un-evolved cellular phenotypes . OptORF could also be extended to find metabolic engineering strategies that do not require coupling of cellular growth and product formation, and would evaluate gene deletions, gene overexpression, and regulatory effects simultaneously to identify such strategies.
The novelty of the method developed here is that it accounts for transcriptional regulatory networks in addition to metabolism in the design of strains for metabolic engineering. However if desired, the approach can be used with and without transcriptional regulatory constraints to consider the interdependence of reactions through their GPR associations. It should be noted that the integrated model of metabolism and regulation allows for simulating the effects of both gene overexpression (where un-expressed genes are expressed) and gene deletion. The OptORF approach can also suggest transcription factor deletion as an alternative to metabolic gene deletion or overexpression, which provides greater flexibility in metabolic engineering strategies. By further incorporating flux modulation approaches such as those proposed in OptReg , additional engineering strategies can be designed which consider adjustment of flux values and not just the complete removal/addition of reactions via gene deletion or gene overexpression.
The approach we have developed here is general and can be used to engineer production of a variety of products in different microorganisms, for which constraint-based models exist. The number of microbial transcriptional regulatory network models continues to grow, which has been enabled by high-throughput datasets and computational analysis [46–52]. Regulatory networks reconstructed from analysis of high-throughput datasets can be integrated with metabolic networks using Boolean or other types of regulatory modeling formalisms, and our approach can applied to new integrated models of metabolism and regulation. As such, it will have impacts on the biological production of a wide variety of products, ranging from biofuels and other commodity chemicals to specialty chemicals [53–55].
This work was funded by the DOE Great Lakes Bioenergy Research Center (DOE BER Office of Science DE-FC02-07ER64494). The authors also wish to acknowledge Bob Landick, Tricia Kiley, Brian Pfleger, and Christos Maravelias for useful discussions and Chris Tervo for help editing the manuscript.
- Fong SS, Palsson BØ: Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat Genet. 2004, 36: 1056-1058. 10.1038/ng1432View ArticlePubMedGoogle Scholar
- Segrè D, Vitkup D, Church GM: Analysis of optimality in natural and perturbed metabolic networks. Proc Natl Acad Sci USA. 2002, 99: 15112-15117. 10.1073/pnas.232349399PubMed CentralView ArticlePubMedGoogle Scholar
- Shlomi T, Berkman O, Ruppin E: Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc Natl Acad Sci USA. 2005, 102: 7695-7700. 10.1073/pnas.0406346102PubMed CentralView ArticlePubMedGoogle Scholar
- Schuetz R, Kuepfer L, Sauer U: Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol. 2007, 3: 119- 10.1038/msb4100162PubMed CentralView ArticlePubMedGoogle Scholar
- Pharkya P, Burgard AP, Maranas CD: OptStrain: A computational framework for redesign of microbial production systems. Genome Res. 2004, 14: 2367-2376. 10.1101/gr.2872004PubMed CentralView ArticlePubMedGoogle Scholar
- Park JH, Lee KH, Kim TY, Lee SY: Metabolic engineering of Escherichia coli for the production of l-valine based on transcriptome analysis and in silico gene knockout simulation. Proc Natl Acad Sci USA. 2007, 104: 7797-7802. 10.1073/pnas.0702609104PubMed CentralView ArticlePubMedGoogle Scholar
- Alper H, Miyaoku K, Stephanopoulos G: Construction of lycopene-overproducing E. coli strains by combining systematic and combinatorial gene knockout targets. Nat Biotechnol. 2005, 23: 612-616. 10.1038/nbt1083View ArticlePubMedGoogle Scholar
- Burgard AP, Pharkya P, Maranas CD: Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol Bioeng. 2003, 84: 647-657. 10.1002/bit.10803View ArticlePubMedGoogle Scholar
- Fong SS, Burgard AP, Herring CD, Knight EM, Blattner FR, Maranas CD, Palsson BØ: In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol Bioeng. 2005, 91: 643-648. 10.1002/bit.20542View ArticlePubMedGoogle Scholar
- Burgard AP, Van Dien SJ: Methods and organisms for the growth-coupled production of succinate. Patent. 2007,WO/2007/030830,Google Scholar
- Patil K, Rocha I, Forster J, Nielsen J: Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics. 2005, 6: 308- 10.1186/1471-2105-6-308PubMed CentralView ArticlePubMedGoogle Scholar
- Pharkya P, Maranas CD: An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab Eng. 2006, 8: 1-13. 10.1016/j.ymben.2005.08.003View ArticlePubMedGoogle Scholar
- Lun DS, Rockwell G, Guido NJ, Baym M, Kelner JA, Berger B, Galagan JE, Church GM: Large-scale identification of genetic design strategies using local search. Mol Syst Biol. 2009, 5: 296- 10.1038/msb.2009.57PubMed CentralView ArticlePubMedGoogle Scholar
- Covert MW, Palsson BØ: Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J Biol Chem. 2002, 277: 28058-28064. 10.1074/jbc.M201691200View ArticlePubMedGoogle Scholar
- Shlomi T, Eisenberg Y, Sharan R, Ruppin E: A genome-scale computational study of the interplay between transcriptional regulation and metabolism. Mol Syst Biol. 2007, 3: 101- 10.1038/msb4100141PubMed CentralView ArticlePubMedGoogle Scholar
- Gianchandani EP, Joyce AR, Palsson BØ, Papin JA: Functional states of the genome-scale Escherichia Coli transcriptional regulatory system. PLoS Comput Biol. 2009, 5: e1000403- 10.1371/journal.pcbi.1000403PubMed CentralView ArticlePubMedGoogle Scholar
- Suthers PF, Zomorrodi A, Maranas CD: Genome-scale gene/reaction essentiality and synthetic lethality analysis. Mol Syst Biol. 2009, 5: 301- 10.1038/msb.2009.56PubMed CentralView ArticlePubMedGoogle Scholar
- Reed JL, Vo TD, Schilling CH, Palsson BØ: An expanded genome-scale model of Escherichia coli K-12 (i JR904 GSM/GPR). Genome Biol. 2003, 4: R54- 10.1186/gb-2003-4-9-r54PubMed CentralView ArticlePubMedGoogle Scholar
- Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BØ: Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004, 429: 92-96. 10.1038/nature02456View ArticlePubMedGoogle Scholar
- Atsumi S, Hanai T, Liao JC: Non-fermentative pathways for synthesis of branched-chain higher alcohols as biofuels. Nature. 2008, 451: 86-89. 10.1038/nature06450View ArticlePubMedGoogle Scholar
- Varma A, Palsson BØ: Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl Environ Microbiol. 1994, 60: 3724-3731.PubMed CentralPubMedGoogle Scholar
- Zhao G, Winkler ME: An Escherichia coli K-12 tktA tktB mutant deficient in transketolase activity requires pyridoxine (vitamin B6) as well as the aromatic amino acids and vitamins for growth. J Bacteriol. 1994, 176: 6134-6138.PubMed CentralPubMedGoogle Scholar
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BØ: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121- 10.1038/msb4100155PubMed CentralView ArticlePubMedGoogle Scholar
- Joyce AR, Reed JL, White A, Edwards R, Osterman A, Baba T, Mori H, Lesely SA, Palsson BØ, Agarwalla S: Experimental and computational assessment of conditionally essential genes in Escherichia coli. J Bacteriol. 2006, 188: 8259-8271. 10.1128/JB.00740-06PubMed CentralView ArticlePubMedGoogle Scholar
- Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H: Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol. 2006, 2: 10.1038/msb4100050. 2006.0008,Google Scholar
- Kim Y, Ingram LO, Shanmugam KT: Construction of an Escherichia coli K-12 mutant for homoethanologenic fermentation of glucose or xylose without foreign genes. Appl Environ Microbiol. 2007, 73: 1766-1771. 10.1128/AEM.02456-06PubMed CentralView ArticlePubMedGoogle Scholar
- Hespell RB, Wyckoff H, Dien BS, Bothast RJ: Stabilization of pet operon plasmids and ethanol production in Escherichia coli strains lacking lactate dehydrogenase and pyruvate formate-lyase activities. Appl Environ Microbiol. 1996, 62: 4594-4597.PubMed CentralPubMedGoogle Scholar
- Zhu J, Shimizu K: The effect of pfl gene knockout on the metabolism for optically pure D-lactate production by Escherichia coli. Appl Microbiol Biotechnol. 2004, 64: 367-375. 10.1007/s00253-003-1499-9View ArticlePubMedGoogle Scholar
- Levanon SS, San K-Y, Bennett GN: Effect of oxygen on the Escherichia coli ArcA and FNR regulation systems and metabolic responses. Biotechnol Bioeng. 2005, 89: 556-564. 10.1002/bit.20381View ArticlePubMedGoogle Scholar
- Kim Y, Ingram LO, Shanmugam KT: Dihydrolipoamide dehydrogenase mutation alters the NADH sensitivity of pyruvate dehydrogenase complex of Escherichia coli K-12. J Bacteriol. 2008, 190: 3851-3858. 10.1128/JB.00104-08PubMed CentralView ArticlePubMedGoogle Scholar
- Tomar A, Eiteman MA, Altman E: The effect of acetate pathway mutations on the production of pyruvate in Escherichia coli. Appl Microbiol Biotechnol. 2003, 62: 76-82. 10.1007/s00253-003-1234-6View ArticlePubMedGoogle Scholar
- Causey TB, Shanmugam KT, Yomano LP, Ingram LO: Engineering Escherichia coli for efficient conversion of glucose to pyruvate. Proc Natl Acad Sci USA. 2004, 101: 2235-2240. 10.1073/pnas.0308171100PubMed CentralView ArticlePubMedGoogle Scholar
- Dittrich CR, Vadali RV, Bennett GN, San K-Y: Redistribution of metabolic fluxes in the central aerobic metabolic pathway of E. coli mutant strains with deletion of the ackA-pta and poxB pathways for the synthesis of isoamyl acetate. Biotechnol Progr. 2005, 21: 627-631. 10.1021/bp049730r.View ArticleGoogle Scholar
- Ferguson GP, Tötemeyer S, MacLean MJ, Booth IR: Methylglyoxal production in bacteria: suicide or survival?. Arch Microbiol. 1998, 170: 209-218. 10.1007/s002030050635View ArticlePubMedGoogle Scholar
- Gosset G: Improvement of Escherichia coli production strains by modification of the phosphoenolpyruvate:sugar phosphotransferase system. Microb Cell Fact. 2005, 4: 14- 10.1186/1475-2859-4-14PubMed CentralView ArticlePubMedGoogle Scholar
- Helling RB: Why does Escherichia coli have two primary pathways for synthesis of glutamate?. J Bacteriol. 1994, 176: 4664-4668.PubMed CentralPubMedGoogle Scholar
- Trinh CT, Unrean P, Srienc F: Minimal Escherichia coli Cell for the Most Efficient Production of Ethanol from Hexoses and Pentoses. Appl Environ Microbiol. 2008, 74: 3634-3643. 10.1128/AEM.02708-07PubMed CentralView ArticlePubMedGoogle Scholar
- Herring CD, Raghunathan A, Honisch C, Patel T, Applebee MK, Joyce AR, Albert TJ, Blattner FR, Boom van den D, Cantor CR, Palsson BO: Comparative genome sequencing of Escherichia coli allows observation of bacterial evolution on a laboratory timescale. Nat Genet. 2006, 38: 1406-1412. 10.1038/ng1906View ArticlePubMedGoogle Scholar
- Fong SS, Marciniak JY, Palsson BØ: Description and interpretation of adaptive evolution of Escherichia coli K-12 MG1655 by using a genome-scale in silico metabolic model. J Bacteriol. 2003, 185: 6400-6408. 10.1128/JB.185.21.6400-6408.2003PubMed CentralView ArticlePubMedGoogle Scholar
- Fong SS, Joyce AR, Palsson BØ: Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 2005, 15: 1365-1372. 10.1101/gr.3832305PubMed CentralView ArticlePubMedGoogle Scholar
- Gardner TS, di Bernardo D, Lorenz D, Collins JJ: Inferring genetic networks and identifying compound mode of action via expression profiling. Science. 2003, 301: 102-105. 10.1126/science.1081900View ArticlePubMedGoogle Scholar
- Colijn C, Brandes A, Zucker J, Lun DS, Weiner B, Farhat MR, Cheng TY, Moody DB, Murray M, Galagan JE: Interpreting expression data with metabolic flux models: predicting Mycobacterium tuberculosis mycolic acid production. PLoS Comput Biol. 2009, 5: e1000489- 10.1371/journal.pcbi.1000489PubMed CentralView ArticlePubMedGoogle Scholar
- Shlomi T, Cabili MN, Herrgard MJ, Palsson BO, Ruppin E: Network-based prediction of human tissue-specific metabolism. Nat Biotechnol. 2008, 26: 1003-1010. 10.1038/nbt.1487View ArticlePubMedGoogle Scholar
- Moxley JF, Jewett MC, Antoniewicz MR, Villas-Boas SG, Alper H, Wheeler RT, Tong L, Hinnebusch AG, Ideker T, Nielsen J, Stephanopoulos G: Linking high-resolution metabolic flux phenotypes and transcriptional regulation in yeast modulated by the global regulator Gcn4p. Proc Natl Acad Sci USA. 2009, 106: 6477-6482. 10.1073/pnas.0811091106PubMed CentralView ArticlePubMedGoogle Scholar
- Asadollahi MA, Maury J, Patil KR, Schalk M, Clark A, Nielsen J: Enhancing sesquiterpene production in Saccharomyces cerevisiae through in silico driven metabolic engineering. Metab Eng. 2009, 11: 328-334. 10.1016/j.ymben.2009.07.001View ArticlePubMedGoogle Scholar
- Rodionov DA: Comparative genomic reconstruction of transcriptional regulatory networks in bacteria. Chem Rev. 2007, 107: 3467-3497. 10.1021/cr068309+PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett CL, Palsson BO: Iterative reconstruction of transcriptional regulatory networks: an algorithmic approach. PLoS Comput Biol. 2006, 2: e52- 10.1371/journal.pcbi.0020052PubMed CentralView ArticlePubMedGoogle Scholar
- Rodriguez-Penagos C, Salgado H, Martinez-Flores I, Collado-Vides J: Automatic reconstruction of a bacterial regulatory network using Natural Language Processing. BMC Bioinformatics. 2007, 8: 293- 10.1186/1471-2105-8-293PubMed CentralView ArticlePubMedGoogle Scholar
- Baumbach J, Rahmann S, Tauch A: Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms. BMC Syst Biol. 2009, 3: 8- 10.1186/1752-0509-3-8PubMed CentralView ArticlePubMedGoogle Scholar
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007, 5: e8- 10.1371/journal.pbio.0050008PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang S, Xu M, Li S, Su Z: Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes. Nucleic Acids Res. 2009, 37: e72- 10.1093/nar/gkp248PubMed CentralView ArticlePubMedGoogle Scholar
- Wang T, Stormo GD: Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. Proc Natl Acad Sci USA. 2005, 102: 17400-17405. 10.1073/pnas.0505147102PubMed CentralView ArticlePubMedGoogle Scholar
- Khosla C, Keasling JD: Metabolic engineering for drug discovery and development. Nat Rev Drug Discov. 2003, 2: 1019-1025. 10.1038/nrd1256View ArticlePubMedGoogle Scholar
- Feist AM, Palsson BØ: The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol. 2008, 26: 659-667. 10.1038/nbt1401PubMed CentralView ArticlePubMedGoogle Scholar
- Alper H, Stephanopoulos G: Engineering for biofuels: exploiting innate microbial capacity or importing biosynthetic potential?. Nat Rev Microbiol. 2009, 7: 715-723. 10.1038/nrmicro2186View ArticlePubMedGoogle Scholar