Investigating the metabolic capabilities of Mycobacterium tuberculosis H37Rv using the in silico strain iNJ 661 and proposing alternative drug targets
BMC Systems Biology volume 1, Article number: 26 (2007)
Mycobacterium tuberculosis continues to be a major pathogen in the third world, killing almost 2 million people a year by the most recent estimates. Even in industrialized countries, the emergence of multi-drug resistant (MDR) strains of tuberculosis hails the need to develop additional medications for treatment. Many of the drugs used for treatment of tuberculosis target metabolic enzymes. Genome-scale models can be used for analysis, discovery, and as hypothesis generating tools, which will hopefully assist the rational drug development process. These models need to be able to assimilate data from large datasets and analyze them.
We completed a bottom up reconstruction of the metabolic network of Mycobacterium tuberculosis H37Rv. This functional in silico bacterium, iNJ 661, contains 661 genes and 939 reactions and can produce many of the complex compounds characteristic to tuberculosis, such as mycolic acids and mycocerosates. We grew this bacterium in silico on various media, analyzed the model in the context of multiple high-throughput data sets, and finally we analyzed the network in an 'unbiased' manner by calculating the Hard Coupled Reaction (HCR) sets, groups of reactions that are forced to operate in unison due to mass conservation and connectivity constraints.
Although we observed growth rates comparable to experimental observations (doubling times ranging from about 12 to 24 hours) in different media, comparisons of gene essentiality with experimental data were less encouraging (generally about 55%). The reasons for the often conflicting results were multi-fold, including gene expression variability under different conditions and lack of complete biological knowledge. Some of the inconsistencies between in vitro and in silico or in vivo and in silico results highlight specific loci that are worth further experimental investigations. Finally, by considering the HCR sets in the context of known drug targets for tuberculosis treatment we proposed new alternative, but equivalent drug targets.
Tuberculosis continues to be a devastating pathogen throughout the world, particularly in developing nations. In 2001, the World Health Organization (WHO) estimated 8.5 million new cases of tuberculosis (based on 3.8 million new reported cases) and an estimated 1.8 million deaths from tuberculosis in 2000 . Within the United States, the number of reported cases of tuberculosis has been decreasing with the exception of a period when the trend reversed in 1986 and peaked in 1992 [2, 3]. This reversal has been attributed principally to HIV/AIDs, immigration from countries with high prevalence of tuberculosis, poverty, homelessness, and multi-drug resistant (MDR) tuberculosis [1, 2]. MDR tuberculosis is generally defined as strains that are resistant to treatment with isoniazid and rifampin , two of the key first line antituberculosis drugs . MDR strains of tuberculosis emerged in the early 1990s and have now been found all over the world .
Many of the unique properties of tuberculosis are attributable to its metabolism, particularly the complex fatty acids characteristics of the organism. These mycolic acids, phenolic glycolipids, and mycoceric acids confer many of the properties such as its acid-fastness and are believed to contribute to the resilience of the organism. Mycobacterium tuberculosis can survive in a wide range of environments (many different tissues) and fairly extreme pHs . One of the most confounding factors with these bacteria is their ability to survive for long periods of time in a dormant stage. The slow doubling time of tuberculosis has further limited the amount of experimental data that can be generated. Many of the first and second line drugs used to treat tuberculosis have metabolic targets, so developing systems level models of metabolism are anticipated to be of great use in the future.
DNA sequencing of the ~4.4 Mbp genome of Mycobacterium tuberculosis H37Rv (M. tb) in 1998  enabled the ability the pursuit of genome-scale analyses of this microorganism. The remarkable relevance to world health and disease control and the need to understand the metabolic function of the organism all evoke the need of a genome-scale metabolic model. Long-term anticipated goals and applications of such models are to understand the growth of mycobacteria under different conditions, identifying strategies to improve growth in vitro (for experimental and diagnostic purposes), and identifying new drug targets for treatment.
In order to gain understanding about the unique characteristics of this important pathogen, we manually reconstructed the metabolic network of M. tb in silico (iNJ 661), from which we developed a model to compute perform computational analyses and interpret experimental data. These bottom-up reconstructions have been described in the past as, biochemically, genetically and genomically structured (BiGG) 'databases'. We employ constraint-based reconstruction and analysis (COBRA) of this BiGG reconstruction to learn about its normal metabolic function and to infer new potential targets for drugs.
Results and discussion
The reconstruction process has been described previously [7, 8], Figure 1 summarizes this process in brief. The network statistics for iNJ 661, which has 661 genes and 939 intra-system reactions, are summarized in Table 1. A biomass objective function was defined using available measurements of M. tb H37Rv and other mycobacteria strains if information was lacking. The biomass objective function was defined using the literature for chemical composition studies of M. tb [9–13]. When such information was not found for the specific strain, Mycobacterium bovis was used (for example to approximate the biomass composition of nucleotides and peptidoglycans ). The simple fatty acids compositions were based on studies of the mycobacterial cell wall . The biomass function includes 90 metabolites in addition growth associated maintenance ATP costs (see Table 2). [See Additional file 1] for more detailed information.
M. tb growth in silico
Flux Balance Analysis (FBA) was used to grow iNJ 661 in silico maximizing the biomass function defined in Table 2 (see Materials and Methods). iNJ 661 was grown in silico on three different types of media, Middlebrook 7H9 (supplemented with glucose and glycerol) , Youmans , and the chemically defined rich culture media (CAMR) . Uptake fluxes which define the different media conditions are shown in Table 3 and the computed results under the different conditions are summarized in Table 4. The doubling times are within the range described in the literature [18, 19], and as expected, the growth rates are higher with the richer media. The minimum doubling time of 14.7/hr described by  are within the capabilities of maximum growth of iNJ 661.
As noted by Primm et al , M. tb growth on Middlebrook 7H9 media requires supplementation with glucose and glycerol. Glycerol is a component of the biomass function (Table 2) according to the chemical composition measurements performed by  on Mycobacterium bovis. So in order to grow in silico, glycerol will either need to be in the medium or the cell must have some way of producing it from other precursors. iNJ 661 was able to grow when glucose uptake was abrogated, however, it always required glycerol supplementation in the media which could be due to the biomass requirement or due to the need for glycerol for the production of a different essential metabolite. The growth dependence on glycerol and glucose uptake can be seen in the phase plane diagram depicted in Figure 2a. In this figure, the biomass objective function is optimized while the glucose and glycerol uptake fluxes are varied in different combinations. The resulting plot shows how the growth capability of iNJ 661 varies as a function of glucose and glycerol uptake. The solid black lines depict isoclines, along which growth is constant . Further investigations were carried out in order to understand the essentiality of glycerol and non-essentiality of glucose in iNJ 661. Since flux distributions calculated by FBA are non-unique, flux variability analysis (see Methods) can be used to calculate the span of each flux while still achieving the maximum value for the objective function. Reactions with non-zero fluxes, in which the maximum flux equals the minimum flux (and is non-zero), are essential in achieving the particular objective (no alternative pathways exist). Flux variability  at maximum growth, with Middlebrook media with glucose and glycerol supplementation (Table 2) was performed in order to identify the constraining flux(es) that prevents growth on glucose but not glycerol. The only non-zero fluxes with no flexibility, directly involving the metabolism of glycerol, were the glycerol transporter (GLYCt) and glycerol kinase (Rv3696c, GlpK, GLYK). The production of glycerol-3-phosphate via glycerol kinase is essential for membrane and fatty acid metabolism involving fatty-acyl glycerol phosphates and interconversions between CDP-diacyl-glycerol and phosphatidyl-glycerol phosphates. These results are consistent and expected given the importance of membrane metabolism in the role of biomass production.
The non-zero fluxes with no flexibility directly involving glucose were the glucose transporter (GLCabc, SugABC, Rv1236+Rv1237+Rv1238) and glucokinase (PPGKr, PpgK, Rv2702). Glucose-6-phosphate is subsequently essential for a number of other reactions involving various aspects of membrane metabolism, including the synthesis of trehalose phosophate (TRE6PS, OtsA, Rv3490) and inositol-phosphate (MI1PS, Ino1, Rv0046c), in addition to phosphoglucomutase (PGMT, PgmA, Rv3068c). These three reactions are required not only for optimal growth, but they are in fact essential for any growth at all in silico. Only one of these, however, has been suggested to be essential experimentally (Rv3490) . Growing iNJ 661 in Middlebrook media (with glycerol but without glucose supplementation) severely retards the growth rate, however it is still viable in silico in contrast to M. tb growth in vitro. The discrepancy between in vitro essentiality of glucose, but non-essentiality of reactions requiring glucose derivatives and in silico non-essentiality of glucose suggest that there may be non-metabolic functions related to the need for glucose, in addition to alternative metabolic pathways for the production and/or use of glucose-6-phosphate.
Figure 2b characterizes the growth dependence on the uptake rates of inorganic phosphate and sulfate. Limiting uptake of either of these molecules will inhibit growth in silico. Although the uptake fluxes are small compared to the uptake of the carbon sources and oxygen, both are required for growth of M. tb. Consequently, the sulfate and phosphate transporters may serve as bacteriostatic and possibly bacteriocidal drug targets. A robustness analysis (see Methods) was carried for biomass production by iNJ 661 for phosphate and sulfate [see Additional file 7]. This was compared to a robustness analysis performed for the in silico E. coli strain iJR 904  on aerobic glucose conditions [see Additional file 7]. The sensitivity of the respective biomass production rates to the uptake fluxes are given by the slopes of the plots for each organism. The sensitivity to the phosphate uptake rate for iNJ 661 and iJR 904 are both approximately 1. However, the slope for sulfate uptake in iNJ 661 is approximately four times that of iJR 904 (about 16 for iNJ 661 compared to about 4 for iJR 904). This difference reflects the relatively larger amount of sulfur as a percent of total biomass in M. tb compared to E. coli, which is consistent with knowledge of M. tb's composition . Collectively, the phase planes and robustness plots suggest that M. tb may be much more sensitive to sulfate depletion than E. coli.
A remarkable property of M. tb is its ability to grow in culture with carbon monoxide (CO) serving as the sole carbon source [26, 27]. Unlike other mycobacteria that can grow with CO as the sole carbon source, M. tb lacks ribulose-bisphosphate carboxylase (RuBisCO), an enzyme found in plants enabling fixation of carbon dioxide. Although it was argued that a reductive citric acid cycle may enable the fixation of carbon dioxide into by M. tb [27, 28], experimentally it was observed that citrate did not affect M. tb growth on CO . We did not expect iNJ 661 to grow on CO, since the only experimentally characterized CO associated metabolism in M. tb was CO dehydrogenase (CODH). The uptake of glycerol versus the uptake of CO while optimizing for growth can be seen in Figure 3. CO uptake affects the biomass optimization only during minimal uptake of glycerol, over a slight range. The contribution being made in this region is the donation of protons (through the oxidation of CO to CO2). Looking further into this issue, we tested the hypothesis that RuBisCO would enable growth on CO (even though M. tb is not known to have it). We added RuBisCO in addition ribose 1-phosphokinase and ribose 1,5-bisphosphate isomerase (to complete the pathway) and optimized for growth on a modified Middlebrook 7H9 media (glycerol and glucose were removed and CO was added). iNJ 661 was unable to grow under these conditions, despite conferring the ability to fix CO2 with ribose sugars. This observation in conjunction with Figure 2a suggests that one possibility for how M. tb is able to grow using CO as the carbon source is that it somehow is able to fix CO into a glycolytic intermediate that does not appear to occur through a RuBisCO pathway or a reductive citrate cycle.
A context for content
This BiGG reconstruction can be used as a 'context for content' in analyzing large biological datasets. Multiple investigators have employed high-throughput data analysis methods to characterize M. tb in different environmental conditions, Sassetti et al  used transposon site hybridization (TraSH) to identify the genes needed for optimal growth of M. tb in vitro (this will be referred to as the OptGro dataset throughout the rest of the text), Gao et al  identified a set of genes that are consistently expressed under different growth conditions in liquid culture (this will be referred to as the ConstExp dataset throughout the rest of the text), and Sassetti in 2003 investigated the set of genes needed for M. tb survival in vivo during infection in mice  (will be referred to as the Infect dataset). The degree of overlap between the three datasets can be seen in Figure 4A (only 2 genes are shared by all three sets) highlight the variability of gene expression across the different experimental datasets. Although significant differences between in vivo and in vitro patterns of gene expression may be expected, the OptGro and ConstExp datasets were both done in vitro. The heterogeneity among the datasets (all three experimental datasets share only two loci) suggests that defining an objective function based on in vitro data, may not necessarily translate to useful results in vivo (it may constrain the solution space from the wrong directions).
We evaluated these three datasets for M. tb and compared them to gene deletion analyses for iNJ 661, in which each gene was individually deleted and growth was optimized on Middlebrook media. Any growth rate that fell short of the maximum wild type growth was considered 'sub-optimal'. Table 5a summarizes the 114 false negative (iNJ 661 gene non-essentiality, but in vitro gene essentiality; 50%) results between iNJ 661 and the OptGro dataset (for the full results [see Additional file 4]). The inconsistencies were considered individually and classified into four categories, Not in Objective Function (NOF), Alternate Route (AR), Alternative Locus (AL), and Not Essential (NE); these classifications are not mutually exclusive and were defined in order to navigate and interpret with greater ease. A gene was placed in the NOF category if it was identified to be in a biosynthetic pathway (based on network connectivity) for a particular metabolite, such as a vitamin or co-factor. Genes would be categorized as AR or AL if there was an alternative biochemical route or an alternative gene annotation that could carry out that reaction in iNJ 661, respectively, but not in M. tb. Potential AR or AL classifications reflect one of the potential errors due to sequence based annotations, so further biochemical characterization of the gene products would help resolve these cases. Genes that did not fall in these categories were classified as NE. Many of these loci were associated with fluxes in which the flux variability was uniformly zero (maximum and minimum value).
Since quantitative data was not available for vitamins and cofactors, they were not included in the original definition of the objective function (Table 2). However, after the deletion analysis, the majority of the false negative predictions in the NE category by iNJ 661 were reactions associated with the biosynthesis of vitamins and cofactors, such as dihydrofolate and tetrahydrofolate. 16 such metabolites were added [see Additional file 1] to the biomass function with coefficients of 0.000001 (small enough to have negligible effects on quantitative growth, but still required for growth). The results for the genes required for optimal growth using the initial biomass function and the expanded compared to OptGrow are displayed in the first Venn diagram of Figure 4B. Although only 16 metabolites were added, the deletion study resulted in 31 more gene loci that matched the OptGro data. This only slightly increased the deletion prediction from 54% to 56%. The reason for the modest increase is that there were also 18 additional false positive (iNJ 661 gene essentiality, but in vitro gene non-essentiality) results (from 46% to 44% for the original and expanded objective functions, respectively).
Genes categorized as AL or AR suggest that there was an error in the sequence based annotation, there are regulatory control loops that we are not aware of, or there was a false positive result from the experimental data (TraSH is often used as a screening tool). Sassetti et al  dicussed one of these cases involving the Rv0505c and Rv3042c loci (serB and serB2 respectively). Both have been assigned the same function based on annotation, however serB2 was found to be essential while serB was not. There may be many possibilities why serB cannot rescue mutations/deletions of serB2, however the detection of such cases imply that having duplicate genes many not confer the degree or robustness that is often assumed to accompany an organism. The cases with alternative loci highlight areas where further experimental investigation may provide insight into the growth capabilities of M. tb. The AL cases are enumerated in Table 5b. These loci are worth further experimental investigation to confirm the annotation and to determine the different conditions which may induce or inhibit expression. The false negative prediction for the sulfate transporter (Rv2397c, Rv2398c, Rv2399c, and Rv2400c) is an interesting case to consider, in part because it validates the observations in the phase plane discussed in Figure 2b. Additionally, it identifies an area in which a partially characterized annotation causes erroneous results. The four aforementioned loci form a protein complex in the model in order to catalyze the transport reaction. However, since there was an additional annotation identifying Rv1739c as a sulfate transporter , an association was made between Rv1739c in addition to the existing protein complex (Rv2397c + Rv2398c + Rv2399c + Rv2400c). So, although the sulfate transporter is essential for in silico growth of iNJ 661, the lack of a biochemically characterized gene protein relationship can result in false predictions.
The 103 false positive cases likely reflect incomplete knowledge about the network (alternative pathways exist) or an alternative objective function [see Additional file 4]. Approximately 20 cases involve amino acid pathways, since amino acids would presumably be required for any kind of growth in lab conditions, these cases likely suggest alternative pathways or enzymes are present in M. tb that are not in iNJ661. Almost half of these cases (46) involve fatty acid, membrane, or peptidoglycan metabolism, since the fatty acid composition in M. tb can change significantly over time in vitro , many of these false predictions are likely to be due to changes in the biomass composition (in silico simulations have fixed biomass compositions). These cases highlight areas in which experimental studies documenting the changes in composition and growth rates may yield interesting results. Five of the cases involve loci associated with the transport of phosphate, ammonium, and a sugar transporter. These cases likely involve as of yet unidentified alternative transporters in M. tb. Another case involves glycerol kinase (Rv3696c, GlpK, GLYK) which was one of the fluxes with no flexibility (see M. tb growth in silico), since M. tb in vitro requires glycerol supplementation for growth . However, since glycerol kinase does not appear to be required even for sub-optimal growth as implied by the OptGro dataset, there must be other reactions involving the direct metabolism of glycerol that have not yet been identified for M. tb. The ability to grow using CO as the sole carbon source and maintain optimal growth even without glycerol kinase activity imply unique and unidentified aspects to central metabolism in the M. tb network. Further experimental investigations may provide clues towards controlling the growth of M. tb (increasing the growth rate in culture and inhibiting growth in vivo).
Out of the 194 genes identified in the Infect dataset, only 17 were shared with the OptGro dataset and 37 were shared with the ConstExp dataset (Figure 4A). From these three sets of experimental datasets, OptGro and ConstExp had the greatest number of genes in common, 162. However this is still only about 27% and 29% of the genes in each dataset, respectively. The large variability between all of these datasets highlights the point that further development of Flux Balance Analysis based approaches will require the definition alternative objective functions to biomass optimization.
Functional assessment independent of objective functions
Biomass functions can help improve predictions under well-defined conditions by constraining the steady state solution space. However, different growth conditions can appreciably alter the composition and the 'objectives' of bacteria, as reflect by Figure 4A and the discussion above. Significant changes in fatty acid composition have been observed even while growing M. tb in culture over the span of weeks . These changes would be reflected as changes in the composition and the coefficients of the biomass function, consequently altering the predictions made by the model. Not all COBRA methods require the definition of objective functions  and identifying groups of reactions that operate together can help simplify a network and provide insight into its functionality . Correlated reaction sets have been calculated using sampling  and flux coupling  and implications for classifying diseases and identifying pathway specific drug targets have been discussed . Applying the same systems view of metabolic networks to pathogens such as M. tb causes one to adopt the view that single enzyme drug targets actually knock out complete pathways. As a result, terminating the activity of any other enzyme in that pathway should have the same effect. Similar to the correlated reaction sets calculated by sampling or flux coupling, HCR sets are defined by mass balance constraints. The definition of HCRs is stricter in the sense that it is based on metabolites with one-to-one connectivity. However in contrast to the other approaches, it is independent of the exchange reactions with the environment and any requirements for demand functions. Consequently, the sets can be calculated directly from the stoichiometric matrix. There is a significant degree of overlap between HCRs and the Enzyme Subsets (ES) described by Pfeiffer et al . Since ESs are not constrained by 1:1 connectivity, in principle they may include additional reactions. On the same token, when the same reaction is carried out by multiple co-factors and the reactions are reversible, which is not uncommon in metabolic networks, ESs may add reactions which form loops (similar to Type III Extreme Pathways ). Additionally, intermediate steps in the calculation of HCRs allow the consideration of inclusion or removal of potentially reversible reactions.
147 HCR sets (1:1 only, additional reversible reactions, 0:2/2:0, were not included, see Methods) were calculated for the network, the average size of each set was 2.93, with median and modes both equal to 2. The largest set, HCR 40, consisted of 13 reactions involved in Cofactor Metabolism and Porphyrin Metabolism. The summary of how the HCRs are split among the 35 subsystems in the network are outlined in Table 6. Using the Gene-Protein-Reaction relationships which form the foundation of the reconstruction, we mapped the HCRs back to the gene loci (Figure 5). The complete HCR set mapped to 124 loci and 61 HCRs in the OptGro dataset, to 46 loci and 34 HCRs in the ConstExp dataset, and 21 loci and 18 HCRs in the Infect dataset [see Additional file 2]. The OptGro and ConstExp shared 8 HCR sets (HCR 2, 3, 11, 12, 43, 57, 69, and 122). The intersection of these datasets identifies a subset of reactions that are consistently expressed under different conditions and also required for optimal growth in vitro. These HCR sets may reflect underlying core sets of reactions vital to growth and survival in vitro.
Many of the tuberculosis treatment drugs have metabolic targets  and a use of this metabolic based reconstruction can be to assist in identifying new and alternative chemotherapy targets for tuberculosis. Mdluli and Spigelman recently discussed the drug targets in M. tb reasonably comprehensively . We used the list (absent the DNA synthesis and Regulatory Protein targets) and mapped them to the 147 HCR sets as seen in Table 7 [see Additional file 3 for the full set]. The final column lists the number of different reactions in the HCR. Other reactions that are shared within a particular HCR which contains a known drug target, have the potential to be alternative but metabolically equivalent targets. The drug targets mapped to 25 of the HCR sets. These sets are depicted on a reduced metabolic map of iNJ 661 (Figure 6). An image of the HCR sets mapped to the entire metabolic network will be available on our website . These 25 HCR sets contain all of the 8 HCR sets identified by the overlap between the OptGro and ConstExp data.
Last year Raman et al  constructed and analyzed a model of the mycolic acid synthesis pathways and proposed additional targets in fatty acid metabolism. Accounting for the entire metabolic network further builds upon this by enabling a more rigorous evaluation of the global growth capabilities of the network and the identification of potential drug targets. The genome-scale model presented here includes the extensive fatty acid metabolism pathways of M. tb, in addition to rest of the metabolic network. With this full network, we then sought to extend the identification of potential drug targets by taking advantage of the underlying principles of reconstructing metabolic networks to calculate HCR sets.
Interpreting large, complex networks in a functionally meaningful manner is enabled by adopting a hierarchical view of the network, where one identifies groups of reactions that respond in unison to changes in media conditions. Groups of reactions strictly bound together based on mass conservation and stoichiometry constraints can lead to the definition of hierarchical sets of reactions [33, 34]. This principle was applied to causal SNP associated diseases in the human mitochondria, and it was found that reactions in the same co-set had similar disease phenotypes . We applied this same concept to iNJ 661, but rather than looking for disease phenotypes, we sought to identify alternative drug targets.
Over time and with iterative improvements, metabolic network reconstructions have achieved the stage of hypothesis generation and model-driven biological discovery in systems biology . We present an in silico strain of M. tb, iNJ 661, with the anticipation that it will be received by the community in a similar vein and to assist in the discovery process of this remarkable yet devastating pathogen.
A large amount of material has been presented in this manuscript, beginning with the presentation of a genome-scale reconstruction and model of M. tb, iNJ 661, followed by experimental validation and use of the model for integration and analysis of multiple experimental datasets. We summarize some of the main points that highlight areas in which further experimental research may be fruitful.
iNJ 661 can grow at rates consistent with experimental data in varying media conditions. Further measurements of the biomass composition of M. tb in well-defined situations are likely to improve the in silico predictions under those conditions.
Analysis of the capabilities under maximal growth conditions can help identify critical nutrient targets for killing M. tb (such as the sulfate transporter) or for optimizing growth (in the lab).
The pathway enabling growth of M. tb using CO as the carbon source is still unclear. The computational studies suggest that there must be an alternative pathway (to reductive citrate cycle and RuBisCO) which enables entry of CO2 into central metabolism.
The seemingly paradoxical need for glucose supplementation in Middlebrook media but non-essentiality of directly related reactions involving the metabolism of glucose-6-phosphate for in vitro growth and the almost diametrically opposed results for in silico growth, suggest that there may be additional pathways involving glucose metabolism and potentially non-metabolic functions related to the glucose requirement.
Comparison with and analysis of large datasets
The definition of the biomass function will largely dictate the results of optimal growth/gene deletion experiments, consequently, positive results (in agreement with the experiments) are not as interesting as the false negative (or false positive results). The false negative results may be due to an error in the annotation of the gene or an underlying physiological mechanism that we are not aware of. In either case, investigating the particular genes experimentally may help clarify the issue.
The model can be used as a Context for Biological Content and provides a consistent, coherent framework to analyze various datasets focusing on metabolism.
Unbiased analysis methods, methods that calculate properties of the network independent of objective functions, can provide interesting and valuable insights into network capabilities . The definition of HCR sets, in a similar vein to correlated reaction sets from sampling and flux-coupling, can confer a hierarchical structure to metabolic networks, and may assist in the identification of alternative drug targets. Using these models in conjunction with experimental data can assist in screening and identifying new and alternative drug targets.
Taken together, this study adds another high resolution reconstruction of microbial metabolism. There now exist a growing number of such reconstructions that have been particularly useful for basic studies of network properties and for bioprocessing applications [42–44]. As the number of reconstructions of human pathogens grows, we should be able to expand their uses towards improved understanding pathogenic mechamisms, and design of interventions and treatment. Large-scale rigorous experimental validation studies should now be performed to further these goals.
Reconstruction and model development of the metabolic network has been described previously [7, 8]. The reconstruction contents will be made available on our website  in Excel [see Additional file 5] and SBML formats.
Figure 1 provides a brief summary of the reconstruction and modeling building process. The sequence based genome annotation of Mycobacterium tuberculosis H37Rv was downloaded from TIGR  and served as the framework of the model. Charge and elementally balanced reactions were added individually based on this annotation, legacy data when available, and updates made in the Tuberculist database . KEGG  and the SEED  were used as ancillary tools on occasion. Following the initial reconstruction, the gaps were evaluated individually by searching for direct evidence in the literature for their metabolism. Due to the relative sparsity of literature on aspects of central and cofactor metabolism in M. tuberculosis, many of these gaps could not be filled. Legacy data in the form of primary articles, review articles, and textbooks were employed in addition to the database resources [see Additional file 6] during the reconstruction and model building phases. A comprehensive map of iNJ 661's metabolic network was visualized by creating a map of the network organized by lumped subsystems of metabolism [see Additional files 8, 9, 10, 11, 12, 13, 14, 15]. After the debugging process, when the cell could grow (i.e. produce biomass) on different media, the remaining intracellular gaps were evaluated by searching the literature for evidence of metabolic reactions involving that particular metabolite. If no evidence of transport or biochemical transformations of the metabolite in M. tb was found, no additional reactions or transporters were added.
The stoichiometric matrix, S, was constructed based on the reactions described in the reconstruction. The biomass function was defined using available legacy data [see Additional file 1]. Flux Balance Analysis (FBA) simulations were employed during the debugging process towards developing a functional model.
Flux Balance Analysis (FBA)
The metabolic network is represented mathematically by the stoichiometric matrix, m rows by n columns, where there are m metabolites and n reactions in the network. Mass conservation dictates
which simplifies to
S·v = 0
at the steady state. Further constraints can be placed on the system based on thermodynamics (reaction directionality) and limits on substrate uptake rates
α ≤ v ≤ β
The oxygen uptake rate measured from the Mycobacteria Bovis BCG strain of 25 μL/hr/mg dry weight (0.98 mmol O2/hr/g dry weight)  was used. The small molecule carbon sources (glucose, glycerol, etc.) were limited to an uptake rate of 1 mmol/hr/g dry weight and all other allowable substrates were unconstrained. The different in silico culture media are listed in Table 3. Deviations from the standard media include: the addition of minute amounts of ferric iron in Youmans media. The oxygen uptake rate was the growth constraining flux in all of the different media.
Flux Variability Analysis (FVA)
Constraining the biomass objective function, c, to the optimal value calculated in FBA,
every flux in the network is then minimized and maximized. The different between the maximum and minimum for each flux defines the flux variability for that reaction.
Robustness plots are performed by varying a particular flux through a pre-defined range and recalculating the objective function. The slope of the curve describes the sensitivity of the objective function on that particular flux (over the specified range of values).
Hard-Coupled Reaction (HCR) Sets
Denoting the binary form of the stoichiometric matrix, , the input/reactant part of the stoichiometric matrix, S-, and the output/product part of the stoichiometric matrix, S+, then it is transparent that,
The input and output connectivities of each metabolite can be calculated for - and +, respectively. The input:output ratio for each metabolite can be determined and all metabolites with 1:1 connectivity will be hard-coupled by the network. Metabolites with 0:2 or 2:0 connectivity reflect metabolites that are hard-coupled in the event that the corresponding reaction is reversible. Metabolites with 0:1 or 1:0 connectivity are either source/sink transporters or blocked reactions.
A general algorithm for calculating hard coupled reaction sets.
Identify 1:1, 0:2, and 2:0 metabolites
Identify the sets of reactions corresponding to the metabolites, denote the entire set as HCR0
Join any HCR0 subset that share 1 or more reactions, denote the entire set as HCR1
i ← 1
while (HCRi-1 ≠ HCRi)
Join any HCRi subset that share 1 or more reactions
i ← i+1
Since 0:2 or 2:0 reactions may be irreversible reactions, an intermediate processing step may be needed to verify that there is biological evidence supporting the catalysis of the reaction in the reverse direction. If one does not wish to consider these potential effects and assume that all of the reactions are reversible then step 1 can be simplified by finding all of the 2 metabolites for . Biomass functions were removed from the stoichiometric matrix before calculating the HCR sets. The HCRs calculated for iNJ 661 had 94 0:2/2:0 reactions. Of these, 24 overlapped with the core set of 147 1:1 HCRs. Since the set of 0:2/2:0 reactions did not have additional gene associations (that weren't already in the 1:1 set) and since they had many irreversible reactions throughout, the 1:1 set and the 0:2/2:0 set were not combined (step 1 in the above algorithm outline would involve only identifying the 1:1 metabolites).
The reconstruction process, the metabolic maps, and the majority of the calculations were performed using Simpheny™ v1.11 software (Genomatica, Inc.). HCR Sets were calculated with Mathematica© v4.2 (Wolfram Research, Inc.).
Kasper D, Braunwald E, Fauci A, Hauser S, Longo D, Jameson J: Harrison's Principles of Internal Medicine. 2004, McGraw-Hill Professional, 16
Schneider E, Moore M, Castro KG: Epidemiology of tuberculosis in the United States. Clinics in chest medicine. 2005, 26: 183-195. v. 10.1016/j.ccm.2005.02.007
Small PM, Fujiwara PI: Management of tuberculosis in the United States. The New England journal of medicine. 2001, 345 (3): 189-200. 10.1056/NEJM200107193450307
Sharma SK, Mohan A: Multidrug-resistant tuberculosis: a menace that threatens to destabilize tuberculosis control. Chest. 2006, 130 (1): 261-272. 10.1378/chest.130.1.261
Wayne LG, Sohaskey CD: Nonreplicating persistence of mycobacterium tuberculosis. Annual review of microbiology. 2001, 55: 139-163. 10.1146/annurev.micro.55.1.139
Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, Gas S, Barry CE, et al.: Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998, 393 (6685): 537-544. 10.1038/31159
Becker SA, Palsson BO: Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC microbiology [electronic resource]. 2005, 5 (1): 8-10.1186/1471-2180-5-8.
Feist AM, Scholten JC, Palsson BO, Brockman FJ, Ideker T: Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Molecular systems biology [electronic resource]. 2006, 2 (2006): 0004-2006 0004
Khuller GK, Taneja R, Kaur S, Verma JN: Lipid composition and virulence of Mycobacterium tuberculosis H37Rv. The Australian journal of experimental biology and medical science. 1982, 60 (Pt 5): 541-547.
Nandedkar AK: Comparative study of the lipid composition of particular pathogenic and nonpathogenic species of Mycobacterium. Journal of the National Medical Association. 1983, 75 (1): 69-74.
Watanabe M, Aoyagi Y, Ridell M, Minnikin DE: Separation and characterization of individual mycolic acids in representative mycobacteria. Microbiology (Reading, England). 2001, 147 (Pt 7): 1825-1837.
Youmans AS, Youmans GP: Ribonucleic acid, deoxyribonucleic acid, and protein content of cells of different ages of Mycobacterium tuberculosis and the ralationship to immunogenicity. J Bacteriol. 1968, 95 (2): 272-279.
Garcia-Vallve S, Guzman E, Montero MA, Romeu A: HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Nucleic acids research. 2003, 31 (1): 187-189. 10.1093/nar/gkg004
Beste DJ, Peters J, Hooper T, Avignone-Rossa C, Bushell ME, McFadden J: Compiling a molecular inventory for Mycobacterium bovis BCG at two growth rates: evidence for growth rate-mediated regulation of ribosome biosynthesis and lipid metabolism. J Bacteriol. 2005, 187 (5): 1677-1684. 10.1128/JB.187.5.1677-1684.2005
Acharya PV, Goldman DS: Chemical composition of the cell wall of the H37Ra strain of Mycobacterium tuberculosis. J Bacteriol. 1970, 102 (3): 733-739.
Middlebrook G, Cohn ML: Bacteriology of tuberculosis: laboratory methods. American journal of public health. 1958, 48 (7): 844-853.
Youmans , Karlson : 1947
James BW, Williams A, Marsh PD: The physiology and pathogenicity of Mycobacterium tuberculosis grown under controlled conditions in a defined medium. Journal of applied microbiology. 2000, 88 (4): 669-677. 10.1046/j.1365-2672.2000.01020.x
Cox RA: Quantitative relationships for specific growth rates and macromolecular compositions of Mycobacterium tuberculosis, Streptomyces coelicolor A3(2) and Escherichia coli B/r: an integrative theoretical approach. Microbiology (Reading, England). 2004, 150 (Pt 5): 1413-1426.
Primm TP, Andersen SJ, Mizrahi V, Avarbock D, Rubin H, Barry CE: The stringent response of Mycobacterium tuberculosis is required for long-term survival. J Bacteriol. 2000, 182 (17): 4889-4898. 10.1128/JB.182.17.4889-4898.2000
Edwards J, Ramakrishna R, Palsson B: Characterizing the metabolic phenotype: a phenotype phase plane analysis. Biotechnol Bioeng. 2002, 77 (1): 27-36. 10.1002/bit.10047
Reed J, Palsson B: Genome-scale in silico models of E. coli have multiple equivalent phenotypic states: assessment of correlated reaction subsets that comprise network states. Genome Res. 2004, 14 (9): 1797-1805. 10.1101/gr.2546004
Sassetti CM, Boyd DH, Rubin EJ: Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol. 2003, 48 (1): 77-84. 10.1046/j.1365-2958.2003.03425.x
Reed JL, Vo TD, Schilling CH, Palsson BO: An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome biology. 2003, 4 (9): R54- 10.1186/gb-2003-4-9-r54
Cole ST: Tuberculosis and the tubercle bacillus. 2005, Washington, DC: ASM Press
King GM: Uptake of carbon monoxide and hydrogen at environmentally relevant concentrations by mycobacteria. Appl Environ Microbiol. 2003, 69 (12): 7266-7272. 10.1128/AEM.69.12.7266-7272.2003
Park SW, Hwang EH, Park H, Kim JA, Heo J, Lee KH, Song T, Kim E, Ro YT, Kim SW, et al.: Growth of mycobacteria on carbonmonoxide and methanol. J Bacteriol. 2003, 185 (1): 142-147. 10.1128/JB.185.1.142-147.2003
Srinivasan V, Morowitz HJ: Ancient genes in contemporary persistent microbial pathogens. The Biological bulletin. 2006, 210 (1): 1-9.
Gao Q, Kripke KE, Saldanha AJ, Yan W, Holmes S, Small PM: Gene expression diversity among Mycobacterium tuberculosis clinical isolates. Microbiology (Reading, England). 2005, 151 (Pt 1): 5-14.
Sassetti CM, Rubin EJ: Genetic requirements for mycobacterial survival during infection. Proc Natl Acad Sci USA. 2003, 100 (22): 12989-12994. 10.1073/pnas.2134250100
Tuberculist Web Server, Pasteur Institute. http://genolist.pasteur.fr/TubercuList/
Price ND, Reed JL, Palsson BO: Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol. 2004, 2 (11): 886-897. 10.1038/nrmicro1023
Papin JA, Reed JL, Palsson BO: Hierarchical thinking in network biology: the unbiased modularization of biochemical networks. Trends Biochem Sci. 2004, 29 (12): 641-647. 10.1016/j.tibs.2004.10.001
Thiele I, Price ND, Vo TD, Palsson BO: Candidate metabolic network states in human mitochondria. Impact of diabetes, ischemia, and diet. J Biol Chem. 2005, 280 (12): 11683-11695. 10.1074/jbc.M409072200
Burgard AP, Nikolaev EV, Schilling CH, Maranas CD: Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 2004, 14 (2): 301-312. 10.1101/gr.1926504
Jamshidi N, Palsson BO: Systems biology of SNPs. Molecular systems biology [electronic resource]. 2006, 2: 38-
Pfeiffer T, Sanchez-Valdenebro I, Nuno JC, Montero F, Schuster S: METATOOL: for studying metabolic networks. Bioinformatics. 1999, 15 (3): 251-257. 10.1093/bioinformatics/15.3.251
Palsson BO: Systems Biology: Determining the Capabilities of Reconstructed Networks. 2006, Cambridge Univ Pr
Mdluli K, Spigelman M: Novel targets for tuberculosis drug discovery. Current opinion in pharmacology. 2006, 6 (5): 459-467. 10.1016/j.coph.2006.06.004
Systems Biology Research Group, UCSD. http://systemsbiology.ucsd.edu/
Raman K, Rajagopalan P, Chandra N: Flux balance analysis of mycolic Acid pathway: targets for anti-tubercular drugs. PLoS computational biology. 2005, 1 (5): e46- 10.1371/journal.pcbi.0010046
Reed JL, Famili I, Thiele I, Palsson BO: Towards multidimensional genome annotation. Nat Rev Genet. 2006, 7 (2): 130-141. 10.1038/nrg1769
Hjersted JL, Henson MA, Mahadevan R: Genome-scale analysis of Saccharomyces cerevisiae metabolism and ethanol production in fed-batch culture. Biotechnol Bioeng. 2007
Van Dien SJ, Lidstrom ME: Stoichiometric model for evaluating the metabolic capabilities of the facultative methylotroph Methylobacterium extorquens AM1, with application to reconstruction of C(3) and C(4) metabolism. Biotechnol Bioeng. 2002, 78 (3): 296-312. 10.1002/bit.10200
The Institute for Genomic Research. http://www.tigr.org/
Kyoto Encyclopedia of Genes and Genomes. http://www.genome.jp/kegg/
van Hemert PA, Tiesjema RH: Possible use of the oxygen uptake rate in the evaluation of BCG vaccines. Journal of biological standardization. 1977, 5 (2): 121-129. 10.1016/0092-1157(77)90006-3
This work was supported in part by an NIH Training Grant. We also thank Mr. Scott Becker and Mr. Adam Feist for helpful comments and feedback. The authors do not declare any competing interests.
NJ carried out the reconstruction, performed the analyses, and drafted the manuscript. BØP conceived the study and helped revise the manuscript. Both authors approved the content of the final manuscript.
Electronic supplementary material
Additional File 1: Molecular composition of biomass functions. A detailed list of the biomass constituents and their respective coefficients in the biomass function in addition to the references from which this information was found. (XLS 39 KB)
Additional File 2: HCRs mapped to experimental data sets. The list of HCRs that mapped to the three experimental data, based on gene loci. (XLS 29 KB)
Additional File 3: HCRs mapped to reaction and metabolic subsystems. The complete list of HCRs, highlighting those known to be drug targets in M. tb. (XLS 46 KB)
Additional File 4: Comprehensive list of false negatives between OptGro data set and iNJ 661 lethal gene deletions. A catalog of the false negative results produced by iNJ 661 when compared to the OptGro data set. Each of false negative results is categorized into at least one of four different categories. Specific loci are listed in cases with suggested alternative loci or alternative paths. (XLS 26 KB)
Additional File 5: Model content for iNJ 661. A complete list of the reactions (abbreviations and full names), the associated mass and charge balanced reactions, confidence scores, reaction subsystems, and the list of the Gene-Protein-Reaction relationships, including logical AND/OR relationships, for all of the reactions in the model. A complete list of metabolites appearing in iNJ 661, including abbreviation, full name, formula, and charge at pH 7.2 can be found in the second tab in the file. (XLS 304 KB)
Additional File 6: Explicit list of references used in the reconstruction process. The list of references used to define particular reactions and GPR relationships. (PDF 168 KB)
Additional File 7: Robustness plots for phosphate and sulfate for iNJ 661 and iJR 904. Plotting the biomass objective function versus phosphate or sulfate uptake rates for in silico strains of M. tb and E. coli. (PDF 121 KB)
Additional File 12: Nucleotide metabolism. A map of the nucleotide synthesis and degradation pathways in iNJ 661. (SVG 2 MB)
Authors’ original submitted files for images
About this article
Cite this article
Jamshidi, N., Palsson, B.Ø. Investigating the metabolic capabilities of Mycobacterium tuberculosis H37Rv using the in silico strain iNJ 661 and proposing alternative drug targets. BMC Syst Biol 1, 26 (2007). https://doi.org/10.1186/1752-0509-1-26
- Metabolic Network
- Flux Balance Analysis
- Sulfate Transporter
- Mycobacterium Bovis
- Stoichiometric Matrix