Three factors underlying incorrect in silico predictions of essential metabolic genes
© Becker and Palsson; licensee BioMed Central Ltd. 2008
Received: 15 October 2007
Accepted: 04 February 2008
Published: 04 February 2008
The indispensability of certain genes in an organism is important for studies of microorganism physiology, antibiotic targeting, and the engineering of minimal genomes. Time and resource intensive genome-wide experimental screens can be conducted to determine which genes are likely essential. For metabolic genes, a reconstructed metabolic network can be used to predict which genes are likely essential. The success rate of these predictions is less than desirable, especially with regard to comprehensively locating essential genes.
We show that genes that are falsely predicted to be non-essential (for growth) share three characteristics across multiple organisms and growth media. First, these genes are on average connected to fewer reactions in the network than correctly predicted essential genes, suggesting incomplete knowledge of the functions of these genes. Second, they are more likely to be blocked (their associated reactions are prohibited from carrying flux in the given condition) than other genes, implying incomplete knowledge of metabolism surrounding these genes. Third, they are connected to less overcoupled metabolites.
The results presented herein indicate genes that cannot be correctly predicted as essential have commonalities in different organisms. These elucidated failure modes can be used to better understand the biology of individual organisms and to improve future predictions.
The dispensability and essentiality of genes in single-celled organisms is an extensively studied field  with multiple applications. Knowledge of which genes are indispensible is needed for the construction of minimal organisms, which are suggested as platforms for novel bacteria with beneficial characteristics . For pathogenic organisms, lists of essential genes can be taken as lists of potential targets for new antibiotics . In the field of metabolic engineering, non-essential gene deletions are used to create bacterial strains with better production characteristics .
Sizeable screens for essential genes have been undertaken in a number of organisms [1, 5], necessitating significant time and resources. Alternatively, at least as far as metabolic genes are concerned, in silico methods can be used to predict gene essentiality. Such in silico studies have been undertaken for a variety of organisms, including Escherichia coli Saccharomyces cerevisiae [5, 7], Helicobacter pylori , Staphylococcus aureus , Bacillus subtilis , and Mycobacterium tuberculosis . These methods are fast and require few resources. The rate-limiting step is the mandatory reconstruction of the metabolic network, which is a valuable resource to develop for a variety of other applications . These reconstructions are currently available for a relatively small, but growing, number of microorganisms. For organisms without a reconstruction, methods to elucidate the context in which essential genes occur across many organisms have been described .
Multiple, simultaneous in silico gene deletions experiments have also been described; see for example [8, 13]. In most organisms, any given individual metabolic gene is likely dispensable under most conditions, due to robustness properties that appear to be inherent to many biological networks [14, 15]. Experiments in which multiple genes are removed from the organism are necessary to dig deeper into its capabilities. Of course, not all individually dispensable genes can be removed at once from an organism, meaning that a collection of single knock-out experiments cannot itself provide instructions for constructing a minimal organism. Double and higher simultaneous knock-out experiments can be technically challenging in the lab and complete coverage of the genome is virtually impossible due to the combinatorial explosion. As cited above, computational methods can easily predict the results of such higher knock-outs. While the computer time required for anything more than a comprehensive double-deletion study may be prohibitive, a many more knock-outs can be simulated in silico than can be performed in vivo. Computational studies can be used as screens to identify potentially interesting multiple knock-outs to pursue in the lab, as has been demonstrated for metabolic engineering applications [16, 17].
Unfortunately, in silico methods for predicting gene essentiality are not perfect. There are four possible outcomes when comparing the results from in silico methods with experiments: true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). True positives occur when both the model and experiment indicate that a gene is essential, and true negatives occur when the model and experiment agree that a gene is nonessential. False positives occur when the model says a gene is essential, but experiments suggest otherwise. False negatives occur when the model says a gene is nonessential, but experiments indicate that it is essential. The overall success rate is given by the ratio of TP and TN to FP and FN. The best large-scale studies cite overall success rates in the vicinity of 90% [5, 6, 10], but nearly all cited success rates are inflated by the large number of non-essential genes that are correctly predicted. While these success rates are not inaccurate, the correct prediction of nonessential genes is less important than the correct prediction of essential genes. In false positive cases, one experiment, the deletion of that gene in the lab, can verify that a prediction is wrong. However, in false negative cases, only a comprehensive set of experiments (one attempted deletion per gene) can locate errors. When in silico studies are considered as screens for essential genes, perhaps for antibiotic target discovery, false-negative errors limit the usefulness of such screens. As detailed herein, when only experimentally-determined essential genes are considered for statistical purposes, success rates (or essential success rates) are lower.
There are several reasons for incorrect essentiality predictions, and incorrect predictions for a single organism are frequently studied and described in the publications that describe these predictions. Incorrect predictions are believed to usually occur for several reasons. False negative errors can be caused by incomplete definition of the biomass function, uncertainty in the growth medium used for experiments, and toxic-intermediate buildup. False positive errors can be caused by overly stringent definition of the biomass function, uncertainty in the growth medium, and the presence of unknown isozymes for a given reaction. The biomass function is central to the simulation of gene deletions, because a gene is predicted to be essential if its deletion results in the complete impairment of flux through this special reaction. The growth medium used for experiments is also very important because genes essentiality is dependent on what substrates are available for use. The buildup of toxic intermediates is difficult to simulate accurately with constraint-based methods because, in the absence of knowledge that the cell will produce a metabolite even if it cannot be broken down, there is no way to predict the production of toxic metabolites. The presence of unknown isozymes suggests that the organism is not understood as well as it could be.
While organism and gene specific explanations for incorrect predictions can be informative and lead to new discoveries, we have elected to study and classify incorrect predictions across organisms without trying to justify each inaccuracy by itself. Herein we report that genes that are incorrectly predicted as dispensable share common characteristics in multiple organisms. In terms of computational predictions, these genes are less connected in the network, more likely to be predicted inoperative, and connect to less overcoupled metabolites. Taken together, these characteristics suggest that incorrectly predicted genes are connected beyond the boundaries of known metabolism, both through limited knowledge of the reactions they catalyze directly and through the limited understanding of metabolism surrounding those reactions.
Results and Discussion
in silico vs. experimental gene deletions
We used six genome-scale metabolic networks [3, 6–8, 10, 11] and a combined total of 13 experimental gene essentiality data sets [5, 18–25]. These networks are all elemental and charge balanced, and they have been manually curated. In terms of included genes, these are the most complete networks for each organism that have been put together by hand and carefully validated.
The microorganisms and media conditions used in the study, together with the predicted gene essentiality results. The percentage of essential genes predicted correctly is a measure of how effective a screen the computational gene essentiality prediction really is.
Total # of genes
# of TP genes
# of FN genes
% essential genes correct
rich (Forsyth et al.)
rich (Ji et al.)
Topological summary statistics (number of genes, number of reactions, number of gene associated reactions, number of metabolites) were noted for each metabolic network studied. These statistics were tightly correlated with each other; for example, a network with a larger number of genes is likely to have a larger number of reactions and metabolites (results not shown here). However, these statistics showed no significant correlation with the ability of a network to correctly predict the essentiality of genes. Model performance, at least in terms of predicting essential genes, does not appear to be related to model size. This lack of correlation suggests that the number of components (genes, reactions, etc.) in a network does not impact our ability to reconstruct an accurate network.
A particular metabolic gene, either alone or in conjunction with other genes, encodes one or more enzymes responsible for one or more biochemical reactions. The associations between genes, enzymes, and reactions for each metabolic network we analyzed are publically available and are termed gene-protein-reaction associations (GPR's) . Herein, we define the connectivity of a gene as the number of reactions it affects, as characterized by the GPR's. Depending on the organism, the mean connectivity for a gene is between one and three. The connectivity of a gene is a reflection of its understood prominence in the metabolic network, as measured by the number of discrete metabolic transformations it enables. Due to imperfect knowledge of the functions of genes, the connectivity of a gene is an estimate, and probably a low estimate. Because metabolic networks are reconstructed by only assigning functions to genes when they are relatively certain, the actual connectivity of a gene could be higher than the numbers given here.
The connectivity of genes in the organisms studied. The overarching trend is that the mean connectivity of FN genes is less than the mean connectivity of TP genes for nearly all organisms and data sets.
Overall Mean Connectivity
Mean connectivity of TP genes
Mean connectivity of FN genes
rich (Forsyth et al.)
rich (Ji et al.)
The outwardly obvious reason for this trend is that we do not have a comprehensive understanding of the function of FN genes. The lesser connectivity of FN genes suggests that they may be essential for reasons that are yet to be discovered or fully understood. The connectivity of essential genes may vary widely. However, we do not expect for it to fall into two groups corresponding to TP and FN unless the connectivity for FN genes is an artifact of an incomplete network E. coli, arguably the best understood microorganism, does not show this trend, supporting the notion that incomplete knowledge of gene function leads to the connectivity differences. We expect that as more is learned about the FN genes in other organisms their connectivity will increase and they will concurrently become TP genes as the reasons for their essentiality are understood.
Flux variability and blocked genes
Given a metabolic network and an objective function, the allowable variability of the flux through each reaction can be computed with a series of linear programming problems . In general, some fluxes can take a wide range of values (they have a wide flux span), some a smaller range, and some have no variability at all. Reactions that must not operate in a steady state are termed blocked reactions; they have no variability at all and are constrained by stoichiometry to carry zero flux. From a modeling standpoint, a reaction can be blocked for two reasons. First, the inputs and outputs determined by given environmental conditions (i.e. growth media) may not allow for a reaction to operate, but it would not be blocked under some different set of input and output constraints. This is called a condition-dependant gap. Second, the reaction may have one or more metabolites that are unavailable for production or consumption due to a network gap, which is basically a dead-end, or a condition-independent gap. This gap may be a modeling artifact due to incomplete knowledge of an organism, or a remnant that used to be functional in an ancestor of the organism. When gaps and blocked reactions occur in metabolic models, they are often viewed as an opportunity to discover something previously unknown about the organism .
To identify a relationship between gene essentiality and flux variability, we computed the maximum and minimum allowable flux through each reaction in each metabolic network, constraining the network to produce biomass at no less than 90% of the optimal rate. Because biomass production is permitted to take a range of values, as would be the case amongst any experimental population of cells, any reaction that has no flux span (meaning that its flux can only take a single value) must also be a blocked reaction. We found a widely variable number of blocked reactions in the networks, ranging from 75 in H. pylori to 888 in E. coli on glycerol minimal medium. We then mapped these reactions to genes, defining a gene as blocked if it is associated with at least one blocked reaction, and completely blocked if all reactions with which it is associated are blocked. Thus, a blocked gene may have some functionality in the network, but a completely blocked gene cannot.
The fraction of blocked and completely blocked genes, both FN and non-FN.
Fraction of non-FN genes blocked
Fraction of FN genes blocked
Fraction of non-FN genes completely blocked
Fraction of FN genes completely blocked
rich (Forsyth et al.)
rich (Ji et al.)
Whereas the simplest explanation for the gene connectivity results above was incomplete knowledge about FN genes themselves, a better rationale for the blocked reactions here is incomplete knowledge of areas of metabolism closely associated with these genes. The network neighborhood of these genes is not completely understood. E. coli is again a very well-studied organism and it is not surprising that FN genes cannot be explained by incomplete knowledge of the surrounding network. H. pylori has a very compact metabolic network, with 45% fewer genes than the next smallest network. It also has one environment in which it is specialized, the human stomach. Thus, it is reasonable to conclude that this organism may have a reasonably comprehensively known metabolism. On the other hand, S. cerevisiae has a variety of factors complicating its metabolism, including the compartmentalization that is an essential feature of eukaryotic organisms. With metabolic processes spanning various organelles and intracellular transport mechanisms incompletely understood, it is logical that FN genes would result from a lack of knowledge of the surrounding metabolism.
Overcoupled metabolite pairs
In genome-scale metabolic networks, certain pairs of metabolites occur in reactions together many times; for example, ATP and ADP. Some of these metabolite pairs can be classified as overcoupled based on statistical calculations that consider the individual connectivity of each metabolite and the network structure . These overcoupled metabolite pairs are often associated with important cellular features such as energy transfer and charge balancing. Their functionality together is speculated to be important enough to have evolved beyond the point at which random connectivity would explain their co-occurrence. Even without knowing that these pairs of metabolites are overcoupled in a statistically significant manner, a casual observer would note that many of the pairs are highly important for cellular function.
We calculated overcoupled metabolites by the previously published method , using p < 0.01. We define a gene as associated with an overcoupled metabolite pair if it catalyzes at least one reaction in which at least one member of the overcoupled pair participates. The gene does not have to be associated with both members of the pair explicitly, but of course it is associated with both metabolites through the actions of whichever metabolite it directly influences. On average, 95% of genes in all models are associated with an overcoupled metabolite.
The overcoupling count for the ith gene is calculated as
count = p•Ŝ•G i
Ŝ is the binary form of the stoichiometric matrix;
G is the gene-reaction association matrix (each row represents a reaction, each column a gene, and each binary entry indicates whether that gene is associated with that reaction); and
p is the overcoupled metabolite vector, with each entry specifying the number of overcoupling interactions with which a metabolite is associated.
This works out to the sum of the number of overcoupling interactions in which the compounds that are associated with a particular gene are involved, allowing compounds to be counted multiple times if they participate in multiple reactions. A simple example is presented in the methods section for clarity.
The mean overcoupling counts for each organism and media condition.
TP mean overcoupling count
FN mean overcoupling count
Corrected FN mean overcoupling count
rich (Forsyth et al.)
rich (Ji et al.)
Because FN genes, on average, interact less with overcoupled metabolites, they are less likely to be tied into important, evolutionarily conserved metabolic processes, at least in silico. It is possible that the FN genes are responsible for reactions beyond what is currently known, similar to the proposed reason that FN genes have lower connectivity. It is also possible that the reactions with which FN genes are associated are not completely correct. For example, some of these reactions may have alternative substrate/product pairs that are highly important for the network.
Herein we have demonstrated that incorrectly predicted essential metabolic genes have network level differences that are largely conserved across organisms. These differences are (1) a smaller mean number of reactions per gene, (2) a larger percentage of blocked genes, and (3) a smaller overcoupling count.
These three differences all rely on the interactions between networks components. Fundamentally, gene essentiality is a network-level property, so it is to be expected that explanations will rely on the network as a whole. We did not find any explanation for incorrect gene essentiality predictions based on simple statistics such as rudimentary network size metrics.
The results suggest that incomplete knowledge of the metabolic processes associated with essential genes and the immediately surrounding metabolic processes are driving forces in incorrect gene essentiality predictions. These factors in most cases cannot with statistical significance explain incorrect gene essentiality predictions in E. coli, the best characterized microorganism considered here. One might expect, based on the numbers for E. coli shown in Table 1, that roughly a third of FN genes cannot be described with these explanations. Thus, further study of this topic is warranted.
One potentially fruitful area may be a comparative analysis of more precise network roles of FN genes vs. those of TP genes. One could, for example, computationally predict the necessity of each gene in the network for a variety of functions other than growth, such as redox balance or energy production. This may allow the determination of imperfectly understood areas of metabolism, even in well studied organisms. We foresee increased comparative analysis of microbial metabolism as more networks become available, akin to the growth of genome sequence comparisons from a curiosity to the essential tool that is BLAST today.
Metabolic network setup and in silico gene deletions
Metabolic networks for all six organisms were obtained as SimPheny (Genomatica, San Diego, CA) output files and imported into the COBRA Toolbox  in Matlab (The Mathworks, Inc., Natick, MA) using the readCbModel command with the SimpheyPlus format. Media conditions were set by using exchange fluxes to allow inputs to the model that are consistent with each published experimental gene deletion study.
Gene deletions were simulated using the singleGeneDeletion command in the COBRA Toolbox. The set of zero or more reactions that cannot occur without the presence of each gene were removed from the model, and we attempted to simulate growth. If no growth was possible, the gene was predicted to be essential. The results from the in silico experiments were compared with previously published experimental results to distinguish TP genes from FN genes (and both from genes that are not essential experimentally). Each gene in each organism under each media condition was identified as TP, FN, or not essential.
Given the boolean gene-protein-reaction associations, the COBRA Toolbox automatically produces a binary matrix G describing the associations between genes and reactions. The number of non-zero entries in each column describe the connectivity of a single gene. The mean connectivity of TP and FN genes was determined with simple arithmetic. All graphs were made in Excel (Microsoft, Redmond WA).
Flux variability and blocked reactions
The flux variability of each reaction in each network under each set of media conditions was determined using the fluxVariability command in the COBRA Toolbox, constraining biomass production to be no less than 90% of maximum. Reactions that cannot take any flux are found this way and termed "blocked." These blocked reactions are mapped back to genes through G, and genes associated with only blocked reactions are termed completely blocked; those associated with one or more blocked reactions are termed blocked.
Overcoupled metabolites and overcoupled count
Overcoupled metabolites are computed with the same procedure as has been previously published . The metabolite coupling matrix M is calculated as
M = Ŝ•Ŝ T
where Ŝ is the binary form of S.
M is a symmetric matrix with off-diagonal elements indicating the number of reactions in which two metabolites (rows and columns of M) co-participate. The diagonal elements give the total number of reactions in which each metabolite appears.
Overcoupled metabolites are determined by redistributing the elements of Ŝ such that the diagonal elements of M remain the same but the off-diagonal elements vary, in effect simulating the effects of random co-occurrence of metabolites but maintaining the connectivity structure of the network. After many redistributions, p values can be determined by comparing the actual value of Mij to the random distribution of values. We used only metabolites that are overcoupled with p < 0.01.
The overcoupled count for each gene was calculated as described above. As an example, consider a gene that catalyzes two isomerization reactions:
A -> B
B -> C
A is a member of one overcoupled metabolite pair, B is a member of 3 overcoupled metabolite pairs, and C is not overcoupled with any other metabolite. The count is 1 + 3 + 3 = 7 (1 for A, 3 for B in the first reaction, and 3 for B in the second reaction).
Except for determining which metabolite pairs are overcoupled, the statistics of which are summarized above and fully covered in , two statistical procedures were used to find p values. Comparisons across multiple organisms, for example, whether gene connectivity is less for FN genes, were analyzed with permutation tests. The data points were randomly assigned to two groups many times and the number of times the actual difference of means was greater (or less) than the random difference of means was noted. This number was divided by the number of randomizations to get a p value. In no case was the number of randomizations less than 10,000.
Comparisons within a dataset, for example, whether FN genes in S. cerevisiae are less connected than TP genes, were assigned a confidence score by randomly picking the same number of genes from each group and comparing their means. The number of times that the sampled mean for FN genes is less than the sampled mean for TP genes divided by the number of random samplings gives a confidence score or p value. No fewer than 10,000 randomizations were used.
We would like to thank, first and foremost, the scientists who reconstructed the six metabolic networks utilized herein; without their hard work, this project could never have been undertaken. We thank Neema Jamshidi for critically reading the manuscript. We thank Shankar Subramaniam for suggesting statistical tests for use in this manuscript. We thank Andrew Joyce and Adam Feist for helping with experimentally essential gene data gathering. We thank You-Kwan Oh and Sharon Wiback for technical assistance with the B. subtilis metabolic network.
- Gerdes S, Edwards R, Kubal M, Fonstein M, Stevens R, Osterman A: Essential genes on metabolic maps. Curr Opin Biotechnol. 2006, 17: 448-456. 10.1016/j.copbio.2006.08.006View ArticlePubMedGoogle Scholar
- Posfai G, Plunkett G, Feher T, Frisch D, Keil GM, Umenhoffer K, Kolisnychenko V, Stahl B, Sharma SS, de Arruda M, Burland V, Harcum SW, Blattner FR: Emergent properties of reduced-genome Escherichia coli. Escherichia coli Science. 2006, 312 (5776): 1044-1046.PubMedGoogle Scholar
- Becker SA, Palsson BO: Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC Microbiol. 2005, 5: 8- 10.1186/1471-2180-5-8PubMed CentralView ArticlePubMedGoogle Scholar
- Fong SS, Burgard AP, Herring CD, Knight EM, Blattner FR, Maranas CD, Palsson BO: In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol Bioeng. 2005, 91: 643-8. 10.1002/bit.20542View ArticlePubMedGoogle Scholar
- Kuepfer L, Sauer U, Blank LM: Metabolic functions of duplicate genes in Saccharomyces cerevisiae. Genome research. 2005, 15: 1421-30. 10.1101/gr.3992505PubMed CentralView ArticlePubMedGoogle Scholar
- Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121- 10.1038/msb4100155PubMed CentralView ArticlePubMedGoogle Scholar
- Duarte NC, Herrgard MJ, Palsson BO: Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res. 2004, 14: 1298-309. 10.1101/gr.2250904PubMed CentralView ArticlePubMedGoogle Scholar
- Thiele I, Vo TD, Price ND, Palsson BO: Expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterization of single- and double-deletion mutants. J Bacteriol. 2005, 187: 5818-30. 10.1128/JB.187.16.5818-5830.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Heinemann M, Kummel A, Ruinatscha R, Panke S: In silico genome-scale reconstruction and validation of the Staphylococcus aureus metabolic network. Biotechnol Bioeng. 2005, 92: 850-64. 10.1002/bit.20663View ArticlePubMedGoogle Scholar
- Oh YK, Palsson BO, Park SM, Schilling CH, Mahadevan R: Genome-scale reconstruction of metabolic network in bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J Biol Chem. 2007, 282 (39): 28791-28799. 10.1074/jbc.M703759200View ArticlePubMedGoogle Scholar
- Jamshidi N, Palsson BO: Investigating the metabolic capabilities of Mycobacterium tuberculosis H37Rv using the in silico strain iNJ661 and proposing alternative drug targets. BMC Syst Biol. 2007, 1: 26- 10.1186/1752-0509-1-26PubMed CentralView ArticlePubMedGoogle Scholar
- Reed JL, Famili I, Thiele I, Palsson BO: Towards multidimensional genome annotation. Nat Rev Genet. 2006, 7: 130-41. 10.1038/nrg1769View ArticlePubMedGoogle Scholar
- Harrison R, Papp B, Pal C, Oliver SG, Delneri D: Plasticity of genetic interactions in metabolic networks of yeast. Proc Natl Acad Sci USA. 2007, 104: 2307-2312. 10.1073/pnas.0607153104PubMed CentralView ArticlePubMedGoogle Scholar
- Deutscher D, Meilijson I, Kupiec M, Ruppin E: Multiple knockout analysis of genetic robustness in the yeast metabolic network. Nat Genet. 2006, 38: 993-998. 10.1038/ng1856View ArticlePubMedGoogle Scholar
- Becker D, Selbach M, Rollenhagen C, Ballmaier M, Meyer TF, Mann M, Bumann D: Robust Salmonella metabolism limits possibilities for new antimicrobials. Nature. 2006, 440: 303-307. 10.1038/nature04616View ArticlePubMedGoogle Scholar
- Burgard AP, Pharkya P, Maranas CD: Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol Bioeng. 2003, 84: 647-57. 10.1002/bit.10803View ArticlePubMedGoogle Scholar
- Pharkya P, Burgard AP, Maranas CD: Exploring the overproduction of amino acids using the bilevel optimization framework OptKnock. Biotechnol Bioeng. 2003, 84: 887-99. 10.1002/bit.10857View ArticlePubMedGoogle Scholar
- Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H: Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Molecular systems biology [electronic resource]. 2006, 2: 0008-Google Scholar
- Joyce AR, Reed JL, White A, Edwards R, Osterman A, Baba T, Mori H, Lesely SA, Palsson BØ, Agarwalla S: Experimental and computational assessment of conditionally essential genes in Escherichia coli. J Bacteriol. 2006, 188: 8259-8271. 10.1128/JB.00740-06PubMed CentralView ArticlePubMedGoogle Scholar
- Giaever G, Chu AM, Ni L, Connelly C, Riles L, Véronneau S, Dow S, Lucau-Danila A, Anderson K, André B, Arkin AP, Astromoff A, El-Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian KD, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Güldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kötter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang CY, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418: 387-91. 10.1038/nature00935View ArticlePubMedGoogle Scholar
- Chalker AF, Minehart HW, Hughes NJ, Koretke KK, Lonetto MA, Brinkman KK, Warren PV, Lupas A, Stanhope MJ, Brown JR, Hoffman PS: Systematic identification of selective essential genes in Helicobacter pylori by genome prioritization and allelic replacement mutagenesis. J Bacteriol. 2001, 183: 1259-1268. 10.1128/JB.183.4.1259-1268.2001PubMed CentralView ArticlePubMedGoogle Scholar
- Ji Y, Zhang B, Van SF, Horn , Warren P, Woodnutt G, Burnham MK, Rosenberg M: Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. Science. 2001, 293: 2266-9. 10.1126/science.1063566View ArticlePubMedGoogle Scholar
- Forsyth RA, Haselbeck RJ, Ohlsen KL, Yamamoto RT, Xu H, Trawick JD, Wall D, Wang L, Brown-Driver V, Froelich JM, C KG, King P, McCarthy M, Malone C, Misiner B, Robbins D, Tan Z, Zhu Zy ZY, Carr G, Mosca DA, Zamudio C, Foulkes JG, Zyskind JW: A genome-wide strategy for the identification of essential genes in Staphylococcus aureus. Mol Microbiol. 2002, 43: 1387-400. 10.1046/j.1365-2958.2002.02832.xView ArticlePubMedGoogle Scholar
- Kobayashi K, Ehrlich SD, Albertini A, Amati G, Andersen KK, Arnaud M, Asai K, Ashikaga S, Aymerich S, Bessieres P, Boland F, Brignell SC, Bron S, Bunai K, Chapuis J, Christiansen LC, Danchin A, Débarbouille M, Dervyn E, Deuerling E, Devine K, Devine SK, Dreesen O, Errington J, Fillinger S, Foster SJ, Fujita Y, Galizzi A, Gardan R, Eschevins C, Fukushima T, Haga K, Harwood CR, Hecker M, Hosoya D, Hullo MF, Kakeshita H, Karamata D, Kasahara Y, Kawamura F, Koga K, Koski P, Kuwana R, Imamura D, Ishimaru M, Ishikawa S, Ishio I, Le Coq D, Masson A, Mauël C, Meima R, Mellado RP, Moir A, Moriya S, Nagakawa E, Nanamiya H, Nakai S, Nygaard P, Ogura M, Ohanan T, O'Reilly M, O'Rourke M, Pragai Z, Pooley HM, Rapoport G, Rawlins JP, Rivas LA, Rivolta C, Sadaie A, Sadaie Y, Sarvas M, Sato T, Saxild HH, Scanlan E, Schumann W, Seegers JF, Sekiguchi J, Sekowska A, Séror SJ, Simon M, Stragier P, Studer R, Takamatsu H, Tanaka T, Takeuchi M, Thomaides HB, Vagner V, van Dijl JM, Watabe K, Wipat A, Yamamoto H, Yamamoto M, Yamamoto Y, Yamane K, Yata K, Yoshida K, Yoshikawa H, Zuber U, Ogasawara Nl: Essential Bacillus subtilis genes. Proc Natl Acad Sci USA. 2003, 100: 4678-83. 10.1073/pnas.0730515100PubMed CentralView ArticlePubMedGoogle Scholar
- Sassetti CM, Boyd DH, Rubin EJ: Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol. 2003, 48: 77-84. 10.1046/j.1365-2958.2003.03425.xView ArticlePubMedGoogle Scholar
- Mahadevan R, Schilling CH: The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003, 5: 264-76. 10.1016/j.ymben.2003.09.002View ArticlePubMedGoogle Scholar
- Reed JL, Patel TR, Chen KH, Joyce AR, Applebee MK, Herring CD, Bui OT, Knight EM, Fong SS, Palsson BO: Systems approach to refining genome annotation. Proc Natl Acad Sci USA. 2006, 103: 17480-17484. 10.1073/pnas.0603364103PubMed CentralView ArticlePubMedGoogle Scholar
- Becker SA, Price ND, Palsson BO: Metabolite coupling in genome-scale metabolic networks. BMC Bioinformatics. 2006, 7: 111- 10.1186/1471-2105-7-111PubMed CentralView ArticlePubMedGoogle Scholar
- Becker SA, Feist AM, Mo ML, Hannum G, Palsson BO, Herrgard MJ: Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat Protoc. 2007, 2: 727-738. 10.1038/nprot.2007.99View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.