Consensus and conflict cards for metabolic pathway databases
© Stobbe et al.; licensee BioMed Central Ltd. 2013
Received: 23 October 2012
Accepted: 20 June 2013
Published: 26 June 2013
The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial.
We introduce the concept of Consensus and Conflict Cards (C2Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C2CardsHuman, as a web application http://www.molgenis.org/c2cards, covering five human pathway databases.
C2Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C2Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement.
KeywordsMetabolic network Consensus Community support Human Pathway database
Metabolic pathway databases have proven very valuable for a wide range of applications, varying from the analysis of high-throughput data to in silico phenotype prediction. In the past decade the number of pathway databases has grown markedly, providing extensive descriptions of the metabolic network for an increasing number of organisms [1, 2]. The metabolic networks of several key organisms, for example, S. cerevisiae and H. sapiens, are even described in multiple databases. A comparison of two yeast networks showed, however, that the two agreed on only 36% of their reactions . Similarly, five pathway databases describing the human metabolic network agreed on only 3% of the 6968 reactions they jointly contain . Given that these databases aim to represent the metabolic capabilities of the same organism, the level of agreement is much lower than one might expect and hope for. There are several explanations for the observed lack of consensus. These include the different ways in which the networks have been built, their manner of curation, and a different interpretation of literature . The comparison of Stobbe et al. also revealed large differences in the breadth and depth of the coverage the five human metabolic networks have.
The advantage of having several descriptions of the metabolic network for the same organism is that they offer different views on the same biological system and thus can reveal controversial biological knowledge. In addition, the databases each have a particular focus and its curators have specific fields of expertise. Therefore, each database may provide complementary pieces of the puzzle of the complete metabolic network. These observations have motivated, still ongoing, efforts to consolidate the different networks for the same organism and to build consensus metabolic networks using a largely manual approach [3, 6, 7].
Combining all the knowledge on the metabolic network contained in the various pathway databases and identifying conflicting information is, however, far from trivial. Retrieving all required information from multiple databases is in itself already a cumbersome task. One reason that makes it challenging to identify instances where pathway databases do not agree on the underlying biology of a metabolic process are the different decisions made by each of the databases on how to represent knowledge [4, 8]. For example, a particular difference may be simply explained by the different levels of granularity with which metabolic processes are described by each database, instead of a fundamentally different biological insight. Secondly, it remains a challenge to determine whether databases refer to the same gene or the same metabolite. Thirdly, the definition of a pathway also differs per database, which makes it nearly impossible to compare the networks on a smaller scale, i.e., per pathway. Fourthly, the larger the number of pathway databases considered, the more difficult it is to identify the consensus and the conflicts. Recently, algorithms have been proposed to semi-automatically merge two descriptions of the metabolic network of the same organism [9, 10]. These approaches mainly address the challenge of matching metabolites, partly via interactions with the user. The core of their resulting merged description consists of reactions that can be found in both networks. Integrating more than two descriptions will, however, significantly reduce the size of the core and limit its utility . The merged description also contains reactions that could not be (exactly) matched and are therefore unique to one of the descriptions. Such an approach will, however, neither resolve the conflicting information between databases nor filter out erroneous information. Furthermore, the semi-automatic approaches do not explicitly address all issues mentioned above. For example, conflicts due to differences in granularity are not taken into account. While semi-automatic approaches generate a useful scaffold for a consensus network, the resulting description still requires extensive manual curation.
Altogether, the issues described above make the construction of a single, more accurate, and more complete network based on the pathway databases available a laborious and largely manual process . Moreover, it is an ongoing process, as new knowledge continues to become available both in the scientific literature and in pathway databases.
To more easily visualize the opinion of multiple pathway databases, we introduce the concept of Consensus and Conflict Cards (C2Cards). C2Cards combine the knowledge from multiple pathway databases for a specific target organism. A C2Card can be centered at a single gene, Enzyme Commission (EC) number or reaction of interest and gives a concise overview of what the databases do or do not agree on with respect to the entity the C2Card is centered at. These three perspectives offer complementary views on the knowledge contained in the pathway databases. Importantly, using these perspectives disagreements caused by a different decision on how and in how much detail to represent knowledge can be identified. C2Cards can be used to assist reconciliation efforts and make users of pathway databases more aware of the exact differences that currently exist between databases.
As a proof-of-concept, we implemented C2CardsHuman (http://www.molgenis.org/c2cards), which combines the knowledge of the following five frequently used human pathway databases: the Biochemically, Genetically and Genomically structured (BiGG) knowledgebase  (H. sapiens Recon 1 ), the Edinburgh Human Metabolic Network (EHMN) , HumanCyc , and the metabolic subsets of the Kyoto Encyclopedia of Genes and Genomes database (KEGG)  and Reactome . Below, we first give an overview of the various features of the C2Cards, the combined strength of the three perspectives, and how C2Cards can aid in the curation of gene and metabolite identifiers. Next, we describe several case studies illustrating the potential of the C2Cards in identifying conflicts between pathway databases. Finally, we discuss the next steps to be taken in curating metabolic networks.
Three complementary perspectives
C2Cards offer three complementary perspectives (gene, EC number, reaction) on the knowledge contained in the pathway databases. Each perspective can answer various types of questions, accommodating the different interests one may have. Importantly, the three perspectives can be used to identify and complement information missing in one (or more) of the pathway databases using the knowledge from the other pathway databases.
The ’gene perspective’ shows for each of the pathway databases, which metabolic functions the product of a gene has, as indicated by the reaction(s) and EC number(s) linked to it. This perspective may also answer the question whether other genes, either encoding isozymes or components of the same complex, are linked to the same reaction.
EC number perspective
The ’EC number perspective’ shows on which elements linked to the EC number the pathway databases (dis)agree for a specific type of conversion. It may also reveal possible alternative substrates, which is one of the sources of conflict between metabolic pathway databases . The C2Card centered at the EC number 22.214.171.124 (3-hydroxyacyl-CoA dehydrogenase) provides an example of this scenario (Additional file 1). For example, EHMN has 62 unique reactions linked to this EC number while both HumanCyc and Recon 1 only have two unique reactions. The EC number perspective can also be used to answer the question which genes encode for an enzyme with the specified enzymatic function, according to each database.
The ‘reaction perspective’ provides a compact overview of which gene(s) and EC number(s) are linked to a reaction of interest in each pathway database. This perspective can assist in resolving a commonly occurring gap in reconstructions of the metabolic network, namely cases in which the gene product catalyzing a known metabolic reaction is missing . The reaction perspective (and also the EC number perspective) can be used to find possible candidates for a missing gene in a particular database or reveal that the gene is missing in all pathway databases.
Definition of EC numbers in NC-IUBMB
Reaction as defined by NC-IUBMB
ATP + nucleoside phosphate =
ADP + nucleoside diphosphate
(1) ATP + (d)CMP = ADP + (d)CDP
(2) ATP + UMP = ADP + UDP
ATP + UMP = ADP + UDP
Dealing with conceptual differences
Gene and metabolite identity
Next to exploring the genes, EC numbers, and reactions contained in the pathway databases, as described above, C2Cards can also be of direct use in curating the identifiers (IDs) assigned to the genes and metabolites by the pathway databases. Identifiers are essential for the unambiguous identification of genes and metabolites across multiple resources and enable linking experimental data to the metabolic network. For each gene and metabolite a C2Card provides the identifiers assigned to them by the pathway databases (see Figure 1, and Materials and methods). Obsolete or transferred identifiers are explicitly indicated. For genes the HUGO Gene Nomenclature Committee (HGNC) symbol is provided and for metabolites their name and synonyms. If available in a pathway database, two structural IDs (InChI and SMILES) and the chemical formula are also shown for a metabolite. The information on the identifiers helps to reveal cases where the assignment of identifiers to a metabolite or gene can be improved. Firstly, it can uncover metabolites that completely lack an ID in one or more pathway databases. Secondly, ID information can also help to identify cases where pathway databases assigned IDs from different gene and metabolite databases to the same entity. This can be used to propose additional identifiers for that particular gene or metabolite, which may also facilitate matching between databases. Thirdly, it can reveal genes and metabolites to which a pathway database assigned multiple identifiers from the same genome or metabolite database, respectively. In summary, C2Cards can assist the considerable amount of manual curation required to correctly link each component of the metabolic network to external databases.
Excerpt of the C 2 Card centered at the reaction ‘l-arginine + H 2 O → ornithine + urea’
H. sapiens Recon 1
l-arginine[c] + H2O[c]
Urea cycle / amino group metabolism
ornithine[c] + urea[c]
l-arginine[m] + H2O[m]
Urea cycle / amino group metabolism
ornithine[m] + urea[m]
l-arginine[c] + H2O[c]
l-ornithine[c] + urea[c]
l-arginine[m] + H2O[m]
l-ornithine[m] + urea[m]
Next to the web interface, programming interfaces to R, SOAP (Simple Object Access Protocol), and REST (Representational State Transfer) are provided to enable programmatic querying of the collection of C2Cards. One possible application would be to perform computational analyses on each of the pathway databases. A typical example is an enrichment test to prioritize pathways most likely to be affected in a given high-throughput experiment. The differences between pathway databases can be quite large both with respect to content and conceptual differences . For example, the number of pathways, in the five selected human pathway databases ranges from 69 in EHMN to 257 in HumanCyc (see Materials and methods). Consequently, it is to be expected that the choice of a particular pathway database affects the outcome of pathway enrichment analyses . It would, therefore, be advisable to apply analyses to multiple pathway databases to verify the robustness of the results. Specifically, to accommodate pathway enrichment analyses, we provide two additional tables, accessible via the programmatic interfaces only. In these tables the metabolites and genes of each pathway database are linked to the corresponding pathways. The results of our reaction comparison could be used to zoom into the outcomes of an enrichment analysis to see if the differences found can perhaps be attributed to the different pathway definitions used by the databases.
Another additional feature offered is the possibility to look up the fate of a metabolite, contained in any of the five databases, by retrieving the list of reactions in which the metabolite of interest participates. Furthermore, databases in which the metabolite is a ‘dead-end’, i.e., it is either only produced or consumed, are explicitly indicated. The list of reactions provided allows the user to find candidate reactions to resolve these dead-ends in the network of a particular database using information from other databases. All reactions in this list are linked to their corresponding C2Card.
C2Cards case studies
For each of the three perspectives we provide a concrete example derived from C2CardsHuman of consensus and conflicts between the five human pathway databases below. The examples have all been chosen from primary metabolic processes, highlighting that conflicts still occur even in well-studied parts of the metabolic network. Moreover, we focused on examples of differences between databases that are not easily resolved and could point either to conflicting information or to complementary information. The case studies illustrate why manual curation remains crucial to resolve contradicting information and to determine in which cases further biochemical experiments are even required to verify what is correct and what is not.
Case study I: gene perspective
Reactome and HumanCyc only link the gene to the glutamine dependent reaction and Recon 1 only to the ammonium dependent reaction. The C2Card focused on the glutamine dependent reaction of Reactome (Figure 1) shows that Recon 1 does contain this reaction, but links it only to the CTPS2 gene and not to CTPS. The same observation can be made when starting from the EC number perspective, as both genes are linked to the same EC number (not shown).
and in the comments field it is stated that “Glutamine can replace NH3”. This might explain the inconsistencies at the reaction level to some extent.
Case study II: EC number perspective
Case study III: reaction perspective
Excerpt of the C 2 Card centered at the reaction ‘deoxyuridine + phosphate < == > 2-deoxy-d-ribose 1-phosphate + uracil’
PNP*, TYMP*, UPP1
H. sapiens Recon 1
PNP* or UPP2
salvage pathways of pyrimidine deoxyribonucleotides
UPP1 or UPP2
Pyrimidine catabolism and Pyrimidine salvage reactions
We proposed the concept of Consensus and Conflict Cards to provide concise overviews of the knowledge contained in metabolic pathway databases for an organism of interest. In a single step one can find, for example, a gene of interest and see if the databases agree on the role of its product in the metabolic network. The C2Cards will increase the awareness of the differences that exist between the various pathway databases. Other initiatives also provide a web-based interface to browse and search multiple pathway databases [24, 25]. However, they are focused on the union of various (pathway) databases instead of explicitly pointing out the differences between pathway databases. Furthermore, they do not provide a clear and compact overview of the content of each of the five selected databases as a C2Card does. Also, the C2Cards application enables users to find reactions that are similar to the reaction of interest, but that are not exactly the same. The three perspectives offered by the C2Cards application provide complementary views on the knowledge contained in the pathway databases. This makes it possible to distinguish differences that reflect a disagreement on the underlying biology (case studies I-III) from differences that may be explained by, for example, different decisions taken on how to represent knowledge (Table 2).
Ultimately, to reconcile differences and to integrate the networks manual curation is required. While a C2Card can highlight differences between databases, it cannot distinguish between errors in one (or more) of the databases and cases where databases do not agree due to lack of consensus in the scientific literature. Moreover, for any given organism metabolic pathway databases are still being refined, expanded, and corrected. This makes it challenging to distinguish complementary information from cases in which the database curators purposely excluded, for example, a reaction or gene. Even the parts the pathway databases agree on may need to be reviewed as the databases share information sources and may copy data from each other, thereby possibly propagating incorrect information. Manual curation is also needed to unambiguously assign identifiers to genes and metabolites.
In summary, C2Cards offer an elegant solution to bring cases that deserve further inspection to the attention of pathway database curators. The overviews may also point out controversial biological knowledge that should be subject of further research.
A biologically accurate and complete description of the metabolic network for human and other organisms is of utmost importance to, e.g., increase our knowledge about pathways perturbed by a disease, find new drug targets, and interpret the deluge of high-throughput data. A crucial step towards a more complete description is to combine the knowledge captured by each of the available pathway databases for a specific organism. Much time and effort has already been put into pathway databases and we should profit from this to the fullest extent. However, it requires the commitment and the support of a broad community to construct an initial consensus network and to extend it with new knowledge from domain experts, the scientific literature, and as captured by the various pathway databases. C2Cards can contribute to such an endeavor in several ways. As illustrated by the three case studies the C2Cards are a perfect starting point for further manual curation of the human metabolic network in future reconstruction jamborees . Our application could be extended in several ways. For example, to support reconstruction efforts, we could indicate whether a reaction is balanced or not, in addition to the already available tool to look up dead-end metabolites. Another possible extension is to further expand the set of five pathway databases currently contained in C2CardsHuman with additional pathway databases. Importantly, the C2Cards application can be set up for other organisms as well (see http://www.molgenis.org/c2cards for a description). Extending each of the three perspectives offered by the C2CardsHuman to multiple organisms could enable using knowledge about metabolism in model organisms to resolve conflicts between the human pathway databases. Note that this does require the use of an ortholog mapping such as InParanoid .
As a guide for integrating pathway databases, we provide overviews of which genes, EC numbers, and reactions can be found in which database. The entries in these overviews are linked to the corresponding C2Card. One could start by curating the reactions contained in all or the majority of the databases. In fact, for more than half of the reactions found in all five human metabolic pathway databases, there is no agreement on the EC numbers and genes linked to a reaction  and additional curation is needed. C2Cards can also be of use if a consensus network for a given organism has already been established. We envision that the C2Cards application could serve as a central platform in which the consensus network can be further refined and extended with knowledge available in pathway databases not used for its construction. We are planning to expand C2CardsHuman with the community-driven consensus human metabolic network Recon 2 , which was published while this article was under review. By including Recon 2 as a point of reference, we can compare this state-of-the-art consensus network with other pathway databases. The overview of all reactions in C2CardsHuman, for example, could be a source of candidates for expanding Recon 2. Bringing the differences between the consensus network and other descriptions to the attention of experts would enable further refinement of Recon 2. As a first step towards such a platform, users can already add comments to a C2Card, preferably substantiated by references to the literature. They can subscribe to C2Cards of their interest and receive an e-mail when new comments are added. Different or even contradictory views possibly held by contributors can be clearly exposed in this forum set-up. Based on these contributions a team of curators could then decide to incorporate the necessary changes in the consensus network, if enough evidence supports this claim. In the future we could extend the forum by allowing people to rank the contributions to bring to the foreground the forum entries deemed most important and thereby aiding the curators. Notably, as illustrated by case study III, it may lead to the conclusion that further biochemical characterization experiments are required. Since pathway databases are continuously being refined and new information is being added, we could also include the possibility to automatically alert the curators by mailing them updated or additional C2Cards.
It is important to actively involve domain experts in this continuous curation process, even though they may only indirectly benefit from contributing to such an effort. To make the barrier to contribute as low as possible, the web interface of the C2Cards was designed to be easy to use and suitable for users with different backgrounds. The application can be accessed via smartphones and tablets as well, allowing C2Cards to be viewed and discussed nearly anywhere. Furthermore, a C2Card can be downloaded for off-line use. The curation of a C2Card is done at the level of a single reaction or the metabolic functions of a single gene product. This may lower the threshold for experts to contribute as well and also allows (very) detailed knowledge of just a single step in the metabolic network to be added. One way to stimulate expert contributions would be to make the contribution traceable and citable in the form of ‘nanopublications’ . A nanopublication consists of three parts: a statement, e.g., protein X (subject) catalyzes (predicate) reaction Y (object), conditions under which the statement holds, e.g., a specific compartment, and provenance of the statement, e.g., author and literature. Besides that this provides an incentive for experts to share their knowledge, it is also a way to ensure that contributions of curators are substantiated by references to the literature. We also plan to include in C2CardsHuman the human metabolic pathways of WikiPathways , an open platform in which anyone can contribute a pathway. By incorporating the knowledge from this database we indirectly have a second way in which experts can contribute their knowledge. Ultimately, to reconstruct a biochemical network that closely resembles the metabolism of a target organism, extensive literature research and additional biochemical experiments will be needed to resolve all conflicts revealed and to fill in the gaps that remain. The continuous support, time and effort of a large and diverse community are therefore essential. C2Cards can contribute to this endeavor by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute.
Materials and methods
Overview of metabolic pathway databases used
Export formats used
H. sapiens Recon 1
Flat file, SBML
Flat file, KGML
Pathway database content statistics
H. sapiens Recon 1
Although not used for comparing metabolites, we also retrieved the InChI and SMILES of metabolites, when provided by the pathway database, as additional information. For the genes we retrieved the Entrez Gene and Ensembl Gene ID, if available. For display and comparison purposes we mapped the Entrez Gene and Ensembl Gene IDs to their corresponding HGNC symbol as provided by the Entrez Gene and Ensembl database, respectively. Both the Entrez Gene ID and the Ensembl Gene ID were not available for 396 genes in HumanCyc. For 106 of these genes the UniProt ID was used to retrieve the Entrez Gene ID and/or Ensembl Gene ID. All out-of-date identifiers and EC numbers were transferred to the current ID/EC number (Additional file 2). If that was not possible the ID or EC number was flagged as being obsolete. All data is made available under the original license terms of the primary databases.
Data retrieval and storage
We used dedicated in-house scripts to retrieve the data needed for C2CardsHuman from the five pathway databases and stored these data in a local MySQL database. The database was designed for easy comparison of the genes, EC numbers, and reactions. A second database, optimized for the queries needed for generating the C2CardsHuman (Additional file 3), was derived from this database. To avoid heavy computations in the web application the second database contains all pairwise matches on gene and metabolite level and the percentage of overlap between every possible pair of reactions. Note that the C2Cards themselves are composed on the fly for a given user query.
In C2cardsHuman genes, EC numbers, metabolites and reactions were matched as follows:
Genes Two genes were considered to match if they agreed based on the Entrez Gene ID and/or Ensembl Gene ID. In addition, both types of gene identifiers were mapped to the corresponding HGNC symbols. This provides a basis for matching genes that are not linked to the same genome database, i.e., Entrez Gene or Ensembl, via their HGNC symbol. Moreover, we computed the transitive closure of the gene matches. This means that if for a particular gene there was a match between database A and B, e.g., on Entrez Gene ID, and between database B and C on, e.g., Ensembl Gene ID then the gene was considered to match between database A and C as well.
EC numbers Matching of EC numbers is straightforward except for 71 incomplete EC numbers the five databases have in total. Up to three numbers of the four that make up a complete EC number may be missing. This is indicated by ‘-’, e.g., EC 1.-.-.-. Incomplete EC numbers have an ambiguous meaning . They may indicate that further specification of the enzyme activity is not possible, but also that a complete EC number for the specific enzyme activity is not yet included by NC-IUBMB. To reduce the number of spurious matches, incomplete EC numbers were matched literally, i.e., the ‘-’ was not treated as a wildcard.
Metabolites Metabolites were matched based on the KEGG Compound ID, when available. If the KEGG Compound ID was not provided, the metabolites had to match on any of four other identifiers (KEGG Glycan, ChEBI, PubChem Compound or CAS ID) or on name. In the latter case we also required the chemical formula to match. A difference in the number of H atoms when comparing chemical formulae was ignored. Furthermore, matching on names was case-insensitive and spaces and punctuation were ignored. Also for the metabolite matching we computed the transitive closure (see above).
where R 1 and R 2 denote the two reactions being compared. Furthermore, we computed the transitive closure for the reaction matches as well (see above).
It depends on the organism and the specific pathway databases included in the C2Cards database which IDs can best be used for comparing genes and metabolites. Only a few changes to the code and the original C2Cards database scheme are required to use other IDs for matching. A more detailed description of the changes to make is available on our website (http://www.molgenis.org/c2cards).
Construction web application
Each row in a C2Card contains a reaction, the EC number(s), gene(s), and the pathway linked to the reaction, and the name of the source database. If a reaction was assigned to multiple pathways, a separate row is used for each pathway. The metabolites of a reaction are represented by their primary name as indicated by the pathway database. Although not taken into account when matching reactions, the direction of a reaction and the compartment(s) as indicated by the source database are shown in a C2Card. If the direction was not provided this is indicated with ‘|==|’. Multiple EC numbers are connected by a comma. Following the convention used in Recon 1, genes of which the products are isozymes are connected by the Boolean operator ‘or’. If the gene products form a complex ‘and’ is used. EHMN and KEGG, however, do not have a syntactic mechanism for describing isozymes nor complexes. Therefore, if multiple genes were linked to a reaction by EHMN and KEGG, they are connected by a comma. Genes are represented by the HGNC symbol retrieved from Entrez Gene. The Entrez Gene ID was, however, not always available for every gene, and the HGNC symbol could not always be retrieved when the Entrez Gene ID was available. In these cases we used, when available, the Ensembl Gene ID to retrieve the HGNC symbol. For 358 genes the HGNC symbol was not available via either gene identifier type. In this case the gene is represented by its Entrez Gene or Ensembl Gene ID, depending on which of these two was available. For 274 genes in HumanCyc these two gene identifiers were also not available and for these cases the internal gene identifier of HumanCyc is used for representation. If multiple HGNC symbols were linked to a gene they are separated by two underscores. Note also that HumanCyc and Reactome may link multiple Entrez Gene IDs to a single gene, which in most cases will also result in multiple HGNC symbols. Similarly, KEGG and Reactome contain genes linked to multiple Ensembl Gene IDs.
Biochemically, Genetically, and Genomically structured
Consensus and Conflict Cards
Edinburgh Human Metabolic Network
HUGO Gene Nomenclature Committee
Kyoto Encyclopedia of Genes and Genomes
KEGG Markup Language
Molecular Genetics Information Systems
Nomenclature Committee of the International Union of Biochemistry and Molecular Biology
Representational State Transfer
Systems Biology Markup Language
Simple Object Access Protocol
We would like to thank Erik Roos for his contributions to the web application in the initial phase of this project, Joeri van der Velde for his contributions in the final phase, and Dave Speijer for helpful discussions. We also thank the anonymous reviewers for their helpful comments and suggestions for improving the presentation and comprehensibility of the paper. This research was carried out within the BioRange (project SP1.2.4) and BioAssist (project 4.1 ‘Molgenis’) programmes of The Netherlands Bioinformatics Centre (NBIC; http://www.nbic.nl), supported by BSIK; Netherlands Proteomics Center grants through The Netherlands Genomics Initiative (NGI); the research programme of the Netherlands Consortium for Systems Biology (NCSB), which is part of the Netherlands Genomics Initiative / Netherlands Organization for Scientific Research. IT was supported by a European Research Council grant (N° 232816) and by a Marie Curie International Reintegration grant (N° 249261) within the 7th European Community Framework Program.
- Karp PD, Caspi R: A survey of metabolic databases emphasizing the MetaCyc family. Arch Toxicol. 2011, 85: 1015-1033. 10.1007/s00204-011-0705-2.PubMedPubMed CentralView ArticleGoogle Scholar
- Oberhardt MA, Palsson BØ, Papin JA: Applications of genome-scale metabolic reconstructions. Mol Syst Biol. 2009, 5: 320-PubMedPubMed CentralView ArticleGoogle Scholar
- Herrgård MJ, Swainston N, Dobson P, Dunn WB, Arga KY, Arvas M, Blüthgen N, Borger S, Costenoble R, Heinemann M: A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nature Biotechnol. 2008, 26: 1155-1160. 10.1038/nbt1492.View ArticleGoogle Scholar
- Stobbe MD, Houten SM, Jansen GA, van Kampen AHC, Moerland PD: Critical assessment of human metabolic pathway databases: a stepping stone for future integration. BMC Syst Biol. 2011, 5: 165-10.1186/1752-0509-5-165.PubMedPubMed CentralView ArticleGoogle Scholar
- Mo ML, Palsson BØ: Understanding human metabolic physiology: a genome-to-systems approach. Trends Biotechnol. 2009, 27: 37-44. 10.1016/j.tibtech.2008.09.007.PubMedView ArticleGoogle Scholar
- Thiele I, Palsson BØ: Reconstruction annotation jamborees: a community approach to systems biology. Mol Syst Biol. 2010, 6: 361-PubMedPubMed CentralView ArticleGoogle Scholar
- Thiele I, Hyduke DR, Steeb B, Fankam G, Allen DK, Bazzani S, Charusanti P, Chen FC, Fleming RM, Hsiung CA: A community effort towards a knowledge-base and mathematical model of the human pathogen Salmonella Typhimurium LT2. BMC Syst Biol. 2011, 5: 8-10.1186/1752-0509-5-8.PubMedPubMed CentralView ArticleGoogle Scholar
- Wittig U, De Beuckelaer A: Analysis and comparison of metabolic pathway databases. Brief Bioinform. 2001, 2: 126-142. 10.1093/bib/2.2.126.PubMedView ArticleGoogle Scholar
- Radrich K, Tsuruoka Y, Dobson P, Gevorgyan A, Swainston N, Baart G, Schwartz JM: Integration of metabolic databases for the reconstruction of genome-scale metabolic networks. BMC Syst Biol. 2010, 4: 114-10.1186/1752-0509-4-114.PubMedPubMed CentralView ArticleGoogle Scholar
- Chindelevitch L, Stanley S, Hung D, Regev A, Berger B: MetaMerge: scaling up genome-scale metabolic reconstructions, with application to Mycobacterium tuberculosis. Genome Biol. 2012, 13: R6-10.1186/gb-2012-13-1-r6.PubMedPubMed CentralView ArticleGoogle Scholar
- Schellenberger J, Park JO, Conrad TM, Palsson BØ: BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinforma. 2010, 11: 213-10.1186/1471-2105-11-213.View ArticleGoogle Scholar
- Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, Vo TD, Srivas R, Palsson BØ: Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci USA. 2007, 104: 1777-1782. 10.1073/pnas.0610772104.PubMedPubMed CentralView ArticleGoogle Scholar
- Hao T, Ma HW, Zhao XM, Goryanin I: Compartmentalization of the Edinburgh Human Metabolic Network. BMC Bioinforma. 2010, 11: 393-10.1186/1471-2105-11-393.View ArticleGoogle Scholar
- Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD: Computational prediction of human metabolic pathways from the complete human genome. Genome Biol. 2004, 6: R2-10.1186/gb-2004-6-1-r2.PubMedPubMed CentralView ArticleGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucl Acids Res. 2012, 40: D109-D114. 10.1093/nar/gkr988.PubMedPubMed CentralView ArticleGoogle Scholar
- Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B: Reactome: a database of reactions, pathways and biological processes. Nucl Acids Res. 2011, 39: D691-D697. 10.1093/nar/gkq1018.PubMedPubMed CentralView ArticleGoogle Scholar
- Orth JD, Palsson BØ: Systematizing the generation of missing metabolic knowledge. Biotechnol Bioeng. 2010, 107: 403-412. 10.1002/bit.22844.PubMedPubMed CentralView ArticleGoogle Scholar
- Elbers CC, van Eijk KR, Franke L, Mulder F, van der Schouw YT, Wijmenga C, Onland-Moret NC: Using genome-wide pathway analysis to unravel the etiology of complex diseases. Genet Epidemiol. 2009, 33: 419-431. 10.1002/gepi.20395.PubMedView ArticleGoogle Scholar
- Willemoës M: Competition between ammonia derived from internal glutamine hydrolysis and hydroxylamine present in the solution for incorporation into UTP as catalysed by Lactococcus lactis CTP synthase. Arch Biochem Biophys. 2004, 424: 105-111. 10.1016/j.abb.2004.01.018.PubMedView ArticleGoogle Scholar
- Kassel KM, Au DR, Higgins MJ, Hines M, Graves LM: Regulation of Human Cytidine Triphosphate Synthetase 2 by Phosphorylation. J Biol Chem. 2010, 285: 33727-33736. 10.1074/jbc.M110.178566.PubMedPubMed CentralView ArticleGoogle Scholar
- Bierau J, Lindhout M, Bakker JA: Pharmacogenetic significance of inosine triphosphatase. Pharmacogenomics. 2007, 8: 1221-1228. 10.2217/146224126.96.36.1991.PubMedView ArticleGoogle Scholar
- Johansson M: Identification of a novel human uridine phosphorylase. Biochem Biophys Res Commun. 2003, 307: 41-46. 10.1016/S0006-291X(03)01062-3.PubMedView ArticleGoogle Scholar
- el Kouni MH, el Kouni MM, Naguib FNM: Differences in Activities and Substrate Specificity of Human and Murine Pyrimidine Nucleoside Phosphorylases: Implications for Chemotherapy with 5-Fluoropyrimidines. Cancer Res. 1993, 53: 3687-3693.PubMedGoogle Scholar
- Kamburov A, Pentchev K, Galicka H, Wierling C, Lehrach H, Herwig R: ConsensusPathDB: toward a more complete picture of cell biology. Nucl Acids Res. 2011, 39: D712-D717. 10.1093/nar/gkq1156.PubMedPubMed CentralView ArticleGoogle Scholar
- Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C: Pathway commons, a web resource for biological pathway data. Nucl Acids Res. 2011, 39: D685-D690. 10.1093/nar/gkq1039.PubMedPubMed CentralView ArticleGoogle Scholar
- Östlund G, Schmitt T, Forslund K, Köstler T, Messina DN, Roopra S, Frings O, Sonnhammer ELL: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucl Acids Res. 2010, 38: D196-D203. 10.1093/nar/gkp931.PubMedPubMed CentralView ArticleGoogle Scholar
- Thiele I, Swainston N, Fleming RMT, Hoppe A, Sahoo S, Aurich MK, Haraldsdottir H, Mo ML, Rolfsson O, Stobbe MD: A community-driven global reconstruction of human metabolism. Nat Biotech. 2013, 31: 419-425. 10.1038/nbt.2488.View ArticleGoogle Scholar
- Groth P, Gibson A, Velterop J: The anatomy of a nanopublication. Information Services and Use. 2010, 30: 51-56.Google Scholar
- Pico AR, Kelder T, van Iersel MP, Hanspers K, Conklin BR, Evelo C: WikiPathways: pathway editing for the people. PLoS Biol. 2008, 6: e184-10.1371/journal.pbio.0060184.PubMedPubMed CentralView ArticleGoogle Scholar
- Green ML, Karp PD: Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers. Nucleic Acids Res. 2005, 33: 4035-4039. 10.1093/nar/gki711.PubMedPubMed CentralView ArticleGoogle Scholar
- Swertz MA, Dijkstra M, Adamusiak T, van der Velde JK, Kanterakis A, Roos TE, Lops J, Thorisson GA, Arends D, Byelas G: The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button. BMC Bioinforma. 2010, 11: S12-View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.