Discovery of metabolite biomarkers: flux analysis and reaction-reaction network approach
© Li et al.; licensee BioMed Central Ltd. 2013
Published: 17 December 2013
Metabolism is a vital cellular process, and its malfunction can be a major contributor to many human diseases. Metabolites can serve as a metabolic disease biomarker. An detection of such biomarkers plays a significant role in the study of biochemical reaction and signaling networks. Early research mainly focused on the analysis of the metabolic networks. The issue of integrating metabolite networks with other available biological data to reveal the mechanics of disease-metabolite associations is an important and interesting challenge.
In this article, we propose two new approaches for the identification of metabolic biomarkers with the incorporation of disease specific gene expression data and the genome-scale human metabolic network. The first approach is to compare the flux interval between the normal and disease sample so as to identify reaction biomarkers. The second one is based on the Reaction-Reaction Network (RRN) to reveal the significant reactions. These two approaches utilize reaction flux obtained by a Linear Programming (LP) based method that can contribute to the discovery of potential novel biomarkers.
Biomarker identification is an important issue in studying biochemical reactions and signaling networks. Two efficient and effective computational methods are proposed for the identification of biomarkers in this article. Furthermore, the biomarkers found by our proposed methods are shown to be significant determinants for diabetes.
Deficiency in essential metabolites can directly cause metabolic diseases. Metabolic diseases profiling is promising in uncovering the mechanism of disease-metabolite associations. Existing research mainly emphasized on the analysis of metabolic networks [1–3]. Models in investigation of large-scale metabolic networks outperform other quantitative approaches [4, 5]. The widespread appearance of gene expression data gives a clue for the integration of metabolite network data to reveal significant biomarkers.
Flux Balance Analysis (FBA)  is a constraint-based and traditional approach for predicting flux distribution. It has been employed in  to identify a number of important metabolic reactions. Drug targets are adopted to reduce abnormal metabolites through formulating an optimal combinatoric problem on metabolic networks [8, 9]. In , drug target prediction can be formulated as an integer linear programming model. A quantitative method based on two-stage FBA has been proposed in  for drug target identification. In , by profiling human metabolic reactions, a drug-reaction network was established for predicting enzyme targets. Here we develop a computational approach to identify metabolic biomarkers using human metabolic reactions incorporating disease-specific gene-expression data . Metabolic biomarkers are metabolites demonstrating consistent variation in concentration in disease state; they can be very useful for a diagnostic purpose, see for instance . As an efficient diagnostic tool and a safe evaluator for drug candidates, metabolomics will play an important role.
Furthermore, we also construct a RRN which provides a platform for ranking the significance of reactions by using the PageRank algorithm. The reaction network can be constructed in such a way that the nodes represent reactions and an edge is placed whenever the reactions share the common metabolite. Note that one reaction can have thousands of edges with the related reactions which makes the network extremely complicated. This issue can be addressed by identifying densely connected subgraphs. The clustering toolbox is therefore employed to find the subnetworks. The graph clustering is based on the assumption that a group of functionally related nodes are likely to highly interact with each other while being more separate from the rest of the network . In , the challenge of the clustering network graphs was presented. In particular, the results of most methods are highly sensitive to their parameters and the predicted clusters can vary from one method to another. Here we focus on analyzing the cluster with the largest number of entities. The selected metabolic reactions provide a platform for us to draw the RRN which depicts the interactions of the reactions. For the ease of interpretation and visualization of the networks, we apply Cytoscape to construct the network. PageRank approach is a promising method to evaluate the importance of a webpage. We then integrate the PageRank algorithm and the FBA method to evaluate the significance of the reactions. We also propose simple statistical criteria to select significant reactions which enable us to identify the corresponding metabolites.
Diabetes Mellitus is a group of metabolic diseases that are amongst the major human malnutrition diseases. Risk assessment is one of the possible ways to prevent the disease. Metabolic profiling, an unbiased technique, can potentially trigger the identification of high-risk candidates and therefore it can reduce the related costs .
We then integrate the human metabolic network with disease-specific gene expression data to analyze the flux profiles within the network. In the following section, we introduce two methods for metabolite biomarkers discovery. The validity of the two approaches will be further discussed. The identified metabolite biomarkers may have potential applications for disease diagnosis.
Materials and methods
The genome-scale human metabolic network reconstructed by Duarte et al.  consists of 3742 reactions, 2766 metabolites and 1905 genes. Three types of information have been used to describe a metabolic network. One of them is stoichiometry, which is used to depict the quantitative associations among reactants and products in all the involved reactions. Another part consists of enzymes corresponding to each reaction in the network. The last part is the flux capacity of each reaction. We employ Human Recon 1, one of the two independently developed human metabolic networks [18, 19] in our study. The data is available at the BiGG database (http://bigg.ucsd.edu/). One can retrieve the reactions and the involved genes using MATLAB. And the RRN can be implemented in the Cytoscape software package which is available at (http://www.cytoscape.org).
We introduce two novel methods for integrating gene expression data and the human metabolic network in biomarker discovery.
Flux profile comparison (FPC) method 
Expression levels in reactions
The LP model
Flux profiles in disease/normal samples
Identification of significant reactions
Significant metabolite discovery
Comparing to the well known model using the human metabolic network to predict metabolic biomarkers of human inborn errors of metabolism , our model takes more realistic constraints into consideration. Firstly, the genome-scale human metabolic network we utilize here consists of 1905 boundary metabolites and 3742 reactions in total. Secondly, we integrate gene expression data in both normal and disease state to mark highly and lowly expressed reactions. Without forcing the reactions to be active in the normal state or inactive in the disease state, we adopt a probability measure for the reaction to be active or inactive instead. We use two pairs of gene expression data in both healthy and disease status and consider the overlap of the discovered metabolic biomarkers. Regarding the solutions of the LP problems, we use a large-scale optimization method which is based on Linear Interior Point SOLver (LIPSOL)  in MATLAB on a Windows Vista machine. These characteristics of our approach contribute to the discovery of metabolic biomarkers in a more significant way.
RRN construction method
In this section, we propose the second novel approach to identify the metabolite biomarkers based on RRN.
• Clustering toolbox for identifying the subnetworks.
• The PageRank algorithm for evaluating the reactions in the RRN.
Results and discussions
In this section, we discuss some of our findings by our proposed two approaches: FPC and RRN. For FPC, we have filtered out 5 reactions and all the participating genes in these reactions . In terms of genes, we have identified 11 significant genes involved in diabetes, "ALDH" and its variants(9 in total), "HSD3B2" and "KHK". In , "ALDH" activity has been experimentally shown to be related to the increasing risk of large vessel disease in diabetes. Direct intra-pancreatic delivery of ALDH activity could be a potential feasible strategy for diabetes . Furthermore, researchers have found that, ALDH2.487Lys allele was related to the decreasing prevalence odds in type II diabetes in a clinical study on diabetes . "HSD3B2" is discovered highly expressed with regulation of FXR (farnesoid × receptor) while FXR agonists are an appearing therapeutic treatment for diabetes . The value of "KHK" as a pharmacological target needs further verification , but it can be a possible biomarker in diabetes treatment.
Considering the metabolites related diabetes, "ac[e]" acting as an inhibitor, is very helpful for patients in clinical trials, see for example . Both "nadph" and "nadp" are valuable metabolites in l-xylulose (l-xylulose is obtained by "nadph" and "nad" reduction with "d-xylulose" ) which is intensively used in diabetes diagnosis. While reaction 1951 is glycolaldehyde dehydrogenase, glycolaldehyde has been shown to play a significant role in diabetic cardiomyopathy . And "pi" is a critical component in the disturbance of diabetes .
Significant reactions selected from RRN
"[c]:pi+uri ⇆ r1p+ura"
"[m]:fad+succ ⇆ fadh2+fum"
Significant genes for diabetes in RRN
"SDHD" "SDHC" "SDHB" "SD-HA"
We remark that the symbol "⇆" means the reaction is reversible and "→" means the reaction is irreversible. The number inside the parentheses (.) is the quantity of the metabolite. For example, in the Reaction 3511 of Table 1, we need "coa[m]:tetpent6crn[m] = 1:1" to produce "crn[m]:tetpent6coa[m] = 1:1". In the associated genes of Table 2, "CPT2" and "SDHD" etc are the gene symbols.
Considering the genes involved in the reactions, we have identified 9 important genes participating in Diabetes in Table 2. In , experiments have shown that gene "EHHADH" is involved in mitochondrial fatty acid β-oxidation and the variants in "EHHADH" are associated with type 2 Diabetes. In , it is demonstrated that obese diabetes impairs the rhythmic expression of various genes, including "UPP2". It has been shown that Diabetes appears to be associated with increased levels of oxidative stress in . And gene "TXNRD1" exhibited increases in oxidative stress. "SDHB", "SDHA", "SDHC" and "SDHD" are four protein subunits forming succinate dehydrogenase. The role of "SDH" needs further investigation, but it is suspected that malfunction of the SDH complex can cause a hypoxic response in the cell that leads to tumor formation. Pharmacological inhibition of the CPT system by the glycidic acid derivative etomoxir, an irreversible and nonisoform-specific active site inhibitor of CPTs, has been demonstrated to reduce fasting blood glucose in an animal model of type 2 diabetes mellitus .
From the perspective of the metabolites related to the disease, reaction 3115 is an active reaction involved in fatty acid cholesterol metabolism during prostate cancer progression . Here "pi" is a determining factor in regulation of metabolism in diabetes  and "nadh" involves in lactate formation (3-4-hydroxyphenyl lactate formation) . Both "fad" and "fadh2" play an important role in fatty acids metabolism . The role of malonyl CoA as a key glucose-derived metabolite, an allosteric inhibitor of fatty acid oxidation, has attracted many attentions. Recent studies have investigated the effects of manipulating this metabolite in various tissues . For the issue of diabetes, a study in  demonstrated that high levels of malonyl CoA and reduced fat oxidation enhance glucose disposal in primary human skeletal myocytes.
One can see that these two approaches are extremely different, but some of the metabolite biomarkers are the same. It has been shown that "nadh" and "pi" are the important factors for detecting diabetes. We can conclude that both our proposed methods are effective for identifying the metabolite biomarkers for diseases.
In this paper, we first develop a computational method to identify significant genes and metabolites for metabolic diseases. LP based strategy is then utilized to obtain flux profiles in disease/normal samples. Gene expression data in two pairs of samples at disease/normal states contributes to discovering genes and metabolites that can be potential biomarkers. We then further present a second novel approach to identify the significant metabolites for the metabolic disease. We also employ the constraint-based flux distribution to analyze the metabolic network. The clustering method makes it possible to identify the subgraphs with the common properties, which is a key step to construct the RRN with Cytoscape. To evaluate the reaction in the network, we propose the PageRank algorithm to evaluate the node. We integrate the flux value and the rank result to select the significant reactions, from which the related metabolites biomarkers can be identified. The integration of genome-scale human metabolic network data with gene expression levels offers a new way for systematically identifying potential biomarkers.
The authors would like to thank the three anonymous referees for their helpful comments and constructive suggestions. The preliminary version of the paper has been presented in ISB 2012 and published in the corresponding conference proceedings .
The publication of this article has been funded by the GRF Grant, HKU CERG Grants, Hung Hing Ying Physical Research Grant and National Natural Science Foundation of China Grant Nos. 10971075 and S201201009985.
This article has been published as part of BMC Systems Biology Volume 7 Supplement 2, 2013: Selected articles from The 6th International Conference of Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/7/S2.
- Reich JG, Selkov EE: Energy Metabolism of the Cell: A Theoretical Treatise. 1981, New York: Academic PressGoogle Scholar
- Fell DA: Understanding the Control of Metabolism. 1996, London: Portland PressGoogle Scholar
- Varner J, Ramkrishna D: Metabolic Engineering from a Cybernetic Perspective. 1. Theoretical Preliminaries. Biotechnology Progress. 1999, 15 (3): 407-425. 10.1021/bp990017p.View ArticlePubMedGoogle Scholar
- Bailey JE: Complex Biology with No Parameters. Nat Biotechnol. 2001, 19: 503-504. 10.1038/89204.View ArticlePubMedGoogle Scholar
- Palsson B: The Challenges of in Silico Biology. Nat Biotechnol. 2000, 18: 1147-1150. 10.1038/81125.View ArticlePubMedGoogle Scholar
- Kauffman KJ, Prakash P, Edwards JS: Advances in Flux Balance Analysis. Current Opinion in Biotechnology. 2003, 14: 491-496. 10.1016/j.copbio.2003.08.001.View ArticlePubMedGoogle Scholar
- Almaas E, Oltvai ZN, Barabasi AL: The Activity Reaction Core and Plasticity of Metabolic Networks. Plos Computational Biology. 2005, 1: e68-10.1371/journal.pcbi.0010068.PubMed CentralView ArticlePubMedGoogle Scholar
- Sridhar P, Kahveci T, Ranka S: An Iterative Algorithm for Metabolic Network-based Drug Target Identification. Pacific Symposium on Biocomputing. 2007, 12: 88-99.Google Scholar
- Sridhar P, Song B, Kahveci T, Ranka S: Mining Metabolic Networks for Optimal Drug Targets. Pacific Symposium on Biocomputing. 2008, 13: 291-302.Google Scholar
- Li Z, Wang RS, Zhang XS, Zhang L: Detecting drug targets with minimum side effects in metabolic networks. IET Syst Biol. 2009, 3 (6): 523-33. 10.1049/iet-syb.2008.0166.View ArticlePubMedGoogle Scholar
- Li Z, Wang RS, Zhang XS: Two-stage flux balance analysis of metabolic networks for drug target identification. BMC Syst Biol. 2011, 5 (Suppl 1): S11-10.1186/1752-0509-5-S1-S11.View ArticleGoogle Scholar
- Li L, Zhou X, Ching W, Wang P: Predicting Enzyme Targets for Cancer Drugs by profiling Human metabolic Reactions in NCI-60 Cell Lines. BMC Bioinformatics. 2010, 11: 501-PubMed CentralPubMedGoogle Scholar
- Li L, Jiang H, Ching W, Vassiliadis V: Metabolite Biomarker Discovery For Metabolic Diseases By Flux Analysis. 2012, Proceedings of the 6th IEEE International Conference on Systems Biology (ISB 2012), Xian China, 1-5. IEEEGoogle Scholar
- Shlomi T, Cabili MN, Ruppin E: Predicting Metabolic Biomarkers of Human Inborn Errors of Metabolism. Molecular Systems Biology. 2009, 5: 263-PubMed CentralView ArticlePubMedGoogle Scholar
- Aittokallio Tero, Schwikowski benno: PGraph-based Methods for Analysing Networks in Cell Biology. Bioinformatics. 2006, 7 (3): 243-255.PubMedGoogle Scholar
- Dhaeseleer P: How Does Gene Expression Clustering Work?. Nat Biotechnol. 2005, 23: 1499-1501. 10.1038/nbt1205-1499.View ArticleGoogle Scholar
- Harrigan GG, Goodacre R: Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis. 2003, London: Kluwer Academic PublisherView ArticleGoogle Scholar
- Duarte NC, Becker SA, Jamshidi N, Thiele I, MO ML, Vo TD, Srivas R, Palsson BO: Global Reconstruction of the Human Metabolic Network Based on Genomi and Bibliomic data. Proc Natl Acad Sci USA. 2007, 104 (6): 1777-1782. 10.1073/pnas.0610772104.PubMed CentralView ArticlePubMedGoogle Scholar
- Ma H, Goryanin I: Human Metabolic Network Reconstruction and Its Impact on Drug Discovery and Development. Drug Discovery Today. 2008, 13: 402-408. 10.1016/j.drudis.2008.02.002.View ArticlePubMedGoogle Scholar
- Zhang Y: Solving Large-Scale Linear Programs by Interior-Point Methods Under the MATLAB Environment. 1995, Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD, Technical Report TR96-01Google Scholar
- Jain AK, Murty MN, Flynn PJ: Data Clustering: A Review. ACM Computing Surveys (CSUR). 1999, 31 (3): 264-323. 10.1145/331499.331504.View ArticleGoogle Scholar
- Duda RO, Hart PE, Stork DG: Pattern Classification. ch.10: Unsupervised Learning and Clustering. John Wiley & Sons, New York. 2001Google Scholar
- Saitou N, Nei M: The Neighbor-joining Method: A New Method for Reconstructing Phylogenetic Trees. Mol Biol Evol. 1987, 4 (4): 406-425.PubMedGoogle Scholar
- Borate BR, Chesler EJ, Langston MA, Saxton AM, Voy BH: Comparison of Threshold Selection Methods for Microarray Gene Co-expression Matrices. BMC Res Notes. 2009, 2: 240-10.1186/1756-0500-2-240.PubMed CentralView ArticlePubMedGoogle Scholar
- Perkins AD, Langston MA: Threshold Selection in Gene Co-expression Networks Using Spectral Graph Theory Techniques. BMC Bioinformatics. 2009, 10: S4-PubMed CentralView ArticlePubMedGoogle Scholar
- Macqueen B: Some methods for classification and Analysis of Multivariate Observations. In proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. 1967, Berkeley, University of California press, 1: 281-297.Google Scholar
- Lu Y, Lu S, Fotouhi F, Deng Y, Brown SJ: Incremental Genetic k-means Algorithm and Its Application in Gene Expression Data Analysis. BMC Bioinformatics. 2004, 5: 172-10.1186/1471-2105-5-172.PubMed CentralView ArticlePubMedGoogle Scholar
- Brin S, Page L: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems. 1998, 30: 107-117. 10.1016/S0169-7552(98)00110-X.View ArticleGoogle Scholar
- Ding Chris HQ, Zha Hongyuan, He Xiaofeng, Husbands Parry, Simon Horst D: Link Analysis: Hubs and Authorities on the World WideWeb. Society for Industrial and Applied Mathematics. 2004, 46 (2): 256-268.Google Scholar
- Jerntorp P, Ohlin H, Almér LO: Aldehyde dehydrogenase Activity and Large Vessel Disease in Diabetes Mellitus: A Preliminary Study. Diabetes. 1986, 35: 291-294. 10.2337/diab.35.3.291.View ArticlePubMedGoogle Scholar
- Bell GI, Putman DM, Hughes-Large JM, Hess DA: Intrapancreatic Delivery of Human Umbilical Cord Blood Aldehyde Dehydrogenase-producing Cells Promotes Islet Regeneration. Diabetologia. 2012, 55: 1755-1760. 10.1007/s00125-012-2520-6.View ArticlePubMedGoogle Scholar
- Guang Y, Ohnaka K, Morita M, Tabata S, Tajima O, Kono S: Genetic Polymorphisms of Alcohol Dehydrogenase and Aldehyde Dehydrogenase: Alcohol Use and Type 2 Diabetes in Japanese Men. Epidemiology Research International. 2011, 2011 (2011): 583682-Google Scholar
- Xing Y, Saner-Amigh K, Nakamura Y, Hinshelwood MM, Carr BR, Mason JI, Rainey WE: The Farnesoid × Receptor Regulates Transcription of 3beta-hydroxysteroid Dehydrogenase Type II in Human Adrenal Cells. Mol Cell. Endocrinol. 2009, 2: 153-162.View ArticleGoogle Scholar
- Diggle CP, Shires M, McRae C, Crellin D, Fisher J, Carr IM, Markham AF, Hayward BE, Asipu A, Bonthron DT: Both Isoforms of Ketohexokinase are Dispensable for Normal Growth and Development. Physiol Genomics. 2010, 4: 235-43.View ArticleGoogle Scholar
- Codario RA: Type 2 Diabetes, Pre-Diabetes, and the Metabolic Syndrome. 2011, NewYork: Humana Press, 2View ArticleGoogle Scholar
- Bhagavan NV: Carbohydrate Metabolism II: Gluconeogenesis, Glycogen Synthesis and Breakdown, and Alernative Pathways. Medical Biochemistry. 2001, London: Academic Press, 4Google Scholar
- Lorenzi R, Andrades ME, Bortolin RC, Nagai R, Dal-Pizzol F, Moreira JC: Glycolaldehyde Induces Oxidative Stress in the Heart: A Clue to Diabetic Cardiomyopathy?. Cardiovasc Toxicol. 2010, 4: 244-249.View ArticleGoogle Scholar
- Ditezl J, Rvang HH, Nagai R: Disturbance of Inorganic Phosphate Metabolism in Diabetes Mellitus: Its Impact on the Development of Diabetic Late Complications. Curr Diabetes Rev. 2010, 5: 323-333.View ArticleGoogle Scholar
- Banasik K, Justesen JM, Hornbak M: Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease. PloS One. 2011, 6: e16542-10.1371/journal.pone.0016542.PubMed CentralView ArticlePubMedGoogle Scholar
- Andoa H, Oshimaa Y, Yanagiharaa H, Hayashia Y, Takamurab T, Kanekob S, Fujimura A: Profile of Rhythmic Gene Expression in the Livers of Obese Diabetic KK-A Mice. Biochemical and Biophysical Research Communications. 2006, 346 (4): 1297-1302. 10.1016/j.bbrc.2006.06.044.View ArticleGoogle Scholar
- Merrill CL, Ni H, Yoon LW: Etomoxir-Induced Oxidative Stress in HepG2 Cells Detected by Differential Gene Expression Is Confirmed Biochemically. Toxicological Sciences. 2002, 68: 93-101. 10.1093/toxsci/68.1.93.View ArticlePubMedGoogle Scholar
- Barnett M, Collier GR, O'Dea K: The Longitudinal Effect of Inhibiting Fatty Acid Oxidation in Diabetic Rats Fed a High Fat Diet. Horm Metab Res. 1992, 24 (8): 360-362. 10.1055/s-2007-1003335.View ArticlePubMedGoogle Scholar
- Balbin OA, Shellman ER: The metabolism of Human Prostate Cancer Progression: A Systems Approach. Bioinformatics. 2008, 527-Google Scholar
- Maassen JA, Romijn JA, Heine RJ: Fatty acid-induced mitochondrial uncoupling in adipocytes as a key protective factor against insulin resistance and beta cell dysfunction: a new concept in the pathogenesis of obesity-associated type 2 diabetes mellitus. Diabetologia. 2007, 550 (10): 2036-2041.View ArticleGoogle Scholar
- Muoio DM, Newgard CB: Fatty Acid Oxidation and Insulin Action When Less Is More. Diabetes. 2008, 57 (6): 1455-1456. 10.2337/db08-0281.View ArticlePubMedGoogle Scholar
- Bouzraki K, Austin R, Rune A, Lassman ME, Garcia-Roves PM, Berger JP, Krook A, Chibalin AV, Zhang BB, Zierath JR: Malonyl Coenzyme-A Decarboxylase Regulates Lipid and Glucose metabolism in Human Skeletal Muscle. Diabetes. 2008, 57: 1508-1516. 10.2337/db07-0583.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.