- Open Access
Discovery of metabolite biomarkers: flux analysis and reaction-reaction network approach
BMC Systems Biology volume 7, Article number: S13 (2013)
Metabolism is a vital cellular process, and its malfunction can be a major contributor to many human diseases. Metabolites can serve as a metabolic disease biomarker. An detection of such biomarkers plays a significant role in the study of biochemical reaction and signaling networks. Early research mainly focused on the analysis of the metabolic networks. The issue of integrating metabolite networks with other available biological data to reveal the mechanics of disease-metabolite associations is an important and interesting challenge.
In this article, we propose two new approaches for the identification of metabolic biomarkers with the incorporation of disease specific gene expression data and the genome-scale human metabolic network. The first approach is to compare the flux interval between the normal and disease sample so as to identify reaction biomarkers. The second one is based on the Reaction-Reaction Network (RRN) to reveal the significant reactions. These two approaches utilize reaction flux obtained by a Linear Programming (LP) based method that can contribute to the discovery of potential novel biomarkers.
Biomarker identification is an important issue in studying biochemical reactions and signaling networks. Two efficient and effective computational methods are proposed for the identification of biomarkers in this article. Furthermore, the biomarkers found by our proposed methods are shown to be significant determinants for diabetes.
Deficiency in essential metabolites can directly cause metabolic diseases. Metabolic diseases profiling is promising in uncovering the mechanism of disease-metabolite associations. Existing research mainly emphasized on the analysis of metabolic networks [1–3]. Models in investigation of large-scale metabolic networks outperform other quantitative approaches [4, 5]. The widespread appearance of gene expression data gives a clue for the integration of metabolite network data to reveal significant biomarkers.
Flux Balance Analysis (FBA)  is a constraint-based and traditional approach for predicting flux distribution. It has been employed in  to identify a number of important metabolic reactions. Drug targets are adopted to reduce abnormal metabolites through formulating an optimal combinatoric problem on metabolic networks [8, 9]. In , drug target prediction can be formulated as an integer linear programming model. A quantitative method based on two-stage FBA has been proposed in  for drug target identification. In , by profiling human metabolic reactions, a drug-reaction network was established for predicting enzyme targets. Here we develop a computational approach to identify metabolic biomarkers using human metabolic reactions incorporating disease-specific gene-expression data . Metabolic biomarkers are metabolites demonstrating consistent variation in concentration in disease state; they can be very useful for a diagnostic purpose, see for instance . As an efficient diagnostic tool and a safe evaluator for drug candidates, metabolomics will play an important role.
Furthermore, we also construct a RRN which provides a platform for ranking the significance of reactions by using the PageRank algorithm. The reaction network can be constructed in such a way that the nodes represent reactions and an edge is placed whenever the reactions share the common metabolite. Note that one reaction can have thousands of edges with the related reactions which makes the network extremely complicated. This issue can be addressed by identifying densely connected subgraphs. The clustering toolbox is therefore employed to find the subnetworks. The graph clustering is based on the assumption that a group of functionally related nodes are likely to highly interact with each other while being more separate from the rest of the network . In , the challenge of the clustering network graphs was presented. In particular, the results of most methods are highly sensitive to their parameters and the predicted clusters can vary from one method to another. Here we focus on analyzing the cluster with the largest number of entities. The selected metabolic reactions provide a platform for us to draw the RRN which depicts the interactions of the reactions. For the ease of interpretation and visualization of the networks, we apply Cytoscape to construct the network. PageRank approach is a promising method to evaluate the importance of a webpage. We then integrate the PageRank algorithm and the FBA method to evaluate the significance of the reactions. We also propose simple statistical criteria to select significant reactions which enable us to identify the corresponding metabolites.
Diabetes Mellitus is a group of metabolic diseases that are amongst the major human malnutrition diseases. Risk assessment is one of the possible ways to prevent the disease. Metabolic profiling, an unbiased technique, can potentially trigger the identification of high-risk candidates and therefore it can reduce the related costs .
We then integrate the human metabolic network with disease-specific gene expression data to analyze the flux profiles within the network. In the following section, we introduce two methods for metabolite biomarkers discovery. The validity of the two approaches will be further discussed. The identified metabolite biomarkers may have potential applications for disease diagnosis.
Materials and methods
The genome-scale human metabolic network reconstructed by Duarte et al.  consists of 3742 reactions, 2766 metabolites and 1905 genes. Three types of information have been used to describe a metabolic network. One of them is stoichiometry, which is used to depict the quantitative associations among reactants and products in all the involved reactions. Another part consists of enzymes corresponding to each reaction in the network. The last part is the flux capacity of each reaction. We employ Human Recon 1, one of the two independently developed human metabolic networks [18, 19] in our study. The data is available at the BiGG database (http://bigg.ucsd.edu/). One can retrieve the reactions and the involved genes using MATLAB. And the RRN can be implemented in the Cytoscape software package which is available at (http://www.cytoscape.org).
We introduce two novel methods for integrating gene expression data and the human metabolic network in biomarker discovery.
Flux profile comparison (FPC) method 
There are several major steps that we have to conduct before the construction of LP model, and eventually allow us for the detection of biomarkers.
Expression levels in reactions
The LP model
Flux profiles in disease/normal samples
Identification of significant reactions
Significant metabolite discovery
Comparing to the well known model using the human metabolic network to predict metabolic biomarkers of human inborn errors of metabolism , our model takes more realistic constraints into consideration. Firstly, the genome-scale human metabolic network we utilize here consists of 1905 boundary metabolites and 3742 reactions in total. Secondly, we integrate gene expression data in both normal and disease state to mark highly and lowly expressed reactions. Without forcing the reactions to be active in the normal state or inactive in the disease state, we adopt a probability measure for the reaction to be active or inactive instead. We use two pairs of gene expression data in both healthy and disease status and consider the overlap of the discovered metabolic biomarkers. Regarding the solutions of the LP problems, we use a large-scale optimization method which is based on Linear Interior Point SOLver (LIPSOL)  in MATLAB on a Windows Vista machine. These characteristics of our approach contribute to the discovery of metabolic biomarkers in a more significant way.
RRN construction method
In this section, we propose the second novel approach to identify the metabolite biomarkers based on RRN.
• Clustering toolbox for identifying the subnetworks.
The availability of the human metabolic network from Duarte et al.  enables us to retrieve the reactions and the genes. It includes 3742 reactions with 2766 metabolites and 1905 genes, which suggests a potential way to build a network if the reactions share the common metabolites. The nodes of the network represent the reactions, while the reactions are linked if they have the same metabolite. Some reactions have thousands of edges based on the foundation of constructing the network. Therefore the RRN can be extremely complex which makes it difficult to draw the network by using Cytoscape. We address this problem by using a clustering toolbox to identify the subnetworks which have similar properties. Clustering analysis  aims to classify a set of observations into two or more mutually exclusive unknown groups based on combinations of variables. Thus, cluster analysis is usually adopted in the context of unsupervised classification . It can be applied to a wide range of biological study cases, such as microarray, sequence and phylogenetic analysis . The purpose of clustering is to group different objects together by observing common properties of elements in a system. In a biological network, this can help identify similar biological entities, like proteins that are homologous in different organisms or that belong to the same complex and genes that are co-expressed [24, 25]. Among all the clustering algorithms, the k-means algorithm  aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. The k-means method and its modifications are widely used for gene expression data analysis . Here we utilize this method to classify the reactions into several groups. Reactions in the same cluster have similar behavior. We focus on analyzing the cluster with the largest number of elements and we choose k to be 50. With different k the classification is consequently different. It is interesting to note that the elements in the largest cluster are quite similar. Thus parameter k does not have much influence on the results of the cluster. We finally choose the largest cluster with 134 reactions for our analysis. And the network is shown in Figure 1.
• The PageRank algorithm for evaluating the reactions in the RRN.
With the obtained reaction network, one can evaluate the significance of each network. The page rank of a webpage is a number for representing the relative importance of the webpage based on the number of inbound and outbound links. Inbound links are links from outside pointing to a webpage. Outbound links are links from a webpage to any other webpages . The page rank of a webpage can be obtained from the following formula:
where P i is the page rank of the webpage i, M (i) is the set of the webpages linked to webpage i, L(j) is the number of outbound links of webpage j, and d is a residual probability which is usually set to be 0.85 . Here we remark that the numerical results are similar in our experiments for other d ≥ 0.85. The values of the PageRank of the webpages are the entries of the dominant eigenvector of the modified adjacency matrix. We denote the eigenvector
where N is the total number of pages and r is the solution of the recursive formula:
and the adjacency matrix ℓ(i, j) is 0 if webpage i does not link to webpage j, and we have the normalization condition that, for each j
i.e., the sum of each column is 1. The value of the ranking indicates an importance of a particular page. Inspired by , one can apply the PageRank algorithm to rank the importance of the reaction in the RRN. Here we apply this approach on two subgraphs which are exactly the subsets of Figure 1. The subgraphs are described in Figures 2 and 3. Figure 2 is the left upper corner of Figure 1 and Figure 3 is the right upper side of Figure 1. Then we can evaluate the nodes (reactions) in these two subnetworks. Furthermore, each reaction also has the flux value which represents the significance. Here we propose a simple approach to integrate flux value and rank value to yield a final significant score for each reaction. Let z be the vector containing the final values used for ranking the reactions. We define
where p is the ranking result obtained from the PageRank algorithm, v is the flux vector for all the reactions generating from the flux analysis, and k represents the k th entry (reaction) of the vector. We then select several reactions with the comparatively high value from the RRN using this criteria.
Results and discussions
In this section, we discuss some of our findings by our proposed two approaches: FPC and RRN. For FPC, we have filtered out 5 reactions and all the participating genes in these reactions . In terms of genes, we have identified 11 significant genes involved in diabetes, "ALDH" and its variants(9 in total), "HSD3B2" and "KHK". In , "ALDH" activity has been experimentally shown to be related to the increasing risk of large vessel disease in diabetes. Direct intra-pancreatic delivery of ALDH activity could be a potential feasible strategy for diabetes . Furthermore, researchers have found that, ALDH2.487Lys allele was related to the decreasing prevalence odds in type II diabetes in a clinical study on diabetes . "HSD3B2" is discovered highly expressed with regulation of FXR (farnesoid × receptor) while FXR agonists are an appearing therapeutic treatment for diabetes . The value of "KHK" as a pharmacological target needs further verification , but it can be a possible biomarker in diabetes treatment.
Considering the metabolites related diabetes, "ac[e]" acting as an inhibitor, is very helpful for patients in clinical trials, see for example . Both "nadph" and "nadp" are valuable metabolites in l-xylulose (l-xylulose is obtained by "nadph" and "nad" reduction with "d-xylulose" ) which is intensively used in diabetes diagnosis. While reaction 1951 is glycolaldehyde dehydrogenase, glycolaldehyde has been shown to play a significant role in diabetic cardiomyopathy . And "pi" is a critical component in the disturbance of diabetes .
Furthermore, we analyze the significant reactions selected by integrating the flux analysis and the PageRank algorithm based on the subnetworks of the RRN. We filter out five reactions which are reported in Table 1. All the participating genes in these five reactions are listed in Table 2.
We remark that the symbol "⇆" means the reaction is reversible and "→" means the reaction is irreversible. The number inside the parentheses (.) is the quantity of the metabolite. For example, in the Reaction 3511 of Table 1, we need "coa[m]:tetpent6crn[m] = 1:1" to produce "crn[m]:tetpent6coa[m] = 1:1". In the associated genes of Table 2, "CPT2" and "SDHD" etc are the gene symbols.
Considering the genes involved in the reactions, we have identified 9 important genes participating in Diabetes in Table 2. In , experiments have shown that gene "EHHADH" is involved in mitochondrial fatty acid β-oxidation and the variants in "EHHADH" are associated with type 2 Diabetes. In , it is demonstrated that obese diabetes impairs the rhythmic expression of various genes, including "UPP2". It has been shown that Diabetes appears to be associated with increased levels of oxidative stress in . And gene "TXNRD1" exhibited increases in oxidative stress. "SDHB", "SDHA", "SDHC" and "SDHD" are four protein subunits forming succinate dehydrogenase. The role of "SDH" needs further investigation, but it is suspected that malfunction of the SDH complex can cause a hypoxic response in the cell that leads to tumor formation. Pharmacological inhibition of the CPT system by the glycidic acid derivative etomoxir, an irreversible and nonisoform-specific active site inhibitor of CPTs, has been demonstrated to reduce fasting blood glucose in an animal model of type 2 diabetes mellitus .
From the perspective of the metabolites related to the disease, reaction 3115 is an active reaction involved in fatty acid cholesterol metabolism during prostate cancer progression . Here "pi" is a determining factor in regulation of metabolism in diabetes  and "nadh" involves in lactate formation (3-4-hydroxyphenyl lactate formation) . Both "fad" and "fadh2" play an important role in fatty acids metabolism . The role of malonyl CoA as a key glucose-derived metabolite, an allosteric inhibitor of fatty acid oxidation, has attracted many attentions. Recent studies have investigated the effects of manipulating this metabolite in various tissues . For the issue of diabetes, a study in  demonstrated that high levels of malonyl CoA and reduced fat oxidation enhance glucose disposal in primary human skeletal myocytes.
One can see that these two approaches are extremely different, but some of the metabolite biomarkers are the same. It has been shown that "nadh" and "pi" are the important factors for detecting diabetes. We can conclude that both our proposed methods are effective for identifying the metabolite biomarkers for diseases.
In this paper, we first develop a computational method to identify significant genes and metabolites for metabolic diseases. LP based strategy is then utilized to obtain flux profiles in disease/normal samples. Gene expression data in two pairs of samples at disease/normal states contributes to discovering genes and metabolites that can be potential biomarkers. We then further present a second novel approach to identify the significant metabolites for the metabolic disease. We also employ the constraint-based flux distribution to analyze the metabolic network. The clustering method makes it possible to identify the subgraphs with the common properties, which is a key step to construct the RRN with Cytoscape. To evaluate the reaction in the network, we propose the PageRank algorithm to evaluate the node. We integrate the flux value and the rank result to select the significant reactions, from which the related metabolites biomarkers can be identified. The integration of genome-scale human metabolic network data with gene expression levels offers a new way for systematically identifying potential biomarkers.
Reich JG, Selkov EE: Energy Metabolism of the Cell: A Theoretical Treatise. 1981, New York: Academic Press
Fell DA: Understanding the Control of Metabolism. 1996, London: Portland Press
Varner J, Ramkrishna D: Metabolic Engineering from a Cybernetic Perspective. 1. Theoretical Preliminaries. Biotechnology Progress. 1999, 15 (3): 407-425. 10.1021/bp990017p.
Bailey JE: Complex Biology with No Parameters. Nat Biotechnol. 2001, 19: 503-504. 10.1038/89204.
Palsson B: The Challenges of in Silico Biology. Nat Biotechnol. 2000, 18: 1147-1150. 10.1038/81125.
Kauffman KJ, Prakash P, Edwards JS: Advances in Flux Balance Analysis. Current Opinion in Biotechnology. 2003, 14: 491-496. 10.1016/j.copbio.2003.08.001.
Almaas E, Oltvai ZN, Barabasi AL: The Activity Reaction Core and Plasticity of Metabolic Networks. Plos Computational Biology. 2005, 1: e68-10.1371/journal.pcbi.0010068.
Sridhar P, Kahveci T, Ranka S: An Iterative Algorithm for Metabolic Network-based Drug Target Identification. Pacific Symposium on Biocomputing. 2007, 12: 88-99.
Sridhar P, Song B, Kahveci T, Ranka S: Mining Metabolic Networks for Optimal Drug Targets. Pacific Symposium on Biocomputing. 2008, 13: 291-302.
Li Z, Wang RS, Zhang XS, Zhang L: Detecting drug targets with minimum side effects in metabolic networks. IET Syst Biol. 2009, 3 (6): 523-33. 10.1049/iet-syb.2008.0166.
Li Z, Wang RS, Zhang XS: Two-stage flux balance analysis of metabolic networks for drug target identification. BMC Syst Biol. 2011, 5 (Suppl 1): S11-10.1186/1752-0509-5-S1-S11.
Li L, Zhou X, Ching W, Wang P: Predicting Enzyme Targets for Cancer Drugs by profiling Human metabolic Reactions in NCI-60 Cell Lines. BMC Bioinformatics. 2010, 11: 501-
Li L, Jiang H, Ching W, Vassiliadis V: Metabolite Biomarker Discovery For Metabolic Diseases By Flux Analysis. 2012, Proceedings of the 6th IEEE International Conference on Systems Biology (ISB 2012), Xian China, 1-5. IEEE
Shlomi T, Cabili MN, Ruppin E: Predicting Metabolic Biomarkers of Human Inborn Errors of Metabolism. Molecular Systems Biology. 2009, 5: 263-
Aittokallio Tero, Schwikowski benno: PGraph-based Methods for Analysing Networks in Cell Biology. Bioinformatics. 2006, 7 (3): 243-255.
Dhaeseleer P: How Does Gene Expression Clustering Work?. Nat Biotechnol. 2005, 23: 1499-1501. 10.1038/nbt1205-1499.
Harrigan GG, Goodacre R: Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis. 2003, London: Kluwer Academic Publisher
Duarte NC, Becker SA, Jamshidi N, Thiele I, MO ML, Vo TD, Srivas R, Palsson BO: Global Reconstruction of the Human Metabolic Network Based on Genomi and Bibliomic data. Proc Natl Acad Sci USA. 2007, 104 (6): 1777-1782. 10.1073/pnas.0610772104.
Ma H, Goryanin I: Human Metabolic Network Reconstruction and Its Impact on Drug Discovery and Development. Drug Discovery Today. 2008, 13: 402-408. 10.1016/j.drudis.2008.02.002.
Zhang Y: Solving Large-Scale Linear Programs by Interior-Point Methods Under the MATLAB Environment. 1995, Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD, Technical Report TR96-01
Jain AK, Murty MN, Flynn PJ: Data Clustering: A Review. ACM Computing Surveys (CSUR). 1999, 31 (3): 264-323. 10.1145/331499.331504.
Duda RO, Hart PE, Stork DG: Pattern Classification. ch.10: Unsupervised Learning and Clustering. John Wiley & Sons, New York. 2001
Saitou N, Nei M: The Neighbor-joining Method: A New Method for Reconstructing Phylogenetic Trees. Mol Biol Evol. 1987, 4 (4): 406-425.
Borate BR, Chesler EJ, Langston MA, Saxton AM, Voy BH: Comparison of Threshold Selection Methods for Microarray Gene Co-expression Matrices. BMC Res Notes. 2009, 2: 240-10.1186/1756-0500-2-240.
Perkins AD, Langston MA: Threshold Selection in Gene Co-expression Networks Using Spectral Graph Theory Techniques. BMC Bioinformatics. 2009, 10: S4-
Macqueen B: Some methods for classification and Analysis of Multivariate Observations. In proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. 1967, Berkeley, University of California press, 1: 281-297.
Lu Y, Lu S, Fotouhi F, Deng Y, Brown SJ: Incremental Genetic k-means Algorithm and Its Application in Gene Expression Data Analysis. BMC Bioinformatics. 2004, 5: 172-10.1186/1471-2105-5-172.
Brin S, Page L: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems. 1998, 30: 107-117. 10.1016/S0169-7552(98)00110-X.
Ding Chris HQ, Zha Hongyuan, He Xiaofeng, Husbands Parry, Simon Horst D: Link Analysis: Hubs and Authorities on the World WideWeb. Society for Industrial and Applied Mathematics. 2004, 46 (2): 256-268.
Jerntorp P, Ohlin H, Almér LO: Aldehyde dehydrogenase Activity and Large Vessel Disease in Diabetes Mellitus: A Preliminary Study. Diabetes. 1986, 35: 291-294. 10.2337/diab.35.3.291.
Bell GI, Putman DM, Hughes-Large JM, Hess DA: Intrapancreatic Delivery of Human Umbilical Cord Blood Aldehyde Dehydrogenase-producing Cells Promotes Islet Regeneration. Diabetologia. 2012, 55: 1755-1760. 10.1007/s00125-012-2520-6.
Guang Y, Ohnaka K, Morita M, Tabata S, Tajima O, Kono S: Genetic Polymorphisms of Alcohol Dehydrogenase and Aldehyde Dehydrogenase: Alcohol Use and Type 2 Diabetes in Japanese Men. Epidemiology Research International. 2011, 2011 (2011): 583682-
Xing Y, Saner-Amigh K, Nakamura Y, Hinshelwood MM, Carr BR, Mason JI, Rainey WE: The Farnesoid × Receptor Regulates Transcription of 3beta-hydroxysteroid Dehydrogenase Type II in Human Adrenal Cells. Mol Cell. Endocrinol. 2009, 2: 153-162.
Diggle CP, Shires M, McRae C, Crellin D, Fisher J, Carr IM, Markham AF, Hayward BE, Asipu A, Bonthron DT: Both Isoforms of Ketohexokinase are Dispensable for Normal Growth and Development. Physiol Genomics. 2010, 4: 235-43.
Codario RA: Type 2 Diabetes, Pre-Diabetes, and the Metabolic Syndrome. 2011, NewYork: Humana Press, 2
Bhagavan NV: Carbohydrate Metabolism II: Gluconeogenesis, Glycogen Synthesis and Breakdown, and Alernative Pathways. Medical Biochemistry. 2001, London: Academic Press, 4
Lorenzi R, Andrades ME, Bortolin RC, Nagai R, Dal-Pizzol F, Moreira JC: Glycolaldehyde Induces Oxidative Stress in the Heart: A Clue to Diabetic Cardiomyopathy?. Cardiovasc Toxicol. 2010, 4: 244-249.
Ditezl J, Rvang HH, Nagai R: Disturbance of Inorganic Phosphate Metabolism in Diabetes Mellitus: Its Impact on the Development of Diabetic Late Complications. Curr Diabetes Rev. 2010, 5: 323-333.
Banasik K, Justesen JM, Hornbak M: Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease. PloS One. 2011, 6: e16542-10.1371/journal.pone.0016542.
Andoa H, Oshimaa Y, Yanagiharaa H, Hayashia Y, Takamurab T, Kanekob S, Fujimura A: Profile of Rhythmic Gene Expression in the Livers of Obese Diabetic KK-A Mice. Biochemical and Biophysical Research Communications. 2006, 346 (4): 1297-1302. 10.1016/j.bbrc.2006.06.044.
Merrill CL, Ni H, Yoon LW: Etomoxir-Induced Oxidative Stress in HepG2 Cells Detected by Differential Gene Expression Is Confirmed Biochemically. Toxicological Sciences. 2002, 68: 93-101. 10.1093/toxsci/68.1.93.
Barnett M, Collier GR, O'Dea K: The Longitudinal Effect of Inhibiting Fatty Acid Oxidation in Diabetic Rats Fed a High Fat Diet. Horm Metab Res. 1992, 24 (8): 360-362. 10.1055/s-2007-1003335.
Balbin OA, Shellman ER: The metabolism of Human Prostate Cancer Progression: A Systems Approach. Bioinformatics. 2008, 527-
Maassen JA, Romijn JA, Heine RJ: Fatty acid-induced mitochondrial uncoupling in adipocytes as a key protective factor against insulin resistance and beta cell dysfunction: a new concept in the pathogenesis of obesity-associated type 2 diabetes mellitus. Diabetologia. 2007, 550 (10): 2036-2041.
Muoio DM, Newgard CB: Fatty Acid Oxidation and Insulin Action When Less Is More. Diabetes. 2008, 57 (6): 1455-1456. 10.2337/db08-0281.
Bouzraki K, Austin R, Rune A, Lassman ME, Garcia-Roves PM, Berger JP, Krook A, Chibalin AV, Zhang BB, Zierath JR: Malonyl Coenzyme-A Decarboxylase Regulates Lipid and Glucose metabolism in Human Skeletal Muscle. Diabetes. 2008, 57: 1508-1516. 10.2337/db07-0583.
The authors would like to thank the three anonymous referees for their helpful comments and constructive suggestions. The preliminary version of the paper has been presented in ISB 2012 and published in the corresponding conference proceedings .
The publication of this article has been funded by the GRF Grant, HKU CERG Grants, Hung Hing Ying Physical Research Grant and National Natural Science Foundation of China Grant Nos. 10971075 and S201201009985.
This article has been published as part of BMC Systems Biology Volume 7 Supplement 2, 2013: Selected articles from The 6th International Conference of Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/7/S2.
The authors declare that they have no competing interests.
LL and HJ came up with the idea. WKC, LL, HJ and VSV designed the research. HJ and YQ performed the research and analyzed the results. WKC, HJ, LL, YQ and VSV wrote the paper. All authors read and approved the final manuscript.