TimeXNet: Identifying active gene sub-networks using time-course gene expression profiles
© Patil and Nakai; licensee BioMed Central Ltd. 2014
Published: 8 December 2014
Time-course gene expression profiles are frequently used to provide insight into the changes in cellular state over time and to infer the molecular pathways involved. When combined with large-scale molecular interaction networks, such data can provide information about the dynamics of cellular response to stimulus. However, few tools are currently available to predict a single active gene sub-network from time-course gene expression profiles.
We introduce a tool, TimeXNet, which identifies active gene sub-networks with temporal paths using time-course gene expression profiles in the context of a weighted gene regulatory and protein-protein interaction network. TimeXNet uses a specialized form of the network flow optimization approach to identify the most probable paths connecting the genes with significant changes in expression at consecutive time intervals. TimeXNet has been extensively evaluated for its ability to predict novel regulators and their associated pathways within active gene sub-networks in the mouse innate immune response and the yeast osmotic stress response. Compared to other similar methods, TimeXNet identified up to 50% more novel regulators from independent experimental datasets. It predicted paths within a greater number of known pathways with longer overlaps (up to 7 consecutive edges) within these pathways. TimeXNet was also shown to be robust in the presence of varying amounts of noise in the molecular interaction network.
TimeXNet is a reliable tool that can be used to study cellular response to stimuli through the identification of time-dependent active gene sub-networks in diverse biological systems. It is significantly better than other similar tools. TimeXNet is implemented in Java as a stand-alone application and supported on Linux, MS Windows and Macintosh. The output of TimeXNet can be directly viewed in Cytoscape. TimeXNet is freely available for non-commercial users.
Condition-specific gene expression profiles are used to study the response of a cell to external stimulus. Time-course gene expression profiles are especially useful in such studies since they capture the changes occurring in the cell over time. This data is often combined with protein-protein and protein-DNA interaction networks to identify sub-networks of activated genes . However, many of the currently available methods that predict response networks using gene expression profiles do not incorporate the analysis of time-course data [2, 3]. Instead, they give a single static response network. Some use time-based gene expression patterns to identify transcription factors activated at specific time points [4, 5] to help predict the response network. Others produce static networks of genes for each time point [6, 7]. Thus, most of the available methods fail to detect relationships between genes expressed at consecutive stages of the cellular response. Many also fail to identify the transient regulators that play an important role in the response but show no change in expression at the sampled time points.
We introduce a tool, TimeXNet (http://timexnet.hgc.jp/), which identifies active gene sub-networks with temporal paths using time-course gene expression profiles in the context of a weighted gene regulatory and protein-protein interaction network . TimeXNet implements an algorithm that identifies the most likely paths in the network connecting genes with significant changes in expression at consecutive time intervals . We show that TimeXNet is faster and more accurate at predicting active gene sub-networks than other existing tools in the study of cellular systems as diverse as the innate immune response in mouse and the yeast osmotic stress response.
Results and discussion
Thus, TimeXNet produces a single response network that incorporates the temporal information of gene expression. The network identifies paths that show the temporal relationships between genes expressed at consecutive time points. These paths also include genes that do not show change in expression levels at the sampled time points. This allows TimeXNet to identify previously unknown, transiently expressed regulators.
Innate immune response in mouse
TimeXNet was evaluated for the identification of active gene sub-networks in the innate immune response in a weighted molecular network of 103218 interactions using gene expression data from mouse dendritic cells at 8 time points after stimulation by lipopolysaccharide (LPS). The weighted molecular interaction network was defined as a combination of mouse protein-protein and protein-DNA interactions from multiple sources including HitPredict , InnateDB , TRANSFAC  and KEGG . Homologs of human interactions were also included. Interactions were scored using the scheme described by HitPredict . The genes with more than 2 fold change in expression were assigned a score based on their relative change in expression on LPS stimulation. The immune response was classified into three consecutive time-dependent stages - early, intermediate and late. TimeXNet was used to identify the most probable paths in the molecular network between genes expressed in the early and the late phases of the immune response, incorporating genes expressed in the intervening time. The resultant network contained several new and known regulators of the innate immune response, as well as those transiently expressed between sampled time points. The predicted temporal network suggested a role for the protein phosphatase 2a catalytic subunit α in the regulation of the immunoproteasome during the late phase of the response. An analysis of time course gene expression profiles from Myd88-knockout and TRIF-knockout dendritic cells helped clarify the differences between the Myd88-dependent and TRIF-dependent pathways in the innate immune response .
TimeXNet evaluation for the mouse innate immune response.
Experimentally confirmed regulators (3 datasets)
KEGG Pathways with predicted paths (maximum length#)
Execution time (4 CPUs, 2.4Ghz, 12Gb RAM)
Prior knowledge required
Analysis of time-course data
13 (7 edges)
0 (3 edges)
2 (4 edges)
Initial regulatory genes
Effect of noise in the interaction network
We evaluated the effect of noise in the interaction network on the predictions made by TimeXNet by randomly adding up to 10,000 synthetic edges (in steps of 10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000) to the mouse interaction network and then predicting the response network. The predicted response network was evaluated for the number of known regulators identified as well as the number and extent of pathway overlap found in KEGG, as described in the previous section. We repeated this test 5 times. We found that adding random edges does not significantly affect the response network predicted by TimeXNet (Additional File 1). Similarly, randomly removing up to 10,000 edges from the mouse interaction network over 5 repetitions does not affect the predicted response network (Additional File 2). Thus, we conclude that the active gene sub-networks predicted by TimeXNet are robust to the effect of noise in the interaction network.
Yeast osmotic stress response
TimeXNet evaluation for the yeast osmotic stress response
Gold Standard genes*#
Based on the evaluation data, we conclude that TimeXNet is a significantly better tool than those currently available for the research community to analyze large amounts of time-course gene expression profiles.
TimeXNet can be installed on the user machine by extracting the contents of a downloadable zip file. Sample data files and help files are also provided.
The input to TimeXNet consists of the following:
1. Three gene lists representing time-course gene expression profiles: These gene lists represent genes that show significant changes in expression during three consecutive time intervals (initial, intermediate and late stage) on exposing the cell to external stimulus. Each gene list is given in the form of a tab-delimited file containing the gene name and the gene score. The gene score represents the value assigned to the gene based on its change in expression, usually the log fold change. The three gene groups are mutually exclusive i.e. genes in one group cannot be present in another.
2. Weighted interaction network: The interaction network is also given as a tab-delimited file containing the each edge denoted by two genes, the type of interaction (unidirectional/bidirectional) and its reliability score. The edges may denote either a physical or functional association between two genes/proteins. The score indicates how reliable the edge is based on experimental or genomic annotation information and should be between 0 and 1. TimeXNet provides a comprehensive network of weighted protein-protein, protein-DNA interactions and post-translational modifications in mouse.
3. Algorithm parameters: These include two real positive constants, γ1 (gamma1) and γ2 (gamma2), which are used to decide the number of initial response genes and intermediate regulators to be included in the predicted response network. TimeXNet requires the GNU Linear Programming Kit (GLPK) in order to solve the optimization problem. It tries to automatically detect installed GLPK. If it fails to find a local copy of the GLPK, it requests the user to install it and provide the location to TimeXNet.
4. Output location: TimeXNet generates several output files and requires the user to specify the output directory where these files will be stored.
After extracting all the files from the downloaded zip file, TimeXNet can be run in three modes:
2. Command line: TimeXNet can also be run from command line with the same input parameters as those of the user interface. This version is particularly useful for running TimeXNet on a supercomputer. The predicted response network is saved in the form of tab-delimited files at the specified location.
3. Iterative command line: The command line version of TimeXNet can be run in an iterative manner for a range of γ1 and γ2 values in order to identify the combination that results in an optimal network i.e. one with the highest number of genes from the three groups and the fewest number of low reliability edges from the starting network. To be run in this mode, TimeXNet requires a range of real positive values for γ1 and γ2. The output provides a statistics file containing the number of original genes and low reliability edges in the predicted network for all combinations of γ1 and γ2 run by TimeXNet. A separate directory is created at the user-specified output location for each combination of γ1 and γ2 values. The predicted response network for each γ1 and γ2 combination is saved in the form of tab-delimited files similar to the single command line and user interface version.
All versions of TimeXNet create and store tab-delimited files containing the genes and interactions of the predicted sub-network along with their flows in a specified location. These can be directly uploaded into Cytoscape. Genes in the predicted response network are assigned a type- SRC, INT, SNK and NOD. Genes of type SRC are a subset of the initial response genes given to TimeXNet as input. INT genes are a subset of the intermediate regulators, while SNK genes are part of the final effectors showing large changes in expression at the final time points. The NOD genes are those that do not show change in expression at the sampled time points but are predicted by TimeXNet to be a part of the response network. Additionally, the formulation of the optimization problem given as input to the GLPK to predict the response network and the final edge list used to generate the optimization problem are also stored, along with the unformatted solution. Finally, a log file showing the detailed progress of the TimeXNet run including a list of duplicate edges ignored, edges and nodes with erroneous scores, and the detailed output of the GLPK is stored in the output directory.
Additional details about installation, input-output files and formats, and usage of TimeXNet can be found at http://timexnet.hgc.jp/.
TimeXNet is a fast and accurate method to identify active gene sub-networks using time-course gene expression profiles. It produces a single response network of genes showing differential expression at consecutive time points with each gene/node and interaction/edge scored for its potential importance in the predicted response network. TimeXNet does not require any starting knowledge of the response pathway being studied. It is able to identify transiently expressed regulators or those showing no change in expression using the time-course gene expression profiles. This allows the user to identify previously unknown regulators. Thus, TimeXNet helps towards a greater understanding of the temporal relationships between regulators of cellular events. The current version of TimeXNet can only find relationships between three groups of genes. Future versions will be capable of working with a larger number of gene groups as well as incorporating other forms of information such as levels of protein phosphorylation.
TimeXNet is implemented in Java as a stand-alone application in a format compatible with Linux, Windows and Macintosh. It requires the Java Runtime Environment 1.7 and the GNU Linear Programming Kit (GLPK) to be installed on the user's machine. Both are freely available and easy to install. TimeXNet looks for an existing copy of GLPK. If it is not found, TimeXNet requests the user to install GLPK from an installable included in the zip file. The predicted response network can be viewed as a table or a network in Cytoscape , which is bundled with TimeXNet. The network is formatted such that the temporal expression patterns of the genes are clearly visible.
Availability and requirements
Project name: TimeXNet
Project home page: http://timexnet.hgc.jp/
Operating system(s): Platform independent.
Programming languages: Java.
Other requirements: GNU Linear Programming Kit
License: Free to non-commercial users
We would like to thank Dr. Anthony Gitter for providing the yeast interaction network, and Ajay Patil for useful discussions on the Java implementation. The supercomputing resource was provided by the Human Genome Center, The Institute of Medical Science, The University of Tokyo.
This work was partially supported by the Cabinet Office, Government of Japan and JSPS through the Funding Program for World-Leading Innovative R&D on Science and Technology (FIRST Program). The publication charges for this article were funded by the Grant-in-Aid for Young Scientists by the Japan Society for the Promotion of Science to AP.
This article has been published as part of BMC Systems Biology Volume 8 Supplement 4, 2014: Thirteenth International Conference on Bioinformatics (InCoB2014): Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/8/S4.
- Bar-Joseph Z, Gitter A, Simon I: Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet. 2012, 13: 552-564. 10.1038/nrg3244.View ArticlePubMedGoogle Scholar
- Yeger-Lotem E, Riva L, Su LJ, Gitler AD, Cashikar AG, King OD, Auluck PK, Geddie ML, Valastyan JS, Karger DR, et al: Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat Genet. 2009, 41: 316-323. 10.1038/ng.337.PubMed CentralView ArticlePubMedGoogle Scholar
- Huang SS, Fraenkel E: Integrating proteomic, transcriptional, and interactome data reveals hidden components of signaling and regulatory networks. Sci Signal. 2009, 2: ra40-PubMed CentralPubMedGoogle Scholar
- Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP: Network component analysis: Reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci USA. 2003, 100: 15522-15527. 10.1073/pnas.2136632100.PubMed CentralView ArticlePubMedGoogle Scholar
- Gitter A, Carmi M, Barkai N, Bar-Joseph Z: Linking the signaling cascades and dynamic regulatory networks controlling stress responses. Genome Res. 2013, 23: 365-376. 10.1101/gr.138628.112.PubMed CentralView ArticlePubMedGoogle Scholar
- Park Y, Bader JS: How networks change with time. Bioinformatics. 2012, 28: i40-48. 10.1093/bioinformatics/bts211.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen Y, Gu J, Li D, Li S: Time-course network analysis reveals TNF-alpha can promote G1/S transition of cell cycle in vascular endothelial cells. Bioinformatics. 2012, 28: 1-4. 10.1093/bioinformatics/btr619.View ArticlePubMedGoogle Scholar
- Patil A, Kumagai Y, Liang KC, Suzuki Y, Nakai K: Linking transcriptional changes over time in stimulated dendritic cells to identify gene networks activated during the innate immune response. PLoS Comput Biol. 2013, 9: e1003323-10.1371/journal.pcbi.1003323.PubMed CentralView ArticlePubMedGoogle Scholar
- Patil A, Nakai K, Nakamura H: HitPredict: a database of quality assessed protein-protein interactions in nine species. Nucleic Acids Res. 2011, 39: D744-749. 10.1093/nar/gkq897.PubMed CentralView ArticlePubMedGoogle Scholar
- Breuer K, Foroushani AK, Laird MR, Chen C, Sribnaia A, Lo R, Winsor GL, Hancock REW, Brinkman FSL, Lynn DJ: InnateDB: systems biology of innate immunity and beyond--recent updates and continuing curation. Nucleic Acids Res. 2013, 41: D1228-D1233. 10.1093/nar/gks1147.PubMed CentralView ArticlePubMedGoogle Scholar
- Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K, et al: TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 2006, 34: D108-D110. 10.1093/nar/gkj143.PubMed CentralView ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012, 40: D109-D114. 10.1093/nar/gkr988.PubMed CentralView ArticlePubMedGoogle Scholar
- Patil A, Nakamura H: Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinformatics. 2005, 6: 100-10.1186/1471-2105-6-100.PubMed CentralView ArticlePubMedGoogle Scholar
- Basha O, Tirman S, Eluk A, Yeger-Lotem E: ResponseNet2.0: Revealing signaling and regulatory pathways connecting your proteins and genes--now with human data. Nucleic Acids Res. 2013, 41: W198-203. 10.1093/nar/gkt532.PubMed CentralView ArticlePubMedGoogle Scholar
- Amit I, Garber M, Chevrier N, Leite AP, Donner Y, Eisenhaure T, Guttman M, Grenier JK, Li W, Zuk O, et al: Unbiased reconstruction of a mammalian transcriptional network mediating pathogen responses. Science. 2009, 326: 257-263. 10.1126/science.1179050.PubMed CentralView ArticlePubMedGoogle Scholar
- Chevrier N, Mertins P, Artyomov MN, Shalek AK, Iannacone M, Ciaccio MF, Gat-Viks I, Tonti E, DeGrace MM, Clauser KR, et al: Systematic discovery of TLR signaling components delineates viral-sensing circuits. Cell. 2011, 147: 853-867. 10.1016/j.cell.2011.10.022.View ArticlePubMedGoogle Scholar
- Romero-Santacreu L, Moreno J, Perez-Ortin JE, Alepuz P: Specific and global regulation of mRNA stability during osmotic stress in Saccharomyces cerevisiae. RNA. 2009, 15: 1110-1120. 10.1261/rna.1435709.PubMed CentralView ArticlePubMedGoogle Scholar
- Saito R, Smoot ME, Ono K, Ruscheinski J, Wang P-L, Lotia S, Pico AR, Bader GD, Ideker T: A travel guide to Cytoscape plugins. Nat Meth. 2012, 9: 1069-1076. 10.1038/nmeth.2212.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.