- Research article
- Open Access
Large-scale analysis of expression signatures reveals hidden links among diverse cellular processes
© Ge; licensee BioMed Central Ltd. 2011
- Received: 26 November 2010
- Accepted: 29 May 2011
- Published: 29 May 2011
Cells must respond to various perturbations using their limited available gene repertoires. In order to study how cells coordinate various responses, we conducted a comprehensive comparison of 1,186 gene expression signatures (gene lists) associated with various genetic and chemical perturbations.
We identified 7,419 statistically significant overlaps between various published gene lists. Most (80%) of the overlaps can be represented by a highly connected network, a "molecular signature map," that highlights the correlation of various expression signatures. By dissecting this network, we identified sub-networks that define clusters of gene sets related to common biological processes (cell cycle, immune response, etc). Examination of these sub-networks has confirmed relationships among various pathways and also generated new hypotheses. For example, our result suggests that glutamine deficiency might suppress cellular growth by inhibiting the MYC pathway. Interestingly, we also observed 1,369 significant overlaps between a set of genes upregulated by factor X and a set of genes downregulated by factor Y, suggesting a repressive interaction between X and Y factors.
Our results suggest that molecular-level responses to diverse chemical and genetic perturbations are heavily interconnected in a modular fashion. Also, shared molecular pathways can be identified by comparing newly defined gene expression signatures with databases of previously published gene expression signatures.
- Gene Ontology
- False Discovery Rate
- Gene List
- Glutamine Starvation
- Repressive Interaction
With a limited number of genes, cells have to effectively coordinate their responses to diverse perturbations. Different stimuli could activate the same molecular pathways and thus induce overlapping sets of genes. A classic example is response to cold, drought and salt stress in plants . Evoking an opposite response might be beneficial in other circumstances. The MYC pathway, for example, induces proliferative growth under favourable conditions, but is suppressed by many stresses such as inflammation . Studying correlations between these diverse responses compliments in-depth investigations focused on cellular responses to individual stimuli and will enhance understanding of complex regulatory mechanisms.
There are many examples of the co-regulation of the same set of genes in different biological processes. For example, Chang et al. observed that the gene expression signature of serum response in fibroblast predicts cancer progression . Similarly, diverse signaling pathways activated by growth factors induce broadly overlapping sets of genes . Ben-Porath et al. found that genes over-expressed in histologically poorly differentiated tumors are enriched with genes highly expressed in embryonic stem cells . On a larger scale, the Connectivity Map  provides a database of expression profiles of cultured cells treated with various compounds for the detection of associations of small molecules with similar mechanism of action. These studies are all based on the analyses of gene expression data and provide important insight into the relationship between different molecular pathways.
The objective of this study is to systematically compare published gene sets and create a "molecular signature map" that highlights correlations between diverse cellular perturbations. Published gene lists, however, are not readily available in a single source; they currently exist in scattered journal articles in diverse formats. The painstaking task of extracting this information manually has been attempted [7–10]. The L2L database represents the first systematic effort to collect lists of differentially expressed genes from microarray studies, which currently includes about 958 mammalian gene sets . Oncomine is a web-based database system that focuses on cancer related genomics data and includes both raw microarray data and gene sets (referred to as "molecular concepts") . The Molecular Signatures Database (MSigDB) was constructed as a knowledgebase for the popular pathway analysis program known as Gene Set Enrichment Analysis (GSEA) . Most of the L2L information is included in MSigDB, which is by far the most comprehensive source of published human gene sets.
Furthermore, several tools to analyze gene lists data have been developed. Both the L2L and MSigDB web sites provide user interfaces to detect significant overlap of gene lists with their database. A similar approach, known as molecular concept analysis, is available at the Oncomine web site. In addition to using published gene sets, users can also compare their lists against functional gene sets, such as those derived from Gene Ontology (GO), KEGG, etc. Such analyses will broaden understanding of gene sets and their relationships with various pathways and functional categories.
This work is an effort to study the whole picture of overlapping gene lists. This comprehensive analysis of MSigDB gene sets related to chemical and genetic perturbations will provide insights on the relationship of diverse cellular processes. By representing overlaps between gene sets as networks, we focus on the interpretation of the connections among diverse gene sets by taking advantage of the methods for visualizing and analyzing complex biological networks.
Thousands of significant overlaps are identified
The Version 2.5 of MSigDB contains 1,186 gene sets in the "C2: chemical and genetic perturbations" category , manually compiled from over 300 publications. It represents an important source of accumulated knowledge of the molecular signatures of various genetic and chemical perturbations. Except for about 99 gene sets that are based on mouse studies, most of the sets are derived from studies using human tissues or cells. The total number of distinct genes across gene sets in all publications is 14,553. Each gene set has a name like "COLLER_MYC_DN," where Coller is the first author of the publication  followed by a brief description of the set, such as "Genes down-regulated by MYC in 293T (transformed fetal renal cell)."
Top 20 most frequently appearing genes in 1,186 published gene sets in MSigDB
We carried out a comprehensive all-vs.-all comparison of the 1,186 published gene sets using a Perl script (available as Additional File 2). Based on the hypergeometric distribution, we then calculated the likelihood of observing the number of overlapping genes if these two gene sets are randomly drawn without replacement from a collection of 14,553 genes.
Using the Bonferroni correction for multiple testing, we multiplied P values by the total number of comparisons. After correction, the number of significant overlaps is 2,441. Some extremely significant (P < 1 × 10-200) overlaps are apparently justified by the biology. For example, 120 out of the 149 genes in the gene set "CHANG_SERUM_RESPONSE_UP" are shared with "SERUM_FIBROBLAST_CORE_UP", which only has 205 genes. Therefore, even with the most conservative correction, thousands of significant overlaps can be identified.
Since the Bonferroni correction could be too conservative, we used the false discovery rate (FDR) procedure  in further analysis. Although the tests are not statistically independent due to the overlaps between sets, the dependency should be considered a positive correlation, and the FDR procedure is applicable . The raw P-values were translated into FDR to correct for multiple testing . Overlaps between gene sets from the same study were considered trivial and were removed. With FDR < 0.001 as a cut-off, we identified 7419 significant overlaps between 958 gene sets.
To further validate the significance of these overlaps, we used the same criteria to detect overlaps from data generated under the null hypothesis. We generated 1,186 gene sets of the same sizes as those in MSigDB but with genes drawn randomly from a pool of 14,553 distinct genes. With FDR < 0.001 as the cut-off, no significant overlap was identified. The same results hold in five repeated simulations. This simulation demonstrated the significance of the 7,419 overlaps in MSigDB.
Modular organization of the gene set overlapping network
Our results can be conveniently represented by an undirected network, where nodes correspond to gene sets and edges indicate significant overlaps (Figure 1). An annotated version of this network with detailed information on gene sets and overlaps can be found in Additional File 3. This file can be read by the Cytoscape software (http://www.cytoscape.org)  for easy access and exploration. This same information is also provided as an Excel file (Additional File 4). This network highlights correlations across expression signatures of diverse biological processes, diseases, and cellular stimuli. This big network thus constitutes a "molecular signature map," in which individual perturbations are placed in the context defined by all others.
This is a highly connected network with an average of 7.74 connections per gene set. Surprisingly, most (949) of the 958 gene sets are connected to a dominant main network. In this network, while most nodes are connected to a small number of other gene sets, there are a small number of gene sets that significantly overlap with a large number of gene sets. This is similar to what has been observed in many biological networks.
Summary of 22 modules consisting of groups of heavily interconnected gene sets
Most freqently shared genes between gene sets
Most significantly enriched GO Term
P21_P53_ANY_DN_49 (Sup. Figure 1)
Cell cycle, especially M phase
DER_IFNB_UP_93 (Figure 3)
MX2,MX1,OAS1,STAT1,OAS2,ISG15,IFITM1,IFIT3,IRF7,IFI27Response to virus
Immune system process
TNFA_NFKB_DEP_UP_18 (Figure 5)
Inflammatory response/TNFa related
Immune system process
GENOTOXINS_ALL_4HRS_REG_27 (Figure 4)
Cell cycle, DNA damage response
Inflammatory response/blood cells
Cell adhesion, differentiation
HYPOXIA_REG_UP_38 (Figure 6)
Cell cycle arrest
cell cycle arrest
Down-regulated by UV, TNFa
SCHUMACHER_MYC_UP_54 ( Figure 2)
MYC target genes
Intracellular organelle lumen
Lipid metabolic process
Lipid metabolic process
Response to DNA damage, UV
Response to stimulus
E2F1 target genes
Stem cell enriched
Obesity down, adipocyte up
Many of these highly connected sub-networks reveal clusters of gene sets derived from biologically similar perturbations. This is evident from the coherent GO terms enriched in genes shared by gene sets within sub-networks (Table 2). We extracted 70 most frequently appearing genes in each sub-network and conducted enrichment analyses based on GO terms. See Additional File 5 for the full list of these top genes in each module.
A summary of overlaps that were discussed in details in this paper
Overlapping Gene Sets
Explanation and supporting references
Chang_Serum_Response_up & Schumacher_MYC_up
Sana_IFNG_Endothelial_Dn & Zeller_MYC_Up, MYC_Targets
Interferon γ (IFNG) inhibits cell growth through suppression of c-MYC expression .
Taketa_NUP9_HOXA9_3d_Up and interferon α and β gene sets
Transduction of fusion protein NUP98-HOXA9 induces "up-regulation of IFNβ1 and is accompanied by marked up-regulation of IFN-induced genes" .
CMV (cytomegalovirus) infection & Various cytokine regulated gene sets
Host cell response to CMV infection might be mediated by these cytokines.
StemCell_Embryonic_up & BRCA_Prognosis_Neg
Aggressive tumors share some expression signature of embryonic stem cells .
P53_Genes_All & Zeller_MYC_Dn
p53 represses the oncogene MYC possibly through miRNA-145 .
Gay_YY1_up & P53_Genes_All
YY1 inhibits the activation of p53 .
Cancer_undifferentiated_Meta_up & IDX_TSA_UP_Cluster3
Genes involved in TSA-induced differentiation of fibroblasts into adipocytes are also upregulated in undifferentiated tumors.
Peng_Glutamine_Dn & several MYC upregulated gene sets
Glutamine starvation might suppress cell growth by repression of MYC pathway.
Manalo_hypoxia_Dn, StemCell_Embryonic_up, Le_Myelin_up
Cell cycle genes are regulated by hypoxia, stem cells, and growth after wounding.
c-MYC oncoprotein and its relationships to serum stimulation and interferon γ
This sub-network also highlights a gene set of serum response genes that overlaps with MYC gene sets . The c-Myc oncogene is known to mediate responses to serum stimulation [18, 19] and trigger proliferative growth in a favourable environment. The overlaps between two MYC target gene sets and genes downregulated by interferon γ (IFNG) were unexpected. However, as IFNG inhibits cell growth through suppression of c-MYC expression , upregulation of IFNG causes downregulation of MYC target genes. We could generalize that overlaps between a set of "X upregulated genes" with "Y downregulated genes" possibly indicate repressive interactions between factors X and Y. Such overlaps are highlighted in dashed red lines in the networks.
We conclude that most of the gene sets in this sub-network are directly or indirectly related to MYC protein. Figure 2B shows the list of 15 genes that appear three times or more in these seven gene sets. We think this could be a reliable list of MYC target genes based on multiple publications.
A sub-network for pathogen response
Correlations between several other gene sets and interferon α and β pathways are not obvious. The "TAKEDA_NUP9_HOXA9_3D_UP" gene set includes genes upregulated by the fusion protein NUP98-HOXA9, which occurs in acute myeloid leukemia . Takeda et al. noted that transduction of this fusion protein induces "upregulation of IFNβ1 and is accompanied by marked upregulation of IFN-induced genes" . Thus, their gene list must contain INFB target genes. The "BENNETT_SLE_UP" gene set includes genes significantly up-regulated by systemic lupus erythematosus (SLE) patients . The major conclusion and surprising finding of this study are that the SLE active expression profile is "distinguished by a remarkably homogeneous gene expression pattern with overexpression of granulopoiesis-related and interferon (IFN)-induced genes" . Finally, seven gene lists in this sub-network are related to CMV (cytomegalovirus) infection . The finding that these gene sets are highly significantly related to IFN-induced genes indicates that host cell response to CMV infection might be mediated by these cytokines.
We further investigated whether the genes that are frequently shared by gene sets in this sub-network have coherent biological functions. The most significantly enriched functional category is "Response to virus" (P <1.3 × 10-19 after Benjamini correction of multiple testing), followed by "Immune response" (P <1.5 × 10-12). Out of the 70 genes, 18 and 24 are associated with "Response to virus" and "Immune responses", respectively (Table 2). These results indicate that the gene lists in this sub-network are dominated by immune responses triggered by various conditions.
Stem cell related genes as predictors of poor prognosis for breast cancer
The overlap between stem cell and breast cancer prognosis genes is highly significant: 42 (44%) of the 95 genes in "BRCA_Prognosis_Neg" are highly expressed in embryonic stem cells ("StemCell_Embryonic_up", with FDR <1 × 10-11). These 42 genes are enriched with 11 cell cycle related genes (Benjamini corrected P < 0.00001), 20 of which are related to organelle parts of cell structure (Benjamini corrected P < 0.008). The significant overlap between breast cancer prognosis genes and stem cell genes thus highlights the similarity in expression profiles between aggressive tumors and stem cells. This is supported by a more in-depth meta-analysis of gene expression data .
Another interesting overlap is between stem cell gene lists with genes down-regulated by hypoxia. Thirty eight (42%) of the 91 genes in "Manalo_hypoxia_Dn" set are included in "Stem_Cell_Embryonic_up" with FDR <1 × 10-12. Of these 38 genes, 12 are related to GO Term "DNA replication" with Benjamini P value <8.5 × 10-9. Cell cycle genes are also enriched. One of the overlapped genes is BRCA1. Other lists in this cluster include "Genotoxins_All_4hrs_Reg," which is a list of genes that are commonly regulated by six types of genotoxins . The overlapped genes are also mostly cell cycle related, including BUB1, CDC20, CCNB1, etc. The "Le_MYELIN_Up" set contains genes upregulated after sciatic nerve injury. Thus, these genes might be related to growth after wounding.
We also compared gene lists in this sub-network with sets of genes (NRC-1 to NRC-9) recently identified as breast cancer prognostic markers by Li et al.. We identified modestly significant (unadjusted P value < 1 × 10-4) overlaps between three gene sets in this subnetwork with two gene sets related to cell cycle (NRC-1 and NRC-5) and one related to cell growth (NRC-9). See Additional File 1: Figure S3 for more information. These overlaps again suggest that cell cycle genes are important in predicting breast cancer survival. But further study is clearly needed to systematically compare the NRC and other breast cancer related gene sets, many of which are not included in the version 2.5 of MSigDB database.
Glutamine starvation strongly downregulates MYC target genes
We focus our attention on the "Peng_Glutamine_Dn" gene list that is associated with glutamine starvation in human BJAB B-lymphoma cells . An unexpected connection is that genes downregulated by glutamine starvation contain many MYC target genes. In the whole network this is most evident as the "Peng_Glutamine_Dn" list significantly overlaps with almost all MYC related gene sets. The neighborhood of this gene set is in the molecular signature map in Additional File 1: Figure S2. Yuneva et al. showed that glutamine but not glucose starvation induces MYC-dependent apoptosis in human cancer cells , but the mechanism is unknown. On the other hand, Wise et al. found that overexpression of MYC promotes glutaminolysis and leads to cellular addiction to glutamine in cancer cells . These study results may lead to the development of targeted killing of cancer cells that rely on high levels of glutamine uptake. We found no report on whether glutamine starvation inhibits the MYC pathway. If this is indeed true, as suggested by the overlapping of these gene sets, then the closely related nature of glutamine metabolism and the MYC pathway will need to be evaluated more closely.
To further confirm the link between glutamine deprivation and the MYC pathway, we downloaded and re-analyzed the raw DNA microarray data on glutamine starvation . Using the GSEA program, we analyzed the whole dataset for enriched gene sets. The enriched gene sets are shown as Additional File 1: Table S1. One pathway that showed up is the proteosome degradation pathway, in which nutrient deficient cells suppress protein degradation as a means for survival. The most noticeable pathways are multiple MYC target gene sets downregulated at highly significant levels, confirming our observation based on gene set overlaps.
Glutamine starvation triggers a complex network of transcription factors including ATFs and C/EBP factors, and such response might be cell line- or species-dependent (See  for review). Indeed, our further analysis of another set of DNA microarray data  suggests that glutamine starvation does not cause downregulation of Myc target genes in mouse hepatoma cells (data not shown). However, for this specific B-lymphoma cell line studied by Peng et al., the suppression of the MYC pathway is strongly supported by gene set overlaps and raw DNA microarray data analysis.
Repressive interactions between pathways
Interestingly, we identified thousands of overlaps corresponding to repressive interactions between different pathways. These are marked by overlaps between a set of genes downregulated by factor "X" (i.e., interferon gamma) and another set of genes upregulated by factor "Y" (MYC oncogene). Among the total of 7,419 significant overlaps identified, 1,369 (18.4%) belong to this category (up-down). For comparison, 2,762 (37.2%) overlaps are explicitly in the same direction (i.e., up-up or down-down).
Besides the IFNG and MYC, several examples are discussed in previous sections (See Table 3 for a full list). These include the overlap between P53_Genes_All and Zeller_MYC_Dn, which is supported by the fact that p53 represses the MYC oncogene . Additional file 4 includes many repressive overlaps not discussed. One of the very significant repressive overlaps, for example, is between Alzheimers_Disease_Dn and StemCell_Neural_Up. There are 276 genes that were found to be downregulated in Alzheimer's disease but were upregulated in neural stem cells. Detailed GO analysis shows that these genes are enriched with ubiquitin-dependent protein catabolic process (P < 10-7). This is consistent with the notion that Alzheimer's disease is one of disorders related to ubiquitin protein catabolic process .
The prevalence of repressive interactions among various molecular pathways highlights the complexity of cellular control machinery. This result also suggests the necessity of paying close attention to the downregulated genes and cross-checking them with upregulated genes in other situations.
The highly connected nature of the 1,186 gene sets is surprising. An average gene set overlaps with more than seven gene sets, above a significant level of FDR < 0.001. Moreover, the majority (80%) of the 1,186 gene sets are connected directly or indirectly as one big network. In other words, any newly defined gene sets will have approximately an 80% chance of having at least one significant overlap with a gene set in MSigDB database. Our results suggest that many seemingly unrelated stimuli/perturbations may activate or deactivate the same molecular pathways. We have discussed several unexpected overlaps in our paper while analyzing sub-networks in previous sections. One example is the shared genes among MYC target genes, serum stimulation, and interferon gamma over-expression. Our data-driven analysis confirms the connection between them: serum stimulation and interferon gamma up- and down-regulates MYC target genes, respectively.
The observation that most of the gene sets are connected to one dominant network can be explained in several ways. Researchers might be biased and focus on a small set of essential processes in cells, which would give rise to a connected network. Similarly, the MSigDB could have been selectively compiled. Another explanation for the observation is that cells respond to diverse perturbations with overlapping genes, resulting in the observed heavily connected networks. This explains the MYC pathway involvement in response to diverse stimuli. We believe that all of these factors contribute to the connectivity of the network.
An implication from this finding is to compare new gene lists obtained from genomics studies to big databases of previously published gene sets. Interpretation of gene lists remains a challenge in high-throughput genomics studies. Algorithms and databases are available and can be used to detect overrepresented genes belonging to the same pathway, GO category, target genes of transcription factors, etc. Alternatively, new gene lists can be compared with all published gene lists. Our analysis showed that very different biological processes can share a gene expression signature. Comparison with thousands of published gene sets will help in the interpretation of new gene lists, with the contextual molecular perturbation map. This is indeed similar to queries of nucleic acid sequence databases for the annotation of new sequence entries. MSigDB already has a user-friendly interface that enables users to upload their gene lists and compare them with all archived gene sets.
One of the drawbacks of this study is that we used gene sets from both human and mouse studies, and comparisons within the same species often involved different types of tissues or even cell lines. We included as many gene sets as possible based on the rationale that a) overlaps between divergent molecular pathways in these species/tissues would not be detected and b) significant overlaps, once detected, would suggest conserved molecular mechanisms across species/tissues. There are some evidence based on studies of yeast  and bacteria  suggesting that gene regulatory networks are remarkably flexible, and large scale rewiring is possible. Another limitation of this study is that our results, the highly connected modules of gene lists, were mainly validated through speculative discussions based on literature. We discussed only a subset of the modules that we deemed interesting. Two additional sub-networks related to p53 signalling and cell differentiation are discussed in Additional File 1. Further study is clearly needed to verify the identified links between diverse biological perturbations.
Despite the fact that DNA microarray studies might be inconsistent across laboratories , we identified thousands of statistically significant overlaps between published gene lists. Summarized as a molecular signature map, our results provide key insights into underlying connections of diverse perturbations. We have found evidence that the molecular signature map is 1) highly interconnected, suggesting that overlapping sets of genes are used over and over again by cells to respond to various stimuli, and 2) modularly organized, suggesting that different responses are coordinated via functional modules.
We downloaded "C2" gene set files (v2.5 updated April 7, 2008) of the MSigDB  that contain 1,186 gene sets that represent chemical and genetic perturbations manually extracted from publications. This database also includes gene sets contributed by individual researchers and other similar databases such as the List of List Annotated (L2L) database .
Statistical and network analyses
We developed a set of Perl scripts to analyze the original gene set database and evaluate the overlapping genes between all pairs. The P value for determining the significance of overlaps between two gene sets is calculated based on the hypergeometric distribution using the statistical computing software R (http://www.r-project.org). The original P values are then converted into false discovery rate (FDR) . Overlaps with FDR < 0.001 were considered significant. Our approach is similar to the method used by Newman and Weiner, except that they used binominal distribution to approximate the hypergeometric distribution for faster calculation .
We used undirected graphs to represent the overlapping information across thousands of gene sets. A significant overlap defines an edge between the two nodes that represent the gene sets. In the network file, each edge has properties representing the number of common genes, names of the common genes and FDR value. Each node has a name, a one-sentence description and the entire gene set. The network file, available as Additional File 3, thus includes a comprehensive account for all "C2" gene sets in MSigDB. The network is visualized using Cytoscape software version 2.6.3 , and highly interconnected sub-networks were identified using MCODE version1.3  with default settings.
To identify statistically enriched GO terms we selected the top 70 most frequently appearing genes in each sub-network and analyzed these gene lists with the DAVID web site (http://david.abcc.ncifcrf.gov/) [43, 44]. If the number of genes shared by gene sets was smaller than 70, only the genes that appeared at least twice were used. The most significant terms for all GO biological process terms are listed in Table 2.
DNA microarray data analysis
The DNA microarray dataset (Affymetrix .CEL files) of glutamine starvation  was downloaded from the homepage of the research group (http://jura.wi.mit.edu/sabatini_public/rapachip2/frameset1.html). The data were re-analyzed using an RMA algorithm. Genes were ranked by average fold change over 12 hours and 24 hours of glutamine starvation compared to normal control. The ranked gene sets were used for pathway analysis with the GSEA algorithm .
The author would like to thank Administrative and Research Computing at South Dakota State University for providing computational resources. He is also indebted to Jill Mesirov's group at the Broad Institute for making the gene sets data available. This work was supported by National Institutes of Health [GM083226]. The content is solely the responsibility of the author and does not necessarily represent the official views of NIGMS or NIH.
- Xiong L, Zhu JK: Molecular and genetic aspects of plant responses to osmotic stress. Plant Cell Environ. 2002, 25 (2): 131-139. 10.1046/j.1365-3040.2002.00782.xView ArticlePubMedGoogle Scholar
- Ramana CV, Grammatikakis N, Chernov M, Nguyen H, Goh KC, Williams BR, Stark GR: Regulation of c-myc expression by IFN-gamma through Stat1-dependent and -independent pathways. EMBO J. 2000, 19 (2): 263-272. 10.1093/emboj/19.2.263PubMed CentralView ArticlePubMedGoogle Scholar
- Chang HY, Sneddon JB, Alizadeh AA, Sood R, West RB, Montgomery K, Chi JT, van de Rijn M, Botstein D, Brown PO: Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol. 2004, 2 (2): E7- 10.1371/journal.pbio.0020007PubMed CentralView ArticlePubMedGoogle Scholar
- Fambrough D, McClure K, Kazlauskas A, Lander ES: Diverse signaling pathways activated by growth factor receptors induce broadly overlapping, rather than independent, sets of genes. Cell. 1999, 97 (6): 727-741. 10.1016/S0092-8674(00)80785-0View ArticlePubMedGoogle Scholar
- Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, Weinberg RA: An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet. 2008, 40 (5): 499-507. 10.1038/ng.127PubMed CentralView ArticlePubMedGoogle Scholar
- Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet JP, Subramanian A, Ross KN, et al.: The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006, 313 (5795): 1929-1935. 10.1126/science.1132939View ArticlePubMedGoogle Scholar
- Cahan P, Ahmad AM, Burke H, Fu S, Lai Y, Florea L, Dharker N, Kobrinski T, Kale P, McCaffrey TA: List of lists-annotated (LOLA): a database for annotation and comparison of published microarray gene lists. Gene. 2005, 360 (1): 78-82. 10.1016/j.gene.2005.07.008View ArticlePubMedGoogle Scholar
- Newman JC, Weiner AM: L2L: a simple tool for discovering the hidden significance in microarray expression data. Genome Biol. 2005, 6 (9): R81- 10.1186/gb-2005-6-9-r81PubMed CentralView ArticlePubMedGoogle Scholar
- Rhodes DR, Kalyana-Sundaram S, Tomlins SA, Mahavisno V, Kasper N, Varambally R, Barrette TR, Ghosh D, Varambally S, Chinnaiyan AM: Molecular concepts analysis links tumors, pathways, mechanisms, and drugs. Neoplasia. 2007, 9 (5): 443-454. 10.1593/neo.07292PubMed CentralView ArticlePubMedGoogle Scholar
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102 (43): 15545-15550. 10.1073/pnas.0506580102PubMed CentralView ArticlePubMedGoogle Scholar
- Coller HA, Grandori C, Tamayo P, Colbert T, Lander ES, Eisenman RN, Golub TR: Expression analysis with oligonucleotide microarrays reveals that MYC regulates genes involved in growth, cell cycle, signaling, and adhesion. Proc Natl Acad Sci USA. 2000, 97 (7): 3260-3265. 10.1073/pnas.97.7.3260PubMed CentralView ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B Met. 1995, 57 (1): 289-300.Google Scholar
- Benjamini Y, Yekutieli D: The control of the false discovery rate in multiple testing under dependency. Ann Stat. 2001, 29 (4): 1165-1188. 10.1214/aos/1013699998.View ArticleGoogle Scholar
- Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, et al.: Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007, 2 (10): 2366-2382. 10.1038/nprot.2007.324PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 2- 10.1186/1471-2105-4-2PubMed CentralView ArticlePubMedGoogle Scholar
- Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet. 2005, 37 (4): 382-390. 10.1038/ng1532View ArticlePubMedGoogle Scholar
- Schuhmacher M, Kohlhuber F, Holzel M, Kaiser C, Burtscher H, Jarsch M, Bornkamm GW, Laux G, Polack A, Weidle UH, et al.: The transcriptional program of a human B cell line in response to Myc. Nucleic Acids Res. 2001, 29 (2): 397-406. 10.1093/nar/29.2.397PubMed CentralView ArticlePubMedGoogle Scholar
- Armelin HA, Armelin MC, Kelly K, Stewart T, Leder P, Cochran BH, Stiles CD: Functional role for c-myc in mitogenic response to platelet-derived growth factor. Nature. 1984, 310 (5979): 655-660. 10.1038/310655a0View ArticlePubMedGoogle Scholar
- Chandriani S, Frengen E, Cowling VH, Pendergrass SA, Perou CM, Whitfield ML, Cole MD: A core MYC gene expression signature is prominent in basal-like breast cancer but only partially overlaps the core serum response. PLoS One. 2009, 4 (8): e6693- 10.1371/journal.pone.0006693PubMed CentralView ArticlePubMedGoogle Scholar
- Takeda A, Goolsby C, Yaseen NR: NUP98-HOXA9 induces long-term proliferation and blocks differentiation of primary human CD34+ hematopoietic cells. Cancer Res. 2006, 66 (13): 6628-6637. 10.1158/0008-5472.CAN-06-0458View ArticlePubMedGoogle Scholar
- Bennett L, Palucka AK, Arce E, Cantrell V, Borvak J, Banchereau J, Pascual V: Interferon and granulopoiesis signatures in systemic lupus erythematosus blood. J Exp Med. 2003, 197 (6): 711-723. 10.1084/jem.20021553PubMed CentralView ArticlePubMedGoogle Scholar
- Zhu H, Cong JP, Mamtora G, Gingeras T, Shenk T: Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays. Proc Natl Acad Sci USA. 1998, 95 (24): 14470-14475. 10.1073/pnas.95.24.14470PubMed CentralView ArticlePubMedGoogle Scholar
- Ramalho-Santos M, Yoon S, Matsuzaki Y, Mulligan RC, Melton DA: "Stemness": transcriptional profiling of embryonic and adult stem cells. Science. 2002, 298 (5593): 597-600. 10.1126/science.1072530View ArticlePubMedGoogle Scholar
- van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415 (6871): 530-536. 10.1038/415530aView ArticlePubMedGoogle Scholar
- Hu T, Gibson DP, Carr GJ, Torontali SM, Tiesman JP, Chaney JG, Aardema MJ: Identification of a gene expression profile that discriminates indirect-acting genotoxins from direct-acting genotoxins. Mutat Res. 2004, 549 (1-2): 5-27.View ArticlePubMedGoogle Scholar
- Li J, Lenferink AE, Deng Y, Collins C, Cui Q, Purisima EO, O'Connor-McCourt MD, Wang E: Identification of high-quality cancer prognostic markers and metastasis network modules. Nat Commun. 2010, 1: 34-PubMedGoogle Scholar
- Appel S, Rupf A, Weck MM, Schoor O, Brummendorf TH, Weinschenk T, Grunebach F, Brossart P: Effects of imatinib on monocyte-derived dendritic cells are mediated by inhibition of nuclear factor-kappaB and Akt signaling pathways. Clin Cancer Res. 2005, 11 (5): 1928-1940. 10.1158/1078-0432.CCR-04-1713View ArticlePubMedGoogle Scholar
- Li CM, Guo M, Borczuk A, Powell CA, Wei M, Thaker HM, Friedman R, Klein U, Tycko B: Gene expression in Wilms' tumor mimics the earliest committed stage in the metanephric mesenchymal-epithelial transition. Am J Pathol. 2002, 160 (6): 2181-2190. 10.1016/S0002-9440(10)61166-2PubMed CentralView ArticlePubMedGoogle Scholar
- Peng T, Golub TR, Sabatini DM: The immunosuppressant rapamycin mimics a starvation-like signal distinct from amino acid and glucose deprivation. Mol Cell Biol. 2002, 22 (15): 5575-5584. 10.1128/MCB.22.15.5575-5584.2002PubMed CentralView ArticlePubMedGoogle Scholar
- Yuneva M, Zamboni N, Oefner P, Sachidanandam R, Lazebnik Y: Deficiency in glutamine but not glucose induces MYC-dependent apoptosis in human cells. J Cell Biol. 2007, 178 (1): 93-105. 10.1083/jcb.200703099PubMed CentralView ArticlePubMedGoogle Scholar
- Wise DR, DeBerardinis RJ, Mancuso A, Sayed N, Zhang XY, Pfeiffer HK, Nissim I, Daikhin E, Yudkoff M, McMahon SB, et al.: Myc regulates a transcriptional program that stimulates mitochondrial glutaminolysis and leads to glutamine addiction. Proc Natl Acad Sci USA. 2008, 105 (48): 18782-18787. 10.1073/pnas.0810199105PubMed CentralView ArticlePubMedGoogle Scholar
- Zeller KI, Jegga AG, Aronow BJ, O'Donnell KA, Dang CV: An integrated database of genes responsive to the Myc oncogenic transcription factor: identification of direct genomic targets. Genome Biol. 2003, 4 (10): R69- 10.1186/gb-2003-4-10-r69PubMed CentralView ArticlePubMedGoogle Scholar
- Wall M, Poortinga G, Hannan KM, Pearson RB, Hannan RD, McArthur GA: Translational control of c-MYC by rapamycin promotes terminal myeloid differentiation. Blood. 2008, 112 (6): 2305-2317. 10.1182/blood-2007-09-111856View ArticlePubMedGoogle Scholar
- Brasse-Lagnel C, Lavoinne A, Husson A: Control of mammalian gene expression by amino acids, especially glutamine. FEBS J. 2009, 276 (7): 1826-1844. 10.1111/j.1742-4658.2009.06920.xView ArticlePubMedGoogle Scholar
- Wong MS, Raab RM, Rigoutsos I, Stephanopoulos GN, Kelleher JK: Metabolic and transcriptional patterns accompanying glutamine depletion and repletion in mouse hepatoma cells: a model for physiological regulatory networks. Physiol Genomics. 2004, 16 (2): 247-255.View ArticlePubMedGoogle Scholar
- Gronroos E, Terentiev AA, Punga T, Ericsson J: YY1 inhibits the activation of the p53 tumor suppressor in response to genotoxic stress. Proc Natl Acad Sci USA. 2004, 101 (33): 12165-12170. 10.1073/pnas.0402283101PubMed CentralView ArticlePubMedGoogle Scholar
- Layfield R, Alban A, Mayer RJ, Lowe J: The ubiquitin protein catabolic disorders. Neuropathol Appl Neurobiol. 2001, 27 (3): 171-179. 10.1046/j.1365-2990.2001.00335.xView ArticlePubMedGoogle Scholar
- Li H, Johnson AD: Evolution of transcription networks--lessons from yeasts. Curr Biol. 2010, 20 (17): R746-753. 10.1016/j.cub.2010.06.056PubMed CentralView ArticlePubMedGoogle Scholar
- Lozada-Chavez I, Janga SC, Collado-Vides J: Bacterial regulatory networks are extremely flexible in evolution. Nucleic Acids Res. 2006, 34 (12): 3434-3445. 10.1093/nar/gkl423PubMed CentralView ArticlePubMedGoogle Scholar
- Fortunel NO, Otu HH, Ng HH, Chen J, Mu X, Chevassut T, Li X, Joseph M, Bailey C, Hatzfeld JA, et al.: Comment on "'Stemness': transcriptional profiling of embryonic and adult stem cells" and "a stem cell molecular signature". Science. 2003, 302 (5644): 393-author reply 393View ArticlePubMedGoogle Scholar
- L2L Microarray Analysis Tool. http://depts.washington.edu/l2l/statistics.html
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504. 10.1101/gr.1239303PubMed CentralView ArticlePubMedGoogle Scholar
- Huang da W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4 (1): 44-57.View ArticlePubMedGoogle Scholar
- Huang da W, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009, 37 (1): 1-13. 10.1093/nar/gkn923PubMed CentralView ArticlePubMedGoogle Scholar
- Sachdeva M, Zhu S, Wu F, Wu H, Walia V, Kumar S, Elble R, Watabe K, Mo YY: p53 represses c-Myc through induction of the tumor suppressor miR-145. Proc Natl Acad Sci USA. 2009, 106 (9): 3207-3212. 10.1073/pnas.0808042106PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.