New network topology approaches reveal differential correlation patterns in breast cancer
© Bockmayr et al.; licensee BioMed Central Ltd. 2013
Received: 15 April 2013
Accepted: 6 August 2013
Published: 15 August 2013
Analysis of genome-wide data is often carried out using standard methods such as differential expression analysis, clustering analysis and heatmaps. Beyond that, differential correlation analysis was suggested to identify changes in the correlation patterns between disease states. The detection of differential correlation is a demanding task, as the number of entries in the gene-by-gene correlation matrix is large. Currently, there is no gold standard for the detection of differential correlation and statistical validation.
We developed two untargeted algorithms (DCloc and DCglob) that identify differential correlation patterns by comparing the local or global topology of correlation networks. Construction of networks from correlation structures requires fixing of a correlation threshold. Instead of a single cutoff, the algorithms systematically investigate a series of correlation thresholds and permit to detect different kinds of correlation changes at the same level of significance: strong changes of a few genes and moderate changes of many genes. Comparing the correlation structure of 208 ER- breast carcinomas and 208 ER+ breast carcinomas, DCloc detected 770 differentially correlated genes with a FDR of 12.8%, while DCglob detected 630 differentially correlated genes with a FDR of 12.1%. In two-fold cross-validation, the reproducibility of the list of the top 5% differentially correlated genes in 140 ER- tumors and in 140 ER+ tumors was 49% for DCloc and 33% for DCglob.
We developed two correlation network topology based algorithms for the detection of differential correlations in different disease states. Clusters of differentially correlated genes could be interpreted biologically and included the marker genes hydroxyprostaglandin dehydrogenase (PGDH) and acyl-CoA synthetase medium chain 1 (ACSM1) of invasive apocrine carcinomas that were differentially correlated, but not differentially expressed. Using random subsampling and cross-validation, DCloc and DCglob were shown to identify specific and reproducible lists of differentially correlated genes.
KeywordsDifferential correlation Microarray data Breast cancer Molecular subtypes Differential co-expression
Over the last 15 years, global gene expression profiling using microarrays has been established as a common tool for disease research. With this approach, disease mechanisms may be studied by comparative expression profiling of disease and healthy tissues or two disease states A and B. In recent years, this approach helped to discover prognostic markers and signatures and to identify target structures for drug intervention. Alterations of gene regulation often result in up- or down-regulated genes. Therefore, looking for differentially expressed genes using statistical tests is one of the most common strategies for the comparative analysis of microarray data .
However, this approach ignores the fact that most of the biological processes require orchestrated action of many genes. Therefore, gene correlation and co-expression have been intensively studied since the early days of microarrays and the seminal work of Eisen et al. . Today, hierarchical clustering and heatmaps are ubiquitous in studies of microarray data. Heatmaps usually serve as a tool for visualization of the results. Clustering has also been shown to be useful for the identification of disease subtypes, such as, for instance, defining molecular subtypes of breast cancer .
Complementary to differential expression analysis, the study of differential correlation or differential co-expression aims at a deeper understanding of the expression patterns in diseased tissues. As an example, a number of downstream targets could be regulated by a master gene, for example a transcription factor. In tissues where the regulatory mechanism is functional, the module of the target genes will be expressed in an ordered pattern. However, in diseased tissues where the regulatory mechanism is dysfunctional, the expression of the gene module will be unordered or random. Correlation changes of this kind can be detected by differential correlation (DC) analysis, but might be overlooked by differential expression (DE) analysis.
The number of pairwise correlations in global expression data of human tissues is quadratic in the number of genes and exceeds one million. Case-by-case testing would lead to a multiple testing problem, connected with searching for a few differentially correlated gene pairs within a huge number of unregulated correlations. Therefore, it should be more efficient not to study each gene pair separately, but to take into account the overall structure of correlations. Shortly after the microarray technology became common, algorithms for the detection of differential co-expression and differential correlation were developed [4–6]. Meanwhile, a multitude of algorithms were published  that can be divided into targeted, semi-targeted and untargeted approaches .
In targeted approaches, predefined gene modules are analyzed for correlation changes between the two disease states. Frequent choices for the modules are Gene Ontology (GO) categories, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, or clusters from additional external expression data sets. For example, in gene set co-expression analysis (GSCA) a dispersion index is calculated for each of the modules and the significance is assessed using a permutation test . In , a difference network framework is developed and a test statistics is defined by averaging over the edge weights between members of the modules. In another kind of targeted approach, the analysis of correlations is restricted to a predefined network, for example to the human interactome . In , the expression pattern of breast cancer on the interactome network was analyzed and it was shown that the metastatic cancer phenotype is characterized by an increase of randomness of the local information flux patterns.
In semi-targeted approaches, modules in one of the disease states are defined using clustering, and these modules are investigated for correlation in the other disease state. The differential clustering algorithm (DCA) starts with clustering of the tissues in the reference disease state and proceeds with reordering the genes in the reference clusters according to the correlations in the second disease state . CoXpress starts with hierarchical clustering of the reference samples and proceeds with a resampling-based approach to find those modules that are co-expressed in one state, but not in the other .
Untargeted approaches do not depend on externally defined modules or modules defined by clustering of a reference data set. Therefore, untargeted algorithms are capable of detecting DC in more general situations where differential regulation neither occurs within predefined external nor internal modules. Many of the untargeted approaches start with constructing correlation (or interaction) networks of each of the disease states and proceed with identification of differentially correlated subnetworks [15–18]. The recently published DICER algorithm  is able to address two different scenarios of DC: differentially correlated clusters, but also differentially correlated meta-modules. Here, a meta-module is defined as a pair of gene sets with genes inside the sets being correlated in both disease states, but with differing correlations between the gene sets.
Transformation of a correlation structure into a network requires fixing of a threshold. Whenever a correlation exceeds the threshold, the corresponding two network nodes are joined. A novelty of the current study is to investigate changes in network topology, but at the same time to evaluate a series of correlation thresholds that comprehensively cover the range of correlations in the data. In this way, different kinds of correlation changes can be detected at the same level of significance: strong changes of a few genes, but also moderate changes of many genes.
Worldwide, breast cancer is classified into molecular subtypes based on estrogen receptor (ER) and HER2 status. Determination of the molecular subtype is essential to tailor adjuvant treatment and to estimate of the risk of recurrence after surgery. In the last decade, DE between molecular subtypes of breast cancer was extensively investigated [20–22]. However, the literature on DC analysis of breast cancer is limited and includes a de-novo partitioning method  and a targeted analysis of KEGG pathways . Therefore, we tested the new developed untargeted algorithms in a large gene expression data set of 208 ER-, 208 ER+, 208 HER2- and 208 HER2+ breast carcinomas.
The algorithms DCglob and DCloc (Additional file 1) were implemented using the statistical programming language R. While the global algorithm focuses on comparison of the connected components of the networks, the local algorithm compares the next neighbors of the gene under consideration. The general workflow of the algorithms is illustrated in Figure 1.
Calculations were done on a Linux computer including 16 GB RAM and an Intel Core i7-2600 processor, 3.40 GHz. In the first step, the gene correlation matrix was calculated and used as input for both of the algorithms. Because this calculation did not significantly contribute, the computing time was independent of the number of samples. The time to calculate the strength of differential correlation for 12703 genes was 52 minutes for DCglob (200 thresholds) and 69 minutes for DCloc (100 thresholds). Including FDR calculation by subsampling analysis (100 subsamples), the calculation time was 88 hours for DCglob and 116 hours for DCloc.
Global topology algorithm
Step 1 The algorithm compares gene expression data of n samples in disease condition A with gene expression data of n samples in disease condition B. First, the correlation matrix C q comprising the Pearson correlations of all pairs of genes is calculated for both disease conditions q=A,B. The Fisher-transformed correlation matrix is the starting point for all further calculations.
Step 2 Correlation networks are computed for a comprehensive series of correlation thresholds. In the breast cancer data set, the highest correlation between genes is c=0.985 and corresponds to a Fisher-transformed value of z=2.5. Therefore, we choose the set of thresholds T to be the series of 200 equidistant values between 0 and 2.5. For each threshold t∈T, we obtain correlation networks and corresponding to the disease conditions A and B.
Step 3 The decomposition of the networks and into connected components is computed and all connected components containing 3 or more genes are selected. We restricted to clusters of 3 or more genes, because this is the minimum number where the network topology comes into play. Changes of pairwise correlations could be more effectively studied using a direct approach. Formally, let and denote the sets of these connected components in and respectively. After these preparations, we remove the genes that are contained in a connected component for both networks and from the set of all genes G, yielding . Then, we build subnetworks resp. of resp. induced by the genes in and compute the sets and of genes that are contained in connected components of and with cardinality greater or equal 3. Because we removed the genes that are contained in correlation clusters for both disease conditions, the remaining genes that are in correlation clusters for one of the disease conditions are considered as differentially correlated for correlation threshold t.
As an example, indicates that gene j is member of a connected component in the network but not in the network .
Step 4 Finally, genes that are members of connected components in only one of the networks over a large range of correlation thresholds are selected. To this end, an interval of maximal length [ a,b] is chosen such that for all t∈ [ a,b]∩T. Thus, the interval contains as series of threshold values for which a gene is correlated in one of the networks, but not in both networks. The interval length b−a is converted to a p-value using Steiger’s test for the comparison of correlation coefficients  and used to measure the strength of differential correlation. A list of differentially correlated genes includes all genes with p-values p<S below a threshold S.
Local topology algorithm
Step 1 This step is identical to Step 1 performed for DCglob.
In this definition, the number of common next neighbors in both networks is divided by the total number of next neighbors. To focus on changes that affect correlation clusters of at least 3 genes, we set if .
Thus, we obtain a value in [0,1], which represents the strength of differential correlation for each gene. A list of differentially correlated genes includes all genes with d > s above a threshold s.
Estimation of false discovery rates
Statistical evaluation of the algorithms was performed by a repeated random subsampling analysis. We wanted to falsify the null hypothesis that both patients groups exhibit the same gene correlation structure. Therefore, we randomly subsampled arbitrary breast cancer patients to generate the null distribution. This procedure mixes ER+ and ER- patients (as well as HER2+ and HER2- patients) and therefore is appropriate to assess the significance of differential correlations between the ER+ and ER- subtype (as well as the HER2+ and HER2- subtype). Then, we compared the number of differentially correlated genes between breast cancer subtypes n AB to the number of differentially correlated genes between randomly sampled sets of breast cancer n0.
wherein π0 denotes the proportion of not differentially correlated genes. This is a standard method for estimating the FDR from a subsampling or permutation analysis, see for example . For breast cancer, the number of differentially correlated genes turned out to be small compared to the total number of genes. Therefore, slightly overestimating the FDR, we used the approximation π0=1.
We generated a large gene expression data set of breast cancer (1317 samples) by fusion of publicly available microarray data sets. Raw data of GSE1456, GSE2034, GSE4922, GSE6532, GSE7390 and GSE11121 with respectively 159, 286, 327, 578, 198 and 200 samples were downloaded from the Gene Expression Omnibus (GEO) website . All the samples were analyzed using the Affymetrix Human Genome U133A microarray. As remarked in  some of the samples were contained in two or more data sets. Thus, we removed 431 samples and ended up with a breast cancer gene expression data set of 1317 unique samples. The raw data were preprocessed using the mas5 protocol as implemented in the R package affy and transformed to log2 scale. All samples consisted of surgical collected fresh-frozen tissue of primary tumors without neoadjuvant treatment.
A large number of genes was represented by more than one microarray probe. In this case, we selected the probe with the highest expression level resulting in a gene expression data set of 12703 unique genes. Immunohistochemistry (IHC) and in situ hybridization (ISH) where necessary are the gold standard for the determination of the ER and HER2 status. However, immunohistochemical data of ER and HER2 protein expression were not available for all samples. Hence, ER and HER2 classification was performed using the expression level of the estrogen receptor 1 gene (probe 205225_at) and the HER2 gene (probe 216836_s_at) from the microarray data (Additional file 2). A high concordance between RNA based determination of ER and HER2 states and the IHC based standard method was demonstrated before [22, 31]. A value of 10 was chosen as threshold for the ER status and a value of 12 as threshold for the HER2 status.
Visualization and functional analysis
Heatmaps were generated using the R function heatmap. Hierarchical clustering was executed using the average linkage method with Pearson correlations as similarity measure. Prior to the analysis, the expression level of each gene was centered to mean 0 and standard deviation 1. Construction and analysis of networks was carried out using the R package igraph. Visualization of the networks was realized using Cytoscape. Gene enrichment analysis was executed using DAVID [34, 35] with the genes represented by the microarray as background.
Two algorithms were developed for the detection of differential correlation in different disease states (Figure 1). The algorithms are based on the detection of either global (DCglob) or local (DCloc) changes in the topology of the correlation network. Both algorithms include the analysis of correlation networks corresponding to a series of correlation thresholds that covers the range of correlations in the data.
Identification of differentially correlated genes
Distribution of ER and HER2 status
The genes were ranked according to the strength of differential correlation p (DCglob) and d (DCloc), see Additional files 3 and 4. The statistic p can be interpreted as the significance of the range of correlations, where the gene under consideration takes part in a change of global topology. The statistic d can be interpreted as topological dissimilarity of the networks in the neighborhood of the gene under consideration. Stronger differential correlation corresponds to lower p, but higher d.
Numbers of detected genes by DCglob and by DCloc
Variation of the cutoff parameters
Differential correlation vs. differential expression
We looked for differential expression between the breast cancer subtypes using the standard approach of Welch’s test. After using the Benjamini-Hochberg (BH) method for multiple testing correction and a FDR of 5%, 55% of all genes were differentially expressed between ER+ and ER-, and 31% were significantly differentially expressed between HER2+ and HER2-. Among the differentially correlated genes identified by DCglob (p<0.1), 76% were differentially expressed between ER+ and ER- and 48% were differentially expressed between HER2+ and HER2-. For DCloc (d>0.25), the percentages were similar (73% and 47%). Thus, DC analysis provided additional information beyond DE analysis. As an example, the marker genes for the invasive apocrine subtype of breast cancer acyl-CoA synthetase medium chain 1 (ACSM1) and hydroxyprostaglandin dehydrogenase (PGDH) exhibit strong differential correlation (DCglob p=8.0E-05 and p=0.0004; DCloc d=0.26 and d=0.27), but they would not be detected in a differential expression analysis (p=0.21 and p=0.55 after Benjamini-Hochberg correction).
Functional analysis of the differentially correlated genes
Gene enrichment analysis (ER+ vs. ER-)
Growth factor binding
Gene enrichment analysis (HER2+ vs. HER2-)
Mitotic cell cycle
Regulation of ubiquitin-protein
ligase activity during
mitotic cell cycle
ent protein catabolic process
Mitochondrial membrane part
MHC protein complex
Growth factor binding
NADH dehydrogenase activity
ATP synthesis coupled
Cluster 1 (orange) included ACSM1 and PGDH, two genes that were described as markers for invasive apocrine carcinomas (IACs), a subgroup of ER- breast cancer recently studied by Celis et al. [37, 38]. It was enriched with other markers of IAC (Table 5). Cluster 2 (dark-blue) consisted of genes that are up-regulated in HER2+ breast cancer, including ERBB2, GRB7, STARD3 and PSMD3 that are located in the HER2 amplicon . It was significantly enriched for genes of the HER2 amplicon (Table 5). Cluster 3 (red) was highly enriched with marker genes for the androgen-responsive subgroup of ER- breast cancer described by Doane et al. . Cluster 4 (blue-green) contained some cell cycle genes (CDC16, TFDP1). Cluster 5 (light-blue) contained FOXC1, a gene with regulatory and prognostic relevance in triple-negative breast cancer . Cluster 6 (green) contained many genes that are related to ATPase activity (ATP5J, DHX15, CCT8, PSMC6 and ATP5O). Finally, Cluster 7 (purple) contained genes coding for keratins (KRT18, KRT19, KRT7, KRT8), claudins (CLDN3, CLDN4) and CD24. While part of the correlations in Cluster 2 (HER2), Cluster 4 and Cluster 6 (ATPase activity) are preserved in ER+ breast cancer, Cluster 1 (IAC markers), Cluster 3 (AR signaling) and Cluster 5 are completely rearranged in ER+ breast cancer.
DCloc identified only two different clusters. Among others, Cluster 1 (light-blue) contains genes related to the immune response (PTPRC, SIT1, CXCL13, FAIM3, IFI35). Like the third cluster identified by DCglob, the second cluster includes AR and FOXA1. Among the 25 genes present in this cluster, 18 are also present in Cluster 3 of the DCglob analysis. The two clusters identified by DCloc are not completely disarranged in the ER+ subtype, but the correlation between the genes in the clusters is much weaker in the ER+ compared to the ER- subtype.
Further, we investigated the reproducibility of lists of differentially correlated genes in dependence of the sample size. When comparing the list of the top 5% genes in the training data set to the list of the top 5% genes in the validation data set, the reproducibility raised up to 33% for DCglob and to 49% for DCloc (Figure 5C). When relaxing the reproducibility condition to the list of the top 20% genes in the validation set, this percentage reached 64% for DCglob and 80% for DCloc (Figure 5D).
Discussion and conclusions
We developed two novel algorithms for the detection of differential correlation (DC) in high-dimensional molecular data. Both approaches are untargeted in the sense that they do not depend on predefined gene modules and start with mapping of the correlation structures to networks. DCglob analyzes global changes of network topology, while DCloc analyzes local changes of the network topology. An innovative ingredient of the algorithms is the analysis of a series of networks that are constructed from a series of correlation thresholds. Therefore, detection of different kinds of correlation changes (strong changes of a few genes or moderate changes of many genes) is feasible. As default setting, the networks are constructed from an equidistant (after Fisher-transformation) series of 100 correlation thresholds between 0 and the maximal correlation in the data set. For the global algorithm, we increased the number of thresholds to 200 to achieve a more precise ranking of the genes. Using 200 instead of 100 thresholds did not significantly change the lists of DC genes. For both algorithms, a statistic for the strength of DC (p or d) is calculated as average of topological changes over all correlation thresholds. Using averaging as method for summary ensures that changes at different correlation thresholds are taken into account with equal weight. This i appropriate in a situation without prior knowledge on the strength and biological relevance of correlation changes.
An essential part of the analysis is to work with two equal-sized sample groups that are compared for differential correlations. The number of available HER2+ samples was 208 and therefore we randomly drew 208 ER+, 208 ER- and 208 HER2- samples for analysis. Unequal samples size would lead to a different sensitivity and/or specificity for the detection of correlations and therefore would lead to false positive discoveries in the DC analysis. We therefore recommend to work with equal-sized sample groups.
The results of DCloc and DCglob were consistent: Pearson correlations between −p and d were strong and highly significant (p<2E-16) for the analysis of estrogen receptor status (R = 0.59) and the analysis of HER2 status (R = 0.64). Extensive subsampling analysis was carried out to demonstrate the significance of the findings, and to estimate FDRs associated with the lists of differentially correlated genes. Both algorithms detected a significant number of differentially correlated genes in molecular subtypes of breast cancer compared with the null model. In the analysis of breast cancer subtypes, DCloc turned out to be more sensitive than DCglob (Figure 2C). This was in line with a higher number of significantly enriched functional categories for DCloc(Table 3). Additionally, the performance of DCloc was considerable better than DCglob in terms of reproducibility when analyzing two independent breast cancer data sets (Figure 5). Assuming that 1/3 of the detected genes should reproduce in two-fold cross-validation, 90 patients in each group were appropriate for DCloc and 130 patients in each group were appropriate for DCglob. However, the results of the global algorithm were more easily interpretable in the heatmap analysis (Figure 3).
Gene enrichment analysis revealed significantly enriched terms within the lists of differentially correlated genes. We tested heatmaps and networks as tools for visualization and in-depth analysis of the correlation changes. Clustering and heatmap analysis turned out to be particularly useful for a biological interpretation of the correlation patterns. ER- tumors tend to be more aggressive and are more difficult to treat than ER+ tumors. Therefore, we analyzed the correlation patterns in ER- tumors in more detail. Interestingly, among the clusters of genes detected by DCglob, there were many genes that could serve markers for substratification of the ER- or the triple-negative subtype (see Table 5).
We detected different types of changes in the correlation patterns between ER+ and ER- breast cancer. For example, the structure of the cluster associated with the HER2 amplicon (Cluster 2) was relatively well preserved. However, there was a difference in correlation strength, which can be possibly explained by the unequal distribution of the HER2 tumors between the ER+ and the ER- subtype (see Table 1). In contrast, there were gene clusters with strong correlation in ER- breast cancer and almost no correlation in ER+ breast cancer (Clusters 1, 3 and 5).
As an example, the genes in Cluster 1 were strongly correlated in ER- breast cancer, but most of them were uncorrelated in ER+ breast cancer. In the marker gene enrichment analysis, we found a highly significant overlap between these genes and marker genes of invasive apocrine carcinomas (IACs). In fact, the genes of Cluster 1 were highly expressed in a small subgroup of triple-negative (ER- and HER2-) breast cancer (10 tumors), but poorly expressed in the remaining ER- tumors and in all ER+ tumors. The two major marker genes for IACs , PGDH and ACSM1, were highly differentially correlated, but not differentially expressed between the ER- and the ER+ subtype.
Interestingly, the genes in Cluster 3 were highly expressed in the ER-/HER2+ tumors and the IACs, but not in the remaining triple-negative tumors. The marker gene enrichment analysis demonstrated a highly significant overlap of this cluster with AR signaling. The AR gene itself turned out to be considerably higher expressed in ER+ tumors compared to ER- tumors (fold change =3.01, p=1.2E-30). This result is in agreement with immunohistochemical data showing that the number of androgen receptor positive tumors is larger in the ER+ subtype . Further analysis showed that the expression of most of the genes of the AR signaling cluster was high in the ER+ tumors, but variable in the ER- tumors. Thus, the low correlation of the AR signaling genes in ER+ tumors is a consequence of the missing variance of the pathway in these tumors, where it is always highly expressed. These observations suggest that AR signaling is always active in ER+ tumors, while it is under regulation (active or inactive) in ER- tumors. AR signaling based stratification is of interest in the ER- subtype, but not in the ER+ subtype, in line with a recent result that high AR protein expression was associated with better survival in triple-negative breast cancer .
In summary, functionally relevant pathways (Table 3, 4 and 5) could be identified that show highly correlated gene expression in one of the subtypes, but not in the complementary subtype. Co-expression of a pathway is likely to be a consequence of pathway regulation, for example by transcription factors. A pathway highly expressed in a cancer subtype is potentially supporting or even essential for the growth of tumors cells in this kind of tumor. Thus, the clusters identified by DC analysis can be beneficial for patient stratification and can represent interesting targets for new therapies. In this context we re-identified the HER2-amplicon that is successfully targeted by trastuzumab or other anti-HER2 drugs .
These examples illustrate that DC analysis, in particular network topology based approaches, can help to identify biologically important gene clusters beyond the results of conventional DE analysis. Using DCloc and DCglob, we detected hundreds of differentially correlated genes in the molecular subtypes of breast cancer, while keeping the FDR under control. In a two-fold cross-validation approach we showed that results of both algorithms were reproducible in an independent data set.
Within the last decade, a multitude of methods and algorithms were developed for the detection of differential correlations. The algorithms address different biological questions and it is difficult to decide which of the algorithms works best in terms of power and of biological interpretability. However, statistical properties like specificity and reproducibility could be evaluated in a comparable way for many of the algorithms. We believe that statistical evaluation should be stronger emphasized in future studies of differential correlations.
False discovery rate
Human epidermal growth factor receptor 2
Invasive apocrine carcinoma
This project was funded by the European Commission, FP7 grant #278659 (RESPONSIFY).
- Allison D, Cui X, Page G, Sabripour M: Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006, 7: 55-65. 10.1038/nrg1749.PubMedView ArticleGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Nat Acad Sci. 1998, 95 (25): 14863-14868. 10.1073/pnas.95.25.14863.PubMedPubMed CentralView ArticleGoogle Scholar
- Sørlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Lønning PE, Børresen-Dale AL: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Nat Acad Sci. 2001, 98 (19): 10869-10874. 10.1073/pnas.191367098.PubMedPubMed CentralView ArticleGoogle Scholar
- Li KC: Genome-wide coexpression dynamics: Theory and application. Proc Nat Acad Sci. 2002, 99 (26): 16875-16880. 10.1073/pnas.252466999.PubMedPubMed CentralView ArticleGoogle Scholar
- Lai Y, Wu B, Chen L, Zhao H: A statistical method for identifying differential gene-gene co-expression patterns. Bioinformatics. 2004, 20 (17): 3146-3155. 10.1093/bioinformatics/bth379.PubMedView ArticleGoogle Scholar
- Kostka D, Spang R: Finding disease specific alterations in the co-expression of genes. Bioinformatics. 2004, 20 (Suppl 1): i194-i199. 10.1093/bioinformatics/bth909.PubMedView ArticleGoogle Scholar
- de la Fuente A: From ‘differential expression’ to ‘differential networking’ – identification of dysfunctional regulatory networks in diseases. Trends Genet. 2010, 26 (7): 326-333. 10.1016/j.tig.2010.05.001.PubMedView ArticleGoogle Scholar
- Tesson B, Breitling R, Jansen R: DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules. BMC Bioinformatics. 2010, 11: 497-10.1186/1471-2105-11-497.PubMedPubMed CentralView ArticleGoogle Scholar
- Choi Y, Kendziorski C: Statistical methods for gene set co-expression analysis. Bioinformatics. 2009, 25 (21): 2780-2786. 10.1093/bioinformatics/btp502.PubMedPubMed CentralView ArticleGoogle Scholar
- Southworth L, Owen A, Kim S: Aging mice show a decreasing correlation of gene expression within genetic modules. PLoS Genet. 2009, 5 (12): e1000776-10.1371/journal.pgen.1000776.PubMedPubMed CentralView ArticleGoogle Scholar
- Taylor I, Linding R, Warde-Farley D, Liu Y, Pesquita C: Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat Biotechnol. 2009, 27: 199-204. 10.1038/nbt.1522.PubMedView ArticleGoogle Scholar
- Teschendorff A, Severini S: Increased entropy of signal transduction in the cancer metastasis phenotype. BMC Syst Biol. 2010, 4: 104-10.1186/1752-0509-4-104.PubMedPubMed CentralView ArticleGoogle Scholar
- Ihmels J, Bergmann S, Berman J, Barkai N: Comparative gene expression analysis by differential clustering approach: application to the Candida albicans transcription program. PLoS Genet. 2005, 1: e39-10.1371/journal.pgen.0010039.PubMedPubMed CentralView ArticleGoogle Scholar
- Watson M: CoXpress: differential co-expression in gene expression data. BMC Bioinformatics. 2006, 7: 509-509. 10.1186/1471-2105-7-509.PubMedPubMed CentralView ArticleGoogle Scholar
- Choi J, Yu U, Yoo O, Kim S: Differential coexpression analysis using microarray data and its application to human cancer. Bioinformatics (Oxford, England). 2005, 21 (24): 4348-4355. 10.1093/bioinformatics/bti722.View ArticleGoogle Scholar
- Voy B, Schar J, Perkins A, Saxton A, Borate B, Chesler E, Branstetter L, Langston M: Extracting gene networks for low-dose radiation using graph theoretical algorithms. PloS Comput Biol. 2006, 2 (7): e89-10.1371/journal.pcbi.0020089.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang B, Li H, Riggins RB, Zhan M, Xuan J, Zhang Z, Hoffman EP, Clarke R, Wang Y: Differential dependency network analysis to identify condition-specific topological changes in biological networks. Bioinformatics. 2009, 25 (4): 526-532. 10.1093/bioinformatics/btn660.PubMedPubMed CentralView ArticleGoogle Scholar
- Altay G, Asim M, Markowetz F, Neal D: Differential C3NET reveals disease networks of direct physical interactions. BMC Bioinformatics. 2011, 12: 296-10.1186/1471-2105-12-296.PubMedPubMed CentralView ArticleGoogle Scholar
- Amar D, Safer H, Shamir R: Dissection of regulatory networks that are altered in disease via differential co-expression. PLoS Comput Biol. 2013, 9 (3): e1002955-10.1371/journal.pcbi.1002955.PubMedPubMed CentralView ArticleGoogle Scholar
- Gruvberger S, Ringnér M, Chen Y, Panavally S, Saal LH, Borg A, Fernö M, Peterson C, Meltzer PS: Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res. 2001, 61 (16): 5979-5984.PubMedGoogle Scholar
- van’t Veer L, Dai H, van de Vijver M, He Y, Hart A, Mao M, Peterse H, van der Kooy K, Marton M, Witteveen A, Schreiber G, Kerkhoven R, Roberts C, Linsley P, Bernards R, Friend S: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415 (6871): 530-536. 10.1038/415530a.View ArticleGoogle Scholar
- Budczies J, Weichert W, Noske A, Müller B, Weller C, Wittenberger T, Hofmann H, Dietel M, Denkert C, Gekeler V: Genome-wide gene expression profiling of formalin-fixed paraffin-embedded breast cancer core biopsies using microarrays. J Histochem Cytochem. 2011, 59 (2): 146-157. 10.1369/jhc.2010.956607.PubMedPubMed CentralView ArticleGoogle Scholar
- Freudenberg J, Sivaganesan S, Wagner M, Medvedovic M: A semi-parametric Bayesian model for unsupervised differential co-expression analysis. BMC Bioinformatics. 2010, 11: 234-10.1186/1471-2105-11-234. [http://www.biomedcentral.com/1471-2105/11/234] 10.1186/1471-2105-11-234PubMedPubMed CentralView ArticleGoogle Scholar
- Tegge AN, Caldwell CW, Xu D: Pathway correlation profile of gene-gene co-expression for identifying pathway perturbation. PLoS ONE. 2012, 7 (12): e52127-10.1371/journal.pone.0052127.PubMedPubMed CentralView ArticleGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. 2011, Vienna, Austria: R Foundation for Statistical Computing, [http://www.R-project.org/] [ISBN 3-900051-07-0]Google Scholar
- Steiger J: Test for comparing elements of a correlation matrix. Psychol Bull. 1980, 87: 245-251.View ArticleGoogle Scholar
- Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Nat Acad Sci. 2003, 100 (16): 9440-9445. 10.1073/pnas.1530509100.PubMedPubMed CentralView ArticleGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207-210. 10.1093/nar/30.1.207.PubMedPubMed CentralView ArticleGoogle Scholar
- Györffy B, Lanczky A, Eklund AC, Denkert C, Budczies J, Li Q, Szallasi Z: An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients. Breast Cancer Res Treat. 2010, 123 (3): 725-731. 10.1007/s10549-009-0674-9.PubMedView ArticleGoogle Scholar
- Gautier L, Cope L, Bolstad BM, Irizarry RA: Affy—analysis of affymetrix GeneChip data at the probe level. Bioinformatics. 2004, 20 (3): 307-315. 10.1093/bioinformatics/btg405.PubMedView ArticleGoogle Scholar
- Müller BM, Kronenwett R, Hennig G, Euting H, Weber K, Bohmann K, Weichert W, Altmann G, Roth C, Winzer KJ, Kristiansen G, Petry C, Dietel M, Denkert C: Quantitative determination of estrogen receptor, progesterone receptor, and HER2 mRNA in formalin-fixed paraffin-embedded tissue–a new option for predictive biomarker assessment in breast cancer. Diagn Mol Pathol. 2011, 20: 1-10. 10.1097/PDM.0b013e3181e3630c.PubMedView ArticleGoogle Scholar
- Csardi G, Nepusz T: The igraph software package for complex network research. InterJournal Complex Systems. 2006,, 1695-Google Scholar
- Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011, 27 (3): 431-432. 10.1093/bioinformatics/btq675.PubMedPubMed CentralView ArticleGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID Bioinformatics resources. Nat Protoc. 2009, 4 (1): 44-57.View ArticleGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA: Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009, 37 (1): 1-13. 10.1093/nar/gkn923.PubMed CentralView ArticleGoogle Scholar
- Bauer K, Parise C, Caggiano V: Use of ER/PR/HER2 subtypes in conjunction with the 2007 St Gallen consensus statement for early breast cancer. BMC Cancer. 2010, 10: 228-10.1186/1471-2407-10-228.PubMedPubMed CentralView ArticleGoogle Scholar
- Celis JE, Gromov P, Cabezon T, Moreira JMA, Friis E, Jirstrom K, Llombart-Bosch A, Timmermans-Wielenga V, Rank F, Gromova I: 15-Prostaglandin dehydrogenase expression alone or in combination with ACSM1 defines a subgroup of the apocrine molecular subtype of breast carcinoma. Mol Cell Proteomics. 2008, 7 (10): 1795-1809. 10.1074/mcp.R800011-MCP200.PubMedView ArticleGoogle Scholar
- Celis JE, Cabezon T, Moreira JM, Gromov P, Gromova I, Timmermans-Wielenga V, Iwase T, Akiyama F, Honma N, Rank F: Molecular characterization of apocrine carcinoma of the breast: Validation of an apocrine protein signature in a well-defined cohort. Mol Oncol. 2009, 3 (3): 220-237. 10.1016/j.molonc.2009.01.005.PubMedView ArticleGoogle Scholar
- Staaf J, Jonsson G, Ringner M, Vallon-Christersson J, Grabau D, Arason A, Gunnarsson H: High-resolution genomic and expression analyses of copy number alterations in HER2-amplified breast cancer. Breast Cancer Res. 2010, 12 (3): R25-10.1186/bcr2568.PubMedPubMed CentralView ArticleGoogle Scholar
- Doane A, Danso M, Lal P, Donaton M, Zhang L, Hudis C, Gerald W: An estrogen receptor-negative breast cancer subset characterized by a hormonally regulated transcriptional program and response to androgen. Oncogene. 2006, 25 (28): 3994-4008. 10.1038/sj.onc.1209415.PubMedView ArticleGoogle Scholar
- Ray PS, Wang J, Qu Y, Sim MS, Shamonki J, Bagaria SP, Ye X, Liu B, Elashoff D, Hoon DS, Walter MA, Martens JW, Richardson AL, Giuliano AE, Cui X: FOXC1 is a potential prognostic biomarker with functional significance in basal-like breast cancer. Cancer Res. 2010, 70 (10): 3870-3876. 10.1158/0008-5472.CAN-09-4120.PubMedView ArticleGoogle Scholar
- Loibl S, Müller B, von Minckwitz G, Schwabe M, Roller M, Darb-Esfahani S, Ataseven B, du Bois A, Fissler-Eckhoff A, Gerber B, Kulmer U, Alles J, Mehta K, Denkert C: Androgen receptor expression in primary breast cancer and its predictive and prognostic value in patients treated with neoadjuvant chemotherapy. Breast Cancer Res Treat. 2011, 130 (2): 477-87. 10.1007/s10549-011-1715-8.PubMedView ArticleGoogle Scholar
- Smith I, Procter M, Gelber R, Guillaume S, Feyereislova A, Dowsett M, Goldhirsch A, Untch M, Mariani G, Baselga J: 2-year follow-up of trastuzumab after adjuvant chemotherapy in HER2-positive breast cancer: a randomised controlled trial. Lancet. 2007, 369 (9555): 29-36. 10.1016/S0140-6736(07)60028-2.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.