- Research article
- Open Access
Regulation patterns in signaling networks of cancer
BMC Systems Biologyvolume 4, Article number: 162 (2010)
Formation of cellular malignancy results from the disruption of fine tuned signaling homeostasis for proliferation, accompanied by mal-functional signals for differentiation, cell cycle and apoptosis. We wanted to observe central signaling characteristics on a global view of malignant cells which have evolved to selfishness and independence in comparison to their non-malignant counterparts that fulfill well defined tasks in their sample.
We investigated the regulation of signaling networks with twenty microarray datasets from eleven different tumor types and their corresponding non-malignant tissue samples. Proteins were represented by their coding genes and regulatory distances were defined by correlating the gene-regulation between neighboring proteins in the network (high correlation = small distance). In cancer cells we observed shorter pathways, larger extension of the networks, a lower signaling frequency of central proteins and links and a higher information content of the network. Proteins of high signaling frequency were enriched with cancer mutations. These proteins showed motifs of regulatory integration in normal cells which was disrupted in tumor cells.
Our global analysis revealed a distinct formation of signaling-regulation in cancer cells when compared to cells of normal samples. From these cancer-specific regulation patterns novel signaling motifs are proposed.
Endogenous signal transduction in cancer cells is systematically disturbed to redirect the cellular decisions from differentiation and apoptosis to proliferation and, later, invasion . Cancer cells acquire their malignancy through accumulation of advantageous gene mutations by which the necessary steps to malignancy are obtained . These selfish adaptations to independence can be described as a result from an evolutionary process of diversity and selection . We were interested to observe the resulting cellular signal transduction on a global view. Experimental high throughput methods such as gene expression profiling with microarrays enable investigating the pathogenic function of tumors on a mesoscopic level. Large-scale gene expression profiles were successfully used to predict clinical outcome [4, 5] and improved risk estimation . However these studies didn't relate genes and their expression to a functional context. To gain an understanding on a systems view, gene expression can be mapped onto cellular networks. Several studies have been reported that used gene expression data from microarrays to describe specific characteristics of signaling networks in cancer. Discriminative components of a protein-protein interaction network were identified by comparing gene expression patterns of metastatic and non-metastatic tumors in breast cancer and suited as risk markers for metastasis of breast cancer . New genetic mediators for prostate cancer were found with networks that were reversely engineered from gene expression profiles . Besides this, insights into evolutionary principles were gained by the analysis of gene expression profiles. Gene expression differences were used to define phylogenetic relationships of several Drosophila species  and a molecular clock for primates . Furthermore, the regulation of signaling in yeast was investigated on a global scale to observe regulatory adaptation to the cellular environment. Yeast responded to exogenous signals by shorter regulatory cascades to enable fast signal propagation .
The aim of our work was to detect characteristic signaling properties of cancer cells on a global scale. We compared the regulation of signaling pathways in cancer with normal cells and mapped gene expression data of tumors and their corresponding non-malignant ("normal") samples onto a comprehensive protein-protein-interaction network. For inferring regulation-principles in cellular signal transduction, we used a graph searching algorithm that tracked pathways with the highest correlation in regulation. We investigated twenty tumor-datasets comprising acute myeloid leukemia, esophageal squamos cell-, lung adeno- and renal clear cell carcinoma, breast-, cervical-, head-and-neck-, oral-tongue-, pancreas- and prostate cancer, and vulva interstitial neoplasia. The investigated tumors showed shorter pathways, but a larger extension of the network. The tumors displayed lower frequency of central proteins and links and a higher information entropy (Shannon's information content) in their network. These findings were embedded into a novel signal-regulation motif which was observed considerably more often in normal cells when compared to tumor cells (Figure 1). Similar to the study of Cui and co-workers , central proteins (hubs) were enriched with cancer mutations. We observed that these proteins showed higher regulation-integrity in the normal samples whereas the tumor samples showed motifs of regulatory maintenance of the neighbors of hubs.
Constructing the signaling networks
We assembled our signaling network employing a comprehensive data repository of known protein-protein interactions from the literature (HPRD: Human Protein Reference Database [13, 14] version 9 from April 13th, 2010). Proteins were represented by their coding genes and will also be denoted as nodes of the networks in the following. Gene expression data of each cancer dataset (malignant cells) and the corresponding set of normal samples (non-malignant cells) was mapped onto the nodes of the network. Depending on the coverage of the probes on the microarray chips, the intersection with the HPRD network comprised of 5574 to 8651 nodes including 559 to 706 receptors and 505 to 617 transcription factors (Table 1). Similar to Luscombe and co-workers, we assumed most likely signaling propagation by high co-regulation of genes of two neighboring proteins in the network . We calculated protein-protein-distances for each link (link-distances) by the co-regulation (one minus the absolute value of Pearson's correlation) of the two interacting proteins (Additional file 1: Supplemental Figure S1). The link-distances were higher (lower absolute correlation) in cancer cells compared to normal cells (average of average link-distances in normal: 0.34, and tumor: 0.52, P = 1.53E-05, Table 1). We defined pathways for each pair of receptors (signal-operator) and transcription-factors (signal-receiver) by their shortest paths yielding a range of 282,295 to 435,602 pathways for each of the investigated cancer datasets. The tumor cells showed a distinct higher coverage of the original protein-interaction network for these pathways. Table 1 gives an overview of the network data for the different datasets we analyzed and also the network-coverage of all receptor-transcription-factor pathways for the tumors and the reference samples. From these pathways we constructed specific networks for each tumor and reference sample. For each tumor and normal sample, the constructed networks consisted only of those links and nodes that appeared at least once in their receptor-transcription-factor pathways. Not-appearing links and nodes were discarded (Figure 2 shows the number of nodes in all constructed networks of normal and cancer tissues). We were interested if these networks were specific for the respective tumor type. For this, we extracted all somatically mutated genes for specific cancer tissues from a database (COSMIC ) and tested if our tumor networks contained genes which have been described specifically for the respective tumors. We performed enrichment tests (Fisher's exact tests) and found that all tumor networks showed a considerably significant enrichment of their corresponding mutated tumor genes (Additional file 1: Supplemental Table S1).
Tumors use shorter paths, more links and less hubs
We calculated a variety of different network-features to characterize specific differences in signaling-regulation of tumor cells and non-malignant cells. The results are given in Table 2 and Table 3 and will be explained in the following. For getting a reasonable estimate of the general tendency of tumors, we calculated the average out of all datasets for cancer and normal networks and performed a significance test of the pair-wise differences between tumor and normal (paired, non-parametric, Wilcoxon-rank test).
The average path-length of cancer networks was less than for non-malignant (average for cancer: 4.58, and normal: 5.50, P = 3.82E-05). We wanted to know how often the same links (interactions) were used for different signaling pathways. For this, we defined the frequency of a link (link-frequency) as the number of receptor-transcription-factor pathways it was involved in. The average link-frequency was obtained by the number of links used in each single pathway from each respective receptor to each transcription factor, divided by the number of all used links. The average link frequency was higher in normal cells (average of average link-frequency for cancer: 122.6, and normal: 234.4, P = 1.53E-05). Similarly, the node frequency was calculated and showed the same tendency (average for cancer: 524.3, and normal: 723.4, P = 2.29E-05). Hence networks of normal cells used more often the same central proteins and interactions for different signaling tasks. Such a hub-like structure is the central characteristic of scale free networks . We were interested if the networks for cancer and normal samples followed these characteristics and if there were distribution differences between them. In deed, the link-frequency distribution of the networks of both entities followed a power law (probability to draw a link with frequency f is proportional to f-α and α > 1). In comparison to the networks from normal cells, the distributions of tumors showed a steeper decline. We calculated the exponent α of the distribution and observed larger exponents for cancer networks (P = 1.91E-06). (exemplarily, Figure 3 shows the distributions and the regression function for cervical cancer 1, the distributions for all datasets are given in Additional file 1: Supplemental Figure S2). This agrees with the lower average of their link-frequency. These distributions also show that proteins of high connectivity (hubs) in the networks of normal cells are more abundant (Additional file 1: Supplemental Figure S3 shows some illustrations of networks). The clustering coefficient has been employed as a measure of connectedness of networks . We calculated the clustering coefficient and obtained lower values for the network of cancer cells supporting our findings that cancer showed a tendency for less centralized, less hub-dependent formation (average of cancer: 0.118, and normal: 0.125, P = 4.20E-04). Specifically, the number of nodes with a clustering coefficient greater zero was distinctively higher in cancer cells (average for cancer: 2208 and normal: 1956, P = 7.63E-05).
Frequently involved genes are enriched with cancer mutated genes
Cui and co-workers compiled a selective list of 284 cancer mutated genes which were derived from large scale sequencing and the literature (Supplementary Table S10 in ). We compared this list with the 50 most frequently involved nodes (our hubs) of each network and found significant enrichment for 19 out of 20 normal and tumor datasets (Additional file 1: Supplemental Table S2). We then defined gene-lists of cancer mutated hubs for every cancer by intersecting the hubs of our network with the list of cancer mutated genes of Cui et al. (Additional file 1: Supplemental Table S3). Interestingly, most of the genes which showed up in the tumor networks were also present in the normal networks. This may indicate that normal cells intrinsically pave the way for their specific evolvement into malignancy.
Signaling-regulation in cancer is detached at cancer mutated hubs but maintained in their vicinity
Uri Alon and his co-workers studied occurrences of direction-motifs in triangles and revealed a large variety of substantial characteristics in signaling networks characterized by consistent and non-consistent feed-forward and feedback loops . We were interested in local regulation patterns of the networks at cancer mutated hubs. For this, we analyzed regulation motifs of every triangle consisting of exactly one hub and two of its neighbors which on their part also interact. We defined two regulation motifs. The first motif reflected the degree of regulatory integration of a hub and its network-vicinity and was defined by a high correlation of all pairs of nodes in the triangle motif (integrated motif, motif A in Figure 4). We found this motif significantly more often in normal cells (P = 1.7E-03, Table 3). The second motif (maintenance motif, motif B in Figure 4) described triangles which pairs of hub-nodes (hub-n1, hub-n2) showed high correlation in one tissue type and no correlation in the other, while the mutual correlation of nodes n1-n2 stayed in the same category (no, low and high correlation). Such a scenario is reasonable for a mutated cancer protein with loss-of-function leaving their neighbors unaffected. Indeed, this motif occurred more often in the cancer networks (P = 6.34E-04, Table 3).
Tumor networks are more robust against directed attacks
Albert and co-workers showed that scale free networks are error tolerant only against attacks of randomly selected nodes but not against directed removals of central nodes (hubs) . We were interested in the robustness of the networks when removing their hubs. For this, we removed the most frequently involved nodes of every network and calculated the average of pair-wise distances (average network diameter) as an estimate of the fragility of the networks . The relative increase of the network diameter due to the removal was distinctively larger in normal cells compared to cancer cells (average for cancer: 1.59, average for normal: 1.64, P = 0.021, Table 2) indicating higher robustness of the tumor networks against directed attacks at their hubs.
Lower information content in normal cells
We used the number of pathways each single link was involved in (link-frequency) as an estimate of the probability that information (such as a phosphorylation) was passed through this link. In this simplified model, every pathway was treated equally. With this, we calculated the information content for each network. As a measure of disorder, Shannon's information entropy  was calculated for each network. The cancer networks exhibited a higher information entropy (average for cancer: 11.98, average for normal: 11.38, P = 3.28-04, Table 3) indicating their higher degree of dispersal.
A comparative network motif
Inspired by the results described above, we designed a comparative network-motif which is illustrated in Figure 1. We wanted to put up a model in which cancer cells use different pathways for different tasks whereas normal cells use common signaling interactions for different tasks. Therefore a model was designed such that two pathways (two operator-receiver pairs, R1 - TF and R2 - TF in Figure 1) of the normal tissue shared at least one common link, whereas the same operator-receiver pairs for the tumor did not share any link. We compared the abundance of this motif with the abundance of its counterpart in which the cancer cells used at least one common link and the normal cells did not share any link. We found a significantly higher number of our motif in which the normal cells share a common link (average counts for cancer: 15,333,384, average for normal: 29,618,238, P = 9.54E-06, Table 3).
We investigated network properties of cancer signaling by looking at co-regulation patterns of genes for different cancer types. We analyzed the general regulatory behavior of correlating gene expression samples of one tumor type and study, rather than analyzing the regulatory behavior of single patients. For this, we calculated a gene to gene distance metric for all samples (patients) of normal and cancerous tissues. The networks of the investigated tumors showed distinctive mechanisms in the regulation of signal transduction when compared to normal cells and had shorter path lengths. Luscombe and co-workers analyzed the dynamics of regulatory networks in yeast . In comparison to endogenously caused changes, they discovered a different topological adaptation of the network when yeast responded to environmental changes. For having quick responses, yeast reacted to environmental changes (nutrition depletion, stress response) by short regulatory cascades. Our investigated cancer cells showed a similar tendency as yeast under stress at which fine tuned endogenous homeostasis is of minor importance. Interestingly, for yeast, Luscombe et al. discovered a higher frequency of hubs for stress responses whereas we discovered that the tumors used hubs less frequently. Cells of normal sample had a more centralized network to regulate signals via common nodes and links. This was reflected by a smaller network, higher frequency of hubs, lower entropy and a higher number of our signaling motif in which the number of pathway-pairs with common links was counted. This makes sense, as fine-tuning and integrating diverse signals need to be coordinately transferred to the respective transcriptional response which is substantial for fine grained signaling homeostasis of normal cells to co-ordinate their signals in accordance to their cellular community in the tissue. Degenerated tumor cells do not need this any more. In turn, the tumors showed a higher connectedness of the whole network which may strengthen their independency of exogenic perturbations.
Similar to Cui and co-workers , we observed with our model that cancer specific mutations occur distinctively more often at hubs for signal transduction. Such a mutation can cause a loss of function. This is beneficial for the cancer if the protein gets insensitive to upstream-signals and fires constitutively an oncogenic signal as e.g. the ABL-BCR fusion protein in chronic myelogenous leukemia . If the protein acts as a tumor suppressor, a complete loss of function is beneficial for oncogenesis. In both scenarios, the regulation for signaling homeostasis of the local network environment is detached from this mal-functional protein and a coordinated regulation between the environment and this protein is not necessary any more. We observed this by counting distinctively less integration-motifs in tumors (motif A in Figure 4). Interestingly, tumors seem to sustain the original signals between the environment. We observed this by higher counts of the disruption motif in tumors which reflects the disruption of co-regulation of the hub, but maintained regulation between the neighbors of the hubs (motif B in Figure 4). Even though tumors may exhibit de-regulation of mal-functional hubs with their neighbors, such a maintained co-regulation of their neighbors gives evidence that bypass regulations are still necessary. Ma'ayan and co-workers observed an accumulation of feedback and feed-forward loops at such hubs  which supports this idea. Tumors need to maintain the direct signal of e.g. a feed-forward loop which is necessary for the effect of the constitutive signal of an oncogenic hub (Figure 4C). Such oncogenic signaling motifs may have implications to drug therapy. If an oncogenic hub is treated (as e.g. ABL-BCR with imatinib ) resistance can occur by mutations of the target protein which reduce the affinity of the drug to the target. A combined therapy may avoid this evolvement by additionally blocking the signaling-maintenance of the neighbors. In addition, we found that the observed cancer networks showed higher error tolerance against directed attacks of hub removals. Hence, some maintenance signals may not only support cancer mutated hubs but also pave the way for the signaling network to get independent of them, specifically for proteins of cancer mutated genes with loss-of-function. It is challenging but highly relevant to shed light into these effects experimentally with cell lines exhibiting drug resistances at such hubs. We analyzed networks based on cohorts of patients and used the correlation of expression between gene pairs for the whole cohorts. This approach does not allow the analysis of a single sample and therefore can't be employed for diagnosis of a single patient, but rather for the analysis of tumor subgroups. It may be worthwhile developing distance metrics of gene pairs for single samples with which the investigated topology features can be employed supporting diagnosis.
We proposed a novel comparative signaling-motif for malignant signaling-regulation which sums up our findings (Figure 1). There have been elaborated studies on network motifs . Our comparative cancer motif is different from these motifs in that it shows signaling-regulation in cancer reflecting less centralized formation. The comparative cancer motif agrees with our findings of non-integration (motif A, Figure 4) but signaling-maintenance (motif B, Figure 4) of proteins with higher involvement in signal propagation.
We analyzed network models that based on correlation of gene expression between interacting proteins which enabled us to track basic principles of signaling by its regulation. The malignant signaling networks showed more diverse signaling pathways (average number of nodes in the networks of tumor: 3324, and normal tissue: 2973, P = 2.3E-03, Figure 2), shorter pathways (average path-length for cancer: 4.58, and normal: 5.50, P = 3.82E-05, Figure 2), the networks were less centralized (average clustering-coefficient of cancer: 0.118, and normal tissue: 0.125, P = 4.20E-04) and less dependent on hubs (average increase of network-diameter after hub-removal, for cancer: 1.59, and normal tissue: 1.64, P = 0.021). The cancer networks indicated signaling maintenance and increased error tolerance to punctual attacks even at hubs which makes cancer treatment at specific targets challenging.
The general workflow of our approach is outlined in Figure 5. To investigate if our network features showed a statistically significant difference we performed paired Wilcoxon tests. We set the significance level to P ≤ 0.05 and considered all p-values below this threshold as statistically significant.
Gene expression analysis
We analyzed twenty different datasets of cancer and their corresponding normal or reference samples. For most of the tumors (8 tumors), we analyzed two datasets for each cancer type. We used two AML (acute myeloid leukemia) datasets containing 18 normal and 25 tumor (AML-1)  and 4 normal and 52 cancer samples (AML-2) . The first breast cancer dataset (breast-1) was obtained from cancer and normal sample of 43 patients each , breast-2 consisted of 143 normal and 42 cancer samples . We analyzed two cervical cancer sets, cervical-1  and cervical-2  comprising data from 8 and 24 normal and 20 and 31 cancer datasets, respectively. Data of esophageal squamous cell carcinomas (ESCC) was obtained from cancerous and normal tissue of 53 patients (taken from the NCBI database Gene Expression Omnibus, accession code GSE23400). We used a glioma data set containing 23 normal and 153 cancer samples . A head-and-neck dataset was taken from a study of head-and-neck squamous carcinoma consisting of data from 22 normal and cancer samples . We used two lung cancer datasets, denoted as "lung-1" and "lung-2". Lung-1 was taken from a study by Bhattacharjee and co-workers  and contained data from 17 normal and 13 cancer samples of adenocarcinoma. Bhattacharjee and co-workers clustered the tumor datasets in their study. To obtain the most relevant data subsets with the necessary homogeneity, we selected their cluster of highly aggressive adenocarcinomas (cluster C2 of their cluster analysis) for our study. Lung-2 contained gene expression data of normal sample and adenocarcinoma tumors from 27 patients . We analyzed an oral-tongue-cancer datasets comprising of data from 26 normal and 31 cancer samples (oral-tongue-1 ) and 12 and 26 normal and cancer samples, respectively (oral-tongue-2 ). We analyzed two datasets for pancreas cancer, pancreas-1 consisting of 39 normal and tumor tissues  and pancreas-2 having 15 normal and 36 cancer samples . The first prostate cancer dataset (prostate-1) comprised of data from 50 normal sample and 52 cancer samples , and the second (prostate-2) consisted of 50 normal and 52 cancer samples (taken from the NCBI database Gene Expression Omnibus, accession code GSE17951). The dataset Renal-1 contained 23 normal renal samples and 69 samples of renal cancer 69  and renal-2 had 5 normal and 62 cancer samples . For the first renal datasets we selected homogenous samples by performing hierarchical clustering (Euclidean distance, complete linkage) yielding sets of nine clustered samples for normal tissue and 10 for cancerous tissue. We analyzed data from vulva interstitial neoplasia consisting of 10 normal and 9 cancer samples . All datasets were stratified by randomly deleting datasets of the overrepresented class yielding an equal amount of tumor and normal sample datasets. For breast-1, ESCC, head-and-neck, lung-2, pancreas-1, and oral-tongue-1, normal and cancer samples were from the same patients (which was not the case for the other analyzed datasets). The data had been obtained using microarrays from Affymetrix of the following versions: HG-U133A for AML-1, breast-1, cervical-2, ESCC, lung-2 and renal-1, HG-U133 Plus 2 for breast-2, cervical-1, glioma, oral-tongue-2, pancreas-1, pancreas-2, prostate-2, renal-2 and vulva; HG-U95Av2 for AML-2, head-and-neck, lung-1, oral-tongue-1 and prostate-1. We normalized all datasets by Variance Stabilization Normalization [40, 41].
The protein-protein-interaction network was constructed using the Human Protein Reference Database [13, 14] (version 9 from April 13th, 2010). Interacting proteins were represented by their coding genes. The network was constructed for every gene that could be mapped to a microarray probe-set using BioMart . Interactions were not taken into account if probe information for at least one gene was missing. For a link between node (gene) x and y, we defined a link-distance d xy by Pearson's correlation coefficient ρ xy from gene expression values of the interacting proteins x and y
for n samples (patients) and gene expression x i and y i for gene x and y of sample i, respectively. These distances were calculated for each dataset of normal and cancer tissues and used for the networks of the respective datasets. To equally handle induction and inhibition events, we used the absolute values of all correlation coefficients. Correlation values were subtracted from one to obtain low distances for paths with high correlation. Genes with the molecular function term "receptor activity" from the definitions of Gene Ontology  were used as receptors in the network. The definitions of transcription factors were taken from TRANSFAC . We used Dijkstra's algorithm  for calculating the shortest paths for every pair of receptors and transcription factors in the normal and tumor networks. These shortest paths of all receptor-transcription factor pairs served as the predicted pathways for each dataset and defined our tumor-specific interaction networks. Links and nodes that were not used by any shortest path were removed. The analyses were then performed on the largest connected component of the interaction network.
Defining the network features
Path length, link and node frequency, and the signaling motif are explained in the results. It is to note that link (and node) frequency is similar to betweenness centrality, which is the number of shortest paths going through the link (and node). While betweenness centrality considers shortest paths between all pairs of nodes, node and link frequency as defined here, was the number of shortest paths between pairs of receptors and transcription factors. The (average) network diameter has been described as a measure for error tolerance of a network against removals of nodes in scale free networks  and was used here in a similar way. The diameters for the networks were obtained by the average of the shortest paths of each pair of nodes in the network. The network diameter was calculated for undisturbed (whole) networks and networks in which the top 10% of the hubs were removed. The ratio of these values was calculated to yield the increase of the average network diameter after hub removal. The calculation of the information content was based on the assumption that signals enter the network at any receptor with equal probability within a certain time interval. These signals are passed by the links of the network to the transcription factors via the defined pathways from the receptors, again with equal probability. We assumed that the signals vanish from the signaling network after having entered the corresponding transcription factor at the end of the path. Signals enter the receptors with a certain frequency, resulting in an equal distribution and therefore we assumed uniform density of the signals in each pathway. The probability of a signal to pass through the link of node i and j is then proportional to the number of pathways passing through this link. With this, we calculated the information content by Shannon's definition 
in which n denotes the number of links and p i the probability of a signal to be passed through link i. The clustering coefficient C i for node i was given by
in which n links is the number of links connecting the neighbors of node i and k is the number of neighbors. This feature described how well the neighbors were mutually connected. If they were fully connected, the clustering coefficient was one, if they were not connected at all, the clustering coefficient was zero.
The link-frequency distributions of normal and tumor cells i followed a power law, i.e. the probability of links P(f) with link-frequency f was approximately given by
To estimate the exponent α we applied the method proposed by Newman  which determines the exponent of the cumulative distribution avoiding noisy data at the tail of the original distribution (see tail of the link frequency distribution in Figure 3). For visualization we plotted the distribution and the corresponding linear function with slope α on a log-log scale. The intersection with the y-axis of the plotted line was calculated using a least squared fit (see Figure 3 and Additional file 1: Supplemental Figure S2).
Defining and counting the integration and the maintenance motif
We defined three correlation categories based on intervals of the absolute values of the correlation coefficient |ρ xy |: no correlation for the absolute value of correlation coefficients between zero and 0.3, low correlation for the absolute value of correlation coefficients between 0.3 and 0.5, and high correlation above 0.5. Hubs of cancer mutated genes were defined by intersecting the list of cancer genes from Cui and co-workers (Supplementary Table S10 in ) with the nodes that appeared in both tissue types (normal and tumor). From this intersection we selected the top 50 most frequently involved nodes from the normal and the tumor network resulting in 100 cancer mutated hubs for every cancer dataset. Hubs that were selected in both tissue types and as such appeared twice in the union set were used only once. For each dataset, we collected all triangles in which one node was such a cancer mutated hub and that appeared in the normal and in the tumor network ensuring the comparability of our motif counts. Out of these triangles, we selected triangles having the motifs for integration (motif A in Figure 4) and maintenance (motif B in Figure 4). For motif A, we selected triangles in which the absolute correlations |ρ xy | between all pairs of nodes (hub-n1, hub-n2, n1-n2, n1 and n2 are the two other nodes in the triangle) was high. For motif B, we counted the abundance of triangles which pairs of hub-nodes showed high correlation in one tissue type and no correlation in the other, while the correlation of n1-n2 stayed in the same category (no correlation, low correlation or high correlation).
We thank Tim Beissbarth for his suggestions for the statistical analysis, and Tobias Bauer for technical support. This work was funded by the Helmholtz Alliance on Systems Biology of Signaling in Cancer, the Nationales Genom-Forschungs-Netz (NGFN+) for the project ENGINE and the Helmholtz International Graduate School for Cancer Research at the German Cancer Research Center.
Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med. 2004, 10 (8): 789-799. 10.1038/nm1087
Hanahan D, Weinberg RA: The hallmarks of cancer. Cell. 2000, 100 (1): 57-70. 10.1016/S0092-8674(00)81683-9
Goymer P: Natural selection: The evolution of cancer. Nature. 2008, 454 (7208): 1046-1048. 10.1038/4541046a
Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, van't Veer LJ, Perou CM: Concordance among gene-expression-based predictors for breast cancer. The New England journal of medicine. 2006, 355 (6): 560-569. 10.1056/NEJMoa052933
van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002, 415 (6871): 530-536. 10.1038/415530a
Oberthuer A, Berthold F, Warnat P, Hero B, Kahlert Y, Spitz R, Ernestus K, König R, Haas S, Eils R, et al.: Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification. J Clin Oncol. 2006, 24 (31): 5070-5078. 10.1200/JCO.2006.06.1879
Chuang HY, Lee E, Liu YT, Lee D, Ideker T: Network-based classification of breast cancer metastasis. Molecular systems biology. 2007, 3: 140- 10.1038/msb4100180
Ergun A, Lawrence CA, Kohanski MA, Brennan TA, Collins JJ: A network biology approach to prostate cancer. Molecular systems biology. 2007, 3: 82- 10.1038/msb4100125
Rifkin SA, Kim J, White KP: Evolution of gene expression in the Drosophila melanogaster subgroup. Nature genetics. 2003, 33 (2): 138-144. 10.1038/ng1086
Khaitovich P, Enard W, Lachmann M, Paabo S: Evolution of primate gene expression. Nature reviews. 2006, 7 (9): 693-702. 10.1038/nrg1940
Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, Gerstein M: Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004, 431 (7006): 308-312. 10.1038/nature02782
Cui Q, Ma Y, Jaramillo M, Bari H, Awan A, Yang S, Zhang S, Liu L, Lu M, O'Connor-McCourt M, et al.: A map of human cancer signaling. Molecular systems biology. 2007, 3: 152- 10.1038/msb4100200
Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M, et al.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome research. 2003, 13 (10): 2363-2371. 10.1101/gr.1680803
Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM: Human protein reference database--2006 update. Nucleic acids research. 2006, D411-414. 34 Database
Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, Menzies A, Teague JW, Futreal PA, Stratton MR: The Catalogue of Somatic Mutations in Cancer (COSMIC). Current protocols in human genetics/editorial board, Jonathan L Haines [et al]. 2008, Chapter 10: Unit 10 11-
Barabasi AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004, 5 (2): 101-113. 10.1038/nrg1272
Albert R, Jeong H, Barabasi AL: Error and attack tolerance of complex networks. Nature. 2000, 406 (6794): 378-382. 10.1038/35019019
Shannon C: A Mathematical Theory of Communication. The Bell System Technical Journal. 1948, 27: 379-423. 623-656
Druker BJ: Translation of the Philadelphia chromosome into therapy for CML. Blood. 2008, 112 (13): 4808-4817. 10.1182/blood-2008-07-077958
Ma'ayan A, Jenkins SL, Neves S, Hasseldine A, Grace E, Dubin-Thaler B, Eungdamrong NJ, Weng G, Ram PT, Rice JJ, et al.: Formation of regulatory patterns during signal propagation in a Mammalian cellular network. Science. 2005, 309 (5737): 1078-1083. 10.1126/science.1108876
Alon U: Network motifs: theory and experimental approaches. Nat Rev Genet. 2007, 8 (6): 450-461. 10.1038/nrg2102
Stirewalt DL, Meshinchi S, Kopecky KJ, Fan W, Pogosova-Agadjanyan EL, Engel JH, Cronk MR, Dorcy KS, McQuary AR, Hockenbery D, et al.: Identification of genes with abnormal expression changes in acute myeloid leukemia. Genes, chromosomes & cancer. 2008, 47 (1): 8-20.
Yagi T, Morimoto A, Eguchi M, Hibi S, Sako M, Ishii E, Mizutani S, Imashuku S, Ohki M, Ichikawa H: Identification of a gene expression signature associated with pediatric AML prognosis. Blood. 2003, 102 (5): 1849-1856. 10.1182/blood-2003-02-0578
Pau Ni IB, Zakaria Z, Muhammad R, Abdullah N, Ibrahim N, Aina Emran N, Hisham Abdullah N, Syed Hussain SN: Gene expression patterns distinguish breast carcinomas from normal breast tissues: the Malaysian context. Pathology, research and practice. 206 (4): 223-228.
Chen DT, Nasir A, Culhane A, Venkataramu C, Fulp W, Rubio R, Wang T, Agrawal D, McCarthy SM, Gruidl M: Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue. Breast cancer research and treatment. 119 (2): 335-346.
Pyeon D, Newton MA, Lambert PF, den Boon JA, Sengupta S, Marsit CJ, Woodworth CD, Connor JP, Haugen TH, Smith EM, et al.: Fundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancers. Cancer research. 2007, 67 (10): 4605-4619. 10.1158/0008-5472.CAN-06-3619
Scotto L, Narayan G, Nandula SV, Arias-Pulido H, Subramaniyam S, Schneider A, Kaufmann AM, Wright JD, Pothuri B, Mansukhani M, et al.: Identification of copy number gain and overexpressed genes on chromosome arm 20q by an integrative genomic approach in cervical cancer: potential role in progression. Genes, chromosomes & cancer. 2008, 47 (9): 755-765.
Sun L, Hui AM, Su Q, Vortmeyer A, Kotliarov Y, Pastorino S, Passaniti A, Menon J, Walling J, Bailey R, et al.: Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain. Cancer cell. 2006, 9 (4): 287-300. 10.1016/j.ccr.2006.03.003
Kuriakose MA, Chen WT, He ZM, Sikora AG, Zhang P, Zhang ZY, Qiu WL, Hsu DF, McMunn-Coffran C, Brown SM, et al.: Selection and validation of differentially expressed genes in head and neck cancer. Cell Mol Life Sci. 2004, 61 (11): 1372-1383. 10.1007/s00018-004-4069-0
Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences of the United States of America. 2001, 98 (24): 13790-13795. 10.1073/pnas.191502998
Su LJ, Chang CW, Wu YC, Chen KC, Lin CJ, Liang SC, Lin CH, Whang-Peng J, Hsu SL, Chen CH, et al.: Selection of DDX5 as a novel internal control for Q-RT-PCR from microarray data using a block bootstrap re-sampling scheme. BMC genomics. 2007, 8: 140- 10.1186/1471-2164-8-140
Estilo CL, P Oc, Talbot S, Socci ND, Carlson DL, Ghossein R, Williams T, Yonekawa Y, Ramanathan Y, Boyle JO, et al.: Oral tongue cancer gene expression profiling: Identification of novel potential prognosticators by oligonucleotide microarray analysis. BMC cancer. 2009, 9: 11- 10.1186/1471-2407-9-11
Ye H, Yu T, Temam S, Ziober BL, Wang J, Schwartz JL, Mao L, Wong DT, Zhou X: Transcriptomic dissection of tongue squamous cell carcinoma. BMC genomics. 2008, 9: 69- 10.1186/1471-2164-9-69
Badea L, Herlea V, Dima SO, Dumitrascu T, Popescu I: Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies genes specifically overexpressed in tumor epithelia. Hepato-gastroenterology. 2008, 55 (88): 2016-2027.
Pei H, Li L, Fridley BL, Jenkins GD, Kalari KR, Lingle W, Petersen G, Lou Z, Wang L: FKBP51 affects cancer cell response to chemotherapy by negatively regulating Akt. Cancer cell. 2009, 16 (3): 259-266. 10.1016/j.ccr.2009.07.016
Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer cell. 2002, 1 (2): 203-209. 10.1016/S1535-6108(02)00030-2
Jones J, Otu H, Spentzos D, Kolia S, Inan M, Beecken WD, Fellbaum C, Gu X, Joseph M, Pantuck AJ, et al.: Gene signatures of progression and metastasis in renal cell cancer. Clin Cancer Res. 2005, 11 (16): 5730-5739. 10.1158/1078-0432.CCR-04-2225
Yusenko MV, Kuiper RP, Boethe T, Ljungberg B, van Kessel AG, Kovacs G: High-resolution DNA copy number and gene expression analyses distinguish chromophobe renal cell carcinomas and renal oncocytomas. BMC cancer. 2009, 9: 152- 10.1186/1471-2407-9-152
Santegoets LA, Seters M, Helmerhorst TJ, Heijmans-Antonissen C, Hanifi-Moghaddam P, Ewing PC, van Ijcken WF, van der Spek PJ, van der Meijden WI, Blok LJ: HPV related VIN: highly proliferative and diminished responsiveness to extracellular signals. International journal of cancer. 2007, 121 (4): 759-766. 10.1002/ijc.22769.
Huber W, von Heydebreck A, Sueltmann H, Poustka A, Vingron M: Parameter estimation for the calibration and variance stabilization of microarray data. Statistical applications in genetics and molecular biology. 2003, 2: Article 3
Huber W, von Heydebreck A, Sultmann H, Poustka A, Vingron M: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics (Oxford, England). 2002, 18 (Suppl 1): S96-104.
Haider S, Ballester B, Smedley D, Zhang J, Rice P, Kasprzyk A: BioMart Central Portal--unified access to biological data. Nucleic Acids Res. 2009, W23-27. 37 Web Server
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556
Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, et al.: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic acids research. 2003, 31 (1): 374-378. 10.1093/nar/gkg108
Cormen TH, Leiserson CE, Rivest RL: Introduction to algorithms. 1995, New York: McGraw-Hill
Newman MEJ: Power laws, Pareto distributions and Zipf's law. Contemporary Physics. 2006, 46 (5): 323-351. 10.1080/00107510500052444.
GS, NK and RK conceived the study and drafted the manuscript. RK guided the study and proof-read the manuscript. All authors read and approved the final manuscript.