- Research
- Open access
- Published:
Complex-based analysis of dysregulated cellular processes in cancer
BMC Systems Biology volume 8, Article number: S1 (2014)
Abstract
Background
Differential expression analysis of (individual) genes is often used to study their roles in diseases. However, diseases such as cancer are a result of the combined effect of multiple genes. Gene products such as proteins seldom act in isolation, but instead constitute stable multi-protein complexes performing dedicated functions. Therefore, complexes aggregate the effect of individual genes (proteins) and can be used to gain a better understanding of cancer mechanisms. Here, we observe that complexes show considerable changes in their expression, in turn directed by the concerted action of transcription factors (TFs), across cancer conditions. We seek to gain novel insights into cancer mechanisms through a systematic analysis of complexes and their transcriptional regulation.
Results
We integrated large-scale protein-interaction (PPI) and gene-expression datasets to identify complexes that exhibit significant changes in their expression across different conditions in cancer. We devised a log-linear model to relate these changes to the differential regulation of complexes by TFs. The application of our model on two case studies involving pancreatic and familial breast tumour conditions revealed: (i) complexes in core cellular processes, especially those responsible for maintaining genome stability and cell proliferation (e.g. DNA damage repair and cell cycle) show considerable changes in expression; (ii) these changes include decrease and countering increase for different sets of complexes indicative of compensatory mechanisms coming into play in tumours; and (iii) TFs work in cooperative and counteractive ways to regulate these mechanisms. Such aberrant complexes and their regulating TFs play vital roles in the initiation and progression of cancer.
Conclusions
Complexes in core cellular processes display considerable decreases and countering increases in expression, strongly reflective of compensatory mechanisms in cancer. These changes are directed by the concerted action of cooperative and counteractive TFs. Our study highlights the roles of these complexes and TFs and presents several case studies of compensatory processes, thus providing novel insights into cancer mechanisms.
Background
Transcriptional regulation is a fundamental mechanism by which all cellular systems mediate the activation or repression of genes, thereby setting up striking patterns of gene expression across diverse cellular conditions - e.g. across cell-cycle phases [1–3], normal vs cancer states [3] or stress conditions [4]. Such regulation of gene expression is executed by the concerted action of transcription factors (TFs) that bind to specific regulatory DNA sequences associated with target genes [5, 6]. Deciphering the roles of TFs is a significant challenge and has been the focus of numerous studies, with great interest being recently shown in cancer [3, 4, 7–9]. For example, Bar-Joseph et al. [3] identified periodically expressed cell-cycle genes in human foreskin fibroblasts to understand their differential regulation between normal and cancer conditions. Nebert [7] surveyed TF activities in cancer, emphasizing the roles of TFs as proto-oncogenes (gain-of-function) that serve as accelerators to activate the cell cycle, and as tumour suppressors (loss-of-function) that serve as brakes to slow the growth of cancer cells. Darnell [8] classified TFs having cancerous or oncogenic potential into three main kinds - steroid receptors (e.g. oestrogen receptors in breast cancer and androgen receptors in prostate cancer), resident nuclear proteins activated by serine kinase cascades (e.g. JUN and FOS), and latent cytoplasmic factors normally activated by receptor-ligand interaction at the cell surface (e.g. STATs and NFκ B). Darnell [8] also discussed the signalling pathways of these TFs (including Wnt-β-catenin, Notch and Hedgehog signalling) as potential drug targets in cancer. Karamouzis and Papavassiliou [9] discussed rewiring of transcriptional regulatory networks in breast tumours focusing on subnetworks of estrogen receptor (ERs) and epidermal growth factor receptor (EGFRs) family members.
Most studies focus on transcriptional regulation of individual target genes. However, diseases such as cancer are a result of the combined effect of multiple genes. Gene products such as proteins seldom act in isolation, but instead physically interact to constitute complexes that perform specialized functions [10, 11]. Studying protein complexes therefore provides an aggregative or "systems level" view of gene function and regulation than studying individual proteins (genes). Here we integrate large-scale protein-interaction (PPI) and gene-expression datasets to examine the differential regulation of complexes across cancer conditions.
An initial analysis
We compiled a list of protein complexes by clustering a network of human PPIs. Co-functional (interacting) proteins are encoded by genes showing high mRNA co-expression [12, 13]. Therefore, we quantified the "functional activity" for each of these complexes by aggregating pairwise co-expression values between their constituent proteins. Analysis for two pancreatic-tissue conditions viz normal and ductal adenocarcinoma (PDAC) tumour revealed significant changes in co-expression for these complexes between the two conditions. For example (Figure 1), CHUK-ERC1-IKBKB-IKBKG showed a change in co-expression, interestingly coinciding with changes in its transcriptional regulation by the NFκ B-family of TFs. This complex constitutes the serine/threonine kinase family, while the TFs play essential roles in NFκ B signalling pathway (http://www.genecards.org) [14], which are implicated in PDAC [15, 16].
Based on these observations, here we seek to understand differential co-expression of complexes and its relationship with differential regulation by TFs between cancer conditions. Therefore we:
-
devise a computational model to identify complexes showing significant differential co-expression and the TFs regulating these complexes; and
-
apply the model on two case studies - normal vs PDAC tumour and BRCA1 vs BRCA2 familial breast tumour conditions - to decipher their roles in these tumours.
In summary (see Methods for details) we compute co-expression values for each of the complexes under different cancer conditions. We then introduce a log-linear model to relate changes in co-expression of complexes to changes in their regulation by TFs between these conditions. We apply the model to identify influential TFs and complexes and validate their roles in cancer.
Results
Experimental datasets
PPI data: We gathered Homo sapiens PPIs identified from multiple low- and high-throughput experiments deposited in Biogrid (v3.1.93) [17] and HPRD (2009 update) [18]. To minimize false positives in these PPI datasets, we employed as scoring scheme Iterative-CD [19] (with 40 iterations) to assign a reliability score (between 0 and 1) to every interaction, and then discarded all low-scoring interactions (< 0.20) to build a dense high-quality PPI network of 29600 interactions among 5824 proteins (average node degree 10.16).
Gene expression data: We have performed one of the largest gene expression profiling analyses of familial breast tumours (n = 74) and stratified them based on BRCA mutation status as BRCA1-, BRCA2- and non- BRCA1/2 tumours [20]. Among these, BRCA1 and BRCA2 tumours are phenotypically most different [21] and we consider these two for our analysis here; our dataset contains 19 BRCA1 and 30 BRCA2 expression samples (GEO accession GSE19177). In addition, we also gathered expression samples from pancreatic tumours - normal and PDAC matched (39 in each) - from the Badea et al. study [22] (GSE15471).
Sporadic breast tumours constitute 93-95% of all breast tumours and most studies classify these into the four molecular subtypes, luminal-A, luminal-B, basal-like and HER2-enriched [23–25]. Broadly, basal-like tumours do not express the ER, PR and HER2 receptors, and exhibit high aggressiveness and poor survival attributed to distant metastasis, compared to luminal tumours. However, much less is known about familial tumours (the remaining 5-7%), although studies [20, 21, 25] have noted that BRCA1 tumours are predominantly basal-like while BRCA2 tumours are more hetergeneous and may be HER2-enriched or luminal-like.
Pancreatic tumours, on the other hand, are more uniform with PDAC accounting for most (95%) pancreatic tumours and is predominantly characterized by dysfunctioning (by mutation) of the KRAS oncogene and of the CDKN2A, SMAD4 and TP53 tumour-suppressor genes [16].
Transcription factors: We gathered 1391 TFs from Vaquerizas et al. [26], manually curated from a combined assessment of DNA-binding capabilities, evolutionary conservation and integration of multiple sources. Benchmark complexes: For independent validation, we used manually curated human complexes from CORUM [27], a total of 1843 complexes of which we used 722 having size at least 4.
Benchmark genes and TFs in cancer: For validation we used known (mutation-driver) genes (total 118) from COSMIC [28] and known TFs (total 82) in cancer from [29].
Analysis of PPI networks highlights considerable rewiring between tumour conditions
By integrating PPI and gene expression datasets (see Methods) we obtained two pairs of conditional PPI networks - normal-PDAC for pancreatic and BRCA1-BRCA2 for breast tumours. Figure 2 shows the co-expression-wise distribution for protein pairs in these networks. Normal vs PDAC displayed striking differences in these distributions (KS test: D NP = 23.11 > K α = 0.05 = 1.36), reflecting considerable rewiring of PPIs. PDAC showed significant loss in co-expression for both positively co-expressed as well as negatively co-expressed interactions compared to normal, indicative of both disruption as well as emergence of interactions in the tumour. Such rewiring has also been noted in earlier studies [30, 31].
Strikingly enough BRCA1 vs BRCA2 tumours also showed significant differences in PPI distributions (Figure 2) (KS test: D B12 = 22.85 > K α = 0.05 = 1.36), reflecting considerable differences in PPI wiring between the two breast tumours. BRCA1 tumours displayed higher co-expression compared to BRCA2 tumours, ∼ 15700 PPIs with higher correlations.
DAVID-based [32] functional analysis of these rewired interactions (Δ ≥ 0.50) showed significant (p ≤ 1.1) enrichment for the Biological Process (BP) terms - Cell cycle, Chromatin organisation, DNA repair and RNA splicing, indicating considerable rewiring in core cellular processes responsible for genome stability. Among these were interactions involving the tumour suppressors TP53 and SMAD4 in PDAC, which are known genes mutated in the tumour, and the DNA double-strand break (DSB) repair proteins BRE and BRCC3 along with BRCA1, BRCA2 and TP53, in breast tumours.
Analysis of complexes highlights disruption to core cellular mechanisms in tumours
Matching of complexes using t J = 0.67 and δ = 0.10 (Methods) resulted in a total of 256 and 277 matched complexes (M) for normal-PDAC and BRCA1-BRCA2 conditions, respectively (Table 1). The co-expressionwise distributions (Figure 3) revealed significant differences for both normal vs PDAC as well as BRCA1 vs BRCA2 conditions (KS test: D NP = 1.69 > K α=0.05 = 1.36 in pancreatic and D B12 = 5.48 > Kα=0.05= 1.36 in breast), indicating that rewiring in PPI networks had considerable impact on these complexes. Overall, we noticed considerable drop in co-expression for PDAC vis-a-vis normal, whereas BRCA1 tumours showed higher co-expression vis-a-vis BRCA2 tumours (Figures 3 and Figure 4). These differences were larger towards the higher co-expression ranges which correspond better to active complexes (Figure 3), indicating that cellular functions were considerably impacted in these tumours. These observations were reproducible using an independent set of complexes from CORUM (Figures 3 and Figure 4) and were significantly (p < 0.001) greater than expected by random (using 500 random complexes generated 1000 times).
DAVID-based analysis for complexes displaying changes ≥ 0.4 indicated significant (p < 0.001) enrichment for core cellular pathways involved in genome stability including Cell cycle and DNA repair (Table 2). The complexes in PDAC were enriched for TGF-β, Wnt and NFκ B signalling, all of which are implicated in pancreatic cancer [16, 33–35]. The complexes in breast tumours reflected aberration in Homologous recombination (HR), a key DSB-repair pathway which includes the breast cancer susceptibility genes BRCA1 and BRCA2.
Analysis of complexes reveal compensatory mechanisms activated in tumours
We next divided the set of matched complexes into two subsets:
-
- those with higher co-expression in normal vis-a-vis PDAC, or higher co-expression in BRCA1 tumours vis-a-vis BRCA2 tumours; and
-
- those with lower co-expression in normal vis-a-vis PDAC, or lower co-expression in BRCA1 tumours vis-a-vis BRCA2 tumours.
Table 3 shows changes in co-expression (ΔC) observed for and . While most complexes showed a decrease in co-expression from normal to PDAC (159 out of 256) and from BRCA1 to BRCA2 tumours (225 out of 277), interestingly a considerable number of complexes showed an increase (96 and 52). But, the decrease was steeper compared to the increase (max: 0.969 vs 0.421 and 0.761 vs 0.543; avg: 0.336 vs 0.192 and 0.281 vs 0.197). Similar trends were observed using CORUM complexes and were significantly (p < 0.001) greater than expected by random. We suspect these observations are indicative of compensatory mechanisms coming into play in these tumours, as explained below.
In the classical work on "hallmarks of cancer", Hanahan and Weinberg [36] describe seven to ten key distinguishing hallmarks of tumour cells, among which are limitless replicative potential and self-sufficiency in growth signals. Cellular mechanisms including cell cycle and DDR are considerably weakened in tumour cells, but these cells survive on last-standing mechanisms (weak links) to continue proliferation. This is due to the activation of compensatory or back-up mechanisms. Although these compensatory mechanisms cannot completely substitute for the weakened or disrupted ones, these are sufficient to enhance the survival of tumour cells [36, 37]. Our analysis reflect such compensatory trends - a fraction of complexes showed increase in co-expression, but the increase was not as steep as the decrease for the remaining faction. However, a straightforward Gene Ontology analysis is too general to delineate the roles of the two factions because both originate from the same or similar processes. We therefore investigated a few specific cases (below).
Examples of compensatory mechanisms and validation for roles in cancer
Normal vs PDAC tumour (Figure 5a): DSB-repair functionality is severely impacted in PDAC [38, 39], with inactivating mutations in RAD50 and NBS1 attributed to loss of DSB-repair functionality increasing the risk of pancreatic cancer [38]. DSBs are detected by the MRE11-RAD50-NBS1 (MRN) and Ku70/Ku80 (XRCC6/XRCC5) complexes in the HR and non-homologous end-joining (NHEJ) pathways, respectively. In HR, the repair process involves recruitment of the BRCA1-A complex (BRCA1-BARD1-FAM175A-UIMC1-BRE-BRCC3-MERIT40) to sites of DSBs. We observed a decrease in co-expression for all the three complexes, indicating considerable weakening of the DSB machinery. On the other hand, we noticed an increase in co-expression for the single-strand break (SSB) and mismatch (MMR) repair complexes XRCC1-POLB-PNKP-LIG3 and MSH6-MLH1-MLH2-PSM2-PCNA, respectively. The XRCC1 complex is responsible for SSB repair through sister chromatid exchange following DNA damage by ionizing radiation, while the MSH6 complex is involved in the recognition and repair of mispairs. Together these observations suggest the activation of SSB and MMR machinery compensating for the loss in DSB-repair machinery; such a functional relationship has been observed previously between DSB and SSB repair pathways [40].
The NFκ B signalling pathway has been strongly implicated in KRAS signalling and pancreatic tumorigenesis [41, 42]. Consistent with this, we noticed considerable changes in co-expression for several NFκ B complexes including the NFκ B1/REL family, which plays important roles in programmed cell death and proliferation control and is critical in tumour initiation and progression [42]. The calcium-binding proteins S100A2, S100A8 and S100A9 are known to modulate P53 activity [43] and their over-expression has been associated with metastatic phenotype of pancreatic cancer [33]. The inactivation of the RAS-associated RASSF1A and RASSF5 complexes, which act as tumour suppressors [44, 45], is frequent in pancreatic cancer [44]. The complex DDX20-GEMIN4-PPP4C-PPP4R2 associated with the SMN (survival of motor neuron), and SNAP23-STX4-VAMP3-VAMP8 associated with vesicular transport, docking and/or fusion of synaptic vesicles with the presynaptic membrane (http://www.genecards.org) [14], support tumorigenic invasion of neural cells in pancreatic cancer [35].
BRCA1 vs BRCA2 tumours (Figure 5b): We observed a lower co-expression for the MMR complex MLH1-MSH6-MSH2-PMS2-PCNA in BRCA1 tumours compared to BRCA2 tumours; we think this is due to the parallel roles of BRCA1. BRCA1 has a key role in DSB repair, and BRCA1-deficient cells have defects in the two DSB repair pathways HR and NHEJ [46]. BRCA1 associates with PCNA and the mismatch repair proteins MSH2, MSH6 and MLH1 to form the BASC complex, a genome-surveillance complex required to sense and repair DNA damages [47], thereby also playing a role in the MMR pathway. On the other hand, BRCA2 has been associated with functions only in HR [48–51]. Therefore, we suspect that although MMR pathway is compensatorily activated in response to DSB-repair deficiency, BRCA1 tumours exhibit a weaker MMR pathway compared to BRCA2 tumours because of the direct involvement of BRCA1 in the MMR pathway.
The DSS1 complex consisting of BRCA2, DSS1 and the integrator subunits mediates the 3'-end processing of small nuclear RNAs [52], and BRCA2 deficiency could result in a reduced stability of this complex. The expression of replication factor C complex (RFC2, RFC3 and RFC4) is indicative of proliferative potential (high cell division rates) of BRCA1 tumours. We noticed over-expression of this complex in BRCA1 compared to BRCA2 tumours.
Finally, a considerable number of cancer genes from COSMIC Classic were represented in complexes showing changes ΔC ≥ 0.10 (Figure 6), suggesting that differential co-expression of complexes is a strong indicator of tumorigenic processes.
Relating changes in co-expression complexes to their transcriptional regulation
We computed Pearson and Spearman rank coefficients between changes in co-expression of complexes and their transcriptional regulation as follows. For each complex-pair {S s , T t } ∈ M(S, T ), we measured its change in correlation ΔC(S s , T t ), and the total change in its regulation by TFs Ff, (see Methods). This resulted in 226 complex-TF pairs in pancreatic and 241 in breast with non-zero ΔC and ΔR. Note that we lose at most 13% of complexes (pancreatic: 256 down to 226, breast: 277 down to 241) as a result of our requirement that TFs interact with at least one complexed protein (Methods). We observed positive Pearson and Spearman coefficients which were supported by CORUM complexes (Table 4). The Spearman coefficients were higher than Pearson in both cases, indicating a non-linear relationship; this supports our use of a log-linear model (Methods).
Analysing influential TFs in pancreatic and breast tumours
Table 5 lists the TFs with non-zero overall influence identified using our model (see Methods). Extrapolating from the simplified example (see Methods), the + and − signs can be interpreted as cooperative and counteractive action of TFs in regulating complexes. As these are overall influence values (that is, across all complexes and TFs), it is difficult to interpret this straightaway. Therefore, we restrict our focus to only STAT1 and STAT3. These two TFs are directly involved in pancreatic tumorigenesis and proliferation, and are thought to play opposite roles - while STAT1 promotes apoptosis, STAT3 is essential for the proliferation and survival of tumour cells [53]. Solving Equation 5 for STAT1 and STAT3 using only the subset of complexes they share (#90), we obtained γ(STAT1) = 1.714 and γ(STAT3) = − 1.582, i.e. these are counteractive TFs (Methods). Their shared complexes were enriched for Cell cycle, Apoptosis and RAS signalling, consistent with the counteractive roles for STAT1 and STAT3 [53].
Differential expression analysis using limma [54] for normal vs PDAC indicated that most of the influential TFs were significantly up- or down-regulated (Table 5). But, a few influential TFs did not show such differential expression, for example heat shock factor-1 (HSF1). Investigation into the complexes regulated by HSF1 revealed considerable changes in co-expression for the cysteine-aspartic acid protease (caspase) family including CASP10-CASP8-FADD-FAS (from 1.28 to − 0.019), documented in CORUM [27] under the functional category '40.10.02: Apoptosis'. Caspases are involved in signal transduction pathways of apoptosis, necrosis and inflammation (http://www.genecards.org) [14], and the role of HSF1 in regulating caspases thereby contributing to the pathogenesis of pancreatic cancer has been investigated [55].
In the case of BRCA1 vs BRCA2 tumours, only four of the influential TFs (GATA3, ESR1, FOXA1 and XBP1) were identified as differentially expressed. These four TFs are ER targets. BRCA1 tumours, being predominantly basal-like, do not express ER and therefore show lower expression of ER targets compared to BRCA2 tumours, which are predominantly luminal-like and express ER [21]. Additionally, Joshi et al. [56], using a pathway-based analysis, have noted over-representation of ESR1, GATA3, MYC, XBP1, FOXA1 and MSX2 in luminal tumours, and NFκ B1, C/EBPβ, FOXO3, JUN, POU2F3 and FOXO1 in basal-like tumours. We also found higher expression of the NFκ B-signalling TFs in BRCA1 tumours - the complex NFκ B1-NFκ B2-REL-RELA-RELB composed entirely of NFκ B TFs, showed a higher correlation in BRCA1 tumours than BRCA2 tumours. This is consistent with earlier findings [56, 57] that ER-negative tumours (BRCA1 tumours) display aberrant expression of NFκ B which makes these tumours highly aggressive.
These observations also suggest that differential expression is not sensitive enough to identify all the genes (here, TFs) involved in tumours. Many of the TFs may not be differentially expressed themselves, but are differentially co-expressed with their target genes. One such possible situation occurs when the TFs themselves are not mutated or (epigenetically) silenced, but their target genes are.
Finally, 12 of 37 TFs in pancreatic, and 14 of 40 TFs in breast tumours were among the 82 cancer TFs listed in [29]. DAVID-based functional analysis of TFs showed significant enrichment for several pathways in cancer (p < 1.1E-05, 23.1% genes), in particular the JAK-STAT pathway (p < 1.9E-02, 10.3% genes), a known driver pathway in cancer [53].
Discussion
We had observed considerable PPI rewiring via differential co-expression analysis (Figure 2). In Figure 7, we now show the PPI network for normal vs PDAC with interactions weighted by the differential co-expression values. Figure 7a highlights the largest component (558 proteins and 519 interactions), which shows an overall decrease in co-expression. A considerable number of genes in this component are targets of ubiquitination (UBC) and sumoylation (SUMO1 and SUMO2) (Figure 7b) possibly causing their inactivation. However, there are several pockets showing increase in co-expression. Interestingly some of the genes topologically central to these pockets are known drug targets in PDAC (Figure 7c), e.g. PLK1 [58] and ANAX2 [59]. Similarly PELP1 which interacts directly with STAT3 and is responsible for cell proliferation and survival in several tumours [53], is likewise an identified drug target in PDAC [60]. A similar analysis in BRCA1 vs BRCA2 tumours highlighted increase in PPI co-expression around the mitotic regulators CDK1, CDC20 and CKS1B and the histone deacetylases HDAC1 and HDAC6; these are known drug targets for which inhibitors have been developed [61, 62].
We clustered this normal vs PDAC network using MCL (inflation 2.3) both with and without the weights as input, and we observed that most clusters predominantly constitute only one kind of interactions, either those showing increase or decrease in co-expression - of the 30 clusters of sizes ≥ 4, in 17 at least two-thirds of the interactions show a decrease, and in 9 at least two-thirds of the interactions show an increase. Among these, PLK1 belonged to a cluster in which all interactions showed an increase (Figure 7d). Similarly, in the BRCA1 vs BRCA2 network, CDK1 and CKS1B belonged to a cluster that showed an increase for all its interactions. These observations suggest that identifying clusters (complexes) that show increase in co-expression could identify new therapeutic targets in cancer.
Conclusion
Proteins seldom act in isolation, but instead interact to constitute specialized complexes driving key processes. We integrated PPI and gene-expression datasets to perform a large-scale unbaised evaluation of complexes in PDAC and familial breast tumours. These complexes showed considerable changes in expression, in particular decreases and countering increases, reflecting compensatory processes coming into play in the tumours. These complexes enable us to explain the possible underlying mechanisms, which is otherwise difficult only by analysing individual genes. These complexes are driven by the concerted action of influential TFs, which themselves work in cooperative and counteractive ways. Network-based analysis shows that complexes could have therapeutic potential in cancer.
Methods
The workflow for our computational approach is depicted in Figure 8, building on our earlier work [63].
From earlier work [63](upper portion of Figure 8): We first assemble a high-confidence network of human PPIs to identify human protein complexes. These PPIs are largely devoid of contextual (conditional) information, and therefore we overlay mRNA expression data of the coding genes, assigning a confidence score to each protein pair under normal and tumour conditions. These scores reflect the presence or absence of interactions under these conditions. Complexes are extracted from these conditional PPI networks by network clustering; for a detailed background on PPI networks and the complex-extraction procedure, see [19, 64–67].
In this work (lower portion of Figure 8): The contribution of this work is to relate changes in co-expression of complexes to changes in their transcriptional regulation by TFs between cancer conditions by introducing a log-linear model. This enables us to identify influential TFs and to validate their roles in cancer. This procedure is described in the following subsections.
Measuring changes in co-expression of complexes between conditions
Let H = (V, E) be the human PPI network, where V is the set of proteins and E is the set of interactions among these proteins, and and be the sets of protein complexes extracted from H under any two conditions, say normal and tumour. For each complex , we calculate its co-expression as
where ρ(p, q) is the Pearson correlation for the protein pair (p, q). The ρ-values are Fisher-transformed, given by , which emphasizes the extreme ρ-values; for example, if ρ = +/-0.10 then z = +/-0.043, but if ρ = +/-0.99 then z = +/-1.149. The co-expression values for are calculated similarly.
To identify complexes that have changed co-expression between the conditions, we construct the set of matching complex pairs such that every pair satisfies (a) a differential co-expression ΔC(S s , T t ) > 0, and (b) a minimum Jaccard similarity in protein composition , where
We expect complexes disrupted between the two conditions to have changed their co-expression (including complete dissolution or new formation) or have gained or lost a few proteins (rewiring within complexes) and therefore we use a δ > 0 and a high t J .
Relating changes in co-expression to changes in transcriptional regulation
Let be the set of TFs. The regulation by a TF of a complex S s is measured as
(3)
where {p : p ∈ S s , (p, F f ) ∈ E} is the set of proteins of S s with which F f physically interacts in the network H. The regulation by F f of the complexes is measured similarly.
Here we consider a TF to regulate a complex only if the TF physically interacts (in the PPI network) with at least one protein in the complex. From the classical view of transcriptional regulation, this assumption means that we consider a TF to regulate a set of genes encoding a complex only if the TF physically interacts with at least one protein from that complex. Although this assumption may be valid for only a subset of TFs or complexes, we employ it here to simplify our model. Indeed (see Results) we only lose at most 13% TF-complex pairs due to this assumption.
We then relate changes in regulation by TFs to changes in co-expression of complexes for between the two conditions using a log-linear model
where ΔR(S s , T t , F f ) = |R(S s , F f ) − R(T t , F f )| is the differential regulation of the complex-pair (S s , T t ) by F f , and γ f is the influence coefficient of F f in regulating the change ΔC(S s , T t ). Log-linear models are widely used to approximate non-linear systems because they inherit the benefits of linear models yet allow a restricted non-linear relationship between inputs and outputs [68].
Equation 4 can be written in matrix form after taking the logarithm as
where is a matrix of (log) differential co-expression of complexes, is a matrix of (log) differential regulation by the TFs (here, ), and [Γ] is a k × 1 matrix of influence coefficients for the TFs. Given this combinatorial regulation model, our purpose is to compute the influence coefficients γ f (1 ≤ f ≤ k) by solving Equation 5, and for this we employ singular-value decomposition (SVD) arriving at a least-squares solution [69]. The TF displaying the highest (absolute) coefficient |γ| has the highest overall influence in regulating changes in co-expression of complexes.
A simplified example to demonstrate our model
Solving the equation can give positive as well as negative γ values. The absolute value |γ| indicates the magnitude of the influence, while the sign indicates the direction: TFs of the same sign regulating a set of complexes work cooperatively, while those of opposite signs work counteractively with each other. To understand this consider the following simplified example in which two TFs with influences γ1 and γ2 regulate two complexes A and B as per the following set of equations:
which after taking log10 becomes,
Here we see that the second TF performs at least twice the regulation than the first TF on the two complexes (5 and 6 vs 10 and 20), the regulation by the second TF is doubled (from 10 to 20) as against a smaller increase for the first TF (from 5 to 6) between A and B, and yet A and B show roughly the same change in co-expression (0.50 vs 0.60). This intuitively means that the first TF has a greater influence than the second TF, and that counteracts the second TF to maintain the co-expression of complexes similar. Indeed upon solving the equations we get γ1 = − 1.293 and γ2 = 0.603, which is interpreted as the first TF being about twice as influential as the second, with the two TFs working counteractively in regulating A and B. It is easy to realize a similar example for the cooperative action of TFs.
Availability of supporting data
The datasets used in this study are available at: http://www.bioinformatics.org.au/tools-data under CONTOURv 2
References
Elkon R, Linhart C, Sharan R, Shamir R, Shiloh Y: Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. Genome Research. 2003, 13: 773-780. 10.1101/gr.947203.
Srihari S, Leong HW: Temporal dynamics of protein complexes in PPI networks: a case study using yeast cell cycle dynamics. BMC Bioinformatics. 2012, 13 (Suppl 17): S16-
Bar-Joseph Z, Siegfried Z, Brandeis M: Genome-wide transcriptional analysis of the human cell cycle identifies genes differentially regulated in normal and cancer cells. Proceedings of the National Academy of Sciences USA. 2007, 105 (3): 955-960.
Spriggs KA, Bushell M, Willis AE: Translational regulation of gene expression during conditions of cell stress. Molecular Cell. 2010, 40 (2): 228-237. 10.1016/j.molcel.2010.09.028.
Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, Gerstein M: Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004, 431: 308-312. 10.1038/nature02782.
Balaji S, Babu MD, Lakshminarayan MI, Luscombe NM, Aravind L: Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. Journal of Molecular Biology. 2006, 360: 213-227. 10.1016/j.jmb.2006.04.029.
Nebert DW: Transcription factors and cancer: an overview. Toxicology. 2002, 131-141. 181/182
Darnell JE: Transcription factors as targets for cancer therapy. Nature Reviews Cancer. 2002, 2: 740-749. 10.1038/nrc906.
Karamouzis MV, Papavassiliou AG: Transcription factor networks as targets for therapeutic intervention of cancer: the breast cancer paradigm. Molecular Medicine. 2011, 17 (11-12): 1133-1136.
Vanunu O, Magger P, Ruppin E, Shlomi T, Sharan R: Associating genes and protein complexes with disease via network propagation. PLoS Computational Biology. 2009, 6 (1): e1000641-
Zhao J, Lee SH, Huss M, Holme P: The network organization of cancer-associated protein complexes in human tissues. Scientific Reports. 2013, 3: 1583-
Jansen R, Greenbaum D, Gernstein M: Relating whole-genome expression data with protein-protein interactions. Genome Research. 2002, 12: 37-46. 10.1101/gr.205602.
Grigoriev A: A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Research. 2001, 29: 3513-3519. 10.1093/nar/29.17.3513.
Safran M, Solomon I, Shmueli O, Lapidot M: GeneCards 2002: towards a complete, object-oriented, human gene compendium. Bioinformatics. 2002, 18 (11): 1542-1543. 10.1093/bioinformatics/18.11.1542.
Fujioka S, Sclabas GM, Schmidt C, Frederick WA: Function of nuclear factor kappaB in pancreatic cancer metastasis. Clinical Cancer Research. 2003, 9 (1): 346-354.
Jones S, Zhang X, Parsons WD: Core signalling pathways in human pancreatic cancers revealed by global genomic analysis. Science. 2008, 321 (5897): 1801-1806. 10.1126/science.1164368.
Stark C, Breitkreutz B, Chatr-aryamontri A: The BioGRID Interaction Database: 2011 update. Nucleic Acids Research. 2011, 39: D698-D704. 10.1093/nar/gkq1116.
Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S: Human Protein Reference Database-2009 update. Nucleic Acids Research. 2009, 37: D767-D772. 10.1093/nar/gkn892.
Liu G, Wong L, Chua HN: Complex discovery from weighted PPI networks. Bioinformatics. 2009, 25 (15): 1891-1897. 10.1093/bioinformatics/btp311.
Waddell N, Arnold J, Cocciardi S, da Silva L: Subtypes of familial breast tumours revealed by expression and copy number profiling. Breast Cancer Research and Treatment. 2010, 123: 661-677. 10.1007/s10549-009-0653-1.
Lakhani SR, Jacquemier J, Sloane JP, Gusterson BA, Anderson TJ: Multifactorial analysis of differences between sporadic breast cancers and cancers involving BRCA1 and BRCA2 mutations. Journal of the National Cancer Institute. 1998, 90 (15): 1138-1145. 10.1093/jnci/90.15.1138.
Badea L, Herlea V, Dima SO: Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies gene specifically overexpressed in tumor epithelia. Hepato-Gastroenterology. 2008, 55: 2015-2026.
Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS: Molecular portraits of human breast tumours. Nature. 2000, 406 (6797): 747-10.1038/35021093.
Sorlie T, Perou CM, Tibshirani R, Aas T: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proceedings of the National Academy of Sciences USA. 2001, 98 (19): 10869-10.1073/pnas.191367098.
Taherian-Fard A, Srihari S, Ragan MA: Breast cancer classification: linking molecular mechanisms to disease prognosis. Briefings in Bioinformatics. 2014, doi: 10.1093/bib/bbu020
Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM: A census of human transcription factors: function, expression and evolution. Nature Reviews Genetics. 2009, 10 (4): 252-63. 10.1038/nrg2538.
Ruepp A, Waegele B, Lechner M: CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Research. 2009, 38: D497-D501.
Bamford S, Dawson E, Forbes S, Clements J: The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. British Journal of Cancer. 2004, 91: 355-358.
Patel MN, Halling-Brown MD, Tym JE, Workman P, Al-Lazikani B: Objective assessment of cancer genes for drug discovery. Nature Reviews Drug Discovery. 2013, 12: 35-50.
Chu LH, Chen BS: Construction of a cancer-perturbed protein-protein interaction network for discovery of apoptosis drug targets. BMC Systems Biology. 2008, 2: 56-10.1186/1752-0509-2-56.
Srihari S, Raman V, Leong HW, Ragan MA: Evolution and controllability of cancer networks: a Boolean perspective. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2014, 11 (1): 83-94.
Dennis G, Sherman BT, Hosack DA: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biology. 2003, 4: R60-10.1186/gb-2003-4-9-r60.
Biankin AV, Kench JG, Colvin EK, Segara D: Expression of S100A2 calcium-binding protein predicts response to pancreatectomy for pancreatic cancer. Gastroenterology. 2009, 137: 558-568. 10.1053/j.gastro.2009.04.009.
Zeng G, Germinaro M, Micsenyi A: Aberrant Wnt/β-Catenin signaling in pancreatic adenocarcinoma. Neoplasia. 2006, 8 (4): 279-289. 10.1593/neo.05607.
Biankin AV, Waddell N, Kassahan KS, Gingras MC, Muthuswamy LB: Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature. 2012, 491 (7424): 399-405. 10.1038/nature11547.
Hanahan D, Weinberg RA: The Hallmarks of Cancer. Cell. 2000, 100 (1): 57-70. 10.1016/S0092-8674(00)81683-9.
Logue JS, Morrison DK: Complexity in the signaling network: insights from the use of targeted inhibitors in cancer therapy. Genes and Development. 2012, 26: 641-650. 10.1101/gad.186965.112.
Wang X, Szabo C, Qian C, Amadio PG, Thibodeau SN: Mutational analysis of thirty-two double-strand dna break repair genes in breast and pancreatic cancers. Cancer Research. 2008, 68: 971-10.1158/0008-5472.CAN-07-6272.
Li D, Liu H, Jiao L, Chang DZ: Significant impact of homologous recombination dna repair gene polymorphisms on pancreatic cancer survival. Cancer Research. 2006, 66 (6): 3323-3330. 10.1158/0008-5472.CAN-05-3032.
Gottipati P, Vischioni B, Schultz N, Solomons J, Bryant HE: Poly(ADP-Ribose) Polymerase is hyperactivated in Homologous recombination-defective cells. Cancer Research. 2010, 70: 5389-10.1158/0008-5472.CAN-09-4716.
Campbell SL, Khosravi-Far R, Rossman KL, Clark GJ, Der CJ: Increasing complexity of Ras signaling. Oncogene. 1998, 17: 1395-1413. 10.1038/sj.onc.1202174.
Koorstra JBM, Hustinx SR, Offerhaus JA, Maitra A: Pancreatic carcinogenesis. Pancreatology. 2008, 8 (2): 110-125. 10.1159/000123838.
Mueller A, Schafer BW, Ferrar S, Weibel M: The calcium-binding protein S100A2 interacts with p53 and modulates its transcriptional activity. Journal of Biological Chemistry. 2005, 280: 29186-29193. 10.1074/jbc.M505000200.
Dammann R, Schagdarsurengin U, Liu L, Otto N, Gimm O, Dralle H, O Boehm B, Pfeifer P, Hoang-Vu C: Frequent RASSF1A promoter hypermethylation and K-ras mutations in pancreatic carcinoma. Oncogene. 2003, 22: 3806-3812. 10.1038/sj.onc.1206582.
Park J, Kang SI, Lee SY, Zhang XF, Kim MS: Tumor suppressor RAS association domain family 5 (RASSF5/NORE1) mediates death receptor ligand-induced apoptosis. Journal of Biological Chemistry. 2010, 285 (45): 35029-35038. 10.1074/jbc.M110.165506.
Liu C, Srihari S, Lˆe Cao KA, Chenevix-Trench G, Simpson PT, Ragan MA, Khanna KK: A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy. Nucleic Acids Research. 2014, 42 (10): 6106-6127. 10.1093/nar/gku284.
Wang Y, Cortez D, Yazdi P, Neff N, Elledge SJ, Qin J: BASC, a super complex of BRCA1-associated proteins involved in the recognition and repair of aberrant DNA structures. Genes and Development. 2000, 14: 927-939.
Khanna KK, Jackson SP: DNA double-strand breaks: signaling, repair and the cancer connection. Nature Genetics. 2001, 27: 247-254. 10.1038/85798.
Zhuang J, Zhang J, Willers H, Wang H: Checkpoint kinase 2-mediated phosphorylation of brca1 regulates the fidelity of nonhomologous end-joining. Cancer Research. 2006, 66: 1401-10.1158/0008-5472.CAN-05-3278.
Thompson EG, Fares H, Dixon K: BRCA1 requirement for the fidelity of plasmid DNA double-strand break repair in cultured breast epithelial cells. Environ and Mol Mutagenesis. 2012, 53 (1): 32-43. 10.1002/em.21674.
Jiang G, Plo I, Wang T, Rahman M, Cho JH, Yang E, Lopez BS, Xia F: BRCA1-Ku80 protein interaction enhances end-joining fidelity of chromosomal double-strand breaks in the G1 phase of the cell cycle. Journal of Biological Chemistry. 2013, 288: 8966-8976. 10.1074/jbc.M112.412650.
Baillat D, Hakimi MA, Naar AM, Shilatifard A, Cooch N, Sheikhattar R: Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell. 2005, 123 (2): 265-276. 10.1016/j.cell.2005.08.019.
Pensa S, Regis G, Boselli D, Novelli F, Poli V: STAT1 and STAT3 in tumorigenesis: two sides of the same coin?. 2009, JAK-STAT Pathway in Disease, 8-
Symth GK: Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology. 2004, 3 (1): 3-
Dudeja V, Chugh RK, Sangwan V, Skube SJ: Prosurvival role of heat shock factor 1 in the pathogenesis of pancreatobiliary tumors. American Journal of Physiology Gastrointestinal and Liver Physiology. 2011, 300 (6): G949-G955.
Joshi H, Nord SH, Frigessi A, Borresen-Dale AL, Kristensen VN: Overrepresentation of transcription factor families in the genesets underlying breast cancer subtypes. BMC Genomics. 2012, 13: 199-10.1186/1471-2164-13-199.
Biswas DK, Shi Q, Baily S, Strickland I: NF-kappa B activation in human breast cancer specimens and its role in cell proliferation and apoptosis. Proceedings of the National Academy of Sciences USA. 2004, 101 (27): 10137-10142. 10.1073/pnas.0403621101.
Zhang C, Sun X, Ren Y, Lou Y, Zhou J, Liu M, Li D: Validation of Polo-like kinase 1 as a therapeutic target in pancreatic cancer cells. Cancer Biology and Therapy. 2012, 13 (12): 1214-1220. 10.4161/cbt.21412.
Zheng L, Jaffee EM: Annexin A2 is a new antigenic target for pancreatic cancer immunotherapy. Oncoimmunology. 2012, 1 (1): 112-114. 10.4161/onci.1.1.18017.
Kashiwaya K, Nakagawa H, Hosokawa M: Involvement of the tubulin tyrosine ligase-like family member 4 polyglutamylase in PELP1 polyglutamylation and chromatin remodeling in pancreatic cancer cells. Cancer Research. 2010, 70 (10): 4024-4033. 10.1158/0008-5472.CAN-09-4444.
Sutherland RL, Musgrove EA: CDK inhibitors as potential breast cancer therapeutics: new evidence for enhanced efficacy in ER+ disease. Breast Cancer Research. 2009, 11 (6): 112-10.1186/bcr2454.
Marks PA, Richon VM, Rifkind RA: Histone deacetylase inhibitors: inducers of differentiation or apoptosis of transformed cells. Journal of the National Cancer Institute. 2000, 92 (15): 1210-16. 10.1093/jnci/92.15.1210.
Srihari S, Ragan MA: Systematic tracking of dysregulated modules identifies novel genes in cancer. Bioinformatics. 2013, 29 (12): 1553-1561. 10.1093/bioinformatics/btt191.
Srihari S, Ning K, Leong HW: MCL-CAw: a refinement of MCL for detecting yeast complexes from weighted PPI networks by incorporating core-attachment structure. BMC Bioinformatics. 2010, 11: 504-10.1186/1471-2105-11-504.
Srihari S, Leong HW: A survey of computational methods for protein complex prediction from protein interaction networks. Journal of Bioinformatics and Computational Biology. 2013, 11 (2): 1230002-10.1142/S021972001230002X.
Srihari S, Leong HW: Employing functional interactions for characterisation and detection of sparse complexes from yeast PPI networks. International Journal of Bioinformatics Research and Applications. 2012, 8 (3): 286-304.
Ning K, Ng HK, Srihari S, Leong HW, Nesvizhskii AI: Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology. BMC Bioinformatics. 2010, 11 (1): 505-10.1186/1471-2105-11-505.
Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP: Network component analysis: reconstruction of regulatory signals in biological systems. Proceedings of the National Academy of Sciences USA. 2003, 100 (26): 15522-15527. 10.1073/pnas.2136632100.
Yeung MKS, Tegner J, Collins JJ: Reverse engineering gene networks using singular value decomposition and robust regression. Proceedings of the National Academy of Sciences USA. 2002, 99 (9): 6163-6168. 10.1073/pnas.092576199.
Acknowledgements
This study was supported by an Australian National Health and Medical Research Council (NHMRC) project grant 1028742 to PTS and MAR. PTS is supported by a fellowship (ECF-10-12) from the National Breast Cancer Foundation (NBCF) Australia.
Declarations
Publication costs for this article were funded by The University of Queensland.
This article has been published as part of BMC Systems Biology Volume 8 Supplement 4, 2014: Thirteenth International Conference on Bioinformatics (InCoB2014): Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/8/S4.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
SS1 designed the study, performed the experiments and analysis, and wrote and revised the manuscript. KK helped in reviewing the results and providing biological interpretation for them. PBM, SS2, CL and PTS helped in data collection and statistical analysis. MAR and KK supervised the project. All authors have read and approved the manuscript.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Srihari, S., Madhamshettiwar, P.B., Song, S. et al. Complex-based analysis of dysregulated cellular processes in cancer. BMC Syst Biol 8 (Suppl 4), S1 (2014). https://doi.org/10.1186/1752-0509-8-S4-S1
Published:
DOI: https://doi.org/10.1186/1752-0509-8-S4-S1