Research article | Open | Published:
Composite functional module inference: detecting cooperation between transcriptional regulation and protein interaction by mantel test
BMC Systems Biologyvolume 4, Article number: 82 (2010)
Functional modules are basic units of cell function, and exploring them is important for understanding the organization, regulation and execution of cell processes. Functional modules in single biological networks (e.g., the protein-protein interaction network), have been the focus of recent studies. Functional modules in the integrated network are composite functional modules, which imply the complex relationships involving multiple biological interaction types, and detect them will help us understand the complexity of cell processes.
We aimed to detect composite functional modules containing co-transcriptional regulation interaction, and protein-protein interaction, in our pre-constructed integrated network of Saccharomyces cerevisiae. We computationally extracted 15 composite functional modules, and found structural consistency between co-transcriptional regulation interaction sub-network and protein-protein interaction sub-network that was well correlated with their functional hierarchy. This type of composite functional modules was compact in structure, and was found to participate in essential cell processes such as oxidative phosphorylation and RNA splicing.
The structure of composite functional modules containing co-transcriptional regulation interaction, and protein-protein interaction reflected the cooperation of transcriptional regulation and protein function implementation, and was indicative of their important roles in essential cell functions. In addition, their structural and functional characteristics were closely related, and suggesting the complexity of the cell regulatory system.
Functional modules are basic units of cells that consist of molecules that work together to perform a desired biological function. The investigation of functional modules facilitates the understanding of the organization, regulation and execution of cell processes. Currently, several functional modules have been computationally extracted from the structural characteristics of biological networks, such as the transcriptional regulation networks, protein-protein interaction networks and metabolic networks [1–10]. However, these studies have mainly been performed on single networks, and cooperation between different types of networks is seldom considered.
The global cell network integrates single networks , such as the one governing transcriptional regulation, that appear to interact, rather than operate independently [12–16]. Recently, substantial cooperative structures called composite motifs have been discovered within integrated networks [13–15], and show functionally relatedness [13, 15]. These composite motifs include two nodes, three nodes and four nodes motifs, such as composite pairs of co-transcriptional regulation and protein-protein interaction (CT-PPI). Three reports [13, 15, 17] showed that composite pairs of CT-PPI (C-pairs of CT-PPI) played important roles in cell function, especially in protein complexes which were also one kind of functional modules. But not all protein complexes are with high consistency between co-transcriptional regulation interactions (CTs) and protein-protein interactions (PPIs). Using yeast as model, Nicolas Simonis et al and Kai Tan et al discovered that protein complexes in the cell were not significant co-regulated.
Thus, we wished to investigate cooperation among different networks in a higher network structure hierarchy. In this work, we investigated the composite functional module of co-transcriptional regulation and protein-protein interaction (CT-PPI modules), and explored its structural and functional characteristics. Co-transcriptional regulation interactions (CTs) and protein-protein interactions (PPIs) are basic regulatory structures of transcriptional regulation and protein function. Our results showed that CTs and PPIs were highly consistent within the CT-PPI modules, indicating the important role of CT-PPI modules in cooperation between transcriptional regulation and implementation of protein function. We detected 15 CT-PPI modules that participated in essential cell processes including the oxidative phosphorylation pathway, RNA splicing, and DNA-dependent positive transcription regulation.
Results and Discussion
We constructed an S. cerevisiae integrated network of 1107 nodes and 39,785 links (38,351 CTs and 1434 PPIs). In Figure 1, nodes represent genes, and coloured edges represent different types of links. Genes with the same GO annotation were regarded as a functional module, for a total of 300 functional modules in the integrated network which contained 100 cellular component (CC) term and 200 biological process (BP) term.
Structural significance and functional coherence of composite CT-PPI pairs
Composite pairs of CT-PPI (C-pairs) are basic units that represent consistency between CTs and PPIs in the integrated network. And the presentation of C-pairs was different in our work and works reported before [13, 15, 17] (additional file 1 Figure S1A), for the integrated network was comprised by CT network and PPI network in our work but was comprised by PPI and transcriptional regulation interaction (TI) network in theirs'. We thought it could make the composite structure of C-pairs (e.g., CT-PPI modules) more concise (as Figure S1B showed) and help us detect CT-PPI modules with this presentation. C-pairs behaved as composite motifs in the integrated network because they occurred significantly more often than random (Figure 2A), this result also demonstrated the work reported before [13, 15, 17].
However, they also behaved as functionally coherent units. A C-pair was considered to be functionally coherent if both genes were annotated under the same GO term. Using a background of general (GO terms in our integrated network) or narrow (leaf terms) annotations, and considering only CC and BP branches, we compared the functional coherence fraction of 168 C-pairs to 38,351 CT pairs, and 1434 PPI pairs. A higher fraction of the C-pairs were functionally coherent, than the CT and PPI pairs (Figure 2B).
This result demonstrated that the cooperation between CTs and PPIs had important network structure and cell function effects. We searched for additional CT-PPI modules and investigated their characteristics.
Detecting CT-PPI modules
Functional modules in single networks are usually detected from "structure to function", meaning that modules are searched first by network, then by functional annotation analysis [1–9]. We chose a "function to structure" method to detect CT-PPI modules by first defining the functional module, and then conducting topological analysis for consistency between CT and PPI sub-networks. The detail is shown in Figure 1. First, we constructed an integrated network of CTs and PPIs in Saccharomyces cerevisiae. Proteins were grouped into different functional modules according to their gene ontology (GO) annotations . Finally, we used a network structure comparison Mantel test  to detect CT-PPI modules by their structural consistency between CT and PPI sub-networks in a given functional module.
We obtained 47 functional modules with a significant r value. We investigated the structural consistency of these functional modules to detect CT-PPI modules.
Association between structural consistency and functional hierarchy of CT-PPI modules
The structural consistency of CT-PPI modules was associated with their functional annotation hierarchy. We paired 41 of the 47 functional modules into 124 ascendant/descendant functional module pairs, with 6 excluded for lacking a relationship, according to their ascendant/descendant relationship in GO. Except for the GO:0009165-GO:0009260 pair, r values of the descendant function modules were greater than those of the ascendant modules (Figure 3). GO:0009165 and GO:0009260 shared the same number (17) of C-pairs, but the inconsistency of the other CTs and PPIs in GO:0009260 influenced the r value more than in GO:0009165. We used 0.2 as the threshold  for consistency between CT and PPI sub-networks of functional modules and obtained 25 functional modules. All r values were higher in the descendant, than in the ascendant functional modules. Choosing the descendent and isolated modules (functional modules with r > 0.2, but no ascendant/descendent relationships) yielded 15 CT-PPI modules (Table 1, Figure 4).
Global structural consistency of CT-PPI modules
C-pairs are the basic elements and locally consistent structures between CTs and PPIs in the integrated network. They also play important roles in the construction of CT-PPI modules, as CT-PPI modules were enriched with C-pairs (Table 2). In fact, CT-PPI modules were detected from the global consistency of CT and PPI sub-networks, rather than the local consistency (enrichment of C-pairs).
Considering only local consistency generated many functional modules enriched with C-pairs. Of 198 functional modules containing C-pairs in the integrated network, 140 were enriched with C-pairs (p < 0.01). This relatively large number changed little as p decreased, so that even at p < 1 × 10-10, 41 functional modules were still found to be enriched with C-pairs (see Additional file 2), although this level of cooperation between CTs and PPIs associated with cell functions is implausible (Additional file 3). In addition, no clear relationship between p (representing the degree of enrichment of C-pairs in functional modules), and the functional hierarchy of such functional modules was found (Additional file 4).
Structure compactness of CT-PPI modules
In single networks, links between genes in a module are more compact than links to genes in other modules . If the inner link density C in , was greater than the outer link density C out, we considered the module compact (see Materials and Methods for detailed definitions of C in and C out ). Compactness analysis of the functional modules separated the integrated network into CT and PPI networks, then examined the compactness of these sub-networks of functional modules. If both sub-networks were compact, we considered the integrated sub-network compact.
Compared with functional modules in the integrated network, and those enriched for C-pairs (p < 0.01), CT-PPI modules were more compact (Figure 5). Of 15 CT-PPI modules, 9 were compact in the integrated network (Table 1). This fraction (0.6) was much higher than the functional modules in the integrated network (0.07), and in those enriched for C-pairs (0.12). This showed that CT-PPI modules were not only modules in function, but showed modular behaviour in structure.
CT-PPI modules involved in essential functions
The nine structurally compact CT-PPI modules were annotated for oxidative phosphorylation (Table 3), a critical metabolic pathway that produces adenosine triphosphate. The GO annotations of the 9 functional modules were closely related to the oxidative phosphorylation process. And the nine oxidative phosphorylation CT-PPI modules were closely related in structure, sharing 164/21 C-pairs, where the numerator is the total number of C-pairs, and the denominator is the number of unique C-pairs in the nine CT-PPI modules. These CT-PPI modules were combined and their 73 genes annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) , showing again that these genes were enriched in the oxidative phosphorylation pathway (p < 1 × 10-32). Furthermore, when we annotated the corresponding genes for the 21 unique C-pairs, 14 annotated in complexes III, IV or V of the oxidative phosphorylation pathway, while the corresponding genes number is 15. YDL181W lacked an annotation in the KEGG pathway system.
We investigated the transcriptional factors (TFs) regulating the C-pairs in the nine compact CT-PPI modules. Although many TFs regulated more than one gene in the nine CT-PPI modules, only HAP4 regulated C-pairs (Table 3). This suggested that HAP4 plays an important role in the regulation of the oxidative phosphorylation pathway, especially complexes III, IV and V, consistent with a previous report .
CT-PPI modules GO:0005684 and GO:0030532 appeared to affect RNA splicing, and shared TF genes STE12 and DIG1 with the CT-PPI modules GO:0045893 (Table 3). However, the term description of GO:0045893 in GO shows it plays roles in "the positive regulation of transcription, DNA-dependent". Transcription and RNA splicing are time-sequential, so shared TFs ensures the coordination of these two processes. We conclude that these CT-PPI modules participate in transcription, which, if interrupted, prevents successful production of mRNA and protein.
CT-PPI modules GO:0007131 and GO:0000794 both appeared to have roles in DNA structure changes in meiosis. CT-PPI modules GO:0000790 annotated as "nuclear chromatin", which is also involved in the chromosome formation. These three CT-PPI modules seemed to be involved in the maintenance and transmission of genetic material.
The above analysis shows that CT-PPI modules are involved in essential eukaryote cellular functions. Their network structure, evaluated as consistency of CT and PPI, reflects the cooperation of transcriptional regulation and implementing protein function, with this type of structure ensuring stable regulation. The network characteristic of CT-PPI modules ensures the stable regulation of their functions.
Our results indicated that cooperation between CT and PPI is important to cell regulation. CT-PPI modules, which reflect the cooperation between CT and PPI in a module, were involved in essential cell functions. In addition, C-pairs, which reflected cooperation between CT and PPI motifs, were functionally coherent.
Our results also suggest that the structure and function of CT-PPI modules are closely related. Their network structure appeared to be conserved, as it coordinated two basic regulatory structures (CT and PPI). This type of structure could help ensure the stability of essential functions. Structural consistency and functional hierarchy in CT-PPI modules were associated, with their both functional and structural modularity. These findings reflect a close relationship between the structure and function of CT-PPI modules and show the complexity of cell regulation.
Many studies have investigated the relationship between the structure and function of special structures within networks, but findings have differed and the relationships have been ambiguous [13–15, 25–31]. In eukaryotes, cell networks have undergone evolutionary pressure for billions of years, generating special structures. Molecular evolution hypothesizes that most evolutionary events behave non-directionally, so special structures that occur in the network do not always carry out corresponding functions. Therefore, we propose that investigating the biological meaning implied in the structures before exploring their functions is the most logical method of studying network structures.
Experimentally identified interactions between TFs and their target genes in S. cerevisiae were extracted from chip-chip experiments , with data treated as Liao et al.  (p < 0.001). We obtained 4433 TIs for 113 TFs and 2400 target genes. If target genes shared TF (or TFs), we considered them co-regulated. In total, 167,708 CTs were found among 2376 genes.
Experimentally identified PPIs were extracted from the Database of Interacting Proteins (downloaded on September 2007) , yielding 17491 PPIs among 4392 genes, excluding homomultimeric proteins.
By overlapping the two data sets, we found 1856 genes with both types of links. For these genes, we selected GO items (layer > 5, annotated genes > 9 in BP and CC branches), and performed GO annotation analysis (downloaded on September 2007). Gene sets with ascendant and descendant GO terms were filtered for the descendant. We obtained 1107 genes annotated with 300 items (100 BP, 200 CC), with 38,351 CTs and 1434 PPIs. We defined a functional module as a gene group annotated in the same GO term in the integrated network.
Structure significance analysis of C-pairs
We defined the number of C-pairs in the integrated network as N real , randomized the integrated network, and defined the number of C-pairs as N rand . The integrated network was randomized according to Yeger-Lotem et al. . The integrated network was separated into CT and PPI networks. The two were randomized while keeping the degrees of nodes in each network unchanged, then integrated, for a total of 1000 randomizations. To our work, N real = 168, N rand = 109.8 ± 9.66 and the corresponding Z score = 6.03.
Functional coherence analysis of C-pairs
Using the annotation information of the 1107 genes with 300 GO terms, we defined a gene pair (CT, PPI, C-pair) as a functionally coherent pair if both genes were annotated with at least one common term. Hypergeometric distribution was used to test whether C-pairs had a higher functional coherence proportion than either CT or PPI pairs, using the formula below:
When comparing C-pairs with CT pairs,
x was the number of functionally coherent C-pairs in the integrated network,
M was the number of CT pairs,
K was the number of functionally coherent CT pairs, and
N was the number of C-pairs.
C-pairs were compared to PPI pairs in the same way.
Replacing the 300 GO terms with leaf GO terms and repeating the processing gave results under the narrow annotation system.
Detecting CT-PPI modules
To detect CT-PPI modules, we used Mantel test, which accounts for distance correlations, to measure the consistency between the CT and PPI sub-networks of a functional module. The simple Mantel test was used to calculate the similarity of two symmetric matrices. Parameter r was a measure of similarity between matrices, and was the Pearson correlation coefficient of the corresponding elements in the lower or upper triangular parts of the two matrices. Parameter p, which measures the significance of the Pearson correlation coefficient r, was calculated as the probability that the number of r in the randomized networks would be equal to or greater than that in the real network.
The parameter r was calculated using the formula:
Where A was the adjacency matrix representing the PPIs among 1107 genes. A ij was the Boolean value representing the interaction between protein i and j. When A ij = 1, PPI existed between i and j, and if A ij = 0 it did not. B was the adjacency matrix representing the CTs among 1107 genes. B ij was the boolean value representing co-regulation between proteins i and j. If B ij = 1, CT existed between genes i and j, and if B ij = 0 it did not.
In this study, r represented the consistency between the CT and PPI sub-networks of a functional module, and p represented the significance of the consistency.
We used zt software , designed for Mantel tests, to calculate the consistency between CT and PPI sub-networks of the 300 functional modules. To test the significance of consistency, 10,000 randomizations were performed. We used p < 0.01 (FDR q value < 0.05) as the threshold.
Enrichment analysis of C-pairs
We used a hypergeometric distribution to test the enrichment of C-pairs in a functional module, using the following formula:
When analyzing the enrichment of C-pairs in a functional module m;
a was the number of C-pairs in the functional module m,
X was equal to , was the total number of genes in the integrated network,
Y was the number of C-pairs in the integrated network,
Z was equal to , and was the number of genes in the functional module m.
Structure compactness analysis of CT-PPI modules
All the genes in a functional module were designated inner genes, and those outside a functional module were designated outer genes. For a functional module, the inner link density was defined as C in = L in /G in and the outer link density as C out = L out /G out .
L in was the number of links between inner genes,
G in was the number of inner genes with links to other inner genes,
L out was the number of links between G in inner genes and outer genes.
G out was the number of outer genes with links to G in inner genes,
If C in was greater than C out , we recognized the functional module as compact.
Zhang S, Jin G, Zhang XS, Chen L: Discovering functions and revealing mechanisms at molecular level from biological networks. Proteomics. 2007, 7 (16): 2856-2869. 10.1002/pmic.200700095
Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N: Revealing modular organization in the yeast transcriptional network. Nat Genet. 2002, 31 (4): 370-377.
Petti AA, Church GM: A network of transcriptionally coordinated functional modules in Saccharomyces cerevisiae. Genome Res. 2005, 15 (9): 1298-1306. 10.1101/gr.3847105
Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003, 34 (2): 166-176. 10.1038/ng1165
Spirin V, Mirny LA: Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA. 2003, 100 (21): 12123-12128. 10.1073/pnas.2032324100
Tornow S, Mewes HW: Functional modules by relating protein interaction networks and gene expression. Nucleic Acids Res. 2003, 31 (21): 6283-6289. 10.1093/nar/gkg838
Chen J, Yuan B: Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics. 2006, 22 (18): 2283-2290. 10.1093/bioinformatics/btl370
Pereira-Leal JB, Enright AJ, Ouzounis CA: Detection of functional modules from protein interaction networks. Proteins. 2004, 54 (1): 49-57. 10.1002/prot.10505
Hu Z, Killion PJ, Iyer VR: Genetic reconstruction of a functional transcriptional regulatory network. Nat Genet. 2007, 39 (5): 683-687. 10.1038/ng2012
Guimera R, Nunes Amaral LA: Functional cartography of complex metabolic networks. Nature. 2005, 433 (7028): 895-900. 10.1038/nature03288
Barabasi AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004, 5 (2): 101-113. 10.1038/nrg1272
Yeger-Lotem E, Margalit H: Detection of regulatory circuits by integrating the cellular networks of protein-protein interactions and transcription regulation. Nucleic Acids Res. 2003, 31 (20): 6053-6061. 10.1093/nar/gkg787
Yeger-Lotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, Alon U, Margalit H: Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. Proc Natl Acad Sci USA. 2004, 101 (16): 5934-5939. 10.1073/pnas.0306752101
Mazurie A, Bottani S, Vergassola M: An evolutionary and functional assessment of regulatory network motifs. Genome Biol. 2005, 6 (4): R35- 10.1186/gb-2005-6-4-r35
Zhang LV, King OD, Wong SL, Goldberg DS, Tong AH, Lesage G, Andrews B, Bussey H, Boone C, Roth FP: Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. J Biol. 2005, 4 (2): 6- 10.1186/jbiol23
Gunsalus KC, Ge H, Schetter AJ, Goldberg DS, Han JD, Hao T, Berriz GF, Bertin N, Huang J, Chuang LS, et al.: Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature. 2005, 436 (7052): 861-865. 10.1038/nature03876
Yu H, Xia Y, Trifonov V, Gerstein M: Design principles of molecular networks revealed by global comparisons and composite motifs. Genome Biol. 2006, 7 (7): R55- 10.1186/gb-2006-7-7-r55
Simonis N, van Helden J, Cohen GN, Wodak SJ: Transcriptional regulation of protein complexes in yeast. Genome Biol. 2004, 5 (5): R33- 10.1186/gb-2004-5-5-r33
Tan K, Shlomi T, Feizi H, Ideker T, Sharan R: Transcriptional regulation of protein complexes within and across species. Proc Natl Acad Sci USA. 2007, 104 (4): 1283-1288. 10.1073/pnas.0606914104
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556
Mantel N: The detection of disease clustering and a generalized regression approach. Cancer Res. 1967, 27 (2): 209-220.
Anthony NM, Johnson-Bawe M, Jeffery K, Clifford SL, Abernethy KA, Tutin CE, Lahm SA, White LJ, Utley JF, Wickings EJ, et al.: The role of Pleistocene refugia and rivers in shaping gorilla genetic diversity in central Africa. Proc Natl Acad Sci USA. 2007, 104 (51): 20432-20436. 10.1073/pnas.0704816105
Newman ME: Modularity and community structure in networks. Proc Natl Acad Sci USA. 2006, 103 (23): 8577-8582. 10.1073/pnas.0601602103
Kanehisa M: The KEGG database. Novartis Found Symp. 2002, 247: 91-101. discussion 101-103, 119-128, 244-152., full_text
Ingram PJ, Stumpf MP, Stark J: Network motifs: structure does not determine function. BMC Genomics. 2006, 7: 108- 10.1186/1471-2164-7-108
Dobrin R, Beg QK, Barabasi AL, Oltvai ZN: Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics. 2004, 5: 10- 10.1186/1471-2105-5-10
Meshi O, Shlomi T, Ruppin E: Evolutionary conservation and over-representation of functionally enriched network patterns in the yeast regulatory network. BMC Syst Biol. 2007, 1: 1- 10.1186/1752-0509-1-1
Wuchty S, Oltvai ZN, Barabasi AL: Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet. 2003, 35 (2): 176-179. 10.1038/ng1242
Kaplan S, Bren A, Dekel E, Alon U: The incoherent feed-forward loop can generate non-monotonic input functions for genes. Mol Syst Biol. 2008, 4: 203- 10.1038/msb.2008.43
Shen-Orr SS, Milo R, Mangan S, Alon U: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet. 2002, 31 (1): 64-68. 10.1038/ng881
Mangan S, Alon U: Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci USA. 2003, 100 (21): 11980-11985. 10.1073/pnas.2133841100
Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al.: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298 (5594): 799-804. 10.1126/science.1075090
Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP: Network component analysis: reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci USA. 2003, 100 (26): 15522-15527. 10.1073/pnas.2136632100
Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 2004, D449-451. 32 Database,
Eric B, Yves VP: zt: A Sofware Tool for Simple and Partial Mantel Tests. JSS. 2002, 7 (10):
Platzer A, Perco P, Lukas A, Mayer B: Characterization of protein-interaction networks in tumors. BMC Bioinformatics. 2007, 8: 224- 10.1186/1471-2105-8-224
This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 30871394, 30370798 and 30571034), the National High Tech Development Project of China, the 863 Program (Grant Nos. 2007AA02Z329), the National Basic Research Program of China, the 973 Program (Grant Nos. 2008CB517302) and the National Science Foundation of Heilongjiang Province (Grant Nos. ZJG0501, 1055HG009, GB03C602-4, JC200711 and BMFH060044).
CW, FZ and XL collected the data, carried out the detection and analysis of CT-PPI modules and wrote the paper. SZ and JL proposed and helped the analysis of C-Pairs. FS, KL and YY participated in the design and coordination of the study and helped draft the paper.
Chao Wu, Fan Zhang contributed equally to this work.