Crosstalk between transcription factors and microRNAs in human protein interaction network
© Lin et al; licensee BioMed Central Ltd. 2012
Received: 16 November 2011
Accepted: 13 March 2012
Published: 13 March 2012
Skip to main content
© Lin et al; licensee BioMed Central Ltd. 2012
Received: 16 November 2011
Accepted: 13 March 2012
Published: 13 March 2012
Gene regulatory networks control the global gene expression and the dynamics of protein output in living cells. In multicellular organisms, transcription factors and microRNAs are the major families of gene regulators. Recent studies have suggested that these two kinds of regulators share similar regulatory logics and participate in cooperative activities in the gene regulatory network; however, their combinational regulatory effects and preferences on the protein interaction network remain unclear.
In this study, we constructed a global human gene regulatory network comprising both transcriptional and post-transcriptional regulatory relationships, and integrated the protein interactome into this network. We then screened the integrated network for four types of regulatory motifs: single-regulation, co-regulation, crosstalk, and independent, and investigated their topological properties in the protein interaction network.
Among the four types of network motifs, the crosstalk was found to have the most enriched protein-protein interactions in their downstream regulatory targets. The topological properties of these motifs also revealed that they target crucial proteins in the protein interaction network and may serve important roles of biological functions.
Altogether, these results reveal the combinatorial regulatory patterns of transcription factors and microRNAs on the protein interactome, and provide further evidence to suggest the connection between gene regulatory network and protein interaction network.
A gene regulatory network (GRN) is a comprehensive collection of regulatory relationships that controls the global gene expression and the dynamics of protein output in a living cell [1–6]. These regulatory relationships may be derived from different layers in the gene regulatory system. Hence, a GRN can be roughly separated into two major levels: the transcriptional and the post-transcriptional levels.
At the transcriptional level, a class of DNA-binding proteins, known as transcription factors (TFs), plays a major role in regulating gene expression. By binding to specific regions of DNA sequences, TFs can control the transcription activities of target genes, thus regulating the production of mRNA transcripts [7–9]. Since it has been widely believed that TFs are the primary regulators of gene expression, previous research on GRNs has mainly focused on the regulatory relationships at the transcriptional level [5, 10, 11]. However, there is increasing evidence suggesting that, at the post-transcriptional level, microRNAs (miRNAs) may also contribute to modulation of gene expression on a large scale [1–3]. miRNAs are small non-coding, single stranded RNAs of ~22 nucleotides in length that are abundantly found in eukaryotic cells [1–3]. By binding to complementary sequences (a.k.a. miRNA binding-sites) on target messenger RNA transcripts (mRNAs), miRNAs can trigger translational repression or gene silencing, thus regulating the expression of their target genes at the post-transcriptional level [12, 13]. In recent years, miRNAs have been reported to control many biological processes, such as development, differentiation, growth, and even cancer development and progression [1–3]. Therefore, it has become critical to construct an integrated GRN that comprises both transcriptional and post-transcriptional regulatory interactions.
Similar to other biological networks, a GRN usually consists of several types of sub-network patterns known as network motifs, such as feedback and feedforward loops. Previous studies [5, 10, 11] have shown that certain types of network motifs are more overrepresented in GRNs. These network motifs, such as feedback loops and co-regulation, are found to play pivotal roles in gene regulation [15–17]. For example, in E. coli, ~35% TFs participate in negative autoregulation motifs which can significantly speed up the transcriptional response time  and smooth the fluctuations of protein expression . In addition to TFs, miRNAs may also form specific network motifs in the GRN. Previous studies [17–20] investigating the co-regulation between miRNAs and TFs found a variety of significant network motifs overrepresented in the co-regulation network, suggesting that the gene regulatory system requires close cooperation between transcriptional and post-transcriptional layers. These studies each proposed that the network motifs might be used as building blocks in GRNs. In order to understand how these motifs in the GRN influence the downstream biological processes, further studies on the protein interactome are essential.
Proteins are the major functional units in living cells, and usually do not work alone. Protein-protein interactions (PPIs), formed by two physically interacting proteins, are fundamental to most biological processes. In addition, proteins are translated from mRNAs, and therefore their abundance may be affected by upstream miRNAs and TFs. Consequently, investigating the correlations between PPIs and their upstream regulators could facilitate the understanding of biological mechanisms within living cells. Recently, the correlations between miRNAs and PPIs have been investigated [21, 22]. Liang and Li  revealed that proteins regulated by more miRNAs tend to possess higher degree, more interacting partners, in a protein interaction network (PIN). Furthermore, Hsu et. al.  provided a comprehensive analysis and suggested that miRNAs could influence specific biological processes through regulating a small number of selected proteins in a PIN, such as hub and bottleneck proteins. These studies have revealed some connection principles between upstream regulators and downstream PINs. However, the specifics of the cooperation between TFs and miRNAs and their combinational regulatory effects on human PINs remain unclear.
To construct the human GRN for our analysis, we collected TF and miRNA regulatory relationships from three online databases: TRED (Transcriptional Regulatory Element Database) , UCSC genome browser at http://genome.ucsc.edu/, and TargetScanHuman (release 6.0, November 2011) [24–27]. TRED contains transcriptional regulation information from experimental evidence and computational prediction. We collected 6,764 transcriptional regulation relationships between 133 TFs and 2,937 target genes from TRED. Additionally, we obtained the conserved binding sites of 125 TFs from UCSC genome browser. To identify the targets of these 125 TFs, the annotations of 21,368 human genes were downloaded from UCSC genome browser. We assigned a target gene to a TF if its promoter region (1000 bp upstream and 500 bp downstream of the transcription start site) covered at least one conserved binding site of the TF . After this process, we identified 52,301 regulations between 125 TFs and 12,383 targets. Then, the union of these two transcriptional regulatory networks from TRED and UCSC was considered as the GRN of transcriptional level in this study, containing 58,711 regulations between 211 TFs and 13,402 targets. For miRNA target prediction programs, previous study had noticed that TargetScan possessed relatively higher precision and sensitivity than other programs . We collected 144,490 post-transcriptional regulatory relationships between 153 miRNA families and 11,161 target genes with conserved binding sites of corresponding miRNAs. Next, these regulatory relationships collected from the three databases were merged together to construct our global GRN, in which nodes represent regulators (TFs/miRNAs) or target genes/proteins, and edges represent the regulatory relationships between regulators and targets.
Human PPI data were obtained from HPRD (Human Protein Reference Database) , which contained experimentally validated physical interactions among human proteins. In this study, we collected 37,080 interactions between 9,465 proteins.
Considering the incompleteness of current human PPI data, we performed an analogous analysis with an expanded PIN, a union of BioGRID  and HPRD PPI data, to verify our results. Additionally, since limited reproducibility of miRNA target prediction has also been reported [32–35], we further independently repeated our study with another miRNA target prediction database, miRBase , to confirm the robustness of our conclusions.
We screened four types of regulatory motifs from GRN: single-regulation, co-regulation, crosstalk, and independent, considering possible synergistic regulation between regulators. These regulatory motifs are depicted in Figure 1. The synergistic regulation defined here is determined by whether the regulators shared at least two common targets. A single-regulation motif consists of one regulator and its targets. The other three motifs consist of two regulators. The co-regulation motif is formed by two synergistic regulators and their shared targets. The crosstalk motif is formed by two synergistic regulators and their private (non-shared) target sets. The independent motif contains two non-synergistic regulators and their respective target sets.
Next, the PPI enrichment for each type of regulatory motif was analyzed. Specifically, for single-regulation, PPIs between every paired target genes were analyzed; for co-regulation motifs, only PPIs between common target genes were analyzed; and for crosstalk and independent motifs, only PPIs between two private target gene sets were analyzed. Additionally, the PPI enrichment analysis was performed from two directions: top-down and bottom-up. In the bottom-up model, genes were firstly classified into four categories analogous to four types of regulatory motifs, and each category was provided with one significance score. In the top-down model, significance scores were assigned to each regulator (for single-regulation motif) or to every pair of regulators (for the other three types of motifs). In this study, a significance score was defined as the z-score (standard score) derived from statistical analysis (Methods in Additional file 1). Furthermore, we also analyzed the significance of several selected network properties (Methods in Additional file 1) for each type of regulatory motif based on similar approaches adopted in the PPI enrichment analysis. The procedures of regulatory motif screening and analysis are depicted in Figure 1. In addition, the functional enrichment analysis of crosstalk motifs was performed to investigate the underlying biological roles for crosstalk motifs in human PINs (Methods in Additional file 1).
To investigate the synergistic relationships between regulators, we further analyzed the distributions of the number of synergistic partners of miRNAs and/or TFs (Figure 2C-F). Herein, we defined two regulators as having a synergistic relationship if they shared at least two common targets. Most TFs and miRNAs have at least one synergistic TF and/or miRNA partner. In other words, they tended to form synergistic regulations with other regulators. Although we noticed that a small fraction of TFs did not form synergistic regulations with other TFs or miRNAs (Figure 2D and 2F), this could be due to the lack of sufficient TF-regulation information.
From the global GRN, we screened four types of regulatory motifs: single-regulation, co-regulation, crosstalk, and independent. Next, the PPI enrichment of regulatory motifs was investigated from two directions: top-down and bottom-up. Based on the combinations of regulators, the regulatory motifs of TF-TF, miRNA-miRNA, and TF-miRNA were analyzed separately.
PPI z-score and the coverage of gene sets involved in regulatory motifs
According to the results of the top-down and the bottom-up analyses, we came to three conclusions: 1) the single-regulation motifs tend to regulate genes with PPIs. 2) Regulatory motifs with synergistic relationships (i.e. co-regulation and crosstalk) favor gene regulation with PPIs, especially for crosstalk motifs. 3) Gene pairs regulated by independent regulators (i.e. without synergistic relationships), in contrast, show no preference to form PPIs.
Z-scores of each network properties for regulatory motifs
With respect to individual genes, most regulatory motifs tend to regulate those genes with higher degree and closeness centrality (z-score > 1). Degree represents the connectivity of proteins in a PIN, and closeness centrality represents how close proteins are to the center of a PIN. These results suggested that the regulatory motifs tend to regulate hub and central proteins. On the other hand, most regulatory motifs tend to regulate those gene sets with higher density (z-score > 1), larger clique levels (z-score > 1), and significantly shorter path lengths (z-score < -2). Density provides a quantitative measure of how tested gene sets group together to form a community in a PIN, clique level represents the level of maximal clique in which a tested gene can join, and path length describes how close tested proteins are to each other in a PIN. Briefly, these three network properties were usually used to evaluate the modularity of tested proteins. Hence, the results presented here imply that the regulatory motifs tend to control those proteins that form biological communities. Notably, the crosstalk motifs showed more significant z-scores than other types of regulatory motifs, suggesting they play more roles that are important in PINs.
PPI and network property z-score of crosstalk motifs with zero and non-zero functional similarity
In this study, we incorporated miRNAs into a traditional GRN to investigate the correlations between PPIs and regulatory motifs formed by miRNAs, TFs, and target proteins/genes. The regulatory motifs were classified into four types: single-regulation, co-regulation, crosstalk, and independent. Traditionally, random sampling methods are usually applied to evaluate the significance of PPI numbers among a group of proteins, but this is very time-consuming. In addition, random sampling is not suitable for analyzing complicated regulatory networks, because the whole process should be redesigned for different motif members. In order to improve the efficiency of the evaluation process without loss of general applicability, we calculated the significance of PPI enrichment for different motifs based on the Bernoulli distribution; in other words, we regarded PPI gain and lost as a Bernoulli process. This allowed the whole evaluation process to be kept under constant time (O(1)).
Among the four types of motifs, the strong correlation between single-regulation and PINs has been well-discussed [21, 22], and a correlation with the co-regulation type has also been reported [37, 38]. Single-regulation motifs analyzed here showed consistent conclusions with previous studies. Our investigation into co-regulation motifs has further provided complementary analysis and given insights that have not been addressed in any previous studies. More importantly, we proposed that the third type of motif -- the crosstalk motif -- could be another prominent pattern in GRNs. Crosstalk motifs were defined as the private target gene sets of two corresponding regulators, TFs and/or miRNAs, which shared at least two targets. In human PINs, crosstalk motifs were significantly enriched in PPI contents and network properties. To summarize the analysis of network properties, crosstalk motifs displayed several features: 1) high degree, 2) high closeness, 3) high density, 4) high clique level, and 5) short characteristic path length. In PINs, proteins with a high degree are usually called "hub proteins", those with high closeness centrality are usually called "central proteins", and those with high density, short characteristic path length, and high clique level are usually called "modular proteins". Therefore, the regulators which participate in crosstalk motifs tend to regulate hub proteins, which are usually more essential than non-hub proteins [39–41], and modular proteins, which usually form important protein complexes or modules in human PINs [42–44]. Additionally, we investigated the enriched functions of the crosstalk motifs. For all three types of regulator pairs, the majority of enriched crosstalk functions are associated with positive/negative regulation of cellular metabolic processes. Notably, miRNA-miRNA crosstalk motifs are not only associated with regulation-related functions, but also response to insulin stimulus. This is consistent with previous findings that miRNAs preferentially regulate downstream components, such as TFs, in signaling networks . Moreover, we demonstrated the functional features within the crosstalk motifs with the highest PPI z-score and proposed a potential cancer-related motif, TP53-miR-200bc/429/548a. Consequently, this crosstalk motif might play an important role in living cells through regulating those essential or pivot proteins in human PINs.
Since our analysis relies on limited data sources from online databases to construct human PINs and GRNs, we carried out further examinations to test the robustness of our conclusions. With respect to miRNA regulation, all current online databases which provide predicted human miRNA targets still have room for improvement both in approach and performance [32–35]. Accordingly, we repeated our analysis with another database, miRBase , and were able to reach a consistent conclusion (Figure S8-S13, Table S1 and S2 in Additional file 1). Considering the incomplete and noisy human PPI data, we performed the same analysis with combined PPI data from HPRD and BioGRID  databases and also obtained consistent conclusions (Figure S14-S22, Table S3-S5 in Additional file 1). Therefore, these re-analyses provide further evidence to support the robustness of our conclusions. With ongoing efforts to improve the completeness of PPI data and GRNs, we will be able to further investigate and confirm the correlations between PPIs and regulatory motifs in the future.
In summary, we proposed a computational approach to investigate the significance of regulatory motifs formed by TFs/miRNAs and their corresponding targets in human PINs. With this approach, we screened four types of regulatory motifs, single-regulation, co-regulation, crosstalk, and independent, from human GRNs and investigated their correlations with PPIs. Among the four types of motifs, the crosstalk motif emerged as a potentially significant motif with important roles in PINs, which has not been previously reported. We suggested that this motif might play an important role in living cells because of its strong correlations with PPIs and significant network properties in human PINs.
This work was supported by grants from the National Science Council of Taiwan (99-2621-B-010-001-MY3 and 99-2621-B-002-005-MY3), National Taiwan University Frontier an Innovative Research Projects (99R70437), and National Health Research Institutes (NHRI-EX101-9819PI).