Gene regulation is governed by a core network in hepatocellular carcinoma
© Gu et al.; licensee BioMed Central Ltd. 2012
Received: 14 February 2012
Accepted: 1 May 2012
Published: 1 May 2012
Skip to main content
© Gu et al.; licensee BioMed Central Ltd. 2012
Received: 14 February 2012
Accepted: 1 May 2012
Published: 1 May 2012
Hepatocellular carcinoma (HCC) is one of the most lethal cancers worldwide, and the mechanisms that lead to the disease are still relatively unclear. However, with the development of high-throughput technologies it is possible to gain a systematic view of biological systems to enhance the understanding of the roles of genes associated with HCC. Thus, analysis of the mechanism of molecule interactions in the context of gene regulatory networks can reveal specific sub-networks that lead to the development of HCC.
In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our method for constructing gene regulatory network is based on predicted target interactions, experimentally-supported interactions, and co-expression model. Regulators in the network included both transcription factors and microRNAs to provide a complete view of gene regulation. Analysis of gene regulatory network revealed that gene regulation in HCC is highly modular, in which different sets of regulators take charge of specific biological processes. We found that microRNAs mainly control biological functions related to mitochondria and oxidative reduction, while transcription factors control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core network that organizes regulations between different modules and maintains the robustness of the whole network. There is direct experimental evidence for most of the regulators in the core gene regulatory network relating to HCC. We infer it is the central controller of gene regulation. Finally, we explored the influence of the core gene regulatory network on biological pathways.
Our analysis provides insights into the mechanism of transcriptional and post-transcriptional control in HCC. In particular, we highlight the importance of the core gene regulatory network; we propose that it is highly related to HCC and we believe further experimental validation is worthwhile.
Hepatocellular carcinoma (HCC) is the major histological subtype of liver cancer, and is among the most lethal cancers worldwide. The high cancer rates are especially found in the East, South-East Asia and sub-Saharan Africa . Infection with hepatitis B (HBV) or C (HCV) viruses was found to be the main cause of the development of HCC in developing countries [1, 2]. However, the current knowledge regarding the mechanisms of molecule interactions that lead to the disease pathogenesis is still quite limited .
With the development of high-throughput technologies such as microarray and next-generation sequencing, it is possible to create a systematic view of biological systems to improve our understanding of the roles of genes associated with diseases . Since the abnormal state of proteins involved in diseases results from the altered expression of genes, analysis of the mechanisms of molecule interactions in the context of gene regulatory networks (GRNs) can reveal the specific sub-networks that lead to the dysfunction of regular biological systems .
GRNs are modelled as directed networks where interactions are directed from regulators to targets. Gene regulation is controlled by both transcription factors (TFs) and microRNAs (miRNAs). Transcription factors are proteins that bind to the promoter regions of target genes, and function by activating or inhibiting the expression of targets. For example, P53 , c-Myc  and E2F-1  are frequently reported to be dysfunctional TFs in HCC. Moreover, miRNAs, a type of short non-coding RNAs, are involved in the post-transcriptional regulation of genes, either by degrading target mRNAs or by inhibiting the translation procedure [8, 9]. It is known that miRNAs play a critical role in human cancer generation by various mechanisms [10, 11]. Two representative miRNAs, miR-122 and miR-21, are highly expressed in liver tissue, where miR-122 is down-regulated and miR-21 is up-regulated in HCC . One of the experimentally validated targets of miR-122 is Cyclin G1, thus, repression of miR-122 expression would enhance the cell cycle process and promote cell proliferation . In turn, oncogenic miR-21 blocks the expression of apoptosis-related genes . MiRNAs are transcribed from the genome contained in the nucleus, and hence expression of miRNAs is also regulated by TFs. As a result of mutual regulation by both miRNAs and TFs, gene regulation is assembled within the structure of a network.
Several studies have focused on the construction of GRNs. The first category of methods is the utilization of interactions from target predictions . In this category, the relationships between TFs/miRNAs and their targets are predicted through sequence alignment or thermodynamics models. However, a major drawback of target prediction methods is the high false-positive rate, and as a result, GRNs constructed in this way contain a lot of noise. Therefore, analysis of GRNs can only provide the global attributes of the system, while the predictions for local regulations may not be reliable. The second category of methods is the integration of both target predictions and gene expression data. It can be regarded as the intersection of GRNs constructed by target predictions and GRNs constructed by co-expression models. Target prediction results only provide information regarding potential physical interactions between regulators and targets. The accuracy of the regulations can be validated by the correlations between regulators and targets on the expression level. However, utilizing expression data alone cannot fully capture the real regulations because correlations cannot elucidate whether the interactions are directly with the regulators or indirect. The most used methods are Pearson correlation  or multivariate linear regression model [17, 18]. Additionally, with the rapid increase in the last few years in the validation of regulatory relationships using experimental approaches such as microarray, deep sequencing and ChIP-seq [19–21], these high quality data will no doubt contribute to the construction of networks.
In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our model for GRN construction is based on predicted target interactions, experimentally-supported interactions and co-expression modeling. The network includes both TFs and miRNAs. We found the regulation strengths were quite different from TFs to miRNAs, thus cutoffs of correlation were set for TFs and miRNAs independently. A topological criterion was applied to select proper cutoffs for correlations in order to ensure that the final GRN made biological sense.
Analysis of the GRN revealed that gene regulation in HCC is highly modular, where different sets of regulators take charge of specific biological processes. We found that miRNAs mainly control biological functions related to mitochondria and oxidative reduction, while TFs control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core GRN that regulates different modules. The core GRN was critically important in maintaining the stability and robustness of the network. We postulate that it is the central controller of gene regulation in this context. In the core GRN, most of the regulators have been previously reported to relate to HCC, thus validating our findings. Finally, we explored the influence of the core GRN on biological pathways.
We focused on the dysfunction of gene regulation in HCC in which HBV is endemic. Microarray data was downloaded from the GEO database [GEO: GSE22058] , and genome-wide expression profiles of both miRNAs and mRNAs were examined.
First, a candidate network was established by combining predicted target interactions and experimentally-supported interactions involving both TFs and miRNAs. Since this kind of network contains a lot of noise and does not relate to specific tissue, we re-filtered interactions using a co-expression model based on microarray data.
Co-expression models are frequently used to establish relationships between genes expressed in specific tissues . In these models, if two genes share similar expression profiles, as measured by significant Pearson correlation coefficients, the two genes are connected in the network. In this step, we only calculated correlation coefficients between regulators and targets in the candidate network. If a TF/miRNA has similar or reversed expression patterns to some genes, then there is high probability that the TF/miRNA regulates these genes. Since the co-expression model cannot tell whether the regulation is direct or indirect, and interactions from the candidate network can only provide potential physical interactions, it is necessary for the integration of both data sources to provide stronger evidence for the gene regulations. Thus, the final GRN can be regarded as the intersection of the candidate network and the GRN constructed by the co-expression model. We first eliminated outliers in the expression profile data, and then calculated the correlation coefficients between regulators and targets using the Pearson method. The final regulations between regulators and targets must satisfy the following three conditions: 1) there exists a predicted target interaction or experimentally-supported interaction; 2) the correlation coefficient between miRNA and its targets should be negative; 3) the absolute value of the correlation coefficient is larger than the cutoff.
The selection of cutoffs was processed through a topological criterion , which means the final GRN must be approximately scale-free. The scale-free network is common in biological networks where a very small amount of nodes connect to many neighbor nodes, while the remaining majority of nodes have extremely small connections . The nodes with high numbers of connections are called hub nodes, and they are important within the network. It is known that some important TFs and miRNAs regulate many targets that result in cancer generation, and they are the hub regulators in GRNs.
Summary of GRN modules
Target gene functions
RUNX3, RUNX2, POU2AF1, POU2F2, FLI1, BHLHB3, PRDM1
HAND2, TCF4, FOXF1, FOXF2, ARID5B, FOXL1
miR-150, miR-142, miR-155, miR-181a, miR-342, miR-27a, miR-146a, miR-199a, miR-214, HNF4A
AR, miR-127, miR-377, miR-323, miR-299, miR-221, miR-433, miR-376a, miR-136, miR-18a, miR-296, miR-154, miR-431, miR-382, miR-369, miR-200b
Cofactor metabolic process,
Steroid metabolic process
NR1I3, NR1I2, ESR1
Fatty acid metabolic process
3, 4, 5
NR1I3,miR-150, miR-142, miR-155, miR-181a, AR, NR1I2, miR-342, miR-27a, miR-146a, HNF4A, miR-199a, miR-218, miR-214, miR-127, miR-132, ESR1, miR-377, SOX4, miR-323, miR-299, miR-221, miR-23a
For HBV-induced HCC, the immune response of the host to clear abnormal pathogens is inhibited. Target genes in Module 1 are highly related to the immune response, and most of genes are down-regulated. There are seven main regulators in Module 1, in which RUNX3, POU2AF1, POU2F2, FLI1 and PRDM1 were significantly down-regulated (p-value < 1e-6). RUNX3 has been suggested to be a tumor suppressor, and its gene is frequently transcriptional silenced in cancer . POU2F2, with its factor POU2AF1, acts as a cell survival factor in immune cells, and plays a central role in lymphoid-specific transcription of immunoglobulin genes . FLI1 can affect apoptosis in tumor cells , and PRDM1 is a candidate tumor suppressor gene related to immune systems .
Cell adhesion is generally suppressed in cancers. Reduced cell adhesion allows cancer cells to disrupt the histological structure, resulting in the morphological features of malignant tumors . Target genes in Module 2 related to extracellular activities are down-regulated. There are six main regulators in Module 2, in which HAND2, TCF4, FOXF1 and ARID5B (p-value < 1e-3) were significantly down-regulated, and FOXF2 was significantly up-regulated (p-value < 1e-4). HAND2 is reported to regulate extracellular matrix remodeling . TCF4 is a key factor in the Wnt pathway and is involved in HCC cell proliferation . FOXF1 deficiency was reported to decrease cell adhesion , and FOXF2 is important for extracellular matrix production .
The mitochondrion is a key organelle in cell metabolism. It is not only a power factory, but also regulates cell death pathways. In cancer cells, as a result of rapid proliferation, oxidative phosphorylation is suppressed in order that mitochondria consume less oxygen . In our results, targets in Modules 3–5 are mostly related to the functions of mitochondria, such as oxidative reduction and metabolism. Among the regulators, miR-150, miR-146a, miR-199a, miR-214, together with NR1I3, AR, NR1I2 and ESR1 were significantly down-regulated (p-value < 1e-6) and miR-221 was significantly up-regulated (p-value < 1e-12). MiR-150 has been reported to inhibit liver cancer by negative regulation of c-Myb . A polymorphism in miR-146a is associated with risk of HCC , while miR-199a induces apoptosis and inhibits the ERK pathway . MiR-214 induces cell survival by targeting the PTEN/Akt pathway to suppress apoptosis , and mir-221 overexpression contributes to liver tumorigenesis . Androgen is related to HCC, and thus its receptor, AR, also plays an important role . NR1I2 and NR1I3 are related to lipid metabolism and HCC generation . Finally, ESR1 is associated with susceptibility to HCC in HBV carriers .
It is a common characteristic that cell proliferation is activated in cancer tissues, thus it would be expected that genes related to the cell cycle are all up-regulated (Module 6). Two regulators, E2F1 and E2F7, were significantly up-regulated (p-value < 1e-16), and are well-known TFs in E2F family that controls the cell cycle .
To summarize, gene regulation is modular in that each set of regulators regulate specific biological processes. Additionally, the two types of regulators have a clear division of control. We showed that miRNAs control biological functions related to mitochondria and oxidative reduction, while TFs control the immune response, extracellular activities and the cell cycle.
Enriched miRNA-associated functions in the GRN
Human embryonic stem cell (hESC) regulation
MiRNA tumor suppressors
Although each network module can provide specific control of biological functions, to maintain the integrity of the biology system, dependency exists among modules. Beyond the modularity of gene regulation, there should be a central mechanism to regulate the expression pattern of each module at a higher level. Thus, we introduced the concept of the core GRN that contains the most important regulations among regulators ankd behaviors as a control center for the global GRN.
The number of edges in the core GRN only covers 1.0 % of all edges in the global GRN, and deletion of these edges does not affect the global GRN's connectivity. Thus, it can be inferred that the local attributes of the network will not be affected by the core GRN. However, the sum of the edge-betweenness takes up 65.8 % of the sum of edge-betweenness in the global GRN. This means that most of the information is controlled by the core GRN, and would affect most of the nodes in the global GRN. When deleting these important edges, the global attributes of the network would be altered and the system would be susceptible to failure.
The core GRN’s role has two aspects. First, it adjusts the regulatory network on the top level. It divides the whole network into two layers with a clear boundary. In the bottom layer, proteins are synthesized under the regulation of TFs and miRNAs, to play roles within or outside of cells. While in the top layer, the core GRN controls what type of proteins would be expressed at what time and at what cellular location. As a result, the entire GRN is organized as a controllable and distributed system. Second, the core GRN can improve the redundancy of the regulatory network. Regulators and regulatory relationships in the core GRN can control more than one module, and regulations of the non-regulator proteins are influenced by the core GRN through a variety of paths. Therefore, when a regulation path does not work for some proteins, the system will assign other paths to process regulations in order to avoid the overall collapse caused by a small portion of damage. In addition, a large number of feedforward and feedback loops exist in the core GRN, which contribute to the flexibility, resiliency and stability of the core GRN, and further to the stability of the whole regulatory network.
In the core GRN, most of the regulators are related to cancers. PBX1 , TWIST1 , HNF4A , ERG , FOXA2 , NR2F2 , FLI1 , GLI2 , RARB , RUNX3 , BHLHB3 , RUNX2 , TCF4  and FOXF1  are reported TFs related to cancers. After querying the human microRNA disease database (HMDD) , we found miR-21 , miR-199a , miR-155 , miR-142 , miR-181a , miR-146a , and miR-150  are reported miRNAs related to cancers. Especially, there is direct evidence for the involvement of TWIST1, HNF4A, GLI2, RARB, RUNX3, TCF4, FOXF1, miR-21, miR-199a, miR-155, miR-142, miR-146a, miR-181a, and miR-150 in HCC generation.
Enriched KEGG pathways of genes in the GRN
hsa00071:Fatty acid metabolism
hsa04660:T cell receptor signaling pathway
hsa04514:Cell adhesion molecules (CAMs)
hsa04640:Hematopoietic cell lineage
hsa04610:Complement and coagulation cascades
hsa03320:PPAR signaling pathway
hsa00280:Valine, leucine and isoleucine degradation
hsa04666:Fc gamma R-mediated phagocytosis
In this study, we established the gene regulatory network that is involved in HCC generation. We revealed that the GRN is modular, where different sets of regulators take charge of specific biological functions. Among them, miRNAs mainly regulate mitochondria and oxidative reduction, while TFs regulate the immune responses, extracellular activity and the cell cycle. At a higher level, a core GRN exists to organize regulations among different modules and to maintain the robustness of the whole network.
Our strategy for constructing the GRN has two advantages. First, the interactions were integrated from three data sources, which are target prediction, experimentally-supported interactions and microarray data. Target prediction provides potential evidence for direct physical interactions, while microarray data can measure the correlations between regulators and targets. Moreover, experimentally-supported interactions can help to improve the quality of the network. Three data sources assign relations to regulators and targets from different aspects, and the integration of the three data sources can make the result more reliable. Second, cutoff of correlations for TFs and miRNAs were set independently of one another. We found that the regulation strength differed from TFs to miRNAs, and the reason for this unequal strength is probably due to the mechanisms of the two kinds of regulators. The selection of cutoffs was processed by a scale-free topological criterion to ensure the final network is biologically sensible.
HCC is a complex disease that involves various molecule interactions. Analysis of the gene regulatory network can help to reveal the mechanisms involved in the development of HCC. In this article we applied a strategy to reveal the most important regulations at the transcriptional level and post-transcriptional level from a systematic view. The core gene regulatory network proposed is highly related to HCC, and we believe it will provide valuable insights for further experimental validations.
Since there are multiple probes for some genes, the ratio values for genes were merged by averaging the ratio of their probes. After merging multiple probes, in total there were 18503 genes and 202 miRNAs. These genes and miRNAs were taken as the candidate list in which we looked for gene regulations.
Promoters of genes and miRNAs were retrieved from UCSC genome browser (hg19) , and the upstream 5 kb sequences were taken. We used TFs identified in the TRANSFAC database  and utilized MATCH  to predict regulations of genes by TFs and miRNAs by TFs. MATCH was executed using the default settings. Regulations of genes by miRNAs were predicted by TargetScan 6.0 . We first obtained miRNA lists containing broadly conserved miRNA families, conserved miRNA families and poorly conserved miRNA families, and then retrieved web pages of conserved targets for each miRNA family using the TargetScan on-line search tool. We analyzed web pages with details of miRNA targets, and noted all the interactions listed. In the target prediction procedure, after mapping to existing gene lists and miRNA lists obtained through microarrays, we identified 1450405 TF regulations of genes, 27077 TF regulations of miRNAs and 277392 miRNAs regulations of genes. We did not use species conservation information in the TF target prediction procedure and did not make any restriction on miRNA targets predictions because some regulatory steps are not conserved and the use of the species conservation would miss these positive interactions.
Experimentally-supported regulations of targets by TFs were downloaded from the ChEA manual database (a collection of TF-to-gene regulations from ChIP-chip and ChIP-seq publications) . Regulations from miRNAs to targets were downloaded from TarBase 6.0 . Regulations from TFs to miRNAs were downloaded from TransmiR 1.1 . After mapping to existing gene lists and miRNA lists obtained through microarrays, there were 85025 interactions from TFs to genes, 15500 interactions from miRNAs to genes and 202 interactions from TFs to miRNAs that have been experimentally validated.
We only considered correlations between regulators and targets that have predicted target interactions. We assumed the expression vector for a certain regulator as y and the expression vector for one of its targets as x. First univariate linear regression was applied to y and x, and then Cook’s distance was calculated to estimate the validity of the data points. We set data points as outliers where Cook’s distance were larger than 0.5. The outliers were then eliminated, and the Pearson correlation coefficient between regulator and target was calculated.
Functional enrichment was applied to evaluate whether a group of genes share common biological functions. We used DAVID  to perform Gene Ontology enrichment and pathway enrichment. For Gene Ontology enrichment of genes in each module, we selected GOTERM_BP_FAT, GOTERM_CC_FAT and GOTERM_MF_FAT as the gene set categories. For pathway enrichment of whole genes in the GRN, KEGG was selected as the gene set category.
Functional enrichment for miRNAs was applied by TAM  to evaluate whether a group of miRNAs regulate common functions or are involved in common diseases. We used miRNAs in the GRN as the candidate miRNA list, and looked for over-represented functions under the category of ‘miRNA function’. We selected all miRNAs in TAM as background, and the analysis was performed under miRNA set version 2. Since we did not take all miRNA categories implemented in TAM, the false discovery rates (FDRs) were re-calculated using BH method .
gene regulatory network
human microRNA disease database
false discovery rate
This work was supported by grants from the National Natural Science Foundation of China (30890044, 31071232,31170751), International Bureau of the Federal Ministry of Education and Research (CHN08/031) and Jiangsu Province Innovation Fund for PhD Candidates (CX10B_014Z).