Protein interaction network topology uncovers melanogenesis regulatory network components within functional genomics datasets
© Ho et al. 2010
Received: 4 November 2009
Accepted: 15 June 2010
Published: 15 June 2010
Skip to main content
© Ho et al. 2010
Received: 4 November 2009
Accepted: 15 June 2010
Published: 15 June 2010
RNA-mediated interference (RNAi)-based functional genomics is a systems-level approach to identify novel genes that control biological phenotypes. Existing computational approaches can identify individual genes from RNAi datasets that regulate a given biological process. However, currently available methods cannot identify which RNAi screen "hits" are novel components of well-characterized biological pathways known to regulate the interrogated phenotype. In this study, we describe a method to identify genes from RNAi datasets that are novel components of known biological pathways. We experimentally validate our approach in the context of a recently completed RNAi screen to identify novel regulators of melanogenesis.
In this study, we utilize a PPI network topology-based approach to identify targets within our RNAi dataset that may be components of known melanogenesis regulatory pathways. Our computational approach identifies a set of screen targets that cluster topologically in a human PPI network with the known pigment regulator Endothelin receptor type B (EDNRB). Validation studies reveal that these genes impact pigment production and EDNRB signaling in pigmented melanoma cells (MNT-1) and normal melanocytes.
We present an approach that identifies novel components of well-characterized biological pathways from functional genomics datasets that could not have been identified by existing statistical and computational approaches.
Identifying the complete set of genes that regulate a biological phenotype is a challenge of systems biology. Availability of systems-level protein-protein interaction (PPI), gene expression, and functional genomics (FG) data has facilitated the development of integrative computational approaches to uncover genes involved in biological processes . Integration of C. elegans FG data  with existing gene expression and PPI data has facilitated the discovery of co-expressed gene networks , early embryogenesis control networks , and large-scale protein function networks . Integrating Drosophila RNAi datasets with PPI networks helped identify novel functional regulators of biological phenotypes, demonstrating that PPI networks and RNAi datasets can be effectively integrated to derive additional functional information from RNAi screens . Application of these methods to mammalian RNAi datasets has been more problematic secondary to higher false positive and false negative rates of mammalian RNAi screens . Biological pathways are distinct, experimentally-validated subnetworks of proteins within the larger PPI network that interact with each other by well defined mechanisms to regulate a specific biologic phenotype. While currently available methods can identify components of RNAi datasets that interact with each other within PPI networks , no method currently exists to determine which of these screen "hits" are novel components of well defined pathways known to regulate the process under study.
Numerous studies have identified molecular determinants of pigment variation: 127 mouse coat color genes have been identified  that coordinately regulate the transcription, translation, and intracellular trafficking of melanogenic enzymes . These studies have identified the master regulator of melanocyte transcription microphthalmia-associated transcription factor (MITF) , several melanogenic enzymes , and regulators of melanosome formation and trafficking . Despite these advances, our current understanding of skin and eye color variability is incomplete .
Recently, we utilized a systems-level FG platform to identify 92 novel genes that regulate melanin production, novel regulators of melanin secretion, and novel depigmenting agents . Notably, our approach failed to identify many known regulators of melanogenesis among our top tier hits, and annotation data failed to identify connections between many screen targets and biological pathways known to regulate melanogenesis. In this study, we apply PPI network topology-based computational methods to identify genes within our FG dataset that are novel components of biological pathways known to regulate melanogenesis.
To find proteins topologically similar to known melanogenesis regulators in the context of the PPI network, we compute GDVs for pigment regulators identified in our siRNA screen (henceforth called Screen Pigment Regulators (SPRs)). As previous studies demonstrated that many known pigment regulators (KPRs) were not present in our top tier targets but did have some impacts on pigment production , we define SPRs as genes whose siRNAs have some impact on pigment production (Z-score < -1 or >1) (Additional File 2 Table S1). 1,244 SPRs are present in the existing PPI network. Also, the existing PPI network contains 41 KPRs from the ESPCR database, 10 of which are SPRs at the same time . We note that a choice of a higher Z-score threshold might be more appropriate; however, such a choice omitted several KPRs from the analyzed dataset and resulted in a limited number of SPRs in the dataset that was insufficient for any statistical analysis.
An alternative approach to finding new pathway members would be to consider simply an SPR and its direct neighbors as cluster members, since it has been shown that proteins that are closer in the network (i.e., direct neighbors) are more likely to be functionally similar (i.e., involved in a same biological process) than proteins that are further apart in the network . Every cluster defined in this way would topologically be somewhere between a "star-like" network structure centered at the chosen SPR (as illustrated in Additional File 4 Figure S2 that presents the EDNRB cluster obtained by this "direct neighborhood" approach) and a fully connected network. That is, network clusters obtained with the direct neighborhood approach would have the diameter of either 1 or 2, where the diameter is defined as the maximum of shortest path lengths between all node pairs in the network. However, we tested all biological pathways from KEGG  mapped onto the human PPI network to compare their diameters with the diameters of the direct neighborhood-based clusters. Due to the incompleteness of the PPI network data, about 30% of KEGG pathways were consisting mainly of isolated edges after they were mapped onto the human PPI network; hence, we removed these pathways from our analysis. For the remaining 70% of the mapped pathways, we found that 93% of them have diameters larger than 2. The distribution of diameters of these pathways was bell-shaped, with the average of 6.24 and the standard deviation of 2.6. Therefore, since pathways correspond to parts of PPI networks that are of relatively large diameters, the simple direct neighborhood approach is not appropriate for finding new pathway members, since it can only produce subgraphs of small diameters. Hence, other approaches should be sought. In contrast, our GDV-based method generates EDNRB network of diameter 9. Also, since GDVs are based on all up to 5-node graphlets hence covering parts of the PPI network of diameter < 9 around an SPR of interest, the GDV-based approach is more appropriate for finding new pathway members than the direct neighborhoods.
With our GDV-based approach, in addition to the proteins in the cluster formed for a given SPR, we include into the SPR's "network" (defined above) the direct neighbors of proteins in the cluster to make the network connected; note that by doing so, we are indirectly taking into account the fact that direct neighbors are more likely to be involved in a same biological process than proteins that are further apart in the network . For comparison, we perform the equivalent procedure for the direct neighborhood approach and evaluate whether this increases the diameters of such "expanded" direct neighborhood-based subnetworks. However, even if we include direct neighbors of the SPR's direct neighbors into the cluster, the diameter of the resulting networks would be between 1 and 4. Even if the resulting subnetworks would have the diameter of 4, that diameter is almost one standard deviation below the average diameter of known biological pathways (which is 6.24, as stated above). In contrast, our GDV-based approach produces PPI subnetworks with diameters between 5 and 9, with the average of 6.33 and the standard deviation of 1.5, thus mimicking well the diameters of known biological pathways.
Topological clustering identified both SPRs and potentially novel pigment regulators that are components of the EDNRB network. As EDNRB and GNA11 siRNA impact pigment formation in our screen and regulate melanogenesis by well-characterized mechanisms, we next sought to experimentally validate novel EDNRB pathway components identified by our method. EDNRB, a g-protein coupled receptor mutated in Waardenburg-Shah syndrome , binds to EDN3  and activates an intracellular signaling cascade via GNA11 . Activation of EDNRB leads to PKC activation, and subsequent ERK phosphorylation. Phosphorylated ERK can either activate MITF by protein phosphorylation, or up-regulate MITF transcription through CREB phosphorylation [25, 26]. MITF directly activates the expression of many melanogenesis regulators, such as tyrosinase, the rate-limiting enzyme in melanin synthesis . In addition to regulating melanogenesis, EDNRB plays a critical role in neural crest development . As some proteins that interact with EDNRB may impact neural crest development as opposed to melanogenesis, we focused on validating proteins in the EDNRB network most likely to directly impact melanogenesis. For this analysis, we select half of the SPRs most likely to directly regulate melanogenesis (Z-score < -1) from Figure 3: two SPRs in the cluster (EDNRB, CACNA1B), and four SPRs adjacent to the cluster (PTK2B, AGTR1, PLCD1, OPRM1). We then validate known regulators of pigmentation in our experimental system (PAX2, EDNRB, GNA11, EDN3), and examine other proteins in the EDNRB network not identified in our FG screen (FHOD1, DPYSL2, TAOK2, FHL1, RING1, CXCR4).
Annotation of genes identified in EDNRB network.
Endothelin receptor type B. Mutation causes Hirschsprung disease type 2 (HSCR2) [MIM:600155]
G-protein activates phospholipase C in endothelin pathway
Ligand for endothelin receptor type B
Phospholipase produces diacylglycerol and inositol 1,4,5-trisphosphate in endothelin pathway
PH-PLC, C2_2, EFh
Kinase activates the p38 MAP kinase pathway in endothelin pathway
Kinase regulates ion channel and activation of the MAP kinase signaling pathway in endothelin pathway
PTKc_FAK, FERM_M. Focal_AT, B41, Pkinase_Tyr
Receptor for angiotensin II; defect causes renal tubular dysgenesis (RTD) [MIM:267430]
E3 ubiquitin-protein ligase for histone H2A
Required for the assembly of F-actin structures
Signaling by class 3 semaphorins and subsequent remodeling of the cytoskeleton
To determine the role of individual targets in EDNRB signaling, we examined whether the 9 genes (EDNRB, GNA11, EDN3, FHOD1, TAOK1, AGTR1, RING1, PTK2B, and PLCD1) that impact MITF and tyrosinase mRNA expression also impact CREB phosphorylation. Previous studies have determined that MNT-1 melanoma cells contain a BRAF mutation resulting in constitutive ERK activity . Therefore, we would predict that the majority of EDNRB pathway components that impact pigment production in MNT-1 cells would do so via impacts on CREB phosphorylation, as the ERK pathway leading to MITF phosphorylation is constitutively active. Treatment of MNT-1 cells with EDNRB, EDN3, and GNA11 siRNAs impacts CREB phosphorylation but not ERK phosphorylation (Figure 5b). Similarly, TAOK1, PTK2B, FHOD1 and PLCD1 siRNAs inhibit CREB phosphorylation (Figure 5b). Published studies support the notion that these four genes are components of the EDNRB signaling network upstream of CREB phosphorylation. Serine/threonine-protein kinase TAO1 (TAOK1) activates the p38 MAP kinase pathway through the specific activation of the upstream MKK3 kinase . This signaling pathway activates p38 leading to CREB phosphorylation in melanocytes [33, 34]. It also is known that Protein tyrosine kinase 2 beta (PTK2B) activation leads to ERK1/2 activation . In turn, ERK1/2 phosphorylates p90rsk, which phosphorylates CREB [34, 36]. Endothelins promote dendrite formation facilitating melanosome transport from melanocytes to keratinocytes [37, 38], processes that require extensive actin cytoskeletal reorganization [39, 40]. Formin homology 2 domain containing 1 (FHOD1), a protein with several potential PKC phosphorylation sites , directly binds to F-actin and plays a role in actin cytoskeleton organization  and cell elongation . Additionally, FHOD1 activates ERK inducing changes in gene expression . Our studies implicate FHOD1 as a potential player in endothelin mediated actin-cytoskeletal reorganization.
Upon endothelins binding to endothelin receptors of melanocytes, phospholipase C is activated  and initiates subsequent activation of protein kinase C and ERK pathways. Phospholipase C delta-1 (PLCD1) hydrolyzes phosphatidylinositol 4,5-bisphosphate (PIP2) to generate two second messengers: diacylglycerol (DG) and inositol 1,4,5-trisphosphate (IP3) . DG mediates the activation of protein kinase C (PKC), and IP3 releases calcium from intracellular stores, which in turns activate ERK . In melanocytes, PLC antagonist U73122 inhibits endothelin-1-induced intracellular calcium rise and ERK phosphorylation [48, 49], providing direct evidence that phospholipase C activity impacts endothelin-induced signaling in primary melanocytes. Analysis of gene annotation data reveals that PLCD1 is a phospholipase C homologue that may function directly downstream of GNA11 (Table 1). As PLC activity impacts EDNRB signaling in melanocytes, we sought to examine whether PLCD1 was the PLC isoform that impacts EDNRB signaling in normal melanocytes. Depletion of PLCD1 in normal melanocytes impacts MITF protein levels (Figure 5c). Additionally, we demonstrate that PLCD1 siRNA inhibits CREB phosphorylation in the context of EDNRB receptor stimulation, consistent with the hypothesis that PLCD1 is upstream of the CREB phosphorylation event. These results demonstrate that some components in our EDNRB network participate in EDNRB signaling in normal and cancer cells (Figure 5d).
In this study, we apply a PPI network topology-based approach that utilizes GDVs to identify novel components of melanogenesis regulatory pathways. Related approaches that detect biological function from PPI network topology rely on simpler network properties such as direct neighbors of nodes and shortest path distances [20, 50]. Since none of the 6 SPRs in EDNRB cluster are adjacent in the PPI network, but are at distance three or four from each other, they could not have been topologically clustered together by analyzing only their neighbors or distances in the PPI network. Additionally, as mentioned above, examining only direct neighbors of an SPR is not appropriate for constructing its underlying melanogenesis-related pathway.
Although GDV-based computational method has already been used to demonstrate that topologically similar proteins share common biological properties such as protein function or involvement in cancer [17, 18], in this paper, we provide a novel application of the method, namely identifying novel components of biological pathways known to regulate melanogenesis. We use this graph theoretic method to identify proteins with similar topological signatures to known pigment regulators and to construct the corresponding melanogenesis-related subnetworks. Most importantly, we experimentally validate that the novel components of these melanogenesis-related subnetworks are components of the EDNRB pathway in both pigmented melanoma cells and normal melanocytes.
In this study, we utilize a PPI network topology-based computational approach to identify a set of novel EDNRB pathway components. Using RNAi-based approaches, we validate that some of the proteins in EDNRB network impact melanogenesis, demonstrating that topological clustering approaches can uncover additional components of known regulatory networks. SPRs that cluster with EDNRB have similar impacts on MITF protein levels, on MITF and tyrosinase expression, and on CREB phosphorylation, indicating that they are likely components of the same biological pathway. Validation of PLCD1, a potent regulator of MITF in MNT-1 cells, reveals that novel EDNRB pathway component identified in MNT-1 cells impact EDNRB signaling in normal melanocytes. Our study demonstrates that integration of PPI network topology-based algorithms with functional genomics datasets is a fruitful method to enhance our systems-level understanding of biological phenotypes. In the future this method may be used to place targets identified in RNAi screens within the context of known biological pathways that regulate the system under study, deriving additional relevance from systems-level RNAi datasets.
MNT-1 melanoma cells were a gift of M. Marks (University of Pennylvania) and were cultured in DMEM (Invitrogen) with 15% fetal bovine serum (Hyclone), 10% AIM-V medium (Invitrogen), 1xMEM (Invitrogen) and 1 × antibiotic/antimycotic (Invitrogen). Darkly pigmented human neonatal epidermal melanocytes (Cascade Biologics) were cultured in Medium 254 with the melanocyte specific HMGS supplement (Cascade Biologics) and 0.1 mM 3-Isobutyl-1-methylxanthine (IBMX).
In our study, we focus on proteins with more than three interacting partners in the PPI network, since poorly connected proteins are more likely to reside in incomplete parts of PPI networks [17, 18]. However, to account for the entire topology of the PPI network, when GDVs are computed, all proteins and interactions in the PPI network are taken into account. There are 5,423 proteins with degrees higher than three in the network, out of which 1,244 are SPRs, and 41 are KPRs contained in the ESPCR database ; 10 SPRs are KPRs at the same time. The code for computing GDVs  is available in GraphCrunch , an open source software tool for large network modeling and analyses.
We perform the GDV-based clustering as follows: for each of the 1,244 SPRs with degrees greater than 3 in the PPI network, we identify a cluster containing that protein and all other proteins in the network having degrees higher than 3 that are topologically similar to it, i.e., that have GDV similarities with the protein of interest above a threshold; thresholds between 90% and 95% have been used previously . We find the highest threshold at which a node is clustered with at least one more node by performing the following: (1) we initially set the signature similarity threshold to 96%; (2) if the SPR for which the cluster is formed is signature-similar with at least one more node at the given threshold, we create the cluster containing these nodes and the SPR; (3) if a protein of interest is not signature-similar with any other protein at a given threshold, we decrease the threshold and repeat steps 2 and 3; (4) we stop the clustering process for a given SPR when its cluster has been formed; we decrease the threshold down to 90% searching for non-empty clusters, but not below 90%; if a protein is not signature-similar with any other protein at the lowest threshold of 90%, we do not form a cluster for that protein. We try to form clusters for all 1,244 SPRs. However, 263 of them are not 90% signature similar to any other nodes in the PPI network. Additional 237 of them cluster only with one other node. Thus, we analyze clusters of size 3 or more formed for 743 SPRs. The minimum threshold of 90% for signature similarity was chosen because it has been empirically shown that for higher thresholds, only a few small clusters are obtained, indicating too high stringency in signature similarities, whereas for lower thresholds, clusters are very large, indicating a loss of signature similarity .
Depending on a method and its application, sensible cut-offs for p-values were reported to range from 10-2 to 10-8 . We find that 26 out of 743 clusters formed for SPRs are enriched with SPRs with p-values of SPR enrichment below 0.05. Since we perform statistical testing on 743 clusters, we believe that the reported p-values do not need to be corrected for multiple testing.
Given that 26 of our clusters are statistically significantly SPR-enriched at p-value threshold of 0.05, we compute the probability of observing 26 statistically significantly SPR-enriched clusters by chance (i.e., statistical significance). For each of the 743 SPRs, we create "random" clusters of the same size as the corresponding GDV-based clusters. Then, we measure statistical significance of SPR enrichment in these random clusters. Since the clusters are random, we repeat the randomization procedure 1,000 times. We find that only in 9 out of the 1,000 runs, 26 or more clusters are statistically significantly enriched with SPRs, leading to empirical (frequency) probability of 0.009 for observing 26 or more statistically significantly SPR-enriched clusters by chance. Furthermore, since the distribution of the number of statistically significantly SPR-enriched random cluster over the 1,000 runs is bell-shaped, we use the formula for computing Z-score to standardize our observation in the data of having 26 statistically significantly SPR-enriched clusters. Then, we use the normal distribution table to compute the p-value corresponding to the resulting Z-score. If we denote by x = 26 the raw score to be standardized, by μ= 15.45 the mean of the (random) population, i.e., the average over the 1,000 runs of the number of random clusters that are statistically significantly SPR-enriched, and by σ = 4.03 the standard deviation over the 1,000 runs, we get Z-score of 2.62. Using the normal distribution table, we find that this Z-score corresponds to the p-value of 0.0044; that is, in this way, we find that the probability of observing 26 or more statistically significantly SPR-enriched clusters at random is lower than 0.0044.
For western blot analysis, 6 × 104 MNT-1 cells were transfected with 75 nM pooled siRNAs using a Dharmafect-2 transfection reagent as previously described . Similarly, 6 × 104 melanocytes were transfected with 12.5 nM pooled siRNAs (two) toward each gene using a HiPerFect transfection reagent as previously described . For pigment measurement, 1 × 104 MNT-1 cells were transfected with 8 nM pooled siRNAs (two siRNAs) using a Dharmafect-2 transfection reagent as described. We performed each transfection in triplicate. For RT-qPCR, we used 4 × 103 MNT-1 cells in the same 96-well format. We obtained pooled siRNAs for tyrosinase from Dharmacon, and all the other siRNAs from Ambion. A list of all siRNA sequences are contained in Additional File 5 Table S3.
For pigment measurement, we transfected MNT-1 cells with the corresponding siRNA in 96 well plates and incubated them for 120 hours at 37°C/5% CO2. Subsequently, we removed 100 ul of the medium from each well and added 15 ul of CellTiter-Glo Reagent (CTG) (Promega) to each well according to the manufacturer's protocol. We measured luminescence and absorbance at 405 nm for each well of the 96 well plate using a DTX800 Multimode Detector (Beckman Coulter). We normalized absorbance values with raw luminescence values that are proportional to cell number to determine relative pigment per cell. We calculated a normalized percent inhibition from these values to account for the background absorbance of MNT-1 cells and variability in transfection. Normalized percent inhibition (NPI) = [(Pigment Index non-targeting control siRNA -Pigment Index sample)/(Pigment Index non-targeting control -Pigment Index tyrosinase siRNA)] × 100%. We use a student's t-test to calculate the p-values of each sample v.s. non-targeting control.
For our immunoblotting experiments, MNT-1 cells 96 hrs post-transfection and melanocytes 144hrs post-transfection were lysed in RIPA buffer supplied with proteases inhibitors cocktail, PMSF and sodium orthovanadate (Santa Cruz Biotechnology). For experiments in melanocytes looking at downstream EDNRB signaling, we treated melanocytes with 10 μM Endothelin-1 for 5mins. After boiling with sample buffer, we loaded the samples to 4-20% SDS-PAGE and subjected them to western blotting with anti-MITF (C5, Santa Cruz Biotechnology), anti-phospho-ERK1/2 (Cell Signaling Technologics), anti-phospho-CREB (Cell Signaling Technologics), anti-alpha/beta-tubulin (Cell Signaling Technologics) and anti-beta-actin (Cell Signaling Technologics) antibodies.
96hrs after siRNA transfection, we prepared cDNA from MNT-1 cells using a Cells-to-Ct kit (Ambion) according to the manufacterer's protocol. For measurement of siRNA knockdown efficiency, we prepared cDNA from MNT-1 cells 48 hrs after transfection for most of the genes except RING1, FHOD1, TAOK1 at 24 hrs to capture the impact in different mRNA kinetics. We added an aliquot of each cDNA reaction to each Taqman master mix reaction along with the appropriate primer and probe set purchased from Applied Biosystems per the manufacturer's protocol (Applied Biosystems). Primers and probes sequences are listed in Additional File 6 Table S4. We utilized a 7900HT Fast Real-Time PCR System (Applied Biosystems) to determine Ct values. We normalized values by human beta-actin Ct values and analyzed our data using the relative quantification mathematical model (Pfaffl).
small interfering RNA
reverse-transcription polymerase chain reaction
Endothelin receptor type B
microphthalmia-associated transcription factor
screen pigment regulator
known pigment regulator
graphlet degree vector
cAMP response element binding
extracellular signal-regulated kinase.
Anand Ganesan, Hsiang Ho, and Jayavani Aruri were supported by two NIH awards (1K08AR056001-01A1 and 1R03AR057150-01). Nataša Pržulj, Tijana Milenković, and Vesna Memišević were supported by the NSF CAREER IIS-0644424 grant.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.