MicroRNA regulate mRNA and protein levels by cleavage and/or translation/transcriptional repression in a tissue specific manner [1–4]. By modulating key cellular processes such as metabolism, division, differentiation, development and apoptosis, they can simultaneously regulate both oncogenes and tumor suppressor genes [5–7]. Aberrant microRNA profiles have been noted in many cancers [5–11], including renal cell carcinoma [12–16]. Almost half the known microRNAs are in cancer-associated chromosomal fragile sites, susceptible to point mutation, amplification, deletion, or translocation [17, 18]. Recent evidence demonstrates that microRNA play an important role in the patho-physiology of many cancers [19–22] and they are believed to be involved in pathogenesis in ccRCC [20, 23]. MicroRNA are also being studied in various tumors to understand their significance for drug resistance [24, 25], diagnosis and prognosis [26–28] and for their therapeutic potential [29–40]. Their secondary structure preserves them better in FFPE samples than mRNA, making them easier to extract in intact form, resulting in higher identification accuracy in the analysis of archived clinical material . Their tissue specificity and tight regulation makes them more reliable identifiers of tissue of origin in highly differentiated tumors . Single microRNA can regulate multiple mRNA and are therefore both better identifiers of mechanism and possibly better drug targets . However, while it is clear that microRNA play an important role in the biology of many cancers, their complex biology and tissue specificity makes it difficult to understand the precise role they play in the disease process and the genes affected by their dysregulation [42–47].
Mature microRNAs are produced in a multi-stage process. After transcription, they are processed by RNA Pol II or Pol III to create capped and polyadenylated primary transcripts (Pri-microRNAs), which are further processed in the nucleus by the enzyme Drosha/Pasha (in flies) or by DGCR8 (in humans) to produce ~60-nucleotide Pre-microRNA stem-loop molecules. These are then exported to the cytoplasm by Exportin and Ran-GTP where they are further processed by Dicer to ~22 nt double-stranded RNA duplexes, which form complexes with RISC (RNA-induced silencing complex) leading to unwinding of the duplexes to form single-stranded microRNAs. MicroRNAs bound to RISC can down-regulate protein levels using at least two alternative pathways: 1) If the microRNA has imperfect complementarity with a matching sequence in the 3'UTR of its target mRNA, the microRNA-RISC complex can combine with the complement mRNA sequence and cause translational repression. 2) On the other hand, if the microRNA and its mRNA target have perfect or near perfect complementarity, the microRNA-RISC complex binding to its target mRNAs can result in the cleavage and degradation of the mRNA by Argonaute2 (Ago2) [1–7, 35].
Although many studies have identified signatures of microRNA dysregulation, the identification of tissue specific targets of aberrantly regulated microRNA is difficult. Putative identification using seed sequence complementarity and free energy predictions of RNA-RNA duplexes [48–55] are available in databases such as TargetScan: http://www.targetscan.org. However, the false positive rate for such matches is unacceptably high, with different algorithms identifying different mRNA targets for the same microRNA [51–53, 56, 57]. The tissue specificity of microRNA regulation is known only in some specific cases (e.g. see Table one in ) and a general methodology for target identification, tissue specificity of action and specific biological role of microRNA in the initiation and progression of most cancers remains an open problem.
We describe a novel method to identify "direct mRNA targets" of microRNA in any cancer based on measuring an anti-correlation signal between differentially expressed microRNA and mRNA in patient matched tumor and normal samples. In this paper, the words "direct mRNA targets" is used in a very strict and limited sense. A direct target is one which: a) has an exact seed sequence match in its 3'UTR to the corresponding microRNA, b) the seed sequence match is conserved across mouse, human, rat and dog genomes, c) the expression levels of both the microRNA and the mRNA can distinguish tumor from normal with high statistical significance and d) the mRNA and microRNA levels are strongly and significantly anti-correlated in tumor and/or normal. These requirements could be relaxed to find additional targets or eliminated altogether to find indirect regulations (see later discussion).
The method proceeds as follows: a) Identify significantly differentially expressed microRNA and mRNA between the two classes (e.g. normal and tumor); b) For each microRNA which is differentially expressed, identify all its putative target mRNA by restricting to those differentially expressed mRNA with a matching seed sequence in their 3'UTR, with the further requirement that it be conserved in human, mouse, rat and dog genomes; c) Compute the Pearson correlation between microRNA and mRNA expression levels for samples in each class (tumor and normal) and d) Retain only those microRNA/mRNA pairs whose expression levels are highly anti-correlated. These constraints remove spurious matches, reducing relatively speculative "putative" seed match based mRNA targets in databases to a highly robust subset of direct functional targets.
Note that our method can be extended (with data on more samples) by removing constraint b) and looking for a correlation (or anti-correlation) signature in c). This allows the identification of indirect regulation. For example, if a microRNA up-regulated in cancer down-regulates a gene which is a transcriptional repressor of an oncogene, then the expression level of the microRNA will be correlated with the level of the oncogene without a seed sequence match. Since the direct gene target (the transcriptional repressor in the example above) of the microRNA should already be identified using our method, such an analysis would extend the regulation network beyond first order interactions. Note that, although the method as described above does not identify regulation by translation inhibition (because this would not significantly affect mRNA levels), if protein levels were also measured, the method could easily be extended to identify such regulation.
We demonstrate the use of our method on expression data from clear cell Renal Cell Carcinoma (ccRCC) and matched normal kidney samples. Renal Cell Carcinoma (RCC) represents ~3% of all malignancies in the US, with 50,000 new cases and 12,000 deaths each year http://www.nci.nih.gov/cancertopics/types/kidney. The most common histological class is ccRCC, accounting for ~75% of kidney cancers. ccRCC is known to be characterized by the loss of the VHL gene, which under normal oxygen pressure, binds to the α subunits of hypoxia-inducible factors (HIFs), inducing their poly-ubiquitinylation and subsequent degradation in the proteasome. In hypoxic conditions, or if HIF regulation is lost because of VHL inactivation, HIF accumulates to high levels and promotes the transcription of genes such as VEGF, PDGF-β, TGF-α, EPO etc which trigger angiogenesis, cell growth, migration and proliferation [59, 60]. The spectrum of HIF target genes expressed in individual tumors and the factors which influence them are the object of active ongoing research. ccRCC tumors have a wide range of natural histories and varied responses to VEGF-targeted therapy . Early stage, Fuhrman grade 1 (low grade) tumors tend to have significantly better disease free survival after resection than higher stage and grade (Fuhrman grade 4) . Although VHL mutation is associated with all grades of ccRCC, the other molecular factors associated with ccRCC initiation and progression are largely unknown. The molecular basis of the diversity in histologic grade, clinical behavior, and response to VEGF-targeted is also unclear, and makes ccRCC a ripe target for studies investigating the molecular and genetic nature of these heterogeneities.
In RCC, various studies have identified panels of microRNA and mRNA that are differentially expressed between normal renal tissue and tumor or between histological subtypes of tumor [12, 14, 15, 63–66]. The present study extends these previous studies by linking the microRNA to some of their mRNA targets, thus elucidating a hitherto unknown part of the biology of ccRCC disease. Some of the identified microRNA/mRNA anti-correlations were validated on a new cohort of ccRCC/normal samples. SEMA6A was confirmed as a direct target of miR-141 by over-expressing miR-141 in a ccRCC cell line and showing strong down-regulation of the SEMA6A transcript.