MicroRNA 3' end nucleotide modification patterns and arm selection preference in liver tissues
© Li et al.; licensee BioMed Central Ltd. 2012
Published: 12 December 2012
Skip to main content
© Li et al.; licensee BioMed Central Ltd. 2012
Published: 12 December 2012
The expression of microRNA (miRNA) genes undergoes several maturation steps. Recent studies brought new insights into the maturation process, but also raised debates on the maturation mechanism. To understand the mechanism better, we downloaded small RNA sequence reads from NCBI SRA and quantified the expression profiles of miRNAs in normal and tumor liver tissues.
From these miRNA expression profiles, we studied several issues related to miRNA biogenesis. First of all, the 3' ends of mature miRNAs usually carried modified nucleotides, generated from nucleotide addition or RNA editing. We found that adenine accounted for more than 50% of all miRNA 3' end modification events in all libraries. However, uracil dominated over adenine in several miRNA types. Moreover, the miRNA reads in the HBV-associated libraries have much lower rates of nucleotide modification. These results indicate that miRNA 3' end modifications are miRNA specific and may differ between normal and tumor tissues. Secondly, according to the hydrogen-bonding theory, the expression ratio of 5p arm to 3p arm miRNAs, derived from the same pre-miRNA, should be constant over tissues. However, a comparison of the expression profiles of the 5p arm and 3p arm miRNAs showed that one arm is preferred in the normal liver tissue whereas the other is preferred in the tumor liver tissue. In other words, different liver tissues have their own preferences on selecting either arm to be mature miRNAs.
The results suggest that besides the traditional miRNA biogenesis theory, another mechanism may also participate in the miRNA biogenesis pathways.
MicroRNA (miRNA) genes are small non-protein-coding genes. Their final functional products are single-strand RNAs of ~22 nucleotides. MiRNAs repress the expression of their target genes. It has been commonly observed that in the tissues of genetic disorders or tumors, the expression levels of many miRNA genes are significantly different from those in the normal tissues [1–7]. In our previous study, we observed miRNA 3' end modification and arm selection preference in one pair of normal and tumor gastric tissues . To better understand these phenomena, we analyzed NGS data of 15 liver tissues in this study. We first characterized the expression profiles of mature miRNAs in liver tissues by analyzing small RNA sequence reads downloaded from NCBI SRA (SRP002272) . Then, we used the expression profiles to address the issues to be described below.
The messenger RNAs (mRNAs) of protein-coding genes are usually subject to modifications, including poly-adenylation, which prolongs the lives of mRNAs. Recently, similar sequence modifications at the 3' end of miRNAs were detected by small RNA cloning and sequencing . Further studies confirmed that such modifications were prevalent among animal and plant miRNAs and were caused by either RNA editing or nucleotide addition rather than by sequencing errors [11, 12]. In this study, using the miRNA expression profiles in liver tissues, we showed that different tissues and different miRNAs have their own preferred nucleotide modifications although adenine accounted for the majority of the modification events. Moreover, HBV (hepatitis B virus)-associated tissues exhibited lower adenine modification rates than normal liver tissues.
Although mature miRNAs are derived from pre-miRNAs, the expression of pre-miRNAs does not guarantee the expression of mature miRNA. In other words, not all pre-miRNAs are processed into mature miRNAs [13–15]. Moreover, the 5p arm and 3p arm of the same pre-miRNA usually have unequal likelihoods to be processed into mature miRNAs . Such arm selection preference is commonly thought to result from the fact that the RISC unwinds the miRNA/miRNA* duplex at the end with weaker hydrogen binding. Therefore, the arm with the freer 5' end is preferentially incorporated into RISC to serve as the mature miRNA [17, 18]. This hydrogen-bonding-based selection rule is currently the majority view.
However, recent studies showed that the orthologous pre-miRNAs, although highly similar to each other and thought to have the same miRNA/miRNA* duplex, showed preference of the 5p arm in one species but the 3p arm in another species. Therefore, the major and minor forms of mature miRNA in different species may be switched [19, 20]. In this study, we are curious whether the same pre-miRNAs have different arm selection preferences between normal and tumor tissues. Using miRNA expression profiles, we showed that the major and minor mature miRNAs of the same pre-miRNA are not always consistent with the miRBase annotation. Moreover, we found that the 5p arm and 3p arm miRNA derived from the same pre-miRNA have different tissue expression preferences, one preferring the normal tissue while the other preferring the tumor tissue. The result points to the existence of another selection mechanism in addition to the hydrogen-bonding theory.
Summary of miRNA reads and library information.
# of reads analyzed
# of miRNA reads
% miRNA reads
# of detected pre-miRNAs
# of detected miRNAs
# of detected miRNAs at opposite arm
HBV(-) HCV(-) adjacent
HBV(-) HCV(-) HCC
Vaz and colleagues observed a much lower miRNA percentage in small RNA collection from the K562 cell line, due to the reduced expression of the Dicer gene . Among all of the 15 libraries we studied, only SRX018968 and SRX018969 were hepatitis C virus (HCV) associated libraries: SRX018969 was sampled from HCV-positive HCC tissue and SRX018968 was sampled from the adjacent region of HCV-positive HCC tissue. Examining the miRNAs in these two libraries, we found that the first several abundant miRNAs are highly similar to the ones in the HCV-negative libraries. We wonder whether the infection of HCV represses the function of Drosha or Dicer so that the expressions of all miRNAs are evenly reduced, without altering the abundance ranking of individual miRNAs. Therefore, we examined the expression levels of Drosha and Dicer in HCV-positive and HCC-negative samples. Additional File 2 shows that the expression levels of Drosha and Dicer are not significantly different among samples. Thus, the lower numbers of miRNA reads in the two HCV-positive libraries could be simply due to sample preparation or the intrinsic characteristic of disease tissue.
In Morin's study, repeat elements accounted for about 16% of the total sequence reads from the human embryo stem cell . In this study, however, repeat elements accounted for a very low percentage in all libraries. The difference might have occurred because we analyzed only the reads with 3' adapter ligated and excluded the one-copy clean reads.
Table 1 shows that about 500 to 600 mature miRNAs were detected in most libraries; however, most of the sequence reads belonged to a few miRNAs. For example, hsa-miR-122 accounted for 50% of all reads in a library in most cases, except for SRX018961 (37.10%) and SRX018971 (2.57%). SRX018971 is the only HBV-negative and HCV-negative HCC sample. Hsa-miR-122 was reported to activate HCV translation  and to be highly expressed in HCV infected patients . However, comparing with other HCV-positive (SRX018968 and SRX018969) and HCV-negative samples, the expression level of hsa-miR-122 in SRX018971 dropped dramatically. This is an interesting observation for further study.
Among the 1,048 human pre-miRNAs, 730 were annotated to encode mature miRNAs at only one arm in miRBase 16. With the increased sequencing depth of NGS platforms, more mature miRNA can be detected at the opposite arm. As shown in Figure 2, we detected also the mature miRNA at the opposite arm, i.e., the 5p arm of hsa-mir-1307 from SRX018957. Furthermore, like common mature miRNAs, the opposite-arm miRNA of hsa-mir-1307 also has many isomiR types, highly overlapping with each other. Actually, in all of the 15 libraries, the opposite-arm miRNA of hsa-mir-1307 was detected.
As described in a previous study , the opposite-arm miRNAs are not necessarily kept at lower expression levels than the annotated ones. In this study, the opposite-arm miRNA of hsa-mir-1307 has a higher expression level than the annotated one in most libraries, except for SRX018968 and SRX018971. In other pre-miRNAs, this phenomenon can also be observed (Additional File 3), which challenges the nomenclature of miRBase. The finding of opposite-arm miRNAs was made because we mapped the sequence reads back to pre-miRNAs rather than to mature miRNAs. In this study, we detected many opposite-arm miRNAs in all 15 libraries (Table 1).
Previous studies reported nucleotide variations preferentially locating at the 3' end of miRNAs [10–12, 26, 27]. Such variations can arise from either nucleotide additions, elongating the miRNAs, or RNA editing that does not alter the miRNA length. Regardless of how the modified nucleotides occur, they can cause mismatches in the mapping procedure, making the originally perfect match reads fail to be mapped back to miRNAs. Therefore, we trimmed the terminal 3' end mismatch one nucleotide at one time until mapping succeeded, and then, analyzed the trimmed nucleotides. These trimmed nucleotides are thought to have occurred by modifications so that we call them modified fragments, which were recovered and presented in lower case (Figure 2 and Additional File 3).
Frequencies of the 3' end modified fragments in libraries.
miRNA reads with 3' end modi.
Figure 3a shows the global distribution patterns of modification events in libraries. However, we are more interested in whether individual miRNAs have preferred modifications different from the global pattern. Therefore, we examined 16 mature miRNAs with a high expression level in most libraries. We found that four mature miRNAs, miR-30d, miR-101, miR-140-3p and miR-378, have similar patterns to the global patterns (Figure 3a). However, in let-7a, let-7b, let-7c, miR-122 and miR148a, the patterns are somewhat different from the global pattern. For example, in miR-122 (Figure 3b), the frequencies of U and UU events decreased so dramatically in all libraries that U was even much rarer than AA, which was originally dominated by U in the global pattern.
Another interesting pattern can be illustrated by miR-192 and miR-29a. Contrary to the pattern of miR-122, the frequencies of A and AA events dramatically decreased in miR-192 (Figure 3c); in other words, U and UU dominated over other modification events except in the SRX018971 library. The A and U decreases in Figure 3c and 3b were consistently found in almost all libraries, demonstrating that different individual miRNAs may have their own preferred modifications.
Although the above modification patterns were consistent in almost all libraries, we observed inconsistent patterns in miR-99a, miR-100, miR-103, miR-191 and miR-199a-3p. For example, in miR-199a-3p (Figure 3d), the U modification has a significantly higher frequency than the A modification in SRX018960, ~61, ~62, ~63, ~64 and ~65, most of which are HBV associated libraries. However, the A modification obviously dominated over the U modification in the remaining libraries, most of which were not HBV associated libraries. The global pattern of modifications (Figure 3a) implied that HBV infection can repress the A modification, which seems to be consistent with the result for miR-199a-3p (Figure 3d).
Previous studies reported several types of RNA editing, such as A to G transition catalyzed by adenine deaminase and C to U transition catalyzed by cytidine deaminase [11, 28], responsible for generating 3' end variations without altering the miRNA length. However, nucleotide addition can also cause the same variations by elongating miRNA length. Owing to the existence of isomiR, the lengths of miRNAs can be dynamic . Therefore, it is difficult to determine whether nucleotide addition or RNA editing contributes to such 3' end variations. Hence, instead of RNA editing or nucleotide addition, we used the term "modifications" to avoid the debate. In summary, 3' end modifications can be miRNA dependent and can depend on the tissue condition (normal or abnormal), making the modification patterns more complicated.
In this study, we showed that the 3' end modification event of miRNA cab be library dependent and that HBV-positive samples tend to have a lower chance of A modification. MiRNAs recognize their target site mainly by the complementary pairing between their seed region, nucleotides 2~7, and the 3' UTR. The modification event occurring in the 3' end can hardly alter their selection of target gene. Therefore, 3' end modification may have little impact on miRNAs' target gene selection and function.
According to the hydrogen-bonding theory, the selection preference between pre-miRNA's 5p arm and 3p arm is an intrinsic characteristic of pre-miRNA . If this selection mechanism is the only one in deciding arm selection preference, the expression ratio of 5p arm to 3p arm should be largely constant in all tissues. To examine this theory, we investigated whether for a pre-miRNA the expression ratio of 5p arm to 3p arm differs between normal and tumor tissues. As shown in Figure 2, the 5p and 3p sequence reads of hsa-mir-1307 in SRX018957 were, respectively, 1325 and 402, resulting in a 3.30 expression ratio of 5p arm to 3p arm.
Although Figure 4a shows a significant difference, an analysis based on average values may neglect the diversities among tissues. Hence, we also examined pre-miRNAs' expression ratios of 5p arm to 3p arm among six normal HBV(+)HCC (NT) pairs. Several pre-miRNAs showed consistent patterns between NT pairs. Figure 4b shows that hsa-mir-22's expression ratios of 5p arm to 3p arm had the same tendency between NT pairs and that they were all higher in HBV(+)HCC tissues than in normal tissues. This result reflected the fact that the 5p arm of hsa-mir-22 was more preferred in HBV(+)HCC tissues than in normal tissues. A similar phenomenon was also observed in other pre-miRNAs such as hsa-mir-17 and hsa-mir-30b.
Although hsa-mir-193b's expression ratios of 5p arm to 3p arm also had the same tendency between NT pairs, the values were all higher in normal tissues rather than in HBV(+)HCC tissues (Figure 4c). Furthermore, the 5p arm of hsa-mir-193b was more preferred in normal tissues than in HBV(+)HCC tissues. Other pre-miRNAs such as hsa-mir-23b showed a similar phenomenon. The above cases showed consistent patterns of expression ratios between all NT pairs, but diversities among NT pairs were observed in several pre-miRNAs, such as has-mir30c-2 (Figure 4d).
In summary, the 5p and 3p arm selection preferences are not always consistent between normal and tumor tissues, implying there is another selection mechanism, in addition to the hydrogen-bonding-based selection rule. If so, this selection mechanism may play a regulatory role in the oncogenesis pathway of HCC.
In our previous study, we showed that the arm selection preference of the same orthologous pre-miRNAs can vary between species, so that some species preferred the 5p arm and the other preferred the 3p arm . In Figure 4a, the expression ratio bars of the same pre-miRNAs showed different directions (upward or downward) between libraries. An upward bar means that the 5p arm was the major arm, while a downward bar means that the 3p arm was the major arm. Therefore, we were curious about whether different arm selection preferences can be found between tissues, especially between the normal and HBV(+)HCC libraries.
The pre-miRNAs whose arm selection preferences are not consistent with the miRBase annotation.
For the pre-miRNAs originally annotated to encode miRNAs at both arms, the major arms of hsa-mir-30c-2, hsa-mir-30b, hsa-mir-106b and hsa-mir-500a were their 5p arms, while the major arms of hsa-mir-126, hsa-mir-377, hsa-mir-193b, hsa-mir-505 and hsa-mir-664 were their 3p arms. From the NGS expression data, however, we found that their major arms and minor arms were switched, contrary to the miRBase annotation. Among them, hsa-mir-500a, hsa-mir-548h-3, hsa-mir-664 and hsa-mir-1307 were the extreme cases, at which the major-minor switch was observed in all of the analyzed libraries, including three normal and two HBV(+)HCC libraries. In conclusion, the major and minor arms of pre-miRNA can vary among tissues.
In this study, we show that although the 5p arm and 3p arm miRNAs are derived from the same gene locus and transcribed by the same transcription factors, they have significant reversal expression preference between two tissues. We therefore conclude that there was another selection mechanism, in addition to the hydrogen-bonding-based selection rule. However, any factors affecting the precise measurement of miRNA's expression levels, such as PCR artifacts and unequal degradation rates, could also be possible explanations of the reversal tissue preference. During the sample preparation procedure of NGS, RNA molecules were amplified with PCR after adapter ligation. If the 5p arm and 3p arm miRNAs have different affinities with adapters, they may have unequal percentages of amplification, causing biased estimates of expression levels and false reversal tissue preference.
After miRNA/miRNA* (in terms of abundance) duplex is unwound, the miRNA is incorporated into RISC and usually has longer life, while the miRNA* is degraded soon. In addition to the protection of RISC, the nucleotide composition of the RNA molecule may also have unequal resistances to RNase. Therefore, the detected read counts by NGS may not really reflect the true miRNA's expression levels, causing biased expression preferences.
Although we can not completely rule out other factors causing biased measurement of miRNA's expression level, we presented a statistically significant result of miRNA's reversal preference in normal and tumor tissues. This new observation deserves further investigation.
We first downloaded sequence reads with the accession number SRP002272 from NCBI SRA. This accession contains 15 libraries from different liver tissues. All library information is given in Table 1. All of the 15 libraries were sequenced using the Illumina (Solexa) small RNA sequencing technology. After downloading, all sequence reads were first processed to remove the 3' end adapter and the ones with adapter detected and trimmed were called "clean reads". In view of the length distribution of mature miRNAs, only 18~25 nt clean reads were collected for analysis. Moreover, for higher confidence, only the unique clean reads with read count 2 were mapped back to human pre-miRNAs (miRBase 16).
In previous studies using NGS reads to quantify miRNA expression levels, nucleotide variations were usually allowed when mapping sequence reads back to the genomes or the reference miRNAs , resulting in ambiguously mapped loci caused by the high similarity between human mature miRNAs, such as hsa-miR-548a, hsa-miR-548b and hsa-miR-548z. In order to eliminate this type of ambiguity, we mapped the sequence reads back to pre-miRNAs with bowtie  by allowing no mismatch, as suggested previously [11, 19].
In this study, we map the sequence reads to pre-miRNAs rather than the mature miRNAs. We then have additional constraint to avoid random match. Mature miRNAs have many variants with different length, named isomiR. The isomiRs shift from their corresponding miRBase reference miRNAs in terms of location. When sequence reads were mapped back to mature miRNAs, the alignment shift may result in mismatches. Therefore, in addition to the perfect match constraint, we adopted an alternative procedure. In order to exclude random match, the difference in start position between mature miRNA and mapped reads must be equal to or less than two nucleotides and the difference in the end position between mature miRNA and mapped reads must be equal to or less than five nucleotides.
Previous studies reported nucleotide additions at the 3' end of miRNAs [10–12, 26, 27], which may cause mismatches. Therefore, following Fernandez-Valverde's strategy , we trimmed the last 3' end mismatch one by one until the perfect-match reads are at least 18 nucleotides in length.
The non-miRNA sequence reads were further classified into nine classes by mapping to different data sets with bowtie  by allowing one-nucleotide variation. The sequences of mRNAs came from the records with NM or XM accession prefix in RefSeq 47 . The sequences of tRNAs were downloaded from the Genomic tRNA database . The sequences of rRNAs were provided by the silva database . The sequences of snoRNAs, scaRNAs and snRNAs were downloaded from NONCODE . The sequences of other ncRNAs came from the records with NR or XR accession prefix in RefSeq 47 . The sequence reads not belonging to any RNA classes were uploaded to RepaetMasker for identifying repeat elements . The sequence reads not belonging to any of the previous classes were classified into the unknown class.
Total RNA was isolated from the frozen tissues using a guanidium isothiocyanate/CsCl method. RNA was quantified by spectrophotometry at 260 nm. Complementary DNA (cDNA) was prepared from the 2 microgram total RNA of paired HCCs and nontumorous liver samples. One microliter reverse transcription product, 1.25 units Pro Taq polymerase (Protech Technology Enterprise, Taipei, Taiwan), Pro Taq buffer, and 200 μ M dATP, dCTP, dGTP, and dTTP (each) were mixed with primer pairs for Dicer, Drosha, PBGD and S26 in a total volume of 30 μ l. PCR was performed in an automated DNA thermal cycler, 30 cycles for Dicer and Drosha, 28 cycles for PBDG and S26, with initial heating at 94 °C for 2 minutes, followed by the Touchdown PCR: 30 or 28 cycles of 94 °C for 30 seconds, annealing for 1 minute (the annealing temperature is reduced by 1 °C per 2 cycles from 65 °C to 55 °C for 20 cycles, than constant the annealing temperature at 55°C for final 10 or 8 cycles), 72 °C for 1 minute, and final 72 °C for 10 minutes. Primers for amplified genes were as follows: Dicer-F (5'- GTACGACTACCACAAGTACTTC -3'), Dicer-R (5'- ATAGTACACCTGCCAGACTGT -3'), Drosha-F (5'-GTGCTGTCCATGCACCAGATT -3'), Drosha-R(5'-TGCATAACTCAACTGTGCAG G -3'), S26-F (5'-CCGTGCCTCCAAGATGACAAAG-3') and S26-R (5'-TGTCTGGTAACGGCAATGCGGCT-3').
This work was supported by grants from National Science Council of Taiwan (NSC99-2628-B-001-009-MY3).
We thank NCBI SRA for organizing and releasing the sequence reads of liver samples.
This article has been published as part of BMC Systems Biology Volume 6 Supplement 2, 2012: Proceedings of the 23rd International Conference on Genome Informatics (GIW 2012). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/6/S2.
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.