Clustering and evolutionary analysis of small RNAs identify regulatory siRNA clusters induced under drought stress in rice
© The Author(s) 2016
Published: 23 December 2016
Drought tolerance is an important trait related to growth and yield in crop. Until now, drought related research has focused on coding genes. However, non-coding RNAs also respond significantly to environmental stimuli such as drought stress. Unfortunately, characterizing the role of siRNAs under drought stress is difficult since a large number of heterogenous siRNA species are expressed under drought stress and non-coding RNAs have very weak evolutionary conservation. Thus, to characterize the role of siRNAs, we need a well designed biological and bioinformatics strategy. In this paper, to characterize the function of siRNAs we developed and used a bioinformatics pipeline that includes a genomic-location based clustering technique and an evolutionary conservation tool.
By comparing the wild type Nipponbare and two drought resistant rice varities, we found that 21 nt and 24 nt siRNAs are significantly expressed in the three rice plants but at different time points under a short-term (0, 1, and 6 hrs) drought treatment. siRNAs were up-regulated in the wild type at an early stage while the up-regulation was delayed in the two drought tolerant plants. Genes targeted by up-regulated siRNAs were related to oxidation reduction and proteolysis, which are well known to be associated with water deficit phenotypes. More interestingly, we found that siRNAs were located in intronic regions as clusters and were of high evolutionary conservation among monocot grass plants. In summary, we show that siRNAs are important respondents to drought stress and regulate genes related to the drought tolerance in water deficit conditions.
Plants are subject to long-term gradual water shortage in weeks or months, or to short-term water deficits in hours to days. Studies report that plants respond differently to long and short-term water deficit conditions. In long-term water shortage, the plant makes effort to optimize its use of resources and shortening their life cycle. In short-term water shortage, the plant minimizes water loss by shutting down respiratory procedures and triggers mechanisms to protect itself from damaging effects, e.g., oxidation that leads to proteolysis causing the break down of cells.
siRNAs are induced under drought condition in response to drought in all three rice plants but at a later stage in the drought tolerant plants,
a large number of siRNAs are enriched forming clusters within gene bodies,
the expression level of siRNA clusters negatively correlate with their target genes expression level and
intronic siRNA cluster regions show significant evolutionary conservation among the monocot grass plants.
With this parameter configuration, siRNA clusters with a minimum length of 100 nt with at least 10 siRNA transcripts are expected to be detected, which is a very strict criteria compared to other studies. In order to identify the expression correlation between siRNA clusters and their target genes, each cluster was quantified and associated to their target genes. Here, we define target genes as genes that embody one or multiple siRNA clusters. For multiple clusters that target a common gene, their expression levels were summed to compile a list of unique gene-siRNA cluster pairs.
The third step was to investigate further if genomic regions with overlapping siRNA clusters have conserved features. We did this by performing two bioinformatics analyses, genomic sequence conservation across multiple genomes and the motif analysis among siRNA cluster regions. To investigate on the evolutionary sequence conservation of siRNA clusters across five monocot grass plants, Brachypodium distachyon, Oryza sativa, Setaria italic, Sorghum bicolor and Zea mays, we measured the conservation strength of siRNA cluster regions. Since we are interested in gene regulatory siRNAs, siRNA clusters that have negative correlation with genes were selected as candidates. Orthologs of the selected genes targeted by siRNA clusters were searched for in the Plant Ensemble plant compara data . Genes with orthologs in at least four plants, including Oryza sativa, were selected. Each gene and its orthologs were aligned using Clustal Omega  in order to acquire the PhastCons score . The PhastCons scores were then binned with a bin size of 25 nt where each bin was assigned with the average PhastCons score of the bin. Bins were labeled as siRNA cluster bins and non-siRNA cluster bins depending on the presenece of an overlapping siRNA cluster. Per genomic feature, a t-test on the scores was performed between siRNA cluster bins and non-siRNA cluster bins for any significant difference (p≤0.05) in conservation. Since the number of siRNA cluster bins was fewer than non-siRNA cluster bins, the statistical test was repeated for 1000 sets of random selection of non-siRNA cluster bins to match the number of siRNA-cluster bins per feature. For the motif analysis, significantly conserved siRNA clusters were collected and used as input for motif discovery using MEME .
Plant material and dehydration stress treatments
We germinated seeds of wild type rice (Oryza sativa Japponica Nipponbare) plants and Nipponebare-backgrounded OsAP2 (Os02g0669100) over-expressed transgenic plants on petri-dishes with 200 mg/L Cefotaxime under 28, 16/8 h light/dark conditions for 5 days. We then cultured them for seedling in a 1.5 mL microfuge tube cellbox with Yoshida’s solution, under the same conditions for 4 days. After seedling, we moved them to an aquaculture container with Yoshida’s solution and cultured them under the same conditions for 8 days until the three-leaf stage. Afterward, we exposed them to drought stress conditions by removing the water in the aquaculture container. After a pilot experiment to determine the expression of DIP1 gene in the WT rice, we selected 0, 1, and 6 HATs as time points to measure omics data. The whole plants harvested at these time points were kept frozen in liquid nitrogen until DNA/RNA extraction.
Measuring gene expression levels
To quantify gene expressions, the transcriptome RNA-seq data were first clipped and mapped to the Rice IRGSP1.0 reference genome [17, 18] using Tophat  and Bowtie . Then the gene expressions (units in FPKM) were quantified using Cufflinks . Genes not exceeding 1 FPKM at any time point were removed from the analysis data set.
Delayed siRNA activation in drought resistant rice plants
siRNA clusters and their potential functional roles
From the siRNA clustering results the number of detected siRNA clusters differed among the three plants and time points. In WT, 6690, 6345 and 11128 siRNA clusters were detected genome wide in 0, 1 and 6 hat respectively. In ap2, 28429, 15715 and 19152 siRNA clusters were detected in 0, 1 and 6 hat respectively. In Vandana, 8799, 8816 and 14847 siRNA clusters were detected in 0, 1 and 6 hat respectively. Since a non-trivial number of clusters were plant and time specific, overlapping clusters were merged and collected for each plant. Furthermore, clusters residing in intergenic regions were filtered out. We found that 393, 781, and 488 genes were targeted by at least one siRNA cluster in WT, ap2 and Vandana respectively.
By performing gene set enrichment analysis of genes that are targeted by siRNAs, previously known drought responsive mechanisms, such as “Oxidation reduction-process (P-value= 3.88e −31)” , “Small molecule metabolic process (P-value= 2.75e −25)”, “Carbohydrate metabolic process (P-value= 2.83e −21)”  and “Photosynthesis (P-value= 6.96e −14)” , were found to be significantly enriched. The GO enrichment was performed using the PANTHER database (Release date: June 22 n d 2016). The full list of significantly enriched GO terms (178) are provided in the Additional file 1.
Characterizing siRNA clusters using conservation and motif analysis
Studies related to drought tolerance have shown that drought tolerant cultivars have higher constitutive level of anti-oxidative enzymes compared to drought sensitive one [29–31]. In addition, drought tolerant cultivars have higher level of proteolysis in roots as well as shoots compared to the drought sensitive ones , while during the recovery from drought, proteolytic activity diminishes . In  siRNAs are reported to be expressed in response to biotic and abiotic stress, such as high salt, temperature extremes, hypoxia, and immunity. However, there are no studies on the functional role of siRNAs in aspects of the drought tolerant mechanism. Here, we report, for the first time to the best of our knowledge, that siRNAs were induced in response to drought stress, but at a later stage in drought tolerant rice plants. Such differently expressed siRNAs have potentially regulatory roles on their target genes. Among the siRNA target genes, the gene set enrichment analysis results showed that oxidation reduction and photosynthesis related genes were differently regulated between WT and drought tolerant cultivars. While intron sequences are known to have low or no conservation, we found intronic regions with enriched overlapping siRNAs to have comparatively high conservation rates, especially in intronic regions (Fig. 8). Generally UTR regions are exptected to have higher conservation rate than introns. Compared to the 4258 intronic siRNA clusters, only 891 and 1677 siRNA clusters were detected in 5 ′ and 3 ′ UTR regions. Also shown in the results above, we found that siRNA clusters were majorly found within intronic regions. One possible explanation could be that UTR are less targeted since these regions are important in terms of biological mechanisms such as gene regulation. Another reason may be that siRNA clusters detected in the UTR region exhibit different characteristics to the intronic siRNA clusters. Due to this fact, we chose the intronic siRNA clusters for further characterization. By performing motif analysis, we found 4 motifs across 87 genes that were located within siRNA cluster regions. This was further supported by the motif analysis that showed that eight genes with all four motifs were Acyl-protein thioesterase 2, Ent-kaurene oxidase (Cytochrome P450), allergen-related, Subtilisin-like protease, TGF-beta receptor, type I/II extracellular region family protein, carboxylyase, Transcriptional factor B3 domain containing protein and NusB/RsmB/TIM44 domain containing protein.
In summary, we report that conserved siRNAs are responders indicating the drought related damages to the plant as the drought stress continues and their functional roles are related to well known drought related pathways.
This article has been published as part of BMC Systems Biology Volume 10, Supplement 4, 2016: Proceedings of the 27th International Conference on Genome Informatics: systems biology. The full contents of the supplement are available online at http://bmcsystbiol.biomedcentral.com/articles/supplements/volume-10-supplement-4.
This work was supported by the Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ01121102) Rural Development Administration, Republic of Korea, and the Collaborative Genome Program for Fostering New Post-Genome industry through the National Research Foundation of Korea (NRF) funded by the Ministry of Science ICT and Future Planning (NRF-2014M3C9A3063541), and the Agenda program (No. J010174082016 for WS Jung), Rural Development Administration, Republic of Korea.
Availability of data and materials
The datasets supporting the conclusions of this article are available in the GEO repository under the accession number GSE74465.
SK and HK designed the project. HK and SS designed and conducted the drought experiments. JK provided the AP2 overexpressed transgenic plant. IJ and HA processed the omics data and performed bioinformatics analysis. WJ performed biological interpretation of the results. IJ, HA, SK and WJ wrote the article. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Axtell MJ. Classification and comparison of small RNAs from plants. Annu Rev Plant Biol. 2013; 64:137–59.View ArticlePubMedGoogle Scholar
- Chen D, Meng Y, Yuan C, et al. Plant siRNAs from introns mediate DNA methylation of host genes. RNA. 2011; 17(6):1012–1024.View ArticlePubMedPubMed CentralGoogle Scholar
- Dunoyer P, Himber C, Voinnet O. DICER-LIKE 4 is required for RNA interference and produces the 21-nucleotide small interfering RNA component of the plant cell-to-cell silencing signal. Nat Genet. 2005; 37(12):1356–1360.View ArticlePubMedGoogle Scholar
- Lippman Z, Martienssen R. The role of RNA interference in heterochromatic silencing. Nature. 2004; 431(7006):364–70.View ArticlePubMedGoogle Scholar
- Kuang H, Padmanabhan C, Li F, et al. Identification of miniature inverted-repeat transposable elements (MITEs) and biogenesis of their siRNAs in the Solanaceae: new functional implications for MITEs. Genome Res. 2009; 19(1):42–56.View ArticlePubMedPubMed CentralGoogle Scholar
- Borsani O, Zhu J, Verslues P, et al. Endogenous siRNAs derived from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis. Cell. 2005; 123(7):1279–1291.View ArticlePubMedPubMed CentralGoogle Scholar
- Yi N, Kim YS, Jeong MH, Oh SJ, Jeong JS, Park SH, Jung H, Do Choi Y, Kim JK. Functional analysis of six drought-inducible promoters in transgenic rice plants throughout all stages of plant growth. Planta. 2010; 232(3):743–54.View ArticlePubMedGoogle Scholar
- Kozomara A, Griffiths JS. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011; 39(1):D152–7.View ArticlePubMedGoogle Scholar
- Burge SW, Daub J, Eberhardt R, et al. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2012:1005.Google Scholar
- Vazquez F, Vaucheret H, Rajagopalan R, et al. Endogenous trans-acting siRNAs regulate the accumulation of Arabidopsis mRNAs. Mol Cell. 2004; 16(1):69–79.View ArticlePubMedGoogle Scholar
- Jung I, Park JC, Kim S. piClust: a density based piRNA clustering algorithm. Comput Biol Chem. 2014; 50:60–7. doi:10.1016/j.compbiolchem.2014.01.008.View ArticlePubMedGoogle Scholar
- Ester M, Kriegel HP, Sander J, Xu X, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd: 1996. p. 226–31.Google Scholar
- Kersey PJ, Allen JE, Christensen M, et al. Ensembl Genomes 2013: scaling up access to genome-wide data. Nucleic Acids Res. 2014; 42(D1):546–52.View ArticleGoogle Scholar
- Sievers F, Wilm A, Dineen D, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011; 7(1):539.View ArticlePubMedPubMed CentralGoogle Scholar
- Margulies EH, Blanchette M, Haussler D, et al. Identification and characterization of multi-species conserved sequences. Genome Res. 2003; 13(12):2507–18.View ArticlePubMedPubMed CentralGoogle Scholar
- Bailey TL, Johnson J, Grant CE, et al. The MEME suite. Nucleic Acids Res. 2015:416.Google Scholar
- Sakai H, Lee SS, Tanaka T, Numa H, Kim J, Kawahara Y, Wakimoto H, Yang C-C, Iwamoto M, Abe T, et al. Rice annotation project database (rap-db): an integrative and interactive database for rice genomics. Plant Cell Physiol. 2013; 54(2):6–6.View ArticleGoogle Scholar
- Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, et al. Improvement of the oryza sativa nipponbare reference genome using next generation sequence and optical map data. Rice. 2013; 6(1):4.View ArticlePubMedGoogle Scholar
- Trapnell C, Pachter L, Salzberg SL. Tophat: discovering splice junctions with rna-seq. Bioinformatics. 2009; 25(9):1105–1111.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012; 9(4):357–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, Van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by rna-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010; 28(5):511–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Suzuki N, Koussevitzky S, Mittler R, Miller G. Ros and redox signalling in the response of plants to abiotic stress. Plant Cell Environ. 2012; 35(2):259–70.View ArticlePubMedGoogle Scholar
- McDowell N, Pockman WT, Allen CD, Breshears DD, Cobb N, Kolb T, Plaut J, Sperry J, West A, Williams DG, et al. Mechanisms of plant survival and mortality during drought: why do some plants survive while others succumb to drought?. New Phytol. 2008; 178(4):719–39.View ArticlePubMedGoogle Scholar
- Chaves M, Flexas J, Pinheiro C. Photosynthesis under drought and salt stress: regulation mechanisms from whole plant to cell. Ann Bot. 2009; 103(4):551–60.View ArticlePubMedGoogle Scholar
- Vrbsky J, Akimcheva S, Watson JM, et al. siRNA-mediated methylation of Arabidopsis telomeres. PLoS Genet. 2010; 6:e1000986.View ArticlePubMedPubMed CentralGoogle Scholar
- Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. Meme suite: tools for motif discovery and searching. Nucleic Acids Res. 2009; 37:W202–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Bailey TL, Elkan C, et al. Fitting a mixture model by expectation maximization to discover motifs in bipolymers. 1994.Google Scholar
- Machanick P, Bailey TL. Meme-chip: motif analysis of large dna datasets. Bioinformatics. 2011; 27(12):1696–1697.View ArticlePubMedPubMed CentralGoogle Scholar
- Lawlor DW, Cornic G. Photosynthetic carbon assimilation and associated metabolism in relation to water deficits in higher plants. Plant Cell Environ. 2002; 25(2):275–94.View ArticlePubMedGoogle Scholar
- Sikuku PA, Netondo GW, Onyango JC, et al. Chlorophyll fluorescence, protein and chlorophyll content of three nerica rainfed rice varieties under varying irrigation regimes. ARPN JJ Agric Biol Sci. 2010; 5(2):19–25.Google Scholar
- Sharma P, Dubey RS. Drought induces oxidative stress and enhances the activities of antioxidant enzymes in growing rice seedlings. Plant Growth Regul. 2005; 46(3):209–21.View ArticleGoogle Scholar
- Pyngrope S, Bhoomika K, Dubey RS. Oxidative stress, protein carbonylation, proteolysis and antioxidative defense system as a model for depicting water deficit tolerance in indica rice seedlings. Plant Growth Regul. 2013; 69(2):149–65.View ArticleGoogle Scholar
- Simova SL, Vassileva V, Petrova T, et al. Proteolytic activity in wheat leaves during drought stress and recovery. Gen Appl Plant Physiol Spec. 2006; 91-100.Google Scholar
- Khraiwesh B, Zhu JK, Zhu J. Role of miRNAs and siRNAs in biotic and abiotic stress responses of plants. Biochim Biophys Acta (BBA)-Gene Regul Mech. 2012; 1819(2):137–48.View ArticleGoogle Scholar