Identification of transcription factor's targets using tissue-specific transcriptomic data in Arabidopsis thaliana
© Liu and Xu; licensee BioMed Central Ltd. 2010
Published: 13 September 2010
Transcription factors (TFs) regulate downstream genes in response to environmental stresses in plants. Identification of TF target genes can provide insight on molecular mechanisms of stress response systems, which can lead to practical applications such as engineering crops that thrive in challenging environments. Despite various computational techniques that have been developed for identifying TF targets, it remains a challenge to make best use of available experimental data, especially from time-series transcriptome profiling data, for improving TF target identification.
In this study, we used a novel approach that combined kinetic modelling of gene expression with a statistical meta-analysis to predict targets of 757 TFs using expression data of 14,905 genes in Arabidopsis exposed to different durations and types of abiotic stresses. Using a kinetic model for the time delay between the expression of a TF gene and its potential targets, we shifted a TF's expression profile to make an interacting pair coherent. We found that partitioning the expression data by tissue and developmental stage improved correlation between TFs and their targets. We identified consensus pairs of correlated profiles between a TF and all other genes among partitioned datasets. We applied this approach to predict novel targets of known TFs. Some of these putative targets were validated from the literature, for E2F's targets in particular, while others provide explicit genes as hypotheses for future studies.
Our method provides a general framework for TF target prediction with consideration of the time lag between initiation of a TF and activation of its targets. The framework helps make significant inferences by reducing the effects of independent noises in different experiments and by identifying recurring regulatory relationships under various biological conditions. Our TF target predictions may shed some light on common regulatory networks in abiotic stress responses.
Plants often respond and adapt to different environmental stresses, such as drought, cold and chemicals through various transcriptional regulatory systems . Identification of these regulations not only enhances our knowledge of biological processes in plants, but also helps a great deal in developing bio-engineered crops that can better sustain challenging environments . Typically, a handful of key transcription factors (TFs) control various biological pathways by regulating downstream target genes. In many cases, these target genes share functions or pathways. While basic ideas of these TFs and their target genes' general functions may be known, lack of knowing explicit target genes often limits the experimental design for validating intuitive hypotheses or developing new crop traits. A comprehensive list of putative targets of a TF could be used to provide more insight of a key TF through functional enrichment analysis or mapping these target genes into different biological pathways.
High-throughput expression profiling experiments  have generated large amounts of data that make it possible to develop computational approaches for predicting regulatory relations. Public repositories like NCBI Gene Expression Omnibus (GEO) [4, 5], SMD (Stanford Microarray Database) , TAIR , etc. contain extensive microarray data from time series, developmental stages, genetic interventions or manipulative treatments for Arabidopsis thaliana, a model organism for plants [8, 9]. These data as well as ChIP-chip data have been used to study interactions of TFs to their downstream genes [10–13]. However, mining microarray data for discovering complicated regulatory relationships is still challenging partially due to the fact that these data are often incomplete, noisy, and contain misleading outliers, all of which likely produce false positives in biological inferences.
Many computational approaches for predicting genome-wide targets of a TF are based on finding co-occurrence of TFs and their targets. These include Standard Pearson correlation technique to measure statistical significance of synchronous co-regulation of genes and order of regulation . However, correlation coefficient is a weak criterion for measuring dependence and can lead to many false positives in predicting TF targets . Another approach is Graphical Gaussian Model (GGM) based on the concept of partial correlation for learning high-dimensional dependency networks from genomic data [16, 17], which is valid when number of genes is comparable to number of samples in the microarray data [18, 19]. One way to avoid this limitation is to use GGM with regularization and moderation, which is implemented as an R package GeneNet[20, 21]. This method has been used to infer genome-scale regulatory network for A. thaliana transcriptome . Some other methods are based on probabilistic models, such as the Bayesian network  and regression tree method . Such methods cannot be directly applied to many time series expression profiling data, because the apparent time lag between initiation of a TF and activation of its targets is not accounted in these models. For example, a study suggests a clear time lag between the mRNA levels of a TF, CBF and its known targets . In part, the time lag is used to translate the mRNAs of a TF into proteins before the proteins can act on activating/repressing TF's targets. To address this issue, it is important to adjust time-series transcription profiling data for detection of TF-target relationship .
Another group of methods to identify TF-target genes are specifically designed for time-series expression profiling data, including a method based on Needleman- Wunsch algorithm  and a dynamic probabilistic model based on chemical kinetics and linear differential equations . The dynamic probabilistic model, introduced by Friedman et al. , is able to learn the kinetic parameters of TFs binding to their target promoters and the structure of gene regulation network simultaneously. However, it requires estimation of a large number of parameters, and it does not provide an explicit way of identifying TFs' targets from predicted active regulator's protein profiles. The linear differential equation model in Ref.  describes the production and degradation of all mRNAs and their corresponding proteins with equations of chemical kinetics. While it is an interesting and promising theoretical approach, it tends to be very complex and requires concentration measurements of both mRNA and protein, at least at the initial state.
Many existing studies for retrieving regulatory information use a large collection of microarray data. A potential problem in using microarray data this way is ignoring the heterogeneity in topology of regulatory network due to biological/experimental factors, which could be different tissues, developmental stages or artificial treatments.
Results and discussion
We used a kinetic model combined with statistical meta-analysis to identify TF targets and reconstructed an Arabidopsis global regulatory network using large-scale expression profiles of 14,905 genes. We then evaluated our strategy by comparative and functional analysis of predicted E2F target genes and by comparing our method with other existing methods. Finally, we analyzed the reconstructed network to infer some novel features from the network.
• Network construction
List of all tissue groups used for meta-analysis.
Number of Samples
Number of Experiments
• E2F Network evaluation
DNA recombination / DNA repair / DNA replication
Regulation of DNA replication and cell cycle
DNA synthesis and replication
DNA synthesis and replication
Transcriptional control/chromatin modification
DNA or RNA metabolism/ transferase activity
Cell cycle, Replication control, DNA synthesis
N-terminal protein myristoylation/ protein amino acidPhosphorylation
DNA replication, synthesis of RNA primer
DNA synthesis and replication, cell cycle control
Chromatin silencing / DNA methylation
DNA synthesis and replication
DNA replication initiation
Transcriptional control, chromatin modification
Regulation of circadian rhythm
Replication protein A-like
Error-prone postreplication DNA repair / replication
Chloroplast organization / DNA replication
Response to DNA damage stimulus / cell cycle
Cell division control protein
DNA replication initiation / GINS complex
DNA or RNA metabolism / other cellular processes
Cell organization / DNA or RNA metabolism
DNA or RNA metabolism / nucleobase, nucleoside, nucleotide and nucleic acid metabolic process
Negative regulation of flower development / leaf morphogenesis
helicase activity/ hydrolase activity / DNA binding
nucleic acid binding
DNA metabolic process / DNA unwinding duringReplication
GO term enrichment analysis of 178 predicted E2F-target genes.
Number of Genes
DNA metabolic process
DNA-dependent DNA replication
DNA replication initiation
Cellular macromolecule biosynthetic process
Macromolecule biosynthetic process
Cellular biopolymer biosynthetic process
Biopolymer biosynthetic process
Cellular macromolecule metabolic process
Macromolecule metabolic process
Cellular biopolymer metabolic process
Biopolymer metabolic process
Cellular biosynthetic process
Primary metabolic process
Nucleobase, nucleoside, nucleotide and nucleic acid metabolic process
Cellular metabolic process
Regulation of DNA metabolic process
DNA methylation on cytosine
Cell cycle process
Cellular response to DNA damage stimulus
Cellular protein metabolic process
Protein metabolic process
Response to DNA damage stimulus
Ribonucleoprotein complex biogenesis
Cellular component biogenesis
• Network evaluation and comparative analysis
In order to compare performance of meta-analysis with other methods for identifying TF targets, we prepared a benchmark dataset of TF-target pairs in Arabidopsis, which were obtained from the AGRIS database and AtRegNet . The benchmark set has 348 pairs in total. Some of the well-known methods to identify TF target and build regulatory network, including causal regression method, standard Pearson correlation method, and Graphical Gaussian model were used for comparative analysis. To make direct comparison of various methods, we used the exactly same microarray datasets as input to these methods and also exactly the same benchmark data. While using Pearson correlation method, Graphical Gaussian model and regression method, we did not partition the data rather we followed the procedure as previously done in the literature.
Performance comparison of various methods with Arabidopsis networks of less than 40,000 edges (the numbers before "/") and less than 70,000 edges (the numbers after "/") .
Pearson Correlation (Cutoff=0.70)
Causal Linear Regression Model
Graphical Gaussian Model
GeneNet: Static method
GeneNet: Dynamic Method
Meta-analysis (Microarray data partition)
The results show that our method with partitioning microarray data into tissue- specific datasets and then performing tissue-wide meta-analysis contains the most confirmed edges. Particularly, the network of less than 40,000 edges obtained using tissue-wide meta-analysis is 1/3 in size compared to other networks in the same category, but with more confirmed edges than any other network. The comparison clearly demonstrates that tissue-wide partition performs much better than experiment- wide or development-based partition. This indicates that the gene expression is driven more by tissue types than by development stages and experimental conditions, at least in Arabidopsis. It also shows that the tissue-wide meta-analysis could greatly improve network constructions over other methods. Interestingly, a simple method using Pearson correlation cutoff of 0.70, although not as good as meta-analysis, outperformed sophisticated methods of causal linear regression model and graphical Gaussian model. This may be because microarray data are often noisy and sophisticated methods could amplify noises to give incorrect predictions in gene regulatory relationships.
• Network feature analysis
Global regulators from medium size network having most target genes. A complete list of putative targets for each TF is available in Additional file 2.
Response to drought, heat, chitin, osmotic stress, salt, cold etc., defense response to fungus, bacterium
Response to wounding, salicylic acid, chitin, defense response to bacterium, fungus etc
Response to chitin
Response to gamma radiation, regulation of cell growth
Regulation of meristem organization
Response to chitin, regulation of transcription, DNA- dependent
Embryo sac & pollen development, karyogamy, double fertilization forming a zygote and endosperm
Response to chitin, heat, UV-B, wounding, oxidative stress, cold, photosynthesis, hyperosmatic salinity response
Response to chitin
Nucleic acid & zinc ion binding, transcription factor activity
Response to abscisic acid, drought, light, cold, chitin, salt etc
Beside network feature analysis using Cytoscape, we analyzed TFs that target significantly more genes than other TFs across different tissues as shown in Table 5. Not surprisingly, most of these TFs are annotated with response to different stimulus in Arabidopsis, such as response to chitin and external stress, given that the microarray data we used were measured in response of Arabidopsis exposed to different abiotic stresses. Some of well-connected TFs are also present in hubs as recognized by the MCODE plugin from Cytoscape and are known to work together for gene regulation. For example, Zat6, Zat10, and Zat12 in the hub of Fig. 3 are activated together in cold and osmotic stresses . WRKY33 and WRKY40 in Table 5 both function as activators of jasmonic acid-dependent defence pathways and repressors of salicylic acid signalling .
In this paper, we proposed a meta-analysis method for identifying TF targets. The novelty of the proposed method lies in combining two models that is (1) adjusting time lag between a TF and its target and (2) finding consensus regulatory interactions from different experimental sources/conditions including tissue types, developmental stages and experimental settings. Our study shows that tissue-wide partition performs much better than experiment-wide or development-based partition for predicting TF targets. The method successfully identified more known TF-target pairs in Arabidopsis than other methods.
There are some limitations of this study. Like other approaches, our method has both false positives and false negatives in predictions. Without performing a large-scale experimental validation, there is no reliable way to assess the prediction accuracies of the methods like ours. Hence, the value of our study is mainly to provide hypotheses for experimentalists to explore specific gene regulations of their interest, especially as most of the predicted TF targets with high confidence were not reported or predicted previously. Furthermore, our method may not be able to distinguish TF targets from other co-expressed non-target genes, although meta-analysis across multiple tissues reduces such a possibility. From the meta-analysis point of view, tissue-wide meta- analysis does not consider specific regulatory relations in particular tissue types. In plant, some regulations are specific to different tissue types or developmental stages. Since such relations do not exhibit significant correlation across different microarray data, meta-analysis may ignore them. Nevertheless, meta-analysis is more robust to find correlations that are consistent across different tissues. Typically, global regulations are those that are fundamental for the existence of all tissues in general.
In the context of our study, we only applied gene expression data of Arabidopsis exposed to different abiotic stresses. It is known that there are common regulatory mechanisms for abiotic stresses. For example, certain heat-shock proteins are commonly elicited in response to various stress conditions in multiple plants . Conserved regulatory mechanisms among responses to drought, salinity, and extreme temperature in Arabidopsis were identified, such as the DREB transcription factors . Characterizing common gene expression patterns under various abiotic stress conditions in plants can help elucidate these conserved regulatory mechanisms . Hence, the meta-analysis that we provided on gene expression data under different abiotic stress treatments may shed some light on common regulatory networks in abiotic stress responses. In our future studies, we will explore more into meta-analysis of microarray data by applying different statistics like meta correlation instead of chi- square statistics. Another dimension of improvement is to include inferences from other types of data such as promoter motif analysis.
Materials and methods
We used publically available microarray data of A. thaliana from NCBI GEO (http://www.ncbi.nlm.nih.gov/geo/) and TAIR (http://www.arabidopsis.org/). The microarray gene expression data were normalized and preprocessed in the databases. We removed genes with missing expression measurement in any tissue type and averaged the replicated expression data. Consequently, we applied our method on 497 arrays in total measuring whole-genome response of Arabidopsis exposed to different durations and types of abiotic stresses. Some 14,905 genes from Arabidopsis genome including 757 TFs were chosen for the analysis as each of these genes was differentially expressed in at least one of the stress conditions. The datasets consist of 27 different microarray experiments, out of which 10 experiments are time series (see Additional file 3).
•Chemical kinetics models to identify regulator-target relationships
f (Rp= 0) is equal to zero, assuming target gene transcription should not be activated when there is no regulator protein. is the activation rate of regulator protein on the target gene. If it is replaced by parameter K act for simplicity, f (R P ) takes the following form:
f (R p ) = K act R p .(4)
The basal level target transcription rate should satisfy the following condition:
B t + f (R pbasal ) — K t T mbasal = 0 (5)
where R pbasal and T mbasal are the basal concentrations of the regulator protein and target mRNA, respectively.
Usually, what is reported in transcription profiling experiment is not the absolute concentration of mRNA, but rather a fold change compared to basal transcription level of that gene. Thus, we define relative changes of R m and T m as R m ' and T m '
R m ' = R m ⁄ R mbasal — 1 ;(6)
T m ' = T m ⁄ T mbasal — 1 .(7)
where γ = K act K tran R mbasal ⁄ T mbasel .
Given all the model parameters, the relationship between the relative mRNA levels of the regulator and its target, R m ' and T m ', is defined by Equation (8). In other words, for a target gene of a regulator, its relative mRNA level T m ' has to satisfy Equation (8), given the model parameters and the relative regulator mRNA level R m '. It is interesting to note that the regulator protein concentration, a key variable in the original model equations, is not involved explicitly in the final equation relating the relative mRNA levels of regulator and target. To predict the target of a specific regulator, we can solve Equation (8) to obtain the theoretical target behaviour curve, and then find the genes with mRNA levels similar to the theoretical curve, which will be identified as the potential targets of that regulator.
Because the target gene mRNA and the regulator protein should be at their basal levels at the onset of stress condition (t=0). It is apparent from Equations (2) and (5) that initial condition (10) should be true.
To approximate R m , a stepwise linear model can be fit as follows:
R m ' I (t) = α i + β i t t i ≤ t ≤t i + 1i = 0, ··· ,n-1 (11)
Where D i = β i γ/K p K t and Ci = [α i γ – (K p + K t )D i ]/ K p K t .
The contiguous restrictions on T m ' are stated in the following equations:
After substituting Equation (12) into Equations (9), (10), (13) and (14), A i and B i can be obtained by solving sets of linear algebra equations, and are functions of α i , ß i , γ, K t and K P .
Learning model parameters and transforming the time series profiles of transcription factors
For each regulator-target pair, there are three parameters involved in Equation (8), the target mRNA turnover rate K t , the active regulator turnover rate K P , and γ, which is equal to K act K trαn R mbαsl / T mbαsal . K αct represents the strength of regulator protein effect on the target gene; K tram is the translation rate of regulator mRNA. They lump together with the ratio of basal mRNA concentrations of regulator and target to form parameter γ, which determines the magnitude of the relative target mRNA level but not its shape. It is the parameters K t and K P that determine the shape of the relative target mRNA level, such as how fast the target gene responds to the regulator. For gene expression experiments under stress conditions in plants, the kinetics model can be trained with known regulator-target pair reported in the literature (e.g., CBF and RD17 in Arabidopsis under cold stress) with a non-linear regression model . When the normalized expression profile of a target gene with its maximal response is considered, there is no need to keep γ as a free model parameter (γ1 = nγ2 leads to Tm 1'= nTm 2' when other parameters are kept the same in Equations (8), (9) and (10)). Therefore, only two parameters K t and K P are estimated from the non-linear regression model, and are used to predict other regulators and their targets in plant stress response.
The theoretical TF-target mRNA expression profiles are calculated for all the genes annotated as TFs and are substituted in place of TFs' profiles during further computation for co-expression calculation. The theoretical target profile of any TF in terms of relative expression levels among different time points is independent of actual targets of that TF as it is solely calculated based on the kinetic model. According to the model, the theoretical target profile of a TF should match the profile of its actual targets in the trend of expression although not in the absolute abundance. With this assumption, we can use Pearson correlation coefficient to find similarity of co-expression between the theoretical/shifted profile of a TF and rest of the genes to find potential targets of this TF.
• Co-expression statistics
where is a chi-square random variable with 2n degrees of freedom. We calculate significance level of the gene pair from multiple datasets. The significance level of gene pair represents the count of datasets in which that gene pair has significant correlation (p-value < 0.01) based on Equation (15). We used meta p-value statistics (Equation (17)) combined with significance level to rank potential targets for a TF .
• Regulatory network reconstruction
The meta p-value combined with significance level and the Pearson correlation coefficient were used as co-expression statistics for finding putative targets for a TF. For a single dataset (without partitioning of microarray data), we ranked all the potential targets of a TF based on Pearson correlation coefficient and select targets such that TF-target correlation > 0.75 (medium size network) or 0.70 (large size network). For multiple datasets, we ranked all TF-target pairs based on the number of individual p-values that are smaller than 0.01 across multiple datasets; for pairs that have the same number of significant p-values, they were ranked by the corresponding meta chi-square statistics defined in Equation (16). Here we used meta chi-square instead of meta p-value since the meta p-value for many gene pairs are very close to zero and hard to distinguish computationally; both meta chi-square and meta p-value should result in the same order when the degrees of freedom for each gene pair is same. In the end, a fixed number of TF-target pairs were selected based on ranking.
In case of meta-analysis, number of target genes for a TF was determined in two methods, i.e., (1) selecting fixed number of targets from top (50 or 75) or (2) choosing targets form top-ranked genes that shows significance correlation as TF-target pair in at least certain number of microarray datasets used for meta-analysis. For example, we used significance cutoff 9 (out of 9 datasets) for small network and cutoff 8 (out of 9) for medium network and cutoff 7 (out of 9) for large network. The second method worked better in general.
We would like to thank Jing Qiu from Department of Statistics, and Trupti Joshi and Jingfen Zhang from Christopher S. Bond Life Sciences Center, University of Missouri for helpful discussions and technical assistance. Part of this work was supported by Monsanto systems biology internship, Shumaker fellowship from Computer Science Department, University of Missouri, and a research fund from United Soybean Board.
This article has been published as part of BMC Systems Biology Volume 4 Supplement 2, 2010: Selected articles from the Third International Symposium on Optimization and Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1752-0509/4?issue=S2
- Shinozaki K, Yamaguchi-Shinozaki K, Seki M: Regulatory network of gene expression in the drought and cold stress responses. Curr Opin Plant Biol. 2003, 6 (5): 410-417. 10.1016/S1369-5266(03)00092-XPubMedView ArticleGoogle Scholar
- Kasuga M, Liu Q, Miura S, Yamaguchi-Shinozaki K, Shinozaki K: Improving plant drought, salt, and freezing tolerance by gene transfer of a single stress-inducible transcription factor. Nat Biotechnol. 1999, 17 (3): 287-291. 10.1038/7036PubMedView ArticleGoogle Scholar
- Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995, 270 (5235): 467-470. 10.1126/science.270.5235.467PubMedView ArticleGoogle Scholar
- Barrett T, Suzek TO, Troup DB, Wilhite SE, Ngau WC, Ledoux P, Rudnev D, Lash AE, Fujibuchi W, Edgar R: NCBI GEO: mining millions of expression profiles--database and tools. Nucleic Acids Res. 2005, 33 (Database issue): D562-566. 10.1093/nar/gki022PubMedPubMed CentralView ArticleGoogle Scholar
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res. 2007, 35 (Database issue): D760-765. 10.1093/nar/gkl887PubMedPubMed CentralView ArticleGoogle Scholar
- Marinelli RJ, Montgomery K, Liu CL, Shah NH, Prapong W, Nitzberg M, Zachariah ZK, Sherlock GJ, Natkunam Y, West RB, et al.: The Stanford Tissue Microarray Database. Nucleic Acids Res. 2008, 36 (Database issue): D871-877.PubMedPubMed CentralGoogle Scholar
- Poole RL: The TAIR database. MethodsMolBiol. 2007, 406: 179-212.Google Scholar
- Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D'Angelo C, Bornberg-Bauer E, Kudla J, Harter K: The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV- B light, drought and cold stress responses. Plant J. 2007, 50 (2): 347-363. 10.1111/j.1365-313X.2007.03052.xPubMedView ArticleGoogle Scholar
- Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU: A gene expression map of Arabidopsis thaliana development. Nat Genet. 2005, 37 (5): 501-506. 10.1038/ng1543PubMedView ArticleGoogle Scholar
- Buck MJ, Lieb JD: ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics. 2004, 83 (3): 349-360. 10.1016/j.ygeno.2003.11.004PubMedView ArticleGoogle Scholar
- de la Fuente A, Brazhnik P, Mendes P: Linking the genes: inferring quantitative gene networks from microarray data. Trends Genet. 2002, 18 (8): 395-398. 10.1016/S0168-9525(02)02692-6PubMedView ArticleGoogle Scholar
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al.: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002, 298 (5594): 799-804. 10.1126/science.1075090PubMedView ArticleGoogle Scholar
- Yugi K, Nakayama Y, Kojima S, Kitayama T, Tomita M: A microarray data-based semi-kinetic method for predicting quantitative dynamics of genetic networks. BMC Bioinformatics. 2005, 6: 299- 10.1186/1471-2105-6-299PubMedPubMed CentralView ArticleGoogle Scholar
- Markowetz F, Spang R: Inferring cellular networks--a review. BMC Bioinformatics. 2007, 8 (Suppl 6): S5- 10.1186/1471-2105-8-S6-S5PubMedPubMed CentralView ArticleGoogle Scholar
- Brazhnik P, de la Fuente A, Mendes P: Gene networks: how to put the function in genomics. Trends Biotechnol. 2002, 20 (11): 467-472. 10.1016/S0167-7799(02)02053-XPubMedView ArticleGoogle Scholar
- Toh H, Horimoto K: Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling. Bioinformatics. 2002, 18 (2): 287-297. 10.1093/bioinformatics/18.2.287PubMedView ArticleGoogle Scholar
- Wichert S, Fokianos K, Strimmer K: Identifying periodically expressed transcripts in microarray time series data. Bioinformatics. 2004, 20 (1): 5-20. 10.1093/bioinformatics/btg364PubMedView ArticleGoogle Scholar
- Schafer J, Strimmer K: An empirical Bayes approach to inferring large- scale gene association networks. Bioinformatics. 2005, 21 (6): 754-764. 10.1093/bioinformatics/bti062PubMedView ArticleGoogle Scholar
- Wille A, Buhlmann P: Low-order conditional independence graphs for inferring genetic networks. Stat Appl Genet Mol Biol. 2006, 5: Article1-PubMedGoogle Scholar
- Schafer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005, 4: Article32-PubMedGoogle Scholar
- Opgen-Rhein R, Strimmer K: From correlation to causation networks: a simple approximate learning algorithm and its application to high- dimensional plant gene expression data. BMC Syst Biol. 2007, 1: 37- 10.1186/1752-0509-1-37PubMedPubMed CentralView ArticleGoogle Scholar
- Ma S, Gong Q, Bohnert HJ: An Arabidopsis gene network based on the graphical Gaussian model. Genome Res. 2007, 17 (11): 1614-1625. 10.1101/gr.6911207PubMedPubMed CentralView ArticleGoogle Scholar
- Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data. J Comput Biol. 2000, 7 (3-4): 601-620. 10.1089/106652700750050961PubMedView ArticleGoogle Scholar
- Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition- specific regulators from gene expression data. Nat Genet. 2003, 34 (2): 166-176. 10.1038/ng1165PubMedView ArticleGoogle Scholar
- Seki M, Ishida J, Narusaka M, Fujita M, Nanjo T, Umezawa T, Kamiya A, Nakajima M, Enju A, Sakurai T, et al.: Monitoring the expression pattern of around 7, 000 Arabidopsis genes under ABA treatments using a full- length cDNA microarray. Funct Integr Genomics. 2002, 2 (6): 282-291. 10.1007/s10142-002-0070-6PubMedView ArticleGoogle Scholar
- Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle- regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 3273-3297.PubMedPubMed CentralView ArticleGoogle Scholar
- Filkov V. SS, Zhi J: Identifying gene regulatory networks from experimental data. In Proceedings of RECOMB. 2001Google Scholar
- Chen T, He HL, Church GM: Modeling gene expression with differential equations. Pac Symp Biocomput. 1999, 29-40.Google Scholar
- Bracken AP, Ciro M, Cocito A, Helin K: E2F target genes: unraveling the biology. Trends Biochem Sci. 2004, 29 (8): 409-417. 10.1016/j.tibs.2004.06.006PubMedView ArticleGoogle Scholar
- Ramirez-Parra E, Frundt C, Gutierrez C: A genome-wide identification of E2F-regulated genes in Arabidopsis. Plant J. 2003, 33 (4): 801-811. 10.1046/j.1365-313X.2003.01662.xPubMedView ArticleGoogle Scholar
- Vandepoele K, Vlieghe K, Florquin K, Hennig L, Beemster GT, Gruissem W, Van de Peer Y, Inze D, De Veylder L: Genome-wide identification of potential plant E2F target genes. Plant Physiol. 2005, 139 (1): 316-328. 10.1104/pp.105.066290PubMedPubMed CentralView ArticleGoogle Scholar
- Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G: GO: TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004, 20 (18): 3710-3715. 10.1093/bioinformatics/bth456PubMedPubMed CentralView ArticleGoogle Scholar
- Reiser L, Rhee SY: Using the Arabidopsis Information Resource (TAIR) to find information about Arabidopsis genes. Curr Protoc Bioinformatics. 2005, Chapter 1:Unit 1 11Google Scholar
- Palaniswamy SK, James S, Sun H, Lamb RS, Davuluri RV, Grotewold E: AGRIS and AtRegNet. a platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol. 2006, 140 (3): 818-829. 10.1104/pp.105.072280PubMedPubMed CentralView ArticleGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504. 10.1101/gr.1239303PubMedPubMed CentralView ArticleGoogle Scholar
- Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 2- 10.1186/1471-2105-4-2PubMedPubMed CentralView ArticleGoogle Scholar
- Mittler R, Kim Y, Song L, Coutu J, Coutu A, Ciftci-Yilmaz S, Lee H, Stevenson B, Zhu JK: Gain- and loss-of-function mutations in Zat10 enhance the tolerance of plants to abiotic stress. FEBS Lett. 2006, 580 (28-29): 6537-6542. 10.1016/j.febslet.2006.11.002PubMedPubMed CentralView ArticleGoogle Scholar
- Xu X, Chen C, Fan B, Chen Z: Physical and functional interactions between pathogen-induced Arabidopsis WRKY18, WRKY40, and WRKY60 transcription factors. Plant Cell. 2006, 18 (5): 1310-1326. 10.1105/tpc.105.037523PubMedPubMed CentralView ArticleGoogle Scholar
- Sørensen JG, Kristensen TN, Loeschcke V: The evolutionary and ecological role of heat shock proteins. Edited by: Letters E. 2003, 6: 1025-1037. (1013), Blackwell PublishingGoogle Scholar
- Mauch-Mani B, Mauch F: The role of abscisic acid in plant-pathogen interactions. Curr Opin Plant Biol. 2005, 8 (4): 409-414. 10.1016/j.pbi.2005.05.015PubMedView ArticleGoogle Scholar
- Swindell WR: The association among gene expression responses to nine abiotic stress treatments in Arabidopsis thaliana. Genetics. 2006, 174 (4): 1811-1824. 10.1534/genetics.106.061374PubMedPubMed CentralView ArticleGoogle Scholar
- Yang X, Li P, Zhou Y, Liu J: Discovering Transcriptional Regulation Networks in Stressed Plants. Proceedings of the 2nd International Rice Congress. 2006, 753-763.Google Scholar
- Srivastava GP, Qiu J, Xu D: Genome-wide functional annotation by integrating multiple microarray datasets using meta-analysis. Int J Data Min Bioinform. 2009Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.