Volume 9 Supplement 2
Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions
© Danziger et al.; licensee BioMed Central Ltd. 2015
Published: 15 April 2015
Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster.
Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct.
We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release.
Biclustering is a technique for examining mRNA expression data and discovering genes that are conditionally co-regulated -i.e., genes that have common expression patterns under certain conditions, but not under others . Thus biclustering is a valuable tool for analysing large gene expression datasets, particularly when those data have been generated under multiple experimental conditions. As mRNA expression data have become ever more plentiful, many diverse public datasets have become available. While it remains difficult to make the most biological sense of this largess, biclustering has been successfully used to mine it for novel biological relationships, to correlate environmental condition with expression patterns, and to predict gene expression under new conditions not in the original datasets .
cMonkey is a particularly powerful biclustering tool that finds putatively co-regulated genes by combining mRNA expression levels (or similar measurements), de novo detected TF binding motifs, and networks of known gene associations . It was originally developed to reconstruct regulatory networks for Halobacterium salinarum . Since then, cMonkey has been continuously developed and has been applied to discover novel biology in other organisms such as humans  and Saccharomyces cerevisiae (S. cerevisiae) , revealing novel challenges. One challenge is building biclusters on consortium datasets containing expression data generated in multiple labs using different mRNA measurement technologies and yeast grown under drastically different conditions. These compendium experiments potentially have different noise levels and can be difficult to compare.
While cMonkey is an effective tool for these circumstances, we found that the mRNA expression evaluation model used by existing versions of cMonkey does not handle such situations as well as it could. It quantifies bicluster coherence by comparing the measured distribution for each gene in a bicluster to an idealized normal distribution, which is based upon the mean expression of the other genes in the bicluster, and the expected variance for each experiment with a uniform systematic error constant. This uniform variance assumption is often inaccurate for expression compendia, because multiple measurement technologies applied in multiple labs will almost certainly have different errors associated with them.
Biclustering of gene expression measurements continues to be an active area of research, and there has been significant progress in improving gene expression biclustering , however very little of it has focused on combining multiple datasets from disparate sources, such as are available from GEO (the gene expression omnibus) [7, 8]. Classical gene expression biclustering, based upon co-expression heuristics such as the Cheng and Church mean-squared-residue , have achieved impressive methodological diversity and results . However, the original cMonkey implementation instead used a probabilistic model that enabled a more rigorous integration of co-expression with bicluster evidence based on non-gene expression data types . Other methods have focused on biclustering in the context of specific biological problems. Reference gene biclustering finds biclusters that match the expression pattern for a single reference gene . Differential co-expression biclustering finds biclusters that are differentially co-expressed between two conditions . Time series biclustering finds genes that follow common temporal co-expression patterns as revealed in time series data . However, none of these methods is well suited to analyse variable compendium data and discover globally relevant biclusters. Reference gene biclustering will only find biclusters relevant for a single reference gene; differential co-expression biclustering requires exactly two well annotated datasets; and time series biclustering requires time series data. As variable compendium data can contextualize behaviour and reveal novel biology that a single condition specific dataset cannot , it is important to develop a metric appropriate for analysing these diverse data sets.
Results & discussion
In cMonkey, the coherence p-value for a gene i in cluster k is referred to as r ik . Mathematically, cMonkey improves the coherence of its biclusters by minimizing r ik for all genes in each cluster (subject to other constraints). BSCM changes how r ik is calculated. By thus improving the co-expression p-value function with BSCM, we were able to improve the overall quality of the biclusters. We assess this improvement using three metrics: 1) We use cMonkey's internal scoring which calculated overall cluster quality using the non-BSCM r ik and test on M. pneumoniae; 2) We use a GO term enrichment score and test on S. cerevisiae; and 3) We use the experiments included in clusters to build a classifier that predicts peroxisome proliferation in S. cerevisiae.
Bicluster Sampled Coherence Metric (BSCM) improves M. pneumoniae model
BiCluster Quality Score on M. pneumoniae (MPN)
Bicluster Sampled Coherence Metric (BSCM) improves S. cerevisiae model
We further tested BSCM using a S. cerevisiae dataset consisting of 26 public sets resulting in 1455 experiments [8–10, 19–41] (Additional File 1). S. cerevisiae has over 6,000 genes compared to 688 for M. pneumoniae so it was impractical to run cMonkey 125 times for the entire S. cerevisiae dataset. However, because S. cerevisiae is much better annotated, it was possible to use a GO annotation enrichment based scoring metric (, Equation 5) that was independent of cMonkey's scoring function. We identified 29 random experiment subsets with 50-1445 microarrays each, eliminated genes without large expression changes, and then ran cMonkey with both the BSCM and non-BSCM based p-values. We applied the and found improvement in 21 of 29 experiments using the new BSCM p-value (Figure 2, binomial p-value = 0.004).
New BSCM allows more accurate bicluster inclusion
The primary advantage of biclustering over standard clustering is that biclusters include the notion of conditional inclusion. That is to say that the genes in the bicluster are conditionally co-expressed under certain experimental conditions, but not under others. The original cMonkey implementation assumed (via a prior probability) that approximately half of all experiments included in a cluster should be included, and half should be excluded. However, as shown in the left panel of Figure 3, this did not work well in conditions where the genes are co-regulated under all conditions (such as was the case for ribosomal biclusters), or in clusters where the genes are co-regulated under a very small subset of conditions. By contrast, the new BSCM provided a natural cutoff for re-splitting biclusters. As shown in Equation 3, r ik estimates the p-value for each experiment j , given a cluster k. Those experiments where r ik ≤ 0.05 are included in the cluster, all others are excluded.
These new splits were more visually satisfying (Figure 3, right panel), however we were interested in determining if the re-split clusters were biologically more relevant. To test this we built a classifier that would predict if yeast would proliferate peroxisomes under certain conditions based on whether or not experiments performed under those conditions were included or excluded from biclusters. We assembled a dataset of relevant conditions (see Methods), extracted the features, and tried four common machine learning algorithms (Figure 4). The classifier performed similarly well regardless of the machine learning algorithm, but the patterns were most obvious when using a Naïve Bayes classifier. Using this classifier, overall peroxisome proliferation prediction accuracy improves from 41.9% to 76.1% correct when using the BSCM bicluster inclusion rather than the previous method. The classifier accuracy was nearly perfect (>95%) for four of the seven conditions, while it is poor only for predictions of glucose. This probably reflects a biological reality: the glucose response pathway is included in the galactose response, but not vice versa. Thus, the information necessary for understanding the galactose response is present when glucose is in the training set. However, when only galactose is present in the training set, a key piece of information is missing necessary to inform the classifier.
mRNA expression data is becoming ever more plentiful as microarrays become more commonplace or are replaced by multiplexed RNA-seq technology. The improved Bicluster Sampled Coherence Metric (BSCM) provides a better way to simplify and interpret large amounts of expression data that come from multiple sources. Beyond directly improving biclusters, this algorithm is useful for drawing additional information out of each bicluster and using it to train a classifier. We anticipate that this method will become particularly relevant for the broad bioinformatics community interested in humans -- where each cell type may be regarded in the same manner as yeast or bacteria in different environmental conditions. This opens the potential to classify cell types based on mRNA signatures, and to reveal conditions or perturbations that induce a specific cellular response.
Let I represent the set of all genes, J all experiments, and K all biclusters. A bicluster contains genes I k , where each gene is , and includes experiments such that .
Bicluster Sampled Coherence Metric (BSCM) method
is the mean variance for the number of genes in bicluster k as determined bootstrap sampling. is the standard deviation of the values used to calculate . The background distribution is calculated for each condition and for each number of genes that occurs in a given bicluster k by sampling |k| genes 200 times from experimental condition j and drawing additional samples in sets of 200 until and change by less than 1%. To determine which genes should be added or removed from a cluster, we calculate a new r ik supposing gene i were added or removed. As a practical matter, background distributions for are pre-calculated for all cluster sizes less than or equal to the maximum size represented in the initial seed clusters, and additional background distributions are calculated as needed during program execution.
Cluster scoring based on GO terms
where is the enrichment p-value for term g in cluster k.
We tested whether r ik could be used with a p-value cutoff of 0.05 to predict if experimental conditions would result in peroxisome proliferation ("YES") or not ("NO"). We built 544 yeast biclusters using 233 experiments in seven different experimental conditions with known peroxisome proliferation: thirty glucose ("NO"), twenty early oleate ("YES"), and twenty-one late oleate experiments ("YES"), seventy-five galactose ("NO"), eighteen lactate ("YES"), five rho- ("YES"), and sixty-four antimycin ("YES") experiments [8, 9, 13, 17]. For every bicluster, each of the 233 experiments was assigned a value indicating whether genes are "UP" or "DOWN" -regulated if included in a given bicluster, or "EXCLUDED" otherwise. Many experiments were replicates, so standard n-fold cross-validation was inappropriate. Therefore, each of the seven growth-conditions was treated as a splitting boundary. Thus when the classifier predicted proliferation in antimycin, antimycin was absent from the training set. During each split we downsampled, thus providing stochastisticity. Predictions were made using decision trees, logistic regression, support vector machines (SVMs), and naive bayes [42, 43]. (See supplemental code and data for implementation.)
This file contains code and data necessary to run the experiments presented in this paper. Available at http://AitchisonLab.com/BSCM/TestData.BSCM.tar.gz (156 MB)
Publication of this article has been funded by grants from the National Institutes of Health (P41GM109824, P50 GM076547, R01 GM075152, 1R01 GM077398, and U54GM103511) and the National Science Foundation (DBI-0640950).
This article has been published as part of BMC Systems Biology Volume 9 Supplement 2, 2015: Selected articles from the IX International Conference on the Bioinformatics of Genome Regulation and Structure\Systems Biology (BGRS\SB-2014): Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/9/S2.
- Tanay A, Sharan R, Shamir R: Biclustering Algorithms: A Survey. Handb Comput Mol Biol Ed Aluru Chapman HallCRC Comput Inf Sci Ser. 2005Google Scholar
- Danziger SA, Ratushny AV, Smith JJ, Saleem RA, Wan Y, Arens CE, Armstrong AM, Sitko K, Chen W-M, Chiang JH, Reiss DJ, Baliga NS, Aitchison JD: Molecular mechanisms of system responses to novel stimuli are predictable from public data. Nucleic Acids Res. 2014, 42: 1442-1460. 10.1093/nar/gkt938.PubMed CentralView ArticlePubMedGoogle Scholar
- Reiss D, Baliga N, Bonneau R: Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics. 2006, 7: 280-10.1186/1471-2105-7-280.PubMed CentralView ArticlePubMedGoogle Scholar
- Bonneau R, Facciotti MT, Reiss DJ, Schmid AK, Pan M, Kaur A, Thorsson V, Shannon P, Johnson MH, Bare JC, Longabaugh W, Vuthoori M, Whitehead K, Madar A, Suzuki L, Mori T, Chang D-E, DiRuggiero J, Johnson CH, Hood L, Baliga NS: A Predictive Model for Transcriptional Control of Physiology in a Free Living Cell. Cell. 2007, 131: 1354-1365. 10.1016/j.cell.2007.10.053.View ArticlePubMedGoogle Scholar
- Wang YK, Print CG, Crampin EJ: Biclustering reveals breast cancer tumour subgroups with common clinical features and improves prediction of disease recurrence. BMC Genomics. 2013, 14: 102-10.1186/1471-2164-14-102.PubMed CentralView ArticlePubMedGoogle Scholar
- Güell M, Noort VV, Yus E, Chen W-H, Leigh-Bell J, Michalodimitrakis K, Yamada T, Arumugam M, Doerks T, Kühner S, Rode M, Suyama M, Schmidt S, Gavin AC, Bork P, Serrano L: Transcriptome Complexity in a Genome-Reduced Bacterium. Science. 2009, 326: 1268-1271. 10.1126/science.1176951.View ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.PubMed CentralView ArticlePubMedGoogle Scholar
- Lai LC, Kissinger MT, Burke PV, Kwast KE: Comparison of the transcriptomic "stress response" evoked by antimycin A and oxygen deprivation in Saccharomyces cerevisiae. BMC Genomics. 2008, 9: 627-10.1186/1471-2164-9-627.PubMed CentralView ArticlePubMedGoogle Scholar
- Veatch JR, McMurray MA, Nelson ZW, Gottschling DE: Mitochondrial dysfunction leads to nuclear genome instability via an iron-sulfur cluster defect. Cell. 2009, 137: 1247-1258. 10.1016/j.cell.2009.04.014.PubMed CentralView ArticlePubMedGoogle Scholar
- Lai LC, Kosorukoff AL, Burke PV, Kwast KE: Metabolic-state-dependent remodeling of the transcriptome in response to anoxia and subsequent reoxygenation in Saccharomyces cerevisiae. Eukaryot Cell. 2006, 5: 1468-1489. 10.1128/EC.00107-06.PubMed CentralView ArticlePubMedGoogle Scholar
- Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO: Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000, 11: 4241-4257. 10.1091/mbc.11.12.4241.PubMed CentralView ArticlePubMedGoogle Scholar
- Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet. 2003, 34: 166-176. 10.1038/ng1165.View ArticlePubMedGoogle Scholar
- Abbott DA, Suir E, van Maris AJA, Pronk JT: Physiological and Transcriptional Responses to High Concentrations of Lactic Acid in Anaerobic Chemostat Cultures of Saccharomyces cerevisiae. Appl Environ Microbiol. 2008, 74: 5759-5768. 10.1128/AEM.01030-08.PubMed CentralView ArticlePubMedGoogle Scholar
- Koerkamp MG, Rep M, Bussemaker HJ, Hardy GPMA, Mul A, Piekarska K, Szigyarto CAK, de Mattos JMT, Tabak HF: Dissection of Transient Oxidative Stress Response inSaccharomyces cerevisiae by Using DNA Microarrays. Mol Biol Cell. 2002, 13: 2783-2794. 10.1091/mbc.E02-02-0075.PubMed CentralView ArticlePubMedGoogle Scholar
- Smith JJ, Marelli M, Christmas RH, Vizeacoumar FJ, Dilworth DJ, Ideker T, Galitski T, Dimitrov K, Rachubinski RA, Aitchison JD: Transcriptome profiling to identify genes involved in peroxisome assembly and function. J Cell Biol. 2002, 158: 259-271. 10.1083/jcb.200204059.PubMed CentralView ArticlePubMedGoogle Scholar
- Parish RW: The isolation and characterization of peroxisomes (microbodies) from baker's yeast, Saccharomyces cerevisiae. Arch Microbiol. 1975, 105: 187-192. 10.1007/BF00447136.View ArticlePubMedGoogle Scholar
- Epstein CB, Waddle JA, Hale W, Dave V, Thornton J, Macatee TL, Garner HR, Butow RA: Genome-wide Responses to Mitochondrial Dysfunction. Mol Biol Cell. 2001, 12: 297-308. 10.1091/mbc.12.2.297.PubMed CentralView ArticlePubMedGoogle Scholar
- Veenhuis M, Mateblowski M, Kunau WH, Harder W: Proliferation of microbodies in Saccharomyces cerevisiae. Yeast. 2004, 3: 77-84.View ArticleGoogle Scholar
- Abbott DA, Knijnenburg TA, de Poorter LMI, Reinders MJT, Pronk JT, van Maris AJA: Generic and specific transcriptional responses to different weak organic acids in anaerobic chemostat cultures of Saccharomyces cerevisiae. FEMS Yeast Res. 2007, 7: 819-833. 10.1111/j.1567-1364.2007.00242.x.View ArticlePubMedGoogle Scholar
- Angell S, Bench BJ, Williams H, Watanabe CMH: Pyocyanin isolated from a marine microbial population: synergistic production between two distinct bacterial species and mode of action. Chem Biol. 2006, 13: 1349-1359. 10.1016/j.chembiol.2006.10.012.View ArticlePubMedGoogle Scholar
- Boer VM, de Winde JH, Pronk JT, Piper MDW: The genome-wide transcriptional responses of Saccharomyces cerevisiae grown on glucose in aerobic chemostat cultures limited for carbon, nitrogen, phosphorus, or sulfur. J Biol Chem. 2003, 278: 3265-3274. 10.1074/jbc.M209759200.View ArticlePubMedGoogle Scholar
- Caba E, Dickinson DA, Warnes GR, Aubrecht J: Differentiating mechanisms of toxicity using global gene expression analysis in Saccharomyces cerevisiae. Mutat Res. 2005, 575: 34-46. 10.1016/j.mrfmmm.2005.02.005.View ArticlePubMedGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207-210. 10.1093/nar/30.1.207.PubMed CentralView ArticlePubMedGoogle Scholar
- Guan Q, Zheng W, Tang S, Liu X, Zinkel RA, Tsui K-W, Yandell BS, Culbertson MR: Impact of nonsense-mediated mRNA decay on the global expression profile of budding yeast. PLoS Genet. 2006, 2: e203-10.1371/journal.pgen.0020203.PubMed CentralView ArticlePubMedGoogle Scholar
- Joseph-Strauss D, Zenvirth D, Simchen G, Barkai N: Spore germination in Saccharomyces cerevisiae: global gene expression patterns and cell cycle landmarks. Genome Biol. 2007, 8: R241-10.1186/gb-2007-8-11-r241.PubMed CentralView ArticlePubMedGoogle Scholar
- Knijnenburg TA, de Winde JH, Daran J-M, Daran-Lapujade P, Pronk JT, Reinders MJT, Wessels LFA: Exploiting combinatorial cultivation conditions to infer transcriptional regulation. BMC Genomics. 2007, 8: 25-10.1186/1471-2164-8-25.PubMed CentralView ArticlePubMedGoogle Scholar
- Komili S, Farny NG, Roth FP, Silver PA: Functional specificity among ribosomal proteins regulates gene expression. Cell. 2007, 131: 557-571. 10.1016/j.cell.2007.08.037.PubMed CentralView ArticlePubMedGoogle Scholar
- Kuranda K, Leberre V, Sokol S, Palamarczyk G, François J: Investigating the caffeine effects in the yeast Saccharomyces cerevisiae brings new insights into the connection between TOR, PKC and Ras/cAMP signalling pathways. Mol Microbiol. 2006, 61: 1147-1166. 10.1111/j.1365-2958.2006.05300.x.View ArticlePubMedGoogle Scholar
- Marks VD, Ho Sui SJ, Erasmus D, van der Merwe GK, Brumm J, Wasserman WW, Bryan J, van Vuuren HJJ: Dynamics of the yeast transcriptome during wine fermentation reveals a novel fermentation stress response. FEMS Yeast Res. 2008, 8: 35-52. 10.1111/j.1567-1364.2007.00338.x.View ArticlePubMedGoogle Scholar
- Nag R, Kyriss M, Smerdon JW, Wyrick JJ, Smerdon MJ: A cassette of N-terminal amino acids of histone H2B are required for efficient cell survival, DNA repair and Swi/Snf binding in UV irradiated yeast. Nucleic Acids Res. 2010, 38: 1450-1460. 10.1093/nar/gkp1074.PubMed CentralView ArticlePubMedGoogle Scholar
- Pan Z, Agarwal AK, Xu T, Feng Q, Baerson SR, Duke SO, Rimando AM: Identification of molecular pathways affected by pterostilbene, a natural dimethylether analog of resveratrol. BMC Med Genomics. 2008, 1: 7-10.1186/1755-8794-1-7.PubMed CentralView ArticlePubMedGoogle Scholar
- Parra MA, Kerr D, Fahy D, Pouchnik DJ, Wyrick JJ: Deciphering the roles of the histone H2B N-terminal domain in genome-wide transcription. Mol Cell Biol. 2006, 26: 3842-3852. 10.1128/MCB.26.10.3842-3852.2006.PubMed CentralView ArticlePubMedGoogle Scholar
- Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A, Dimitrov K, Siegel AF, Galitski T: Control of yeast filamentous-form growth by modules in an integrated molecular network. Genome Res. 2004, 14: 380-390. 10.1101/gr.2020604.PubMed CentralView ArticlePubMedGoogle Scholar
- Reinke A, Chen JC-Y, Aronova S, Powers T: Caffeine targets TOR complex I and provides evidence for a regulatory link between the FRB and kinase domains of Tor1p. J Biol Chem. 2006, 281: 31616-31626. 10.1074/jbc.M603107200.View ArticlePubMedGoogle Scholar
- Ronen M, Botstein D: Transcriptional response of steady-state yeast cultures to transient perturbations in carbon source. Proc Natl Acad Sci USA. 2006, 103: 389-394. 10.1073/pnas.0509978103.PubMed CentralView ArticlePubMedGoogle Scholar
- Sapra AK, Arava Y, Khandelia P, Vijayraghavan U: Genome-wide analysis of pre-mRNA splicing: intron features govern the requirement for the second-step factor, Prp17 in Saccharomyces cerevisiae and Schizosaccharomyces pombe. J Biol Chem. 2004, 279: 52437-52446. 10.1074/jbc.M408815200.View ArticlePubMedGoogle Scholar
- Sheehan KB, McInnerney K, Purevdorj-Gage B, Altenburg SD, Hyman LE: Yeast genomic expression patterns in response to low-shear modeled microgravity. BMC Genomics. 2007, 8: 3-10.1186/1471-2164-8-3.PubMed CentralView ArticlePubMedGoogle Scholar
- Singh J, Kumar D, Ramakrishnan N, Singhal V, Jervis J, Garst JF, Slaughter SM, DeSantis AM, Potts M, Helm RF: Transcriptional response of Saccharomyces cerevisiae to desiccation and rehydration. Appl Environ Microbiol. 2005, 71: 8752-8763. 10.1128/AEM.71.12.8752-8763.2005.PubMed CentralView ArticlePubMedGoogle Scholar
- Tai SL, Boer VM, Daran-Lapujade P, Walsh MC, de Winde JH, Daran JM, Pronk JT: Two-dimensional transcriptome analysis in chemostat cultures. Combinatorial effects of oxygen availability and macronutrient limitation in Saccharomyces cerevisiae. J Biol Chem. 2005, 280: 437-447. 10.1074/jbc.M410573200.View ArticlePubMedGoogle Scholar
- Tu BP, Kudlicki A, Rowicka M, McKnight SL: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science. 2005, 310: 1152-1158. 10.1126/science.1120499.View ArticlePubMedGoogle Scholar
- Van Wageningen S, Kemmeren P, Lijnzaad P, Margaritis T, Benschop JJ, de Castro IJ, van Leenen D, Groot Koerkamp MJA, Ko CW, Miles AJ, Brabers N, Brok MO, Lenstra TL, Fiedler D, Fokkens L, Aldecoa R, Apweiler E, Taliadouros V, Sameith K, van de Pasch LAL, van Hooff SR, Bakker LV, Krogan NJ, Snel B, Holstege FCP: Functional overlap and regulatory links shape genetic interactions between signaling pathways. Cell. 2010, 143: 991-1004. 10.1016/j.cell.2010.11.021.PubMed CentralView ArticlePubMedGoogle Scholar
- Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH: The WEKA data mining software: an update. ACM SIGKDD Explor Newsl. 2009, 11: 10-18. 10.1145/1656274.1656278.View ArticleGoogle Scholar
- Hornik K, Buchta C, Zeileis A: Open-source machine learning: R meets Weka. Comput Stat. 2009, 24: 225-232. 10.1007/s00180-008-0119-7.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.