- Research article
- Open Access
The effects of protein interactions, gene essentiality and regulatory regions on expression variation
© Zhou et al; licensee BioMed Central Ltd. 2008
- Received: 05 February 2008
- Accepted: 26 June 2008
- Published: 26 June 2008
Identifying factors affecting gene expression variation is a challenging problem in genetics. Previous studies have shown that the presence of TATA box, the number of cis-regulatory elements, gene essentiality, and protein interactions significantly affect gene expression variation. Nonetheless, the need to obtain a more complete understanding of such factors and how their interactions influence gene expression variation remains a challenge. The growth rates of yeast cells under several DNA-damaging conditions have been studied and a gene's toxicity degree is defined as the number of such conditions that the growth rate of the yeast deletion strain is significantly affected. Since toxicity degree reflects a gene's importance to cell survival under DNA-damaging conditions, we expect that it is negatively associated with gene expression variation. Mutations in both cis-regulatory elements and transcription factors (TF) regulating a gene affect the gene's expression and thus we study the relationship between gene expression variation and the number of TFs regulating a gene. Most importantly we study how these factors interact with each other influencing gene expression variation.
Using yeast as a model system, we evaluated the effects of four separate factors and their interactions on gene expression variation: protein interaction degree, toxicity degree, number of TFs, and the presence of TATA box. Results showed that 1) gene expression variation is negatively correlated with the protein interaction degree in the protein interaction network, 2) essential genes tend to have less expression variation than non-essential genes and gene expression variation decreases with toxicity degree, and 3) the number of TFs regulating a gene is the most important factor influencing gene expression variation (R2 = 8–14%). In addition, the number of TFs regulating a gene was found to be an important factor influencing gene expression variation for both TATA-containing and non-TATA-containing genes, but with different association strength. Moreover, gene expression variation was significantly negatively correlated with toxicity degree only for TATA-containing genes.
The finding that distinct mechanisms may influence gene expression variation in TATA-containing and non-TATA-containing genes, provides new insights into the mechanisms that underlie the evolution of gene expression.
- Expression Variation
- Essential Gene
- Protein Interaction Network
- Gene Expression Dataset
- Gene Expression Variation
Gene expression variation has been studied on three different levels: single cells across a common environment , within one species across a variety of different environments [2, 3], and across different species/strains, which is often referred to as evolutionary variation [4–8]. In this paper, we study genetic factors affecting gene expression variation within one species across many different environmental conditions. Broadly, the genetic factors affecting gene expression primarily include the binding of regulatory proteins to cis-elements in the upstream of the gene, as well as physical and genetic interactions with other genes. With the availability of many gene expression profiles, protein interaction networks, and gene regulatory networks, it is now possible to study how gene expression variation is associated with both network features and genomic factors. In the case of protein interaction networks, interaction degree, i.e., the number of interacting partners of a given protein, is one of many factors. The presence or absence of TATA box and the number of transcription factors (TF) regulating a gene provide examples of genomic factors influencing gene expression variation.
Many studies have focused on individual factors affecting gene expression variation. For instance, Newman et al.  developed an experimental technique to study protein expression noise in single cells and showed that chromosomal distance to other genes and mRNA-half life are associated with expression noise. However, they did not find a relationship between protein expression noise and protein-protein interactions. Recently, using a more complete interaction dataset, Batada et al.  found that protein expression variation is negatively correlated with interaction degree when protein abundance was controlled using the data in Newman et al. . This relationship continues to hold within the viable genes. Several groups investigated mRNA expression variation within species. For example, Nelson et al.  studied the relationship between the number of tissues or body parts (expression variation), where the gene is expressed, and gene spacing in C. elegans and D. melanogaster. They found that gene expression variation increases in relation to the intergenic distances between genes. Walther et al.  found a positive correlation between the frequency of a gene's differential expression and the number of cis-regulatory elements of that gene in A. thaliana. Furthermore, several groups have studied gene expression variation, also known as evolutionary variation, across different species/strains. Using gene expression data from several yeast species , as well as from different strains derived from mutation-accumulation experiments , it was found, for instance, that the interspecies/interstrain variation of gene expression is significantly correlated with the presence/absence of the TATA box in the promoter region. Lemos et al. [4, 5] studied the effect of protein-protein interactions and protein length on evolutionary variation (variation among strains in a species). They found that evolutionary variation is negatively correlated with protein-protein interactions in Saccharomyces cerevisiae or Drosophila melanogaster  and negatively correlated with protein length in Drosophila melanogaster . These studies highlighted the importance of protein interactions and gene regulatory regions on gene expression variation.
Only a few studies, however, have integrated such different data sources in a way that collectively identifies and interprets the key factors affecting gene expression variation. Therefore, we conducted studies of proteomic and genomic factors marginally and collectively influencing gene expression variation across different perturbation conditions within one species: yeast.
Protein interactions play an important role in gene expression variation. Protein-protein interactions are key biological events in a living cell, and proteins in a cell interact with each other to perform certain functions. High throughput technologies, including yeast two-hybrid systems and mass spectrometry, have generated a large amount of protein interactions in yeast. Computational methods have also been developed to study the reliability of the observed interactions [10, 11] and to build reliable protein interaction networks. These efforts have resulted in the development of several protein interaction databases, albeit with differing degrees of reliability, including MIPS , DIP  and BioGrid . From an evolutionary point of view, the expression profiles of neighboring genes of a target gene in a protein interaction network may put some constraints on the target gene's expression. Thus, in a protein interaction network, the interacting partners of a specific protein can affect the corresponding gene's expression. Therefore, protein physical interaction degree, i.e., the number of interacting partners of a given protein, can significantly affect gene expression variation. In the present study, we show that gene expression variation decreases with protein interaction degree and that protein interaction degree accounts for 1–2% of the expression variation in model organism yeast, a result consistent with previous studies [4, 5].
Another key factor affecting gene expression variation is gene essentiality. Genes can be classified into essential and non-essential genes based on the fitness phenotype of the yeast cell when the gene is deleted under normal growth conditions . Essential genes are those that, when deleted, will render the yeast cell non-viable. Non-essential genes can be further classified into no-phenotype and toxicity-modulating genes based on the fitness phenotype of yeast cell when the gene is deleted under the conditions of four DNA-damaging treatments . Specifically, we define a gene's toxicity modulation degree as the number of DNA-damaging treatments significantly affecting the deletion strain's fitness (toxicity modulation degree = 0 (no phenotype), 1, 2, 3, and 4). The higher the toxicity modulation degree, the more important the gene is in relation to cell survival. Therefore, toxicity degree gives a quantitative measurement of a gene's importance to yeast cell survival. We measure a gene's functional importance in relation to cell survival by the essentiality of the essential genes and the toxicity modulation degree of non-essential genes. Since the expression of genes important for cell survival are generally stable under many different stimuli and cannot fluctuate extensively, we hypothesize and show that expression variation of essential genes is lower than that of non-essential genes and decreases with toxicity degree within non-essential genes.
The number of cis-elements has been shown to be positively associated with gene expression variation . The number of cis-elements is usually approximated using computational approaches and many contain false positive and negative predictions. Theoretically, a given gene's expression pattern can become increasingly complex with the increasing number of transcription factors that regulate this gene, either directly or indirectly. In this study, we hypothesize that the number of TFs is a significant predictor of expression variation and show that the number of TFs regulating a gene (hereinafter referred to as 'number of TFs') accounts for 8–14% of its expression variation, much higher than that can be explained by the number of cis-elements (0.3–1.7%). This implies the importance of indirect trans-effect on expression variation.
The TATA box is a conserved element in the eukaryotic promoter region and is usually bound by TATA-binding proteins. The presence of TATA box has been shown to be one of the most important factors contributing to gene expression variation [6, 7]. Further analysis of the individual genomic and proteomic factors affecting gene expression indicates that there might be two distinct mechanisms that specifically influence gene expression variation of TATA-containing and non-TATA-containing genes. Most importantly, we show that significant negative correlation between expression variation and toxicity degree is only present for TATA-containing genes and that toxicity degree accounts for 1.3–2.6% of the expression variation. In contrast, the relationship between expression variation and toxicity degree is absent for non-TATA-containing genes. The fact that TATA-containing genes are enriched in stress-related genes  may explain this difference. Although the number of TFs is significantly positively correlated with expression variation for both TATA- and non-TATA-containing genes, the association strength is higher for non-TATA containing genes than for TATA-containing genes. These results imply that the mechanism influencing TATA-containing gene expression variation is much more complicated than that in non-TATA-containing genes. For example, TATA-containing genes were found more likely to be epigenetic regulated [17, 18]. Thus, this study gives a more complete analysis of factors and their interaction affecting gene expression variation than may be found in previous studies.
We present our results based on the MIPS protein physical interaction data  and the yeast gene expression profiles under 40 Ca and Na exposure conditions . The results based on three interaction datasets (MIPS , DIP , and BioGrid ) and four other gene expression datasets (chemostat (nutritional stress) , environmental stress , oxidative stress , and a combined gene expression dataset over more than 1,500 conditions ) are given in the Additional Files 1 and 2. We study the expression data individually in order to minimize the variation among different laboratories. By doing so, we can also confirm whether the results based on different gene expression data are consistent. Consistency of results using a variety of different datasets adds confidence to the conclusions. In this manuscript, we use genes and proteins interchangeably. We declare statistical significance if a p-value is less than 0.05 without adjusting for multiple comparisons. In this study, we conducted an exploratory study of factors affecting gene expression variation. As in many epidemiological studies, we did not adjust p-values for multiple comparisons. Therefore, some of our findings need to be further tested in other datasets.
Gene expression variation versus protein interaction degree
Accordingly, we then used linear regression to fit the expression variation for proteins with a maximal physical interaction degree of 20:
v = α + βd
where v is the gene expression variation and d is the interaction degree. α and β are parameters. The fitted line and the corresponding bar-plot for the expression variation are shown in Figure 1B. The gene expression variation is significantly negatively correlated with the protein interaction degree (≤ 20) (R2 = 1.41%, β = -0.0302, p-value = 9.704e-14). The negative correlation between expression variation and interaction degree implies that protein with high interaction degrees do not tolerate extensive expression variation and such protein need more precise control on gene expression for an organism to function normally.
Gene expression variation versus essentiality, toxicity modulation, and interaction degrees
We observed a positive correlation between interaction degree and toxicity degree (data not shown) and, therefore, asked whether the observed negative correlation between gene expression variation and interaction degree is, conversely, caused by the positive correlation between interaction degree and toxicity degree. We consequently studied the relationship between gene expression variation and protein interaction degree within gene groups stratified according to their toxicity degrees (Figure 2B). Using Ca and Na exposure gene expression data , we found a significant decreasing trend of gene expression variation with respect to interaction degree in all the strata except for the one with toxicity degree 3. The corresponding (R2, β, p-value) are (0.31%, -0.017, 0.03), (0.75%, -0.023, 0.014), (2.75%, -0.030, 0.001), (0.8%, -0.016, 0.2248), and (4.41%, -0.046, 0.001) for toxicity degrees 0, 1, 2, 3 and 4, respectively. The fraction of expression variation explained by the protein interaction degree seems to increase as the toxicity degree increases.
In our analyses, both toxicity and protein interaction degrees are negatively associated with gene expression variation. Hence, the more important a gene is to the survival of the yeast cell, the less variation there is in its expression levels across many different conditions. Similarly, the higher the interaction degree of a gene, the more stability is observed in its expression levels. Biologically, a gene is important to the cell's survival since it participates in many important biological processes. Any perturbation of this gene's expression will likely cause deleterious effect to the corresponding biological process and thus renders the cell non-viable. An evolutionary consequence of this hypothesis is that genes important to cell survival appear to have robust expression levels.
Expression variation versus gene regulatory regions: TATA box, number of TFs, and toxicity degree
The relationship between gene expression variation and the number of cis-elements.
Cis-elements are identified with binding p < 0.0001 and conservation in at least 2 other yeast species
Gene expression data-set
Cis-elements are identified with binding p < 0.0001 and conservation in at least 1 other yeast
Gene expression data-set
Cis-elements are identified with binding p < 0.0001 and no Conservation Criteria
Gene expression data-set
Overall analysis of factors affecting gene expression variation
Analysis of four factors and their interactions affecting expression variation using stepwise selection with AIC.
The effect of two factors on expression variation stratified by the presence/absence of TATA box.
The effect of toxicity degree on expression variation stratified by the set of environmental stress response (ESR).
We also did the same analysis for the average gene expression variation across the four expression datasets (Ca and Na exposure , chemostat , environmental stress , and oxidative stress ) and the combined gene expression data of Landry et al. , and the results are presented as Additional File 2. The same conclusions can be obtained indicating the robustness of our results. Previous studies showed that TATA- and non-TATA-containing genes might recruit different coactivator complexes for gene expression . TATA-containing genes were also found to be subject to greater nucleosomal regulation than non-TATA-containing genes . Basehoar et al.  suggested that two distinct regulatory mechanisms may be present at TATA- and TATA-less promoters. The results in Table 3 support their findings.
The results based on the oxidative stress gene expression dataset  are not consistent with the results based on the other three gene expression datasets. This observation may be due to the relatively small gene expression variation in this data. For example, the range of the variance of the expression levels within the oxidative stress dataset, (0.07, 5.34), is much smaller than the corresponding ranges, (0.02, 10.59), (0.17, 9.18), and (0.09, 11.07), for the Ca and Na exposure , chemostat , and environmental stress conditions , respectively.
We also studied the contributing factors for gene expression variation using the DIP  and BioGrid interactions , and the results are given in Additional File 1 and File 2. Similar conclusions as those based on the MIPS interaction data  were obtained. The consistency of the results using different combinations of protein interaction data sets and gene expression profiles showed the robustness of our conclusions. However, the fraction of gene expression variation explained by all factors is less than 25%. One possible explanation is that the measurement of gene expression changes and other factors, including the toxicity degree and interaction degree, are still very noisy. We expect that the true R2 would be higher than that observed in this study.
We implemented a system-wide analysis of proteomic and genomic factors affecting gene expression variation. Among four different factors (protein interaction degree, toxicity degree, TATA box, the number of TFs), TATA-box and the number of TFs are the most important factors influencing gene expression variation. The influence of TATA-box on evolutionary gene expression variation has been extensively studied both computationally and experimentally [6, 8], and our results are consistent with their findings. Although it is intuitive that the number of TFs regulating a gene should have a significant effect on the gene's expression variation, the magnitude of its influence has not been studied in large scale expression datasets to the best of our knowledge. Our findings demonstrated that the gene regulation is a main factor affecting gene expression variation. Protein interaction degree and toxicity degree do not account for as much variation when compared to the influence of the number of TFs and the TATA-box.
In our overall analysis, we also found the interactions between TATA-box and toxicity degree as well as the number of TFs influence expression variation. The further study stratified by TATA-box indicated that TATA-containing genes and non-TATA containing genes behave differently in relation to the toxicity degree and the number of TFs. The effect of the number of TFs on expression variation within the TATA-containing genes is lower than that for the non-TATA-containing genes. On the other hand, toxicity degree is associated with expression variation within the TATA-containing genes only. These findings suggest that the regulatory mechanism might be more complicated for TATA-containing genes than non-TATA containing genes.
In order to study factors affecting gene expression variation, we collected data on gene expression profiles, protein physical interactions, gene regulatory networks, essentiality and toxicity resistance. Details of these data are given below.
Gene expression profiles
A large number of gene expression studies are available. In this study, we chose gene expression studies containing at least 40 conditions. These datasets include yeast gene expression profiles under 40 Ca and Na exposure conditions , chemostat (i.e., nutritional stress) at 100 conditions , environmental stress at 156 conditions  and oxidative stress at 70 conditions . These data were analyzed separately to ensure that between-laboratory variation was minimized. A combined gene expression profile under more than 1,500 conditions was collected by . The responsiveness for each gene across more than 1,500 conditions calculated by  was used in our analysis as expression variation.
Protein interaction data
We downloaded yeast protein interaction data from three different data sources. The MIPS (Munich Information Center for Protein Sequences)  dataset (version: PPI_18052006.tab) contains 11,124 protein physical interactions involving 4,404 proteins. The DIP core interaction dataset  (version: ScereCR20070107) contains 5,738 protein interactions involving 2,161 proteins. The DIP core interactions were assessed by a number of quality tests and are supposed to be highly reliable . The BioGrid  dataset (version 2.0.34) contains 59,317 protein physical interactions involving 5,054 proteins.
Essential and toxicity modulating genes
Large scale gene deletion studies have identified about 17–20% of the genes essential for yeast cell survival  under normal conditions. Even within the class of non-essential genes, a gene's importance in relation to cell survival is not the same. Further studies classified the non-essential genes based on the cell's fitness phenotypes under four different DNA damage perturbations when a gene is knocked out . The toxicity modulating genes were defined as those significantly affecting the cell's fitness phenotype when knocked out. We defined the toxicity degree of a gene as the number of perturbations that significantly affected the deletion strain's fitness. Essential genes were downloaded from the SGD website , and the toxicity degrees of non-essential genes were calculated from .
Gene Regulatory Network
Studies have shown that gene expression variation is positively correlated with the number of cis-regulatory elements and the length of intergenic region in several organisms. Since cis-elements control the expression of genes through interaction with the TFs, it is interesting to study if the number of TFs regulating a gene has an effect on gene expression variation. The mapping of cis-elements to genes was obtained using motif discovery algorithms, PhyloCon and Converge, with binding p-value less than 0.001 and conservation in at least 0, 1 or 2 other yeast species . The mapping of the TFs to genes is obtained from Hu et al. .
A TATA box is a DNA sequence (cis-element) found in the promoter region of most eukaryotic genes. The TATA consensus sequence was identified as TATA(A/T)A(A/T)(A/G) . The TATA box has been identified as a very important factor for gene expression variation. The relationship between yeast genes and the TATA box was downloaded from . There are 1090 out of 6278 genes that were predicted to have a TATA box. Our analysis used these 1090 genes as TATA-containing genes and other genes as non-TATA-containing genes. (We note that 607 genes are not classified in , and the results are essentially the same when these genes are not considered (data not shown).)
Gene expression variation was measured by the logarithm of the variance of the gene expression levels under various conditions. The distribution of the variance was not normal. In addition, the standard deviations of the resulting distributions conditional on the independent variables (protein physical interaction degree, toxicity degree, TATA box, number of TFs) differed widely, making the linear model for the variance invalid. To avoid these problems, we measured the gene expression variation by the logarithm of the variance. The resulting distributions seem to fit the conditions for the linear model. Hence, in our study, we used a linear model to study the relationship between the expression variation and each factor. In the study of the relationship between the gene expression variation and interaction degrees, we first used the LOWESS function in R  to fit the data. An approximate linear relationship between gene expression variation and interaction degree was observed when the interaction degree was less than 20. We then proceeded to use linear regression to fit the data up to interaction degree 20.
v = α + βd
where v is the gene expression variation and d is the interaction degree. α and β are parameters. We tested the statistical significance for the relationship between gene expression variation and interaction degree based on the linear regression model.
Before we do the joint analysis of expression variation with respect to the four factors (protein interaction degree, toxicity degree, number of TFs, and TATA box), we tested if the four factors are highly correlated. We calculated the correlation matrix between them and it is given in Additional File 2 (Supplementary Table 9). All the correlation coefficients are smaller than 0.3 indicating that they are not highly correlated. Although it might be more computationally reasonable to first find the principal components of these factors and then analyze the data using linear regression, the interpretation of the final result is not clear. Since these factors are not highly correlated, we treat them as independent factors in our joint analysis.
In the overall analysis, we first used stepwise selection to find a model that gives the smallest AIC (Akaike information criterion) = 2*K+n*ln(SSE/n), where K is the number of parameters in the model; n is the number of observations; and SSE is the residual sum of squares . We then used linear regression to analyze the relationship between gene expression variation and the retained factors and interactions. The corresponding p-values and the R2 values are reported in Table 2.
We thank Mr. David Martin for carefully reading the manuscript and for suggestions that significantly improved the presentation of the paper. This work is partly supported by the NIH/NSF Joint Mathematical Biology Initiative DMS-0241102 and NIH P50 HG 002790. We sincerely thank the anonymous reviewers for pointing out several important references that were missed in the original version. We also thank the reviewers for suggestions that significantly improved the presentation of the paper.
- Newman JR, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, Weissman JS: Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006, 441: 840-6. 10.1038/nature04785View ArticlePubMedGoogle Scholar
- Nelson CE, Hersh BM, Carroll SB: The regulatory content of intergenic DNA shapes genome architecture. Genome Biol. 2004, 5: R25- 10.1186/gb-2004-5-4-r25PubMed CentralView ArticlePubMedGoogle Scholar
- Walther D, Brunnemann R, Selbig J: The regulatory code for transcriptional response diversity and its relation to genome structural properties in A. thaliana. PLoS Genet. 2007, 3: e11- 10.1371/journal.pgen.0030011PubMed CentralView ArticlePubMedGoogle Scholar
- Lemos B, Meiklejohn CD, Hartl DL: Regulatory evolution across the protein interaction network. Nat Genet. 2004, 36: 1059-60. 10.1038/ng1427View ArticlePubMedGoogle Scholar
- Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL: Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol Biol Evol. 2005, 22: 1345-54. 10.1093/molbev/msi122View ArticlePubMedGoogle Scholar
- Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL: Genetic properties influencing the evolvability of gene expression. Science. 2007, 317: 118-21. 10.1126/science.1140247View ArticlePubMedGoogle Scholar
- Tirosh I, Weinberger A, Carmi M, Barkai N: A genetic signature of interspecies variations in gene expression. Nat Genet. 2006, 38: 830-4. 10.1038/ng1819View ArticlePubMedGoogle Scholar
- Tirosh I, Barkai N: Evolution of gene sequence and gene expression are not correlated in yeast. Trends Genet. 2008, 24: 109-13. 10.1016/j.tig.2007.12.004View ArticlePubMedGoogle Scholar
- Batada NN, Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hurst LD, Tyers M: Stratus not altocumulus: a new view of the yeast protein interaction network. PLoS Biol. 2006, 4 (10): e317- 10.1371/journal.pbio.0040317PubMed CentralView ArticlePubMedGoogle Scholar
- Deng MH, Sun FZ, Chen T: Assessment of the reliability of protein-protein interactions and protein function prediction. Pac Symp Biocomput. 2003, 140-151.Google Scholar
- Deane CM, Salwinski L, I Xenarios, Eisenberg D: Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics. 2002, 1: 349-56. 10.1074/mcp.M100037-MCP200View ArticlePubMedGoogle Scholar
- Mewes HW, Amid C, Arnold R, Frishman D, Güldener U, Mannhaupt G, Münsterkötter M, Pagel P, Strack N, Stümpflen V, Warfsmann J, Ruepp A: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 2004, D41-44. 32 DatabasePubMed CentralView ArticlePubMedGoogle Scholar
- Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 2004, D449-451. 32 DatabasePubMed CentralView ArticlePubMedGoogle Scholar
- Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006, 34 (Database issue): D535-9. 10.1093/nar/gkj109PubMed CentralView ArticlePubMedGoogle Scholar
- Giaever G, Chu AM, Ni L, Connelly C, Riles L, Véronneau S, Dow S, Lucau-Danila A, Anderson K, André B, Arkin AP, Astromoff A, El-Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian KD, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Güldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kötter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang CY, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418: 387-91. 10.1038/nature00935View ArticlePubMedGoogle Scholar
- Said MR, Begley TJ, Oppenheim AV, Lauffenburger DA, Samson LD: Global network analysis of phenotypic effects: protein networks and toxicity modulation in Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 2004, 101: 18006-11. 10.1073/pnas.0405996101PubMed CentralView ArticlePubMedGoogle Scholar
- Basehoar AD, Zanton SJ, Pugh BF: Identification and distinct regulation of yeast TATA box-containing genes. Cell. 2004, 116: 699-709. 10.1016/S0092-8674(04)00205-3View ArticlePubMedGoogle Scholar
- Choi JK, Kim YJ: Epigenetic regulation and the variability of gene expression. Nat Genet. 2008, 40: 141-7. 10.1038/ng.2007.58View ArticlePubMedGoogle Scholar
- Yoshimoto H, Saltsman K, Gasch AP, Li HX, Ogawa N, Botstein D, Brown PO, Cyert MS: Genome-wide analysis of gene expression regulated by the calcineurin/Crz1p signaling pathway in Saccharomyces cerevisiae. J Biol Chem. 2002, 277: 31079-88. 10.1074/jbc.M202718200View ArticlePubMedGoogle Scholar
- Saldanha AJ, Brauer MJ, Botstein D: Nutritional homeostasis in batch and steady-state culture of yeast. Mol Biol Cell. 2004, 15: 4089-104. 10.1091/mbc.E04-04-0306PubMed CentralView ArticlePubMedGoogle Scholar
- Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO: Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000, 11: 4241-57.PubMed CentralView ArticlePubMedGoogle Scholar
- Shapira M, Segal E, Botstein D: Disruption of yeast forkhead-associated cell cycle transcription by oxidative stress. Mol Biol Cell. 2004, 15: 5659-69. 10.1091/mbc.E04-04-0340PubMed CentralView ArticlePubMedGoogle Scholar
- The R Project for Statistical Computing., http://www.r-project.org/
- Bader JS, Chaudhuri A, Rothberg JM, Chant J: Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol. 2004, 22: 78-85. 10.1038/nbt924View ArticlePubMedGoogle Scholar
- Choi JK, Kim SC, Seo J, Kim S, Bhak J: Impact of transcriptional properties on essentiality and evolutionary rate. Genetics. 2007, 175: 199-206. 10.1534/genetics.106.066027PubMed CentralView ArticlePubMedGoogle Scholar
- MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E: An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics. 2006, 7: 113- 10.1186/1471-2105-7-113PubMed CentralView ArticlePubMedGoogle Scholar
- Hu Z, Killion PJ, Iyer VR: Genetic reconstruction of a functional transcriptional regulatory network. Nat Genet. 2007, 39 (5): 683-7. 10.1038/ng2012View ArticlePubMedGoogle Scholar
- Akaike H: Information theory and an extension of the maximum likelihood principle. Proceedings of Second International Symposium on Information Theory. Edited by: Petrov BN, Csaki F. 1973, 267-281. Akademiai Kiado, BudapestGoogle Scholar
- Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, Chu AM, Connelly C, Davis K, Dietrich F, Dow SW, El Bakkoury M, Foury F, Friend SH, Gentalen E, Giaever G, Hegemann JH, Jones T, Laub M, Liao H, Liebundguth N, Lockhart DJ, Lucau-Danila A, Lussier M, M'Rabet N, Menard P, Mittmann M, Pai C, Rebischung C, Revuelta JL, Riles L, Roberts CJ, Ross-MacDonald P, Scherens B, Snyder M, Sookhai-Mahadeo S, Storms RK, Véronneau S, Voet M, Volckaert G, Ward TR, Wysocki R, Yen GS, Yu K, Zimmermann K, Philippsen P, Johnston M, Davis RW: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999, 285: 901-6. 10.1126/science.285.5429.901View ArticlePubMedGoogle Scholar
- Burnham KP, Anderson DR: Model selection and multimodel inference: a practical-theoretic approach. 2002, Springer-Verlag, 2Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.