A simple principle concerning the robustness of protein complex activity to changes in gene expression
© Semple et al; licensee BioMed Central Ltd. 2008
Received: 03 September 2007
Accepted: 02 January 2008
Published: 02 January 2008
The functions of a eukaryotic cell are largely performed by multi-subunit protein complexes that act as molecular machines or information processing modules in cellular networks. An important problem in systems biology is to understand how, in general, these molecular machines respond to perturbations.
In yeast, genes that inhibit growth when their expression is reduced are strongly enriched amongst the subunits of multi-subunit protein complexes. This applies to both the core and peripheral subunits of protein complexes, and the subunits of each complex normally have the same loss-of-function phenotypes. In contrast, genes that inhibit growth when their expression is increased are not enriched amongst the core or peripheral subunits of protein complexes, and the behaviour of one subunit of a complex is not predictive for the other subunits with respect to over-expression phenotypes.
We propose the principle that the overall activity of a protein complex is in general robust to an increase, but not to a decrease in the expression of its subunits. This means that whereas phenotypes resulting from a decrease in gene expression can be predicted because they cluster on networks of protein complexes, over-expression phenotypes cannot be predicted in this way. We discuss the implications of these findings for understanding how cells are regulated, how they evolve, and how genetic perturbations connect to disease in humans.
The proteome of a eukaryotic cell is largely organized as a collection of multi-subunit protein complexes [1–4]. These complexes are defined empirically by the stable association of their subunits during biochemical purification [3, 4] and act as molecular machines  or information processing modules  in cellular networks. For example some of the many integrated complexes required for gene expression include the RNA polymerase complexes, chromatin remodeling complexes, RNA processing complexes such as the spliceosome, exosome and decapping complex, the ribosome, and the proteosome .
In this paper we address the question of whether there are any general principles concerning how the activity of protein complexes respond to changes in the expression of their subunits. Available global data in yeast show that reducing the expression of any subunit of a protein complex normally produces the same change in phenotype . However we show here that this is not true for changes in phenotype resulting from increases in the expression of subunits, and this applies to both core and peripheral subunits of complexes. We propose the principle that the overall activity of a protein complex is normally robust to an increase, but not to a decrease in the expression of its subunits. We highlight some of the implications of this principle for understanding the regulation and evolution of biological systems.
Genes that reduce fitness when under- but not over-expressed are enriched amongst protein complexes
Protein complexes with essential functions are not enriched for subunits with over-expression phenotypes.
Percentage genes with over-expression phenotype (total number of genes)
All protein complex subunits
Complex with no essential subunits
Complex with at least one essential subunit
Complex with >= 25% essential subunits
Complex with >= 50% essential subunits
Essential subunits of protein complex
Genes with under- but not over-expression phenotypes cluster into individual protein complexes
Even if over-expressing a subunit of a protein complex does not in general disrupt the overall activity of the entire complex, it is still possible that a subset of protein complexes may be particularly sensitive to the over-expression of their subunits. To test this we investigated the distribution of genes with under-or over-expression phenotypes amongst complexes. For each phenotype we divided the protein complexes into ten evenly spaced bins according to the fraction of subunits associated with the phenotype. We then compared this distribution of phenotypes to that seen when the subunits are randomized amongst complexes.
To further confirm this conclusion we asked whether any individual protein complexes contain more subunits with over-expression phenotypes than expected by chance. To do this we randomised the assignment of subunits to protein complexes and for each complex counted the number of times it had the same or more subunits with an over-expression phenotype than seen with the real data. There are 9 complexes with more genes with over-expression phenotypes than in 5% of randomisations, but none of these are significantly enriched for over-expression phenotypes after adjusting for multiple hypothesis testing (see Supplementary table 1 in Additional file 1, Benjamini-Hochberg false discovery rate, FDR = 5%). In contrast, there are 41 complexes with more essential genes than are seen in 5% of randomisations, and 17 of these complexes are still significantly enriched after adjusting for multiple hypothesis testing (see Supplementary table 2 in Additional file 1, FDR = 5%). Indeed the complex most enriched for genes with over-expression phenotypes is the nucleosome complex, and here the over-expression phenotype may be more related to the disruption of the precise temporal regulation of histone expression during the cell cycle  rather than disruption of protein complex formation per se. Indeed there is an overall enrichment for genes with over-expression phenotypes amongst cell cycle regulated genes (p = 0.037, Fisher's exact test).
Thus we conclude that for protein complexes performing essential functions, inhibiting the expression of any subunit of a protein complex is likely to reduce the overall activity of that complex. In contrast, over-expressing any individual subunit of a protein complex does not normally inhibit the overall activity of that protein complex. This conclusion most likely applies to the vast majority of protein complexes in a eukaryotic cell.
Neither core nor peripheral subunits of protein complexes are enriched for genes with over-expression phenotypes
We conclude that complexes are often sensitive to reduction of a subunit from any part of the complex, and that isoform-specific subunits are particularly sensitive to a partial reduction in the expression of a subunit. These isoform-specific subunits are likely to be regulatory subunits (i.e. limiting the overall activity of a complex) and so may be particularly sensitive to a reduction in expression. In contrast there is no evidence that complexes are sensitive to the over-expression of any particular structural subclass of subunit. Our findings also do not support the previous prediction that the core subunits of protein complexes will be particularly sensitive to over-expression [13, 14].
A simple principle for the robustness of protein complex function and its implications for systems biology
In summary we have shown that in yeast reducing the expression of any individual subunit of a protein complex that performs an essential function under laboratory conditions is likely to disrupt the function of that complex. In contrast increasing the expression of any subunit generally has no effect on the overall activity of a complex. Both of these findings apply equally to core and isoform-specific subunits of protein complexes. Although the over-expression of some complex subunits does result in reduced growth, these phenotypes do not seem related to the disruption of the complex with the possible exception of a very small number of complexes (~3).
This principle contrasts with previous predictions [13–15] and has several important implications for understanding the design principles and evolution of eukaryotic cells. Here we briefly highlight three implications of the principle: (1) the strategies a cell can use to regulate protein complex function, (2) the trajectories by which eukaryotes can evolve new proteins, and (3) how perturbations of gene expression in human disease can be connected to disease phenotypes.
First, according to the principle, reducing the expression of most subunits of a protein complex will down-regulate the activity of that complex. Therefore there are many alternative strategies available for reducing the activity of a protein complex by altering gene expression. This provides the cell with a very flexible and evolvable framework for regulating protein complex function. In contrast, to up-regulate the activity of a protein complex the cell must coordinately increase the expression levels of all of the subunits, unless the expression of a single subunit is limiting. Thus, in the absence of a limiting subunit , up-regulation of complex activity can be most easily achieved by up-regulating a trans-acting factor that regulates the expression of all of the subunits.
Second, the insensitivity of protein complex activity to the over-expression of subunits may have facilitated the evolution of novel protein complexes by gene duplication. Most protein complex subunits can probably be duplicated with little phenotypic effect, a situation that would not be true if over-expressing subunits more frequently disrupted the activity of complexes. Indeed such a mechanism of protein complex subunit duplication has been very important in the evolution of new complexes and protein functions .
Finally, the principle also has practical implications for understanding the etiology of genetic disease in humans. The results we present here suggest that if a subunit of a protein complex is over-expressed  or duplicated  in a human disease, then any connection with the disease phenotype is unlikely to be due to an overall reduction in the activity of that complex. Moreover, the fact that genes with over-expression phenotypes do not cluster into protein complexes means that over-expression phenotypes probably cannot be predicted using a comprehensive map of human protein complexes as is possible for loss-of-function phenotypes [20–22]. More sophisticated methods therefore need to be developed to predict the consequences of increases in gene expression levels.
769 genes that reduce fitness when they are over-expressed were identified by Sopko et al. who tested the phenotypes of 5280 strains each over-expressing a single yeast gene . 1010 essential genes were downloaded from the MIPS database . 184 haploinsufficient genes were identified in a genome-wide screen of heterozygous mutants grown in rich medium . 614 genes required for normal growth in rich media were identified by Giaever et al. As a high quality set of protein complexes we used the manually annotated set of MIPS protein complexes (downloaded from MIPS  on 14 March 2007, removing one redundantly listed complex, complex 510.190.10.20.10). A second set of systematically identified protein complexes was taken from the data of Gavin et al. who classified subunits into cores (1148), modules (393) and attachments (959) of complexes. We used three alternative definitions of an "essential" protein complex – a complex for which at least one, or at least 25% or 50% of subunits have a nonviable deletion phenotype. Cell cycle regulated genes were identified by Spellman et al..
To compare the distribution of phenotypes amongst protein complexes to that expected by chance we divided the set of protein complexes into ten evenly spaced bins according to the percentage of tested subunits that shared each phenotype. We then randomized the assignment of subunits to protein complexes 100,000 times (but keeping the distribution of complex sizes the same) to calculate the expected frequency of complexes in each bin. To identify bins significantly over- or under-represented for phenotypes we counted the number of times the real enrichments for each bin were seen in the randomizations.
To identify individual complexes significantly enriched for each phenotype we compared the number of subunits of each complex that share a phenotype to the frequencies seen in randomised complexes. To correct for multiple hypothesis testing we used the Benjamini-Hochberg method  to identify those complexes enriched at a 5% false-discovery rate (FDR). When testing the association between over-expression phenotypes and protein complex subunits, we only considered complexes for which at least two subunits had been tested for over-expression phenotypes. Hence in this case the total number of complexes considered was 183 rather than 217. The percentages of genes with over-expression phenotypes represent the percentage of tested genes.
We thank Richelle Sopko for providing a complete list of genes represented on the yeast over-expression arrays. This work was funded by the EMBL-CRG Systems Biology Program, which is supported by a grant from the Spanish Ministry of Science and Education (Ministerio de Educación y Ciencia, MEC), and by the Institució Catalana de Recerca i Estudis Avançats (ICREA).
- Aloy P, Bottcher B, Ceulemans H, Leutwein C, Mellwig C, Fischer S, Gavin AC, Bork P, Superti-Furga G, Serrano L, Russell RB: Structure-based assembly of protein complexes in yeast. Science. 2004, 303 (5666): 2026-2029. 10.1126/science.1092645View ArticlePubMedGoogle Scholar
- Bork P, Serrano L: Towards cellular systems in 4D. Cell. 2005, 121 (4): 507-509. 10.1016/j.cell.2005.05.001View ArticlePubMedGoogle Scholar
- Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532View ArticlePubMedGoogle Scholar
- Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MH, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637-643. 10.1038/nature04670View ArticlePubMedGoogle Scholar
- Gunsalus KC, Ge H, Schetter AJ, Goldberg DS, Han JD, Hao T, Berriz GF, Bertin N, Huang J, Chuang LS, Li N, Mani R, Hyman AA, Sonnichsen B, Echeverri CJ, Roth FP, Vidal M, Piano F: Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature. 2005, 436 (7052): 861-865. 10.1038/nature03876View ArticlePubMedGoogle Scholar
- Bray D: Protein molecules as computational elements in living cells. Nature. 1995, 376 (6538): 307-312. 10.1038/376307a0View ArticlePubMedGoogle Scholar
- Maciag K, Altschuler SJ, Slack MD, Krogan NJ, Emili A, Greenblatt JF, Maniatis T, Wu LF: Systems-level analyses identify extensive coupling among gene expression machines. Mol Syst Biol. 2006, 2: 2006 0003- 10.1038/msb4100045PubMed CentralView ArticlePubMedGoogle Scholar
- Hart GT, Lee I, Marcotte ER: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics. 2007, 8: 236- 10.1186/1471-2105-8-236PubMed CentralView ArticlePubMedGoogle Scholar
- Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, Davis RW, Nislow C, Giaever G: Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics. 2005, 169 (4): 1915-1925. 10.1534/genetics.104.036871PubMed CentralView ArticlePubMedGoogle Scholar
- Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, Arkin AP, Astromoff A, El-Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian KD, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Guldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kotter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang CY, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418 (6896): 387-391. 10.1038/nature00935View ArticlePubMedGoogle Scholar
- Sopko R, Huang D, Preston N, Chua G, Papp B, Kafadar K, Snyder M, Oliver SG, Cyert M, Hughes TR, Boone C, Andrews B: Mapping pathways and phenotypes by systematic gene overexpression. Mol Cell. 2006, 21 (3): 319-330. 10.1016/j.molcel.2005.12.011View ArticlePubMedGoogle Scholar
- Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 3273-3297.PubMed CentralView ArticlePubMedGoogle Scholar
- Bray D, Lay S: Computer-based analysis of the binding steps in protein complex formation. Proc Natl Acad Sci U S A. 1997, 94 (25): 13493-13498. 10.1073/pnas.94.25.13493PubMed CentralView ArticlePubMedGoogle Scholar
- Veitia RA: Exploring the etiology of haploinsufficiency. Bioessays. 2002, 24 (2): 175-184. 10.1002/bies.10023View ArticlePubMedGoogle Scholar
- Papp B, Pal C, Hurst LD: Dosage sensitivity and the evolution of gene families in yeast. Nature. 2003, 424 (6945): 194-197. 10.1038/nature01771View ArticlePubMedGoogle Scholar
- de Lichtenberg U, Jensen LJ, Brunak S, Bork P: Dynamic complex formation during the yeast cell cycle. Science. 2005, 307 (5710): 724-727. 10.1126/science.1105103View ArticlePubMedGoogle Scholar
- Pereira-Leal JB, Levy ED, Kamp C, Teichmann SA: Evolution of protein complexes by duplication of homomeric interactions. Genome Biol. 2007, 8 (4): R51- 10.1186/gb-2007-8-4-r51PubMed CentralView ArticlePubMedGoogle Scholar
- Stranger BE, Dermitzakis ET: From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation. Hum Genomics. 2006, 2 (6): 383-390.PubMed CentralView ArticlePubMedGoogle Scholar
- Beckmann JS, Estivill X, Antonarakis SE: Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nat Rev Genet. 2007, 8 (8): 639-646. 10.1038/nrg2149View ArticlePubMedGoogle Scholar
- Lehner B, Fraser AG: A first-draft human protein-interaction map. Genome Biol. 2004, 5 (9): R63- 10.1186/gb-2004-5-9-r63PubMed CentralView ArticlePubMedGoogle Scholar
- Franke L, Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet. 2006, 78 (6): 1011-1025. 10.1086/504300PubMed CentralView ArticlePubMedGoogle Scholar
- Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tumer Z, Pociot F, Tommerup N, Moreau Y, Brunak S: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007, 25 (3): 309-316. 10.1038/nbt1295View ArticlePubMedGoogle Scholar
- Mewes HW, Frishman D, Mayer KF, Munsterkotter M, Noubibou O, Pagel P, Rattei T, Oesterheld M, Ruepp A, Stumpflen V: MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res. 2006, 34 (Database issue): D169-72. 10.1093/nar/gkj148PubMed CentralView ArticlePubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological). 1995, 57 (1): 289-300.Google Scholar