- Research article
- Open Access
A simple principle concerning the robustness of protein complex activity to changes in gene expression
BMC Systems Biologyvolume 2, Article number: 1 (2008)
The functions of a eukaryotic cell are largely performed by multi-subunit protein complexes that act as molecular machines or information processing modules in cellular networks. An important problem in systems biology is to understand how, in general, these molecular machines respond to perturbations.
In yeast, genes that inhibit growth when their expression is reduced are strongly enriched amongst the subunits of multi-subunit protein complexes. This applies to both the core and peripheral subunits of protein complexes, and the subunits of each complex normally have the same loss-of-function phenotypes. In contrast, genes that inhibit growth when their expression is increased are not enriched amongst the core or peripheral subunits of protein complexes, and the behaviour of one subunit of a complex is not predictive for the other subunits with respect to over-expression phenotypes.
We propose the principle that the overall activity of a protein complex is in general robust to an increase, but not to a decrease in the expression of its subunits. This means that whereas phenotypes resulting from a decrease in gene expression can be predicted because they cluster on networks of protein complexes, over-expression phenotypes cannot be predicted in this way. We discuss the implications of these findings for understanding how cells are regulated, how they evolve, and how genetic perturbations connect to disease in humans.
The proteome of a eukaryotic cell is largely organized as a collection of multi-subunit protein complexes [1–4]. These complexes are defined empirically by the stable association of their subunits during biochemical purification [3, 4] and act as molecular machines  or information processing modules  in cellular networks. For example some of the many integrated complexes required for gene expression include the RNA polymerase complexes, chromatin remodeling complexes, RNA processing complexes such as the spliceosome, exosome and decapping complex, the ribosome, and the proteosome .
In this paper we address the question of whether there are any general principles concerning how the activity of protein complexes respond to changes in the expression of their subunits. Available global data in yeast show that reducing the expression of any subunit of a protein complex normally produces the same change in phenotype . However we show here that this is not true for changes in phenotype resulting from increases in the expression of subunits, and this applies to both core and peripheral subunits of complexes. We propose the principle that the overall activity of a protein complex is normally robust to an increase, but not to a decrease in the expression of its subunits. We highlight some of the implications of this principle for understanding the regulation and evolution of biological systems.
Genes that reduce fitness when under- but not over-expressed are enriched amongst protein complexes
Most essential functions of the eukaryotic cell are performed by multi-subunit protein complexes. As previously shown , genes with essential functions are enriched amongst the subunits of multi-protein complexes (Figure 1). This is also true for haploinsufficient genes (i.e. genes that reduce fitness when their dosage is reduced by half in heterozygotes ) and for genes that cause slow growth when they are deleted  (Figure 1). Thus inhibiting the expression of a subunit of a protein complex is very likely to disrupt the function of that complex. However genes that slow growth when they are over-expressed  (referred to here as genes with over-expression phenotypes) are not enriched amongst the subunits of protein complexes (Figure 1). This lack of enrichment could reflect the fact that many protein complexes are not essential for normal growth and therefore perturbing their function will not result in a visible phenotype. However, we find that genes that reduce fitness when they are over-expressed are also not enriched amongst protein complexes that perform essential functions (Table 1), nor are they enriched amongst the subunits of protein complexes that are essential when deleted (Table 1). Thus in general over-expressing a subunit of an essential protein complex does not normally disturb its function.
Genes with under- but not over-expression phenotypes cluster into individual protein complexes
Even if over-expressing a subunit of a protein complex does not in general disrupt the overall activity of the entire complex, it is still possible that a subset of protein complexes may be particularly sensitive to the over-expression of their subunits. To test this we investigated the distribution of genes with under-or over-expression phenotypes amongst complexes. For each phenotype we divided the protein complexes into ten evenly spaced bins according to the fraction of subunits associated with the phenotype. We then compared this distribution of phenotypes to that seen when the subunits are randomized amongst complexes.
As shown in Figure 2, genes with under-expression phenotypes (essential genes, haploinsufficient genes and genes required for normal growth) cluster into particular protein complexes. For example, 44 complexes have >90% essential subunits compared to 13 expected by chance, and for all phenotypes arising from decreased gene expression there are many more complexes with no genes having that phenotype than expected by chance. In contrast, for genes that reduce fitness when they are over-expressed, only two bins contain more complexes than expected by chance – one complex has 80–90% of tested subunits with an over-expression phenotype (compared to 0.01 expected, p = 0.006) and 5 complexes have >90% of tested subunits with an over-expression phenotype (1.54 expected, p = 0.02). Thus only a few complexes (~3/183) contain more subunits that are toxic when over-expressed than expected by chance. For the vast majority of complexes the distribution of genes with over-expression phenotypes is not different to that expected by chance.
To further confirm this conclusion we asked whether any individual protein complexes contain more subunits with over-expression phenotypes than expected by chance. To do this we randomised the assignment of subunits to protein complexes and for each complex counted the number of times it had the same or more subunits with an over-expression phenotype than seen with the real data. There are 9 complexes with more genes with over-expression phenotypes than in 5% of randomisations, but none of these are significantly enriched for over-expression phenotypes after adjusting for multiple hypothesis testing (see Supplementary table 1 in Additional file 1, Benjamini-Hochberg false discovery rate, FDR = 5%). In contrast, there are 41 complexes with more essential genes than are seen in 5% of randomisations, and 17 of these complexes are still significantly enriched after adjusting for multiple hypothesis testing (see Supplementary table 2 in Additional file 1, FDR = 5%). Indeed the complex most enriched for genes with over-expression phenotypes is the nucleosome complex, and here the over-expression phenotype may be more related to the disruption of the precise temporal regulation of histone expression during the cell cycle  rather than disruption of protein complex formation per se. Indeed there is an overall enrichment for genes with over-expression phenotypes amongst cell cycle regulated genes (p = 0.037, Fisher's exact test).
Thus we conclude that for protein complexes performing essential functions, inhibiting the expression of any subunit of a protein complex is likely to reduce the overall activity of that complex. In contrast, over-expressing any individual subunit of a protein complex does not normally inhibit the overall activity of that protein complex. This conclusion most likely applies to the vast majority of protein complexes in a eukaryotic cell.
Neither core nor peripheral subunits of protein complexes are enriched for genes with over-expression phenotypes
Previously it has been suggested that subunits that form the structural core of a protein complex might be particularly sensitive to alterations in expression level [13, 14]. Therefore we tested whether subunits with under- or over-expression phenotypes are enriched amongst the core or peripheral/isoform-specific subunits of protein complexes. In a genome-wide study of protein complexes identified by tandem affinity purification, Gavin et al. identified a total of 491 complexes and classified their subunits as "core" – those present in most complex isoforms, "attachment" – those present only in some isoforms, and "modules" – two or more attachment proteins that tended to occur together in different complexes . As shown in Figure 3, there is no difference between the percentage of genes with over-expression phenotypes in cores, modules, or attachments when compared with yeast genes in general. In contrast, subunits with essential or haploinsufficient phenotypes are significantly enriched among all three types of subunit (p < 0.0001, Fisher's exact test). The same result is seen when only considering genes that fall exclusively within each classification, except that haploinsufficient genes are only enriched amongst attachments (Figure 3).
We conclude that complexes are often sensitive to reduction of a subunit from any part of the complex, and that isoform-specific subunits are particularly sensitive to a partial reduction in the expression of a subunit. These isoform-specific subunits are likely to be regulatory subunits (i.e. limiting the overall activity of a complex) and so may be particularly sensitive to a reduction in expression. In contrast there is no evidence that complexes are sensitive to the over-expression of any particular structural subclass of subunit. Our findings also do not support the previous prediction that the core subunits of protein complexes will be particularly sensitive to over-expression [13, 14].
A simple principle for the robustness of protein complex function and its implications for systems biology
In summary we have shown that in yeast reducing the expression of any individual subunit of a protein complex that performs an essential function under laboratory conditions is likely to disrupt the function of that complex. In contrast increasing the expression of any subunit generally has no effect on the overall activity of a complex. Both of these findings apply equally to core and isoform-specific subunits of protein complexes. Although the over-expression of some complex subunits does result in reduced growth, these phenotypes do not seem related to the disruption of the complex with the possible exception of a very small number of complexes (~3).
Therefore we propose the following principle concerning the robustness of protein complex function to alterations in gene expression (Figure 4): protein complex activity in eukaryotic cells is in general robust to an increase, but not to a decrease in the expression levels of individual subunits. This may reflect either an overall insensitivity of protein complex assembly and activity to the over-expression of subunits or that the cell encodes active mechanisms for degrading subunits produced in excess.
This principle contrasts with previous predictions [13–15] and has several important implications for understanding the design principles and evolution of eukaryotic cells. Here we briefly highlight three implications of the principle: (1) the strategies a cell can use to regulate protein complex function, (2) the trajectories by which eukaryotes can evolve new proteins, and (3) how perturbations of gene expression in human disease can be connected to disease phenotypes.
First, according to the principle, reducing the expression of most subunits of a protein complex will down-regulate the activity of that complex. Therefore there are many alternative strategies available for reducing the activity of a protein complex by altering gene expression. This provides the cell with a very flexible and evolvable framework for regulating protein complex function. In contrast, to up-regulate the activity of a protein complex the cell must coordinately increase the expression levels of all of the subunits, unless the expression of a single subunit is limiting. Thus, in the absence of a limiting subunit , up-regulation of complex activity can be most easily achieved by up-regulating a trans-acting factor that regulates the expression of all of the subunits.
Second, the insensitivity of protein complex activity to the over-expression of subunits may have facilitated the evolution of novel protein complexes by gene duplication. Most protein complex subunits can probably be duplicated with little phenotypic effect, a situation that would not be true if over-expressing subunits more frequently disrupted the activity of complexes. Indeed such a mechanism of protein complex subunit duplication has been very important in the evolution of new complexes and protein functions .
Finally, the principle also has practical implications for understanding the etiology of genetic disease in humans. The results we present here suggest that if a subunit of a protein complex is over-expressed  or duplicated  in a human disease, then any connection with the disease phenotype is unlikely to be due to an overall reduction in the activity of that complex. Moreover, the fact that genes with over-expression phenotypes do not cluster into protein complexes means that over-expression phenotypes probably cannot be predicted using a comprehensive map of human protein complexes as is possible for loss-of-function phenotypes [20–22]. More sophisticated methods therefore need to be developed to predict the consequences of increases in gene expression levels.
769 genes that reduce fitness when they are over-expressed were identified by Sopko et al. who tested the phenotypes of 5280 strains each over-expressing a single yeast gene . 1010 essential genes were downloaded from the MIPS database . 184 haploinsufficient genes were identified in a genome-wide screen of heterozygous mutants grown in rich medium . 614 genes required for normal growth in rich media were identified by Giaever et al. As a high quality set of protein complexes we used the manually annotated set of MIPS protein complexes (downloaded from MIPS  on 14 March 2007, removing one redundantly listed complex, complex 510.190.10.20.10). A second set of systematically identified protein complexes was taken from the data of Gavin et al. who classified subunits into cores (1148), modules (393) and attachments (959) of complexes. We used three alternative definitions of an "essential" protein complex – a complex for which at least one, or at least 25% or 50% of subunits have a nonviable deletion phenotype. Cell cycle regulated genes were identified by Spellman et al..
To compare the distribution of phenotypes amongst protein complexes to that expected by chance we divided the set of protein complexes into ten evenly spaced bins according to the percentage of tested subunits that shared each phenotype. We then randomized the assignment of subunits to protein complexes 100,000 times (but keeping the distribution of complex sizes the same) to calculate the expected frequency of complexes in each bin. To identify bins significantly over- or under-represented for phenotypes we counted the number of times the real enrichments for each bin were seen in the randomizations.
To identify individual complexes significantly enriched for each phenotype we compared the number of subunits of each complex that share a phenotype to the frequencies seen in randomised complexes. To correct for multiple hypothesis testing we used the Benjamini-Hochberg method  to identify those complexes enriched at a 5% false-discovery rate (FDR). When testing the association between over-expression phenotypes and protein complex subunits, we only considered complexes for which at least two subunits had been tested for over-expression phenotypes. Hence in this case the total number of complexes considered was 183 rather than 217. The percentages of genes with over-expression phenotypes represent the percentage of tested genes.
Aloy P, Bottcher B, Ceulemans H, Leutwein C, Mellwig C, Fischer S, Gavin AC, Bork P, Superti-Furga G, Serrano L, Russell RB: Structure-based assembly of protein complexes in yeast. Science. 2004, 303 (5666): 2026-2029. 10.1126/science.1092645
Bork P, Serrano L: Towards cellular systems in 4D. Cell. 2005, 121 (4): 507-509. 10.1016/j.cell.2005.05.001
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532
Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MH, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637-643. 10.1038/nature04670
Gunsalus KC, Ge H, Schetter AJ, Goldberg DS, Han JD, Hao T, Berriz GF, Bertin N, Huang J, Chuang LS, Li N, Mani R, Hyman AA, Sonnichsen B, Echeverri CJ, Roth FP, Vidal M, Piano F: Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature. 2005, 436 (7052): 861-865. 10.1038/nature03876
Bray D: Protein molecules as computational elements in living cells. Nature. 1995, 376 (6538): 307-312. 10.1038/376307a0
Maciag K, Altschuler SJ, Slack MD, Krogan NJ, Emili A, Greenblatt JF, Maniatis T, Wu LF: Systems-level analyses identify extensive coupling among gene expression machines. Mol Syst Biol. 2006, 2: 2006 0003- 10.1038/msb4100045
Hart GT, Lee I, Marcotte ER: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics. 2007, 8: 236- 10.1186/1471-2105-8-236
Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, Davis RW, Nislow C, Giaever G: Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics. 2005, 169 (4): 1915-1925. 10.1534/genetics.104.036871
Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, Arkin AP, Astromoff A, El-Bakkoury M, Bangham R, Benito R, Brachat S, Campanaro S, Curtiss M, Davis K, Deutschbauer A, Entian KD, Flaherty P, Foury F, Garfinkel DJ, Gerstein M, Gotte D, Guldener U, Hegemann JH, Hempel S, Herman Z, Jaramillo DF, Kelly DE, Kelly SL, Kotter P, LaBonte D, Lamb DC, Lan N, Liang H, Liao H, Liu L, Luo C, Lussier M, Mao R, Menard P, Ooi SL, Revuelta JL, Roberts CJ, Rose M, Ross-Macdonald P, Scherens B, Schimmack G, Shafer B, Shoemaker DD, Sookhai-Mahadeo S, Storms RK, Strathern JN, Valle G, Voet M, Volckaert G, Wang CY, Ward TR, Wilhelmy J, Winzeler EA, Yang Y, Yen G, Youngman E, Yu K, Bussey H, Boeke JD, Snyder M, Philippsen P, Davis RW, Johnston M: Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002, 418 (6896): 387-391. 10.1038/nature00935
Sopko R, Huang D, Preston N, Chua G, Papp B, Kafadar K, Snyder M, Oliver SG, Cyert M, Hughes TR, Boone C, Andrews B: Mapping pathways and phenotypes by systematic gene overexpression. Mol Cell. 2006, 21 (3): 319-330. 10.1016/j.molcel.2005.12.011
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 3273-3297.
Bray D, Lay S: Computer-based analysis of the binding steps in protein complex formation. Proc Natl Acad Sci U S A. 1997, 94 (25): 13493-13498. 10.1073/pnas.94.25.13493
Veitia RA: Exploring the etiology of haploinsufficiency. Bioessays. 2002, 24 (2): 175-184. 10.1002/bies.10023
Papp B, Pal C, Hurst LD: Dosage sensitivity and the evolution of gene families in yeast. Nature. 2003, 424 (6945): 194-197. 10.1038/nature01771
de Lichtenberg U, Jensen LJ, Brunak S, Bork P: Dynamic complex formation during the yeast cell cycle. Science. 2005, 307 (5710): 724-727. 10.1126/science.1105103
Pereira-Leal JB, Levy ED, Kamp C, Teichmann SA: Evolution of protein complexes by duplication of homomeric interactions. Genome Biol. 2007, 8 (4): R51- 10.1186/gb-2007-8-4-r51
Stranger BE, Dermitzakis ET: From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation. Hum Genomics. 2006, 2 (6): 383-390.
Beckmann JS, Estivill X, Antonarakis SE: Copy number variants and genetic traits: closer to the resolution of phenotypic to genotypic variability. Nat Rev Genet. 2007, 8 (8): 639-646. 10.1038/nrg2149
Lehner B, Fraser AG: A first-draft human protein-interaction map. Genome Biol. 2004, 5 (9): R63- 10.1186/gb-2004-5-9-r63
Franke L, Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet. 2006, 78 (6): 1011-1025. 10.1086/504300
Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tumer Z, Pociot F, Tommerup N, Moreau Y, Brunak S: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007, 25 (3): 309-316. 10.1038/nbt1295
Mewes HW, Frishman D, Mayer KF, Munsterkotter M, Noubibou O, Pagel P, Rattei T, Oesterheld M, Ruepp A, Stumpflen V: MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res. 2006, 34 (Database issue): D169-72. 10.1093/nar/gkj148
Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological). 1995, 57 (1): 289-300.
We thank Richelle Sopko for providing a complete list of genes represented on the yeast over-expression arrays. This work was funded by the EMBL-CRG Systems Biology Program, which is supported by a grant from the Spanish Ministry of Science and Education (Ministerio de Educación y Ciencia, MEC), and by the Institució Catalana de Recerca i Estudis Avançats (ICREA).
JIS, TV and BL analyzed the data and wrote the paper. BL conceived the study. All authors read and approved the manuscript.
Jennifer I Semple, Tanya Vavouri contributed equally to this work.