Optimality in regulated gene expression
Following a recent study [4], we describe the growth rate of a population of E. coli cells as a function of expression of metabolic genes and carbon source in the environment in terms of the cost and benefit of gene expression
(1)
where g
0
is the basal growth rate, set by compounds other than lactose in the environment. η(Z) is the decrease of growth rate due to the metabolic burden of producing lac operon gene products LacZ, LacY, and LacA [11, 12]. B(Z, L) is the growth advantage due to lactose metabolism, which depends on both the expression level, Z, of the lac gene products (in particular LacZ), and the concentration of lactose in the environment L. As both η(Z) and B(Z, L) are increasing functions of the expression level Z, equation (1) predicts that for each concentration of lactose in the medium there will be an optimum expression level Z = Zopt(L) where benefit minus cost is maximal. Which expression level is optimal depends on the properties of the regulated proteins, such as their Michaelis-Menten kinetics and transport properties.
As we focus here on the adaptation of the regulated lac operon expression, we explicitly incorporate the dependency of the expression level on the lactose concentration by substituting Z = Zreg(L), which yields
(2)
where Zreg describes the system's regulatory properties and is referred to as the regulatory response or induction profile. Now a gene regulatory system can be said to be optimally adapted if it satisfies
(3)
which implies that the regulatory system establishes a connection between the inductive properties and the catabolic payoff of lactose. At low levels of lactose, where the cost term will dominate the benefit term, the optimal expression level will be low or zero. Conversely, at high lactose concentrations the optimal expression level will be high. It is important to note that this criterion for regulatory optimality only concerns the relation between expression levels and catabolite concentrations. The regulatory system may also be subject to optimization for response times [14], structural architecture [15], robustness to either mutation [16], protein number fluctuations [17, 18], or otherwise.
Optimality for decoupled lac regulation
When organisms are challenged by a new environment, they may perform sub-optimally and undergo selection towards a new phenotypic optimum. One method to establish such directional selection in controlled experiments would be to modify the regulatory response or downstream regulated genes by genetic modifications. Another approach, which we introduce here, is to decouple inducer and carbon source and allow the regulatory system to adapt to a new relation between the two. An additional advantage is that a selective pressure can be applied to the wild-type lac operon, and does not require modification of the regulatory system.
For the lac system, a large number of artificial compounds have been synthesized [13], that interact with the gene products in a different way than lactose. The decoupling between lac signaling and metabolism can be achieved by using isopropyl-β-D-thiogalactopyranoside (IPTG), and phenyl-β-D-galactoside (Pgal). IPTG is a gratuitous inducer; it binds to the lac repressor and relieves repression, but cannot be hydrolyzed by β-galactosidase (LacZ). Pgal, on the other hand does not induce LacI, but is hydrolyzed by LacZ, releasing galactose (for further metabolism) and phenol. Now the optimality relation reads
(4)
where I and P (the IPTG and Pgal concentrations in the environment) are independent variables. Relation (4) states that for each Pgal concentration P there is an IPTG concentration I that achieves an optimum expression level. In the same vein, IPTG concentrations exist that yield suboptimal expression levels. In the latter case, the original regulatory response may evolve to that does achieve the optimal expression level. Note that Z(P) may also incur evolutionary changes, which would correspond to β-galactosidase optimizing to Pgal metabolism. With inducer and carbon source decoupled, the equation for the growth rate (2) now reads:
(5)
We determined growth rates g(I, P) of Escherichia coli MG1655 cells (termed 'wild-type' hereafter) [19] carrying the lac operon, as function of IPTG and Pgal (Figure 1b and 2). These two compounds are added to a casamino acids minimal medium, which confers a basal growth rate g
0
of 1.09 generations per hour. A wild-type induction profile , measured using the fluorogenic substrate FDG (see Materials and Methods), is shown in Figure 1c. Note that we observed expression decreases at higher concentrations of Pgal, which can be explained by the competitive binding of Pgal to the inducer binding site of the repressor (see Additional file 1). The absence of this anti-induction effect at lower Pgal concentrations is consistent with the known affinities, which is much higher for IPTG than for Pgal (a K
D
of 1.10-6 M versus 1.10-3 M [20]). All experiments presented hereafter were performed in this low Pgal regime.
The growth data (Figure 1b and 2) shows that in the absence of Pgal (where basic growth is supported by casamino acids), induction results in a decrease of the growth rate. This suggests that expression of the operon withdraws resources that could otherwise be used for cell growth. At full induction ([IPTG] > 200 μM), this cost to lac operon expression results in a reduction in growth rate of about 0.2 doublings per hour. The addition of Pgal has the opposite effect on growth rate. Increasing concentrations of Pgal increase the growth rate, initially compensating for the protein expression costs, and eventually resulting in an overall growth rate increase of up to 1.6 doublings per hour. These growth rate increases indicate a benefit of lac operon expression originating from Pgal metabolism.
The total fitness or growth rate, or the expression benefits minus the costs, achieves a maximum value at a certain optimal inducer concentration, as can be seen directly in the measured data (Figure 1b). In the absence of Pgal, it is optimal to have no induction. At 0.10 mM and 0.24 mM Pgal, the growth rate is maximized for IPTG concentrations near 5 μM and 30 μM respectively. For higher Pgal concentrations the maximum observed growth rates are at inducer levels of 200 μM or higher.
We determined the optimal expression levels, Zopt(P) using the smoothened growth data (Figure 2) and the induction profiles that were separately measured for different concentrations of Pgal (Additional file 1). This optimality relation is given in Figure 3 (black curve), together with the optimal Pgal concentrations (solid circles) as obtained directly from the growth data in Figure 1. Although the optimal expression level shows a sharp Pgal dependence, this does not necessarily imply that the strength of selection on expression is strong. On the contrary; the inflexion point of Figure 3 lies at a Pgal concentration of ~150 μM, and at this Pgal concentration the fitness landscape in fact shows a weak dependence on expression (Figure 2). This suggests that at these Pgal concentrations, suboptimal expression will result only in weak selective pressures.
The cost and benefit in our system were modeled by equation (5). Because in our system induction and metabolic properties are separated, we adjusted the model to include IPTG induction and anti-induction for high concentrations of Pgal (Additional file 1), using independent measurements of the expression levels of LacZ. At high IPTG concentrations (Figure 4, for 220 μM IPTG), the model and data show a quantitative agreement, with the model accurately predicting that cost and benefit balance at a Pgal concentration of 120 μM. However, the model does not describe the data quantitatively for lower IPTG concentrations up to 5 μM (Additional file 1): this regime shows only a marginal rise in expression levels (Figure 1b), and hence only a marginal increase of cost and benefit terms is predicted by the model, which contrasts with the measured cost and benefit that show significant increases (Figure 1a). The observed discrepancies may indicate that cost and benefit exhibit a steeper dependence on operon expression than assumed in current models.
Evolution in constant environments
We performed serial dilution experiments in a number of constant environments with different concentrations of IPTG and Pgal, as indicated schematically in Figure 5. For each condition, a 10 ml culture was grown and diluted twice daily 300-500 fold for a total of ~800 generations. Every week a sample of each culture was stored at -80°C to preserve snapshots of its evolutionary history. The LacZ activity of the adapting populations was determined for different time points during the experiment.
The induced and uninduced operon expression levels during the adaptation experiments are displayed in Figure 6. We first consider the environments with a high carbon source concentration (350 μM Pgal) and low induction (0 and 2 μM IPTG, Figure 6a and 6e). The uninduced expression for both experiments evolved to high levels that agree with predictions based on the optimality curve Zopt(P) (Figure 3). The induced expression state, which did not undergo selection, did not change and was maintained at the wild-type levels.
The two experiments (Figure 6a and 6e) showed differences in the evolutionary dynamics. The population grown without IPTG reached its optimal expression level in ~200 generations (Figure 6a). Notably, a second replicate experiment performed at this condition (squares and triangles in Figure 6a), is indistinguishable in its dynamics. The population grown at 2 μM IPTG (Figure 6e) reaches its final expression level only after more than 600 generations. If both traces are fitted with a simple competition model (assuming a single mutant fixation and a sufficiently high mutation rate to be able to neglect stochasticity due to bottlenecking the population, see Additional file 1), we find that the selection coefficient of the population growing without IPTG is more than 4 times larger than that of the population at 2 μM IPTG (s = 0.055 versus 0.013). Although one expects the selection coefficient to decrease for increasing concentrations of IPTG, the observed large difference between 0 and 2 μM IPTG is remarkable given the small wild-type expression differences for these IPTG concentrations (Figure 1b). However, Figure 1a shows that wild-type cells grown at a Pgal concentration of 0.5 mM already realize more than half of their expression benefit at 5 μM IPTG. Consequently, the additional selective advantage of abolishing repression is lowered correspondingly.
Figure 6b shows the evolutionary trace of a culture grown at a high carbon source concentration (350 μM Pgal) and high induction (220 μM IPTG). No significant adaptation was observed, which is consistent with the measured costs and benefits that predict near optimal growth rates for these conditions (Figure 2). When fully induced, the regulatory system is in principle free to lose regulation by neutral drift: mutations that deactivate the repressor do not affect the growth rate. Since mutations that restore repressor function are in general much less likely to occur, in the long run repressor null mutants may fix in the population. However, the expected rate at which this would occur is on the order of 1/μ generations [21], where μ is the mutation rate towards lacI-mutants, which is ~1.10-6 per cell per generation [22]. If repressor deactivation is neutral, fixation would therefore only be expected after 1.106 generations. Interestingly, a null mutation in the promoter controlling the transcription of the repressor may actually be selectively favored, since it should reduce the cost associated with the production of repressor protein. However, given the low amount of repressor protein (10-20 according to [23]) compared to the other lac gene products (induced LacZ expression is in the order of 1.104 per cell [24, 25]), we expect that the selection coefficients associated with the loss of repressor production are too low to be observable within the time course of the experiments.
In the medium containing no IPTG and no Pgal, we find that the regulation remains unchanged (Figure 6c). This outcome is consistent with the predicted low selective pressures (Figure 2), as the expression of the lac operon products is tightly repressed under these conditions. We do find that expression is significantly reduced during growth on 200 μM IPTG and 0 μM Pgal (Figure 6d). Indeed, under these conditions the costs of this spurious operon expression are predicted to be significant, yielding a growth rate reduction of ~0.2 doublings per hour (Figure 1b and 2). The rate at which the expression decreases in the population suggests a selection coefficient of around 0.067. These values are comparable for the 0 μM IPTG and 350 μM Pgal medium which yielded an expression increase (Figure 6a) with an associated selection coefficient of ~0.055 and a predicted potential growth rate increase of ~0.2 doublings per hour (Figure 1b and 2). In contrast however, fixation of the decreased expression phenotype occurs at later generations, suggesting that it occurs less frequently than the increased expression phenotypes. This asymmetry might be understood by considering the mutational targets for obtaining increased and decreased expression. Increases in expression could be achieved by a diverse array of mutations in the repressor or the operator that lower the binding affinity, whereas decreases in expression would require more rare mutations that increase affinity or mutations in the lacZ promoter.
Figure 6f shows the evolutionary history of a population growing without Pgal, but with 2 μM IPTG. As in the case of 200 μM IPTG and no Pgal (Figure 6d), operon expression evolves to lower values. This observation is in line with the measured cost of spurious operon expression, which was shown to be significant even for these low inducer concentrations (Figure 1b). The measured costs are however lower than for 200 μM IPTG (Figure 6d), which predicts a lower selection coefficient. Indeed we observe a selection coefficient that is roughly half.
For the environment with 15 μM IPTG and 350 μM Pgal the evolved phenotype exhibited an altered induction profile (Figure 7). However, the profile changed in a way that the expression level at 15 μM IPTG in fact remained the same when taking into account Pgal anti-induction. From the growth data (Figure 2) we indeed expected a low selective pressure, as it indicates only a marginal difference in fitness between the expression level at 15 μM IPTG and the optimum that lies nearby at somewhat higher IPTG levels. Interestingly however, in this population a mutant had become fixed, thus suggesting a deviation from the predicted low level of selective pressure.
From eight clonal isolates after the serial dilution experiment we sequenced the chromosomal region consisting of the lac repressor, the lac promoter (upstream of lacZ), until 420 base pairs into the lacZ coding sequence. Compared to the reference GenBank nucleotide sequence of the lac operon (accession number J01636.1), all isolates contain a lacI polymorphism (C857T) that does not affect LacI function, and a synonymous mutation in the coding sequence of lacZ. From earlier work we know that C857T pre-existed in the MG1655 strain, and we assume that the synonymous lacZ mutation did also. Apart from these pre-existing mutations, three clones isolated from the population that adapted to 350 μM Pgal, 0 μM IPTG all showed a known hotspot frame shift deletion of four base pairs from a triply repeated TGGC (nucleotides 593-604 of the lacI coding sequence) [22]. This frame shift has been reported to lead to inactivation of the repressor [22], which is in line with our observation. One clone sequenced from adaptation on 350 μM Pgal, 220 μM IPTG and another from 0 μM Pgal, 0 μM IPTG, which retained wild-type induction characteristics, did not reveal any mutations. Remarkably, three clones sequenced from the population that adapted to 220 μM IPTG, 0 μM Pgal, also showed the hotspot frame shift. These isolates do not show a constitutive expression, but instead a greatly reduced expression, which means that they must carry another unidentified mutation. However, since these isolates did not contain mutations in the promoter controlling lacZ expression, no cause for the observed loss of LacZ activity (which originated from selection against expression cost, not against activity) can be identified at present.
Optimality and evolution in alternating environments
Variable environments may confront an organism with a trade-off: the possibility to improve in one environment, but at the expense of deteriorations in another. Here we can explicitly quantify such tradeoffs, which have been introduced conceptually by Levins [26], using the expression-growth relations (Figure 1b). We plotted the growth rate for a high Pgal concentration (500 μM) versus the growth rate in an environment with a low Pgal concentration (39 μM), for a range of IPTG levels (Figure 8). This graph indicates the growth rate combinations that are possible for phenotypes with one constant expression level in both environments (constituting a so-called Pareto-optimal front for the fitness in each environment), and can thus be used to determine the optimal unregulated phenotype. For instance, when the environment alternates between high and low Pgal for equal periods of time, this analysis predicts that the optimal constant expression level is achieved by inducing the WT system with 5 to 30 μM IPTG. Thus, at that expression level, the benefits minus the costs averaged over two environments are optimal. Importantly, the trade-off curve appears to have a concave shape, bulging out towards the cross in the upper right corner where growth in both environments is maximal. As a result, constant expression phenotypes can achieve near-maximal growth rates in each of the environments. This suggests that the superior responsive phenotype, which may achieve that maximum by regulating its expression to optimal values for both environments, has a limited selective advantage over optimal unregulated phenotypes. More generally, the analysis indicates that the selective advantages achieved by regulation are suppressed by the concave trade-off relation of this system.
We performed a number of serial dilution experiments in which the environment was alternating between two states (Figure 9). A change of environment was realized once or twice daily (see Materials and Methods). For four out of six experiments (marked with grey arrows in the Figure 9) we found no significant change of the induction profile. This evolutionary stasis can be explained using the measured expression-growth relation g(I, P) (Figure 1b and 2) and the results from the constant environments adaptation experiments. For example, at 2 μM IPTG and 0 μM Pgal there is a moderate selective pressure to decrease the low but spurious expression (Figure 1b and 2). On the other hand, at high Pgal (350 μM) with moderately high IPTG (15 and 30 μM), the induced expression is strongly favored to be maintained (Figure 1b and 2). This dominant selective force may explain why no overall decreases in expression were observed when alternating between these two environments. These conditions do produce a small selective advantage for phenotypes with an altered induction response that provide a decreased expression at 2 μM IPTG while maintaining the induced expression at 30 μM. Such adaptive change, however, was not observed. This may indicate limited genetic variation for such a phenotypic change, or reflect the limited benefit associated with such a change. A limited genetic variation is suggested by adaptation experiment in the 2 μM IPTG and no Pgal constant environment of Figure 6f, which shows a decreased expression phenotype emerges only at the end of the experiment (at the end of the 800 generations). This suggests that the genetic changes underlying expression decreases are rarer than for expression increases.
An additional rationale for the absence of evolutionary change might be found in bi-stability of the lac operon. Intermediate inducer concentrations have been shown to give rise to a bimodal phenotypic distribution for the WT genotype, in which the lac operon is either repressed or fully expressed [27]. In the media with 350 μM Pgal and 15 or 30 μM IPTG, a spontaneous fully expressed WT phenotype would have a high fitness and rapidly rise in number. Consequently, any fitness increase of an evolved regulatory mutant would be limited, and thus promote evolutionary stasis. However, it is unclear whether the growth conditions used here lead to bistability.
In two alternating environments the induction profile did change. First, alternating between 2 μM IPTG with 350 μM Pgal and 220 μM IPTG with 39 μM Pgal resulted in a high constitutive expression (Figure 10a). These conditions would in fact favor that expression increases without IPTG, and decreases with IPTG. The fact that only the former demand was met indicates a barrier to decreasing expression, which is consistent with results in constant environments. Moreover, one might expect less genetic variation for phenotypes that meet demands in two environments rather than one, which could explain the observed stasis in evolving the induction response. More prolonged adaptation experiments might clarify whether these constraints can be broken.
In the environment where no IPTG and no Pgal alternates with 2 μM IPTG and 350 μM Pgal, we also observe evolution towards a constitutive expression (Figure 10b). Here, the optimal regulated phenotype would have the inflection point of the induction curve shifted to lower IPTG concentrations, which could result from a higher affinity of the repressor for IPTG. The absence of these changes in our experiments despite the significant selective pressures, suggest that there is limited genetic variation for such phenotypes. The adaptation that occurred here maximizes growth in the environmental state with Pgal.