### Model overview

The model is relatively simple and resolves only those mechanisms necessary for exploring the hypothesis and comparison to the relevant data. Yeast cells take up glucose (*G*, g L^{-1}) and convert it to biomass (Figure 1). Three forms of biomass are considered, including structural (*m*
_{
X
}, g dry cell^{-1}), damaged (*m*
_{
D
}, g dry cell^{-1}) and trehalose (*m*
_{
T
}, g dry cell^{-1}). The total biomass (*m*, g dry cell^{-1}) is the sum of these components (*m* = *m*
_{
X
} + *m*
_{
D
} + *m*
_{
T
}). Structural biomass becomes damaged. A fraction of biomass is synthesized as trehalose. The model tracks the age or number of divisions in terms of bud scars (*n*
_{
B
}). A population of individual cells is simulated using an agent-based approach.

### Biomass growth

A number of metabolism models for *S. cervisiae* have been developed ranging from simple Monod-type growth equations to more detailed kinetic models that resolve intracellular mechanisms up to dynamic/kinetic implementations genome-scale network reconstructions [24–28].

Here, a relatively simple approach is used, with the mass balances for the three components:

\frac{\mathit{d}{\mathit{m}}_{\mathit{X}}}{\mathit{dt}}=\left(1-{\mathit{f}}_{\mathit{s}}\right)\phantom{\rule{0.12em}{0ex}}\mathit{\mu}\phantom{\rule{0.12em}{0ex}}{\mathit{m}}_{\mathit{X}}-{\mathit{k}}_{\mathit{d}}\phantom{\rule{0.12em}{0ex}}{\mathit{m}}_{\mathit{X}}

(1a)

\frac{\mathit{d}{\mathit{m}}_{\mathit{D}}}{\mathit{dt}}={\mathit{k}}_{\mathit{d}}\phantom{\rule{0.12em}{0ex}}{\mathit{m}}_{\mathit{X}}

(1b)

\frac{\mathit{d}{\mathit{m}}_{\mathit{T}}}{\mathit{dt}}={\mathit{f}}_{\mathit{s}}\phantom{\rule{0.12em}{0ex}}\mathit{\mu}\phantom{\rule{0.12em}{0ex}}{\mathit{m}}_{\mathit{X}}

(1c)

where *μ* (day^{-1}) is the growth rate, *f*
_{
s
} is the trehalose synthesis fraction and *k*
_{
d
} (day^{-1}) is the damage rate. The growth rate increases with glucose and decreases with damage:

\mathit{\mu}={\mathit{\mu}}_{\mathit{m},\mathit{g}}\frac{\mathit{G}}{{\mathit{K}}_{\mathit{g}}+\mathit{G}}\phantom{\rule{0.12em}{0ex}}\frac{{\mathit{K}}_{\mathit{d}}^{{\mathit{n}}_{\mathit{d}}}}{{\mathit{K}}_{\mathit{d}}^{{\mathit{n}}_{\mathit{d}}}+{\left({\mathit{m}}_{\mathit{D}}/\mathit{m}\right)}^{{\mathit{n}}_{\mathit{d}}}}

(2)

where *μ*
_{
m,g
} (day^{-1}) is the maximum growth rate, *K*
_{
g
} (g L^{-1}) is the half-saturation constant for glucose, *K*
_{
d
} is the half-saturation constant and *n*
_{
d
} is the exponent for damage. The model dynamically simulates the extracellular glucose concentration considering inflow and washout (for continuous culture simulations) and uptake by the cells (see Additional file 1: SI text for details).

### Replication

Cell division in *S. cervisiae* is via the asymmetrical budding process, where a larger mother cell gives birth to a smaller daughter cell. With subsequent births, the mother’s size increases and it accumulates bud scars and damage (see below). A number of cell cycle and replication models for *S. cervisiae* have been developed [29, 30].

Here, the model of Vanoni et al. [30] is adopted. Briefly, two cell cycle phases are simulated, unbudded and budding. Budding starts at a threshold budding size (*m*
_{
b
}), which increases from a specified daughter cell value (*m*
_{
b,0
}) by a factor ({\mathit{a}}_{\mathit{m}}\phantom{\rule{0.12em}{0ex}}{{\mathit{b}}_{\mathit{m}}}^{{\mathit{n}}_{\mathit{B}}-1}) with each generation. Division occurs at a threshold replication size (*m*
_{
r
}), which is proportional to the budding size (*m*
_{
r
} = *f*
_{
m,r
}
*m*
_{
b
}). The daughter gets the mass synthesized during the budding phase. The mother gains a bud scar and preferentially retains the damage (see below). Individual phenotypic heterogeneity is introduced by randomizing the budding size. At birth, the daughter’s budding size is drawn from a global truncated normal distribution with specified mean and coefficient of variation (CV) [31]. A truncated distribution is used to prevent unrealistic values (e.g., *m*
_{
b,0
} < 0). This process introduces non-heritable phenotypic heterogeneity, so values are drawn from a global distribution (vs. one with the mean based on the mother).

### Aging and damage accumulation

Aging in *S. cervisiae* is due to a number of mechanisms, including accumulation of extrachromosomal DNA circles (ERCs) and oxidative damage (e.g., carbonylation) to proteins [32]. At division, this damage is preferentially retained by the mother cell, although the ability to do so diminishes with replicative age [33, 34]. In addition, the mother cell has higher reactive oxygen species (ROS) and protein damage rates and lower damaged protein degradation rates [35, 36]. Trehalose protects against ROS damage to proteins [37]. Several generic models of aging have been presented [18, 34]. Specifically for *S. cervisiae*, Hirsch [38] developed a model where cells accumulate a senescence factor at a constant rate and partitions it asymmetrically at division. The growth rate decreases with increasing amount of this senescence factor. More mechanistic models that explicitly represent ROS, the damage reaction with a protein (citrate synthase) and repair reaction with a heat shock protein (Hsp90) have been presented [39].

The present model considers the production of damaged mass (*m*
_{
D
}) from structural mass (*m*
_{
X
}) in a first-order manner, at a damage rate that increases with age ({\mathit{k}}_{\mathit{d}}={\mathit{a}}_{\mathit{d}}\phantom{\rule{0.12em}{0ex}}{{\mathit{n}}_{\mathit{B}}}^{{\mathit{b}}_{\mathit{d}}}). At division, the damage mass is preferentially retained by the mother, based on a split fraction (*s*
_{
d
}).

### Trehalose synthesis and Tsl1/Tps3 expression

Trehalose serves as storage carbohydrate and stress protectant [40–42]. Synthesis is highest during the stationary phase and in response to stress (incl. heat), but trehalose also accumulates under normal, non-stressed conditions [42–44]. Trehalose is synthesized by a trimeric protein complex made up of Tps1 and Tps2, and interchangeable Tps3 or Tsl1 [45, 46]. Genes involved in trehalose synthesis are induced by heat shock [46, 47]. In addition, the expression is negatively correlated with growth rate and has a stochastic component [15, 47]. Tsl1 and Tps3 promoters share a common regulatory element (stress-responsive element, STRE), but their expression can differ [9, 10, 46]. Trehalose (or more generally carbohydrate storage) has been included in metabolic models of *S. cervisiae*[24, 25, 48].

In the present model, the trehalose synthesis complex is represented by Tsl1 and Tps3, which are assumed to limit the trehalose synthesis rate (i.e., Tps1 and Tps2 are assumed to be present in excess). Tsl1 and Tps3 are considered separately (vs. the complete synthase complex or Tps1), because their expression differs and to allow for direct comparison to observations [15, 46]. The trehalose synthesis fraction (*f*
_{
s
}) is a function of the total trehalose synthesis enzyme concentration (*e*
_{
s
} = *e*
_{
Tsl1
} + *e*
_{
Tps3
}):

{\mathit{f}}_{\mathit{s}}={\mathit{f}}_{\mathit{m},\mathit{s}}\frac{{\mathit{e}}_{\mathit{s}}^{{\mathit{n}}_{\mathit{s}}}}{{\mathit{K}}_{\mathit{s}}^{{\mathit{n}}_{\mathit{s}}}+{\mathit{e}}_{\mathit{s}}^{{\mathit{n}}_{\mathit{s}}}}

(3)

where *f*
_{
m,s
} is the maximum fraction, *K*
_{
s
} (mmol L^{-1}) is the half-saturation constant and *n*
_{
s
} is the exponent for trehalose synthesis.

Tsl1 and Tps3 are expressed using a set of constant, age-dependent and stochastic terms. In the model applications, cells are grown in constant, glucose-replete conditions, so the effect of growth condition (i.e., stationary phase) on expression is not included. Cells are subjected to heat shock, but since the observed resistance was not due to an induced heat shock response [15], induction of resistance by heat shock is not included. The Tsl1 enzyme concentration (*e*
_{
Tsl1
}, mol L^{-1}) is assumed to adjust rapidly to the damage according to:

{\mathit{e}}_{\mathit{Tsl}1}=\left[{\mathit{e}}_{\mathit{c},\mathit{Tsl}1}+{\mathit{e}}_{\mathit{a},\mathit{Tsl}1}\phantom{\rule{0.12em}{0ex}}\frac{{\mathit{m}}_{\mathit{D}}/\mathit{m}}{{\mathit{K}}_{\mathit{Tsl}1}+{\mathit{m}}_{\mathit{D}}/\mathit{m}}\right]\phantom{\rule{0.12em}{0ex}}{\mathit{f}}_{\mathit{r},\mathit{Tsl}1}

(4)

where *e*
_{
c,Tsl1
} (mol L^{-1}) is the magnitude of constant expression, *e*
_{
a,Tsl1
} (mol L^{-1}) is the magnitude of damage or age-dependent expression, *K*
_{
Tsl1
} is a half-saturation constant and *f*
_{
r,Tsl1
} is a randomization factor. Note that expression is a function of the combined effect of constant, age-dependent and stochastic terms, with their relative contribution depending on the assigned parameter values. The randomization factor (*f*
_{
r,Tsl1
}) is varied by drawing from a global truncated normal distribution with mean of 1.0 and specified CV, following the same approach used for *m*
_{
b,0
} (see above). An equivalent formulation is used for Tps3.

### Heat shock tolerance and death

Heat causes denaturation of proteins and there are a number of mechanisms that can prevent this. Trehalose stabilizes proteins during heat shock [40]. Other factors include various heat shock proteins, whose intrinsic (i.e. without heat shock) expression is also heterogeneous and correlated with heat shock resistance [11, 14].

Mortality by heat shock is simulated using a deterministic approach [8]. Specifically, the applied heat shock severity (*H*
_{
a
}, arbitrary units) is compared to the tolerance of the cell (*H*
_{
t
}), which increases with the trehalose mass fraction (*m*
_{
T
} / *m*):

{\mathit{H}}_{\mathit{t}}=\frac{{\mathit{m}}_{\mathit{T}}/\mathit{m}}{{\mathit{K}}_{\mathit{h}}+{\mathit{m}}_{\mathit{T}}/\mathit{m}}

(5)

where *K*
_{
h
} is the half-saturation constant for heat shock tolerance. *K*
_{
h
} is the fraction of trehalose required to achieve a tolerance of 0.5. When a heat shock is applied, all cells with *H*
_{
t
} < *H*
_{
a
} die. *H*
_{
a
} can be adjusted to reflect different experimental conditions.

### Additional mechanisms of heterogeneity

The model includes a number of deterministic sources of heterogeneity, like the uneven split of damage among mother and daughter at replication. Also, the budding mass threshold (*m*
_{
b,0
}) and Tsl1 and Tps3 expression factors (*f*
_{
r,Tsl1
} and *f*
_{
r,Tps3
}) are varied stochastically, as described above. However, in reality there are numerous other mechanisms (e.g., stochastic expression of all genes) that contribute to heterogeneity in cellular processes [1, 7–12]. To account for this, the maximum growth rate (*μ*
_{
m,g
}) and damage exponent (*n*
_{
d
}) parameters are also randomized (following the same approach used for *m*
_{
b,0
}).

### Agent-based modeling of population

The model simulates individual yeast cells using an agent-based approach [18–22]. Each agent stores the cell state variables (e.g., *m*
_{
X
}, see Figure 1) and those parameters that are varied at the individual level (e.g., *m*
_{
b,0
}). For continuous culture simulations, the model includes stochastic washout of cells from the reactor (see Additional file 1: SI text for details). Differential equations (e.g., Eq. 1) are solved using an explicit numerical integration method. The model is implemented in the IAM framework [19, 20], and the source code is available from the corresponding author.

For some experiments the model explicitly simulates each individual cell. This includes the microcolony experiments that have ~10^{5} colonies of up to ~100 cells for a total of 10^{7} cells (e.g. Additional file 1: Figure S1B). However, for liquid cultures with larger populations, including the competition experiments, this is not feasible. For example, a 300-mL culture with a cell density of 2.6 × 10^{8} cells mL^{-1} contains 7.9 × 10^{10} cells. Using the present model, simulating that many cells for 20 days would take approximately 15 years of CPU time and 18 PB of RAM memory. As is common in microbe ABMs, for liquid culture, the model simulates super-individuals, which are representative of a number of real individuals [21]. Minimum/maximum numbers of agents are specified, and when the number of agents drop/rises below/above this, agents are split/combined (see Additional file 1: SI text for details). The number of agents, or the computational resolution, is set sufficiently high so that the model produces robust and reproducible results over multiple runs with different seed values for the random number generator.

This application is especially challenging from a computational perspective, because of the focus on small fractions of the population. For example, in one experiment by Levy et al. [15], 0.1% of the population was sorted out using flow cytometry and the growth rate distribution of that fraction was computed (Additional file 1: Figure S1I). In order for the model to adequately resolve the heterogeneity of such a small fraction of the population, it needs to have a very large number of agents.

### Simulations performed

The model was constrained by calibration and comparison to relevant observations from the literature. Several parameters were calibrated within the available literature range with the help of an automated optimization routine (see Additional file 1: SI text, Table S1, Figures S1&2). Model simulations followed the actual experimental protocols as described in the respective literature references, which could be quite involved. For example, one experiment by Levy et al. [15] included growing cells in liquid suspension, sub-sampling based on Tsl1 expression, growing again in liquid suspension, randomly sub-sampling, growing as microcolonies, and estimating the growth rate of each microcolony based on the change in colony area (Additional file 1: Figure S1J). The resulting calibrated model is then used without any further changes to explore the underlying mechanisms and fitness effect of heterogeneous, age-correlated heat shock resistance.

To understand how heterogeneity is produced and how it propagates through the population, we developed a heterogeneity network (Figure 2F). The nodes in the network represent individual state variables (e.g. *m*
_{
D
}), calculated variables (e.g. *μ*) and processes (e.g. unequal split of damage, node DAM), and the links represent causal relationships. For example, DAM causes heterogeneity in *m*
_{
D
}, which in turn causes heterogeneity in *m* (via mass summation, node m), *μ* (via Eq. 2), and *e*
_{
Tsl1
} and *e*
_{
Tps3
} (via Eq. 4). By turning off the heterogeneity at a node or link in the network and examining the resulting reduction in heterogeneity at a downstream node, the heterogeneity can be mapped onto the network (see Results section).

To explore the role of the age-correlated resistance trait on the fitness of the yeast, numerical (i.e. simulations) competition experiments were performed. Cells were grown in a glucose-limited chemostat with constant dilution rate (*D* = 0.15 h^{-1}), a set-up similar to the one used in a previous experimental study that examined the effect of mild heat shock (28 > 36°C) on *S. cervisiae* growth rate and gene expression [47]. The culture was subjected to heat shocks at specified heat shock severity (*H*
_{
a
}) and frequency (*F*
_{
h
}, h^{-1}). A number of model strains with different Tsl1/Tps3 expression parameters were developed. For Tsl1, *e*
_{
c,Tsl1
} controls constant expression, *e*
_{
a,Tsl1
} controls age-dependent expression, and *f*
_{
r,Tsl1,CV
} controls stochasticity. Tsl1 and Tps3 are considered separately in the model to allow for comparison to data (Figure 2), but their effect on trehalose synthesis is identical (Eq. 3) and a strain with a high Tsl1 parameter (e.g., *e*
_{
c,Tsl1
}) behaves the same as a strain with an equivalently high Tps3 parameter (i.e., *e*
_{
c,Tps3
}). Therefore, the Tsl1 and Tps3 parameters were varied together (e.g., the ratio *e*
_{
c,Tps3
} / *e*
_{
c,Tsl1
} is held constant). We created 1,000 model strains, with 10 variants for each Tsl1 parameter. Doing a single simulation with all strains was not feasible. Therefore, a tournament-style competition was used. In each round, 10 strains were randomly selected and placed in competition. The winner of each match advanced to the next round and the process was repeated until one strain was left. For each condition (*H*
_{
a
}, *F*
_{
h
}), the optimal bet hedging strategy/parameters emerged as the winner. The experiment is implicitly constrained by the metabolic cost of trehalose synthesis (reduced synthesis of structural biomass, Eq. 1a), which is traded off against the benefit of higher heat shock survival.