A multiscale approximation in a heat shock response model of E. coli
© Kang; licensee BioMed Central Ltd. 2012
Received: 17 November 2011
Accepted: 7 November 2012
Published: 21 November 2012
A heat shock response model of Escherichia coli developed by Srivastava, Peterson, and Bentley (2001) has multiscale nature due to its species numbers and reaction rate constants varying over wide ranges. Applying the method of separation of time-scales and model reduction for stochastic reaction networks extended by Kang and Kurtz (2012), we approximate the chemical network in the heat shock response model.
Scaling the species numbers and the rate constants by powers of the scaling parameter, we embed the model into a one-parameter family of models, each of which is a continuous-time Markov chain. Choosing an appropriate set of scaling exponents for the species numbers and for the rate constants satisfying balance conditions, the behavior of the full network in the time scales of interest is approximated by limiting models in three time scales. Due to the subset of species whose numbers are either approximated as constants or are averaged in terms of other species numbers, the limiting models are located on lower dimensional spaces than the full model and have a simpler structure than the full model does.
The goal of this paper is to illustrate how to apply the multiscale approximation method to the biological model with significant complexity. We applied the method to the heat shock response model involving 9 species and 18 reactions and derived simplified models in three time scales which capture the dynamics of the full model. Convergence of the scaled species numbers to their limit is obtained and errors between the scaled species numbers and their limit are estimated using the central limit theorem.
Stochasticity may play an important role in biochemical systems. For example, stochasticity may be beneficial to give variability in gene expression, to produce population heterogeneity, and to adjust or respond to fluctuations in environment . We are interested in local dynamics of biochemical networks involving some species with a small number of molecules so that the system is assumed to be well-mixed and relative fluctuations of small species numbers may play a role in the system dynamics.
The conventional stochastic model for the well-stirred biochemical network is based on the chemical master equation. The chemical master equation governs the evolution of the probability density of species numbers and is expressed as the balanced equation between influx and outflux of the probability density. When the biochemical network involves many species or bimolecular reactions, it is rarely possible to obtain an exact solution of the master equation in a closed form. Instead of searching for the solution of the master equation, stochastic simulation algorithms are used to obtain the temporal evolution of the species numbers. For example, Gillespie’s Stochastic Simulation Algorithm (SSA, or the direct method) is well known [2, 3] and provides a realization of the exact trajectory of the sample path for the species numbers. As the biochemical network has more species and reactions, SSA becomes computationally expensive and more efficient algorithms were suggested by many authors [4–6]. The detailed review of stochastic simulation methods, stochastic approximations, and hybrid simulation methods is given in . For models with well-separated time scales, numerous authors suggested stochastic simulation algorithms for biochemical reaction networks by assuming that “fast” subnetworks have reached a “partial equilibrium”  or a “quasi-steady state” . Using these assumptions, the approximate stochastic simulation algorithms involve a reduced number of species or reactions.
On the other hand, Ball et al.  described the state of the biochemical reaction network in the well-stirred system directly using stochastic equations for species numbers, and suggested an approximation of the reaction network via limiting models derived using different scalings for the species numbers and for the reaction rate constants. Kang and Kurtz  extended this multiscale approximation method and gave a systematic way to obtain limiting models in the time scales of interest. Conditions are given to help identify appropriate values for a set of scaling exponents which determine the time scale of each species and reaction. Using this method, nonstationary behavior of biochemical systems can be analyzed. Moreover, application of the method is flexible in the sense that the method does not require the exact parameter values but gives approximations valid for a range of parameter values. More recently, Crude et al.  also proposed a reduction method to derive simplified models with preserving stochastic properties and with key parameters using averaging and hybrid simplification.
The multiscale approximation method in  requires consideration of magnitude of both species numbers and rate constants of the reactions involving the corresponding species. When a moderately fast reaction involves two species, one with a small number of molecules and the other with a large number of molecules, the effects of this reaction on these species are different. Net molecule changes of species with large numbers due to the reaction is less noticeable than those of species with small numbers. Therefore, though the same reaction governs these species, their time scales may be different from each other. Letting N0 be a fixed constant and choosing a large value for N0, for example N0=100, we express magnitudes of species numbers and reaction rate constants in terms of powers of N0 with different scaling exponents. For instance, 1 to 10molecules are expressed as to , 500 to 800molecules are rewritten as 5×N0 to 8×N0molecules, and 0.0002 sec becomes . Assuming N0 is large, we replace N0by a large parameter N and stochastic equations for species numbers are expressed in terms of N. Then, N is an analogue of 1/ε where ε is a small parameter in perturbation theory.
A specific time scale of interest is expressed in terms of a power of N, and its exponent contributes to reaction rates due to change of variables in time. For each species (or linear combination of species), we compare a power of N for the species number and those for reaction rates involving this species. Consider a case when the power for the species number is larger than those for the rates of all reactions where the species is involved. Then net molecule changes due to the reactions are not large enough to be noticeable in this time scale, and the species number is approximated as constant. Next, consider a case when the power for the species number is smaller than those for some reaction rates involving the species. In this case, the species number fluctuates very rapidly due to the fast reactions in this time scale, and the averaged behavior of the species number can be described in terms of other species numbers. The method of averaging is similar to approximation of one variable in terms of others using a quasi-steady state assumption. Last, when the power for the species number is equal to those for the rates of reactions where the species is involved, the scaled species number is approximated by a nondegenerate limit describing nonstationary behavior of the species number in the specific time scale of interest. The limit could be described in various kinds of variables: a continuous time Markov chain, a deterministic model given by a system of ordinary differential equations, or a hybrid model with both discrete and continuous variables. Since some of the scaled species numbers are approximated as constants or the averaged behavior of some species numbers is expressed in terms of other variables, dimension of species in the approximation of the biochemical network is reduced.
In the multiscale approximation method, scaling exponents for species numbers and for reaction rate constants are not uniquely determined, since the choice of values for the exponents is flexible. For example, 0.005 sec can be expressed as or when N0=100. The goal in this method is to find an appropriate set of scaling exponents to obtain a nondegenerate limit of the scaled species numbers. Orders of magnitude of species numbers in the propensities affect reaction rates, and reaction rates contribute to determining rates of net molecule changes of the species involved in the reactions. Since species numbers and reaction rates interact, it is not easy to determine scaling exponents for all species numbers and reaction rate constants so that the limits of the scaled species numbers become balanced.
Kang and Kurtz  introduced balance conditions for the scaling exponents, which help to determine values for a set of exponents. The key idea in these conditions is that for each species (or linear combination of species) the maximum of scaling exponents in the rates of the reactions where this species is produced should be the same as that in the rates of the reactions where this species is consumed, i.e. maximal production and consumption rates of the species should be balanced in the order of magnitude. In case the maximums of scaling exponents for productions and consumptions are not balanced for some species, an increase or decrease of the scaled species number can be described by its limit during a certain time period. However after this time period, the scaled species number will either become zero or blow up to infinity. Therefore, if some of the scaled species numbers are not balanced due to a difference between orders of magnitude of production and consumption rates, the chosen scaling is valid up to a certain time scale. After this time scale, we need to choose different values for scaling exponents. In each time scale of interest we derive a limiting model including a subset of species and reactions, which is used to approximate the state of the full reaction network. The multiscale approximation method is applicable in case some of reaction rates are not known accurately, since the chosen scaling is applicable in some ranges of the parameters. Therefore, based on the behavior of the limiting models, we may be able to estimate behavior for a range of parameter values without performing a huge number of stochastic simulations.
The reduced network in the early stage has very simple structure without any bimolecular reactions, and all reactions involved are either production from a source or conversion. Moreover, the reduced network is well separated into two due to independence of S8from S2and S3.
where a species over the arrow accelerates or inhibits the corresponding reaction. The reaction does not change this species number, but the propensity of the corresponding reaction is a function of this species number. In this time scale, conversion between S2 and S3 occurs very frequently and S2and S3play a role as a single “virtual” species rather than separate species. The species numbers of S23 and S8are described as two independent birth processes and the species number of S7 is governed by conversion. In this time scale, the species number of S8is normalized and treated as a continuous variable. The interesting thing is that the behavior of the species S8 which rapidly increases in time is well approximated in both first and second time scales.
As we see in Figure 1, the full network involves reactions with more than two reactants or products. However, all reactions in the reduced network at the times of order 10,000 sec consist of either production or degradation of each species, though most of the species (6 species out of 9) are involved in the reduced model. As in the medium stage of time period, S2and S3play a role as a single species. In the early and medium stages of time period propensities are in a form following the law of mass action, while in the late stage of time period the propensity for degradation of S23 is a nonlinear function of the species numbers similar to the reaction rate appearing in the Michaelis-Menten approximation for an enzyme reaction. The nonlinear function involves the species numbers of S23, S8, and S9, which come from averaging of the species numbers of S2and S6which fluctuate rapidly in the third time scale. Similarly, the propensity of catalytic degradation of S8 is not proportional to the number of molecules of S8.
In the late stage of time period of order 10,000 sec, we study the error between the scaled species numbers and their limit analytically using the central limit theorem derived in  and show that the error is of order 10−1.
- 1.Write a chemical reaction network involving s 0species and r 0 reactions in the form of
where ν ik and are nonnegative integers. Rearrange the reactions so that the reaction rate constants are decreasing monotonically as k gets large.
Derive a system of stochastic equations for species numbers.
- (a)Letting X i (t) be the number of molecules of species S i at time t, the corresponding stochastic equation is
where counts the number of times that the k th reaction occurs up to time t.
λ k (x) is determined by a stochastic version of mass action kinetics, and is expressed as a product of the rate constant and the numbers of molecules of reactants. If the k th reaction is second-order () with different types of reactants, . When the reactants are two molecules of the same species, .
Derive a system of stochastic equations for the normalized species numbers after a time change, Z N,γ(t).
In the equation for X i (t) obtained in Step 2 (a), replace X i by and divide reaction terms by N α i . In the k th reaction term, put N γ + ρ k in the propensity and replace λ k (X) by . Then, we have
In the equation in Step 3 (a), .
- (c)In the most reactions, is obtained by replacing by κ k in λ k . In case the k th reaction is second-order with reactants of the same species, is replaced by .
Write a set of species balance equations and their time-scale constraints.
Define and as subsets of reactions where the species number of S i increases or decreases every time the reaction occurs. Comparing ρ k ’s for and those for , set the balance equations as
Time-scale constraints are given as
Find a minimum set of linear combinations of species whose maximum of collective production (or consumption) rates may be different from that of one of any species. We construct a minimum set of linear combinations of species by selecting a linear combination of species if any reaction term involving the species consisting of the linear combination is canceled in the equation for the linear combination of species.
For each selected linear combination of species, write a collective species balance equation and its time-scale constraint. They are obtained similarly to the ones in Step 4 using subsets of reactions where the number of molecules of linear combinations of species either increases or decreases instead of using and .
- 7.Select a large value for N 0and choose an appropriate set of α i ’s and β k ’s so that
the species number X i and the reaction rate constant are approximately of orders and ;
the normalized species number and the scaled reaction rate constant κ k are of order 1;
most of the balance equations obtained in Steps 4 and 6 are satisfied;
β k ’s are monotone decreasing among each class of reactions which have the same number of molecules of reactants.
Plugging the chosen values for α i ’s and β k ’s in the time-scale constraints obtained in Steps 4 and 6, compute an upper bound (denoted as γ 0) for a time-scale exponent. Then, the chosen set of exponents α i ’s and β k ’s can be used for γ satisfying γ≤γ 0. For γ>γ 0, select another set of exponents α i ’s and β k ’s using Steps 7 and 8.
- 9.Using each set of values for α i ’s and β k ’s, identify a natural time scale exponent of each species (denoted as γ i for species S i ) so that γ i satisfies
We collect γ i ’s with the same values, whose species are in the same time scales in the approximation.
Modify α i ’s and β k ’s so that the conditions in Step 7 are satisfied and that γ i ’s are divided into appropriate number of values, which gives the number of time scales, N γ =N γ i , we are interested in.
- 11.For each chosen γ, derive a limiting equation for each species S i with γ i =γ. Using the stochastic equation obtained in Step 3 (a), we let N go to infinity.
For , the k th reaction term converges to zero if α i >γ + ρ k .
If α i =γ + ρ k , the k th reaction term appears as a limit in the limiting equation. The limit of the k th reaction term is discrete if α i =0, while it is a continuous variable with the limit of its propensity if α i >0.
There is no k satisfying α i <γ + ρ k in the equation for species S i with γ=γ i due to the definition of γ i given in Step 9.
- 12.In the limiting equation for each species S i with γ i =γ, we approximate propensities in the reaction terms. Suppose that the normalized species number for S j appears in the propensities.
If γ j >γ, the limit of the normalized species number for S j is its initial value.
If γ j =γ, the limit of the normalized species number for S j appears as a variable in the propensities in the limiting equation.
If γ j <γ, the limit of the normalized species number for S j is expressed as a function of the limits of the normalized species numbers for S i with γ i =γ. The function for S j is obtained by dividing the equation for S j by and letting N go to infinity.
If a limiting model is not closed, consider limiting equations for some linear combinations of species selected in Step 5 whose natural time scale exponents are equal to the chosen γ.
The method for multiscale approximation described above can be applied to general chemical reaction networks containing different scales in species numbers and reaction rate constants. We can apply the method in case the rates of chemical reactions are determined by law of mass action and when there is no species whose number is either zero or infinity at all times. As given in , in the reaction network involving ∅→S1, ∅→S2, ∅→S3, S1 + S2→∅, and S1 + S3→∅, convergence of the limit for the scaled species numbers may not be guaranteed at some time scales. Suppose that production rate of S1 is larger than that of S2but with the same order of magnitude, and that production rate of S3 is much smaller than those of S1and S2. Then, X1(t) may blow up to infinity and X2(t) may go to zero at some time scales. In this case, the method is not applicable.
Results and discussion
We analyze a heat shock response model of E. coli developed by Srivastava, Peterson, and Bentley . The heat shock response model gives a simplified mechanism occurring in the E. coli to respond to high temperature. Heat causes unfolding, misfolding, or aggregation of proteins, and cells overcome the heat stress by producing heat shock proteins, which refold or degrade denatured proteins. In E. coli, σ32factors play an important role in recovery from the stress under the high temperature. σ32factors catalyze production of the heat shock proteins such as chaperon proteins and other proteases. In this model, J denotes a chaperon complex, FtsH represents a σ32-regulated stress protein, and GroEL is a σ32-mediated stress response protein.
σ32 factors are in three different forms, free σ32protein, σ32 combined with RNA polymerase (E σ32), and σ32 combined with a chaperon complex (σ32-J). Under the normal situation without stress, most of the σ32 factors combine with chaperon complexes and form σ32-J. A chaperon complex J keeps σ32factors in an inactive form, and σ32factors can directly respond to the stress by changing into different forms. When there exist σ32factors combined with chaperon complexes, FtsH catalyzes degradation of σ32 factors. Thus, if enough σ32-regulated stress proteins are produced, σ32factors are degraded.
Species in the heat shock response model of E. coli and their initial values
# of S1
# of S2
# of S3
E σ 32
# of S4
# of S5
# of S6
# of S7
# of S8
# of S9
Reactions in the heat shock response model of E. coli
Recombinant protein synthesis
S7→S2 + S6
S2 + S6→S7
S6 + S8→S9
Recombinant protein-J association
Recombinant protein degradation
S9→S6 + S8
Recombinant protein-J disassociation
σ32 mRNA decay
Stochastic reaction rate constants in the heat shock response model of E. coli
approximates the network at the times of order 10,000 sec. Detailed derivation is given in the later sections. Note that it is possible to identify different numbers of time scales depending on the scaling of the species numbers and reaction rate constants. In the heat shock response model of E. coli, it is possible to obtain approximate models with two or four time scales. However, if the number of time scales are too many, the limiting model in each time scale may involve one species and a few number of reactions and the model in this case may not be interesting to consider.
Derivation of the scaled models
so that and .
For each reaction, ρ k is given in terms of α i and β k in the Additional file 1: Table S1.
We are interested in dynamics of species numbers and in various stages of time period. In the early stage of time period, normalized species numbers of S2 and S3 are very close to their scaled initial values, since these species numbers have not changed yet. In the medium stage of time period, the normalized species numbers of S2and S3 are asymptotically equal to non-constant limits. In the late stage of time period, the normalized species numbers of S2 and S3fluctuate very rapidly and their averaged behavior is captured in terms of some function of other species numbers.
Then, gives a normalized species number at the times of order N γ . A natural time scale of S i is the time when has a nonzero finite limit which is not constant and of order 1.
where N γ in each propensity comes from the change of the time variable. Here, the initial values may depend on γ, since we can choose different values for α i for each γ due to changes in order of magnitude of species numbers in time. The stochastic equations after scaling and a time change for all species are given in the Additional file 1: Section 1.
Inequalities in (14) mean that if maximal production and consumption rates are not balanced either for S2 or S3, the chosen set of values for scaling exponents can be used to approximate the dynamics of the full network up to times of order N u 2 or N u 3. For times later than those of order N u 2or N u 3, we need to choose another set of values for scaling exponents based on the balance equations. We call the balance equation and the time-scale constraint for each species as the species balance condition. If either (12a) or (??) is satisfied, we say that the species balance condition for S2 is satisfied.
Similarly to the time-scale constraint in the species balance condition, (18) implies that if maximal collective production and consumption rates for S23are not balanced, our choice of values for scaling exponents are valid up to times of order N u 23.
Balance equations and time-scale constraints for each species and for each collective species chosen
S2 + S3 + S7
S2 + S3
S2 + S7
S6 + S7 + S9
S6 + S7
S6 + S9
S8 + S9
Based on species and collective species balance equations in Table 4, we choose appropriate values for α i ’s and β k ’s so that most of the balance equations are satisfied. If some of the balance equations are not satisfied, corresponding time-scale constraints give a range of γ where the chosen α i ’s and β k ’s are valid. The time-scale constraint, γ≤γ0, implies that the set of scaling exponents α i ’s and β k ’s chosen is appropriate only up to time whose order of magnitude is equal to N γ 0. For the times larger than O(N γ 0), we need to choose a different set of values for the scaling exponents, α i ’s. Assuming that reaction rate constants do not change in time and that the species numbers vary in time, we in general use one set of β k ’s for all time scales and may use several sets of α i ’s. A large change of the species numbers in time requires different α i ’s in different time scales. For the heat shock model we identify three different time scales as we will see in the section of limiting models in three time scales, and α1, α2, α3, α8, and α9 may depend on the time scale. α4, α5, α6, and α7 are the same for all time scales.
Then, the first set of scaling exponents with α1=1 and α2=α3=0 is valid only when γ≤0. Next, based on the fact that X2(t)≈O(10) and X3(t)≈O(10) in the medium stage of time period, we choose α2=α3=0 for γ>0. At this stage of time period, we set with α1=0. Then, (12a) and (12b) are satisfied but not (16). The condition (18) gives γ≤1, and the second set of scaling exponents with α1=α2=α3=0 is valid when γ≤1. Finally, we set α1=0 and α2=α3=1 for γ>1 based on the fact that the numbers of molecules of S2and S3 grow in time and are of order 100. Then, (12a), (12b), and (16) are all satisfied, and the third set of scaling exponents with α1=0 and α2=α3=1 can be used for γ>1.
The three sets of values for the scaling exponents chosen are given in the Additional file 1: Table S4. With chosen values for the scaling exponents, we check whether each balance equation is satisfied and give a time-scale constraint in the Additional file 1: Table S6 in case the balance equation is not satisfied. Different choices of α i ’s and β k ’s from the ones in the Additional file 1: Table S4 give different limiting models. As long as the chosen values for α i ’s and β k ’s satisfy balance conditions, the limiting model will describe nontrivial behavior of the species numbers which are nonzero and finite in the specific time of interest.
Limiting models in three time scales
In the heat shock response model of E. coli, we identify a time scale of interest using the chosen set of scaling exponents and derive a limiting model which approximates dynamics of the full chemical reaction network. Each limiting model involves a subset of species and reactions, and gives features of the full network during the time interval of interest.
where Γ i + denotes the collection of reactions where the species number of S i increases every time the reaction occurs. Similarly, Γ i− is the subset of reactions where the species number of S i decreases every time the reaction occurs. In (19), the left-side term is the maximal order of magnitude of rates of reactions involving S i and the right-side term is the order of magnitude of the species number for S i . If times are earlier than those of order N γ i (γ<γ i ), fluctuations of species number of S i due to the reactions involving S i are not noticeable compared to magnitude of the species number of S i . Then, the species number of S i is approximated as its initial value. In the times of order N γ i (γ=γ i ), changes of species number of S i due to the reactions and the species number of S i are similar in magnitude and behavior of the species number of S i is described by its nondegenerate limit. If times are later than those of order N γ i (γ>γ i ), the species number of S i fluctuates very rapidly due to the reactions involving S i compared to the magnitude of the species number of S i . Then, the averaged behavior of the species number of S i is approximated by some function of other species numbers. Note that γ i depends on α i ’s and β k ’s, and the time scale of the i th species may change if we use several sets of α i ’s.
All values of α i ’s and ρ k ’s for three scalings which are used to derive limiting models are given in the Additional file 1: Table S4. The equations for normalized species numbers and the equation for which are used later in this section are given in the Additional file 1: Section 1 and Section 2, respectively. When we derive limiting models in three time scales, boundedness of the normalized species numbers is required. For first two time scales, we define stopping times so that the normalized species numbers are bounded up to those times. For the last time scale, we proved stochastic boundedness of some normalized species numbers in a finite time interval. For more details, see Additional file 1: Section 5.
and we get γ2=0. Similarly, we get γ3=γ8=0.
and we get γ1=2. Similarly, we get γ i >0 for i=4,5,6,7,9. Among all natural time scale exponents of species, we choose the smallest one, γ=0, and set t∼O(N0)=O(1) as the first time scale we are interested in. Since γ1>0, as N→∞. Similarly, for i=4,5,6,7,9 as N→∞. To sum up, in this time scale with γ=0, the species numbers of S i ’s for i=1,4,5,6,7,9 change more slowly than other species numbers, and the species numbers with slow time scales are approximated as constant.
Similarly, we get a limiting model with , , and for γ=0 as given in (3).
and we get γ6=1. Similarly, we get γ7=γ8=1, γ i <1 for i=2,3, and γ i >1 for i=1,4,5,9. We already get the temporal behavior of species numbers of S2, S3, and S8 through the limiting model when γ=0. Thus, we set t∼O(N1) as the second time scale we are interested in, and derive a limiting model for S6, S7, and S8 when γ=1. Note that species S8 is involved in the limiting models for both γ=0 and γ=1, since we use different sets of scaling exponents in these models. For i=1,4,5,9 as N→∞, since γ i >1. Thus, in the 12th and 15th reaction terms in (24), and as N→∞. Since the propensities of the 8th, 9th, and 17th reaction terms in (24) are of order Nγ−2=N−1 for γ=1 and the species number of S6 is of order 1, these reaction terms go to zero as N→∞. In the 10th and 15th reaction terms in (24), , , and are asymptotically O(1) and converge to , , and as N→∞ since γ6=γ7=γ8=1.
In (30), note that since X9(0)=0 as given in Table 1. Limiting equations for and can be derived similarly, and a limiting model with , , , and for γ=1 is given in (4).
uniformly as N→∞.