Gene expression is inherently stochastic and most RNA molecules exist in very low copy numbers in Escherichia coli . The phenotype of these cells depends strongly on how many RNA molecules of each gene are produced , when they are produced, and how their numbers fluctuate in time, especially because protein numbers generally follow the RNA numbers [3, 4]. This suggests that for the phenotype to be robust and thus predictable, bacteria may need to control fluctuations in some RNAs numbers, especially of weakly expressed genes.
RNA numbers depend on the kinetics if its production and degradation. A genome wide study of degradation rates of RNA molecules in E. coli concluded that while there is a wide range of degradation rates, it is the transcription rate that determines mRNA steady-state levels . Differences in RNA half-lives may have other roles, such as the regulation of transient changes in abundance in response to environmental stress or cell cycle . Further, while several sequence dependent events can take place in elongation that affect mean and fluctuations in RNA numbers , apart from premature terminations, they only have tangible consequences if multiple RNA polymerases are on the template simultaneously. This only occurs for strongly expressed genes and thus the dynamics of transcription initiation should be the key determinant of the dynamics of RNA numbers for weakly expressed genes.
The mean rate of transcription of a gene is mostly determined by the promoter sequence as well as by the present concentrations of possible activator and repressor molecules. In bacteria, the process of transcription initiation at the promoter region includes diffusion of the RNA polymerase (RNAp) along the template until reaching a transcription start site (TSS), DNA bending and loading in the active site of the RNAp, DNA unwinding and positioning in the TSS, loading of the NT strand, and assembly of the clamp/jaw on downstream DNA . After this sequence of events, the RNAp can elongate along the DNA and assemble the RNA strand. At the termination sequence, the RNAp and a single-stranded RNA are released.
The durations of the rate-limiting steps in initiation vary widely between promoters, even when the sequences only differ slightly , as well as with temperature  and concentration of Mg2+ and other metabolites . In vitro studies of the kinetics of the lac-UV5 promoter in E. coli suggest that its initiation involves up to three rate-limiting steps: formation of a closed complex (RPc), isomerization (forming the RPi complex), and formation of the open complex, RPo [8, 10, 11]. Isomerization is only rate-limiting for temperatures below 20C.
The initiation mechanism is dynamically complex as it involves, e.g., uni-dimensional diffusion of the RNAp on the DNA template and conformational changes of the RNAp and template [12
]. So far, no measurements exist of the distribution of the duration of these events, and the existing information on the kinetics derives solely from in vitro
estimations of mean durations. A detailed model [11
] of the likely common sequence of events is shown in (1). R stands for RNAp, P stands for promoter DNA, RP stands for the complex of R bound to P, while RPc and RPo stand for the closed and open complexes, respectively. I1
are intermediates of the isomerization step. The last step in (1) competes with abortive initiation [14
]. Also shown in (1) are the expected speeds of the steps (in the forward direction) given results from in vitro
measurements on a few promoters [11
All steps in (1), except for the last one, are reversible [13
]. In vitro
studies suggest that the unwinding of promoter DNA, which occurs early in the open complex formation [15
] is a slow process compared with the time for the RNAp to diffuse along the template and find a TSS [12
]. A simplified model of (1) is shown in (2), showing only the rate-limiting steps [12
], by packing the fast steps into the three steps known to be slow in some promoters (reversibility not represented):
Let t(RPc) be the duration of the closed complex formation (first step in (2)), which includes the time for the RNAp to find the TSS. Also, let t(RPo) be the duration of the open complex formation (second step in (2)), and let t(RPcl) be the time for RNA chain elongation initiation and promoter clearance (third step in (2)). Finally, let tpt be the time to start a productive transcription, equal to the sum of t(RPc), t(RPo) and t(RPcl). In vitro measurements of the kinetics of the lac promoter and variants, such as lar, indicate that tpt is of the order of 10-1000 seconds, depending on the concentrations of inducers and environmental factors such as temperature.
The in vivo kinetics of the steps in (2), as well as the distribution of durations of intervals between initiation events, has not been characterized for any promoter . This distribution is likely a determining factor of the strength of fluctuations in RNA numbers . A recent study using a delayed stochastic model of gene expression suggests that, by regulating the kinetics of the closed and open complex formations, it is possible to regulate both mean and fluctuations in RNA numbers independently . This is relevant since the kinetics of these steps varies with sequence, environmental factors such as temperature, and concentrations of repressor and activator molecules . In general, the binding of a repressor to the promoter significantly increases the duration of the closed complex formation, usually by reducing the probability that an RNAp will find the TSS (e.g. by blocking diffusion on the template) [7, 12]. Activators tend to have more complex effects, affecting the mean duration of both closed and open complex formations [7, 12].
Recently, a method was developed in E. coli to tag mRNA molecules in vivo with MS2d-GFP proteins that allows their detection shortly after being produced (Golding et al, 2005). Expression of the target RNA is controlled by the lar promoter (also named lac/ara) . Individual transcription events are detectable and the behaviour is similar to that of the unlabeled system [18, 19]. Using this method, we measured intervals between consecutive productions of RNA molecules under the control of lar, under weak and medium induction, which have not been previously measured.
The kinetics of transcription initiation of the lar promoter, as well as of several variants, have been studied in vitro . The sequence of the lar promoter and differences from the original lac promoter are described in detail in [7, 20]. Its expression is activated by Arabinose and IPTG. In vitro, the time between productions of consecutive RNA molecules is approximately 6000 s when not induced, 2500 s when induced by IPTG alone, 800 s when pre-incubated with Arabinose alone, and 50 s when induced with both IPTG and Arabinose . Recent in vivo measurements suggest that the kinetics of transcription differs from in vitro conditions. For maximum induction, in vivo, only 4 RNAs are produced on average in 1 hour .
Here, we report in vivo measurements of intervals between RNA production events, in the regimes of weak and medium induction. From the distributions of intervals, we derive number of steps and their duration, necessary to describe the measured distributions, assuming that each step's duration follows an exponential distribution. The method proposed here is applicable to study the kinetics of initiation of a wide range of promoters in E. coli and, as such, may provide new genome-wide knowledge on the dynamics of transcription initiation in prokaryotes.