SParSE++: improved eventbased stochastic parameter search
 Min K. Roh^{1}Email authorView ORCID ID profile and
 Bernie J. DaigleJr.^{2}
DOI: 10.1186/s129180160367z
© The Author(s) 2016
Received: 9 June 2016
Accepted: 6 November 2016
Published: 25 November 2016
Abstract
Background
Despite the increasing availability of high performance computing capabilities, analysis and characterization of stochastic biochemical systems remain a computational challenge. To address this challenge, the Stochastic Parameter Search for Events (SParSE) was developed to automatically identify reaction rates that yield a probabilistic userspecified event. SParSE consists of three main components: the multilevel crossentropy method, which identifies biasing parameters to push the system toward the event of interest, the related inverse biasing method, and an optional interpolation of identified parameters. While effective for many examples, SParSE depends on the existence of a sufficient amount of intrinsic stochasticity in the system of interest. In the absence of this stochasticity, SParSE can either converge slowly or not at all.
Results
We have developed SParSE++, a substantially improved algorithm for characterizing target events in terms of system parameters. SParSE++ makes use of a series of novel parameter leaping methods that accelerate the convergence rate to the target event, particularly in low stochasticity cases. In addition, the interpolation stage is modified to compute multiple interpolants and to choose the optimal one in a statistically rigorous manner. We demonstrate the performance of SParSE++ on four example systems: a birthdeath process, a reversible isomerization model, SIRS disease dynamics, and a yeast polarization model. In all four cases, SParSE++ shows significantly improved computational efficiency over SParSE, with the largest improvements resulting from analyses with the strictest error tolerances.
Conclusions
As researchers continue to model realistic biochemical systems, the need for efficient methods to characterize target events will grow. The algorithmic advancements provided by SParSE++ fulfill this need, enabling characterization of computationally intensive biochemical events that are currently resistant to analysis.
Keywords
Stochastic simulation Parameter estimation Rare event Optimization Stochastic event Stochastic mass action kineticsBackground
The everincreasing computational capacity of modern computer architectures [1–5] has enabled the simulation of realistic biochemical models with thousands of reactions [6–8]. Despite this capability, computational analysis of stochastic biochemical systems remains a challenge, as techniques like parameter estimation and sensitivity analysis typically require the simulation of multiple ensembles of hundreds to thousands of system trajectories. Most existing parameter estimation algorithms for biochemical systems identify one or more sets of reaction rate parameters giving rise to trajectories that closely mimic observed data [9–15]. In contrast, Stochastic Parameter Search for Events (SParSE) [16] was developed to efficiently sample biochemical reaction rate parameter values that confer a userspecified target event with a given probability and error tolerance. Its conception was inspired by acknowledging the usefulness of such an eventbased approach to parameter estimation. As executing SParSE with different initial conditions will identify nonoverlapping sets of parameter values that satisfy the target event equally well (the “solution hypersurface”), these results can be used to evaluate various cost functions for scientific and economic purposes. For example, many intervention strategies exist in malaria control: mass drug administration, mass screen and treat, focal mass drug administration, and snowball reactive case detection [17]. Knowing all combinations of system parameters in an epidemiological model that result in eradication of malaria is extremely beneficial for making the most costeffective policy decisions. Similarly, learning different parameter combinations that result in cell polarization in a mechanistic model of the yeast Saccharomyces cerevisiae can aid in our understanding of cell polarization in other organisms as well as contribute new insights to yeast polarization [18]. In general, SParSE outputs from the solution hypersurface may be filtered using userdefined cost functions or constraints to further refine the event characterization.
SParSE is comprised of three main components: the multilevel crossentropy (CE) method, exponential parameter interpolation, and the inverse biasing method (see below for details) [16]. As introduced in [19], performance of the multilevel CE method depends on the existence of a sufficient amount of intrinsic stochasticity in the system of interest. For systems with low levels of intrinsic stochasticity (even for a subset of reactions), the multilevel CE method can exhibit slow convergence properties, especially when initial parameter values are far from the solution hypersurface. Furthermore, experiments conducted in [16] demonstrated that the accuracy of exponential interpolation significantly decreases when parameter estimates rapidly pass through the solution hypersurface. Taken together, these two limitations greatly hamper our ability to characterize target events in important classes of stochastic systems.
To overcome these limitations, we developed SParSE++, a substantially improved algorithm for characterizing target events in terms of system parameters. SParSE++ makes use of several algorithmic improvements to SParSE that lead to faster and more accurate performance, particularly in the presence of low intrinsic system stochasticity. SParSE++ utilizes a novel method called crossentropy leaping (CE leaping) that accelerates the convergence rate of the multilevel CE method. Upon detecting slow convergence, CE leaping uses past parameter estimates to intelligently “leap" forward in parameter space rather than continue using the standard CE method. Furthermore, we have defined special leaping cases to improve accuracy in situations when parameter estimates rapidly pass through the solution hypersurface. Finally, SParSE++ features a more robust parameter interpolation method that further accelerates the algorithm convergence rate. To demonstrate superior performance of SParSE++, we apply the method to the three example systems featured in [16]: a birthdeath process, a reversible isomerization model, and a system exhibiting SIRS disease dynamics. In addition, we include an eightreaction model of yeast polarization featured in [13, 20]. In each example, SParSE++ shows substantially improved computational efficiency over SParSE, with the largest efficiency improvements resulting from analysis of events with the strictest error tolerances.
Methods
The algorithms developed in this work make use of Gillespie’s stochastic simulation algorithm (SSA) [21]—a Monte Carlo simulation method that produces exact trajectories of a wellstirred system obeying the chemical master equation (CME). Such systems can be described in the following manner. Given a biochemical system consisting of N molecular species {S _{1},⋯,S _{ N }} and M reaction channels {R _{1},⋯,R _{ M }}, let X _{ i }(t) denote the population of S _{ i } at time t, x(t)≡(X _{1}(t),⋯,X _{ N }(t)) the state vector at time t, and x _{0}≡x(t _{0}) the population at initial time t _{0}. The time evolution of x(t) in a fixed volume at constant temperature is governed by sequences of two random variables: τ, the time elapsing between the current and next reaction firings, and j ^{′}, the index of the next reaction firing at time t+τ. After each selection of τ and j ^{′}, x(t) advances by \(\mathbf {x}(t+\tau) = \mathbf {x}(t) + \boldsymbol {\nu }_{j^{\prime }}\), where ν _{ j }≡[ν _{1j },⋯,ν _{ Nj }], and j∈{1,⋯,M} is the state change (stoichiometry) vector. Each component in the state change vector, ν _{ ij }, denotes the change in population X _{ i } induced by single firing of reaction R _{ j }.
Sampling τ and j ^{′} requires computation of reaction propensity functions, a _{ j }(x,k), where k represents the system reaction rates (k _{1},⋯,k _{ M }), defined such that a _{ j }(x,k)dt is the probability that one R _{ j } reaction occurs in the next infinitesimal time interval [t, t+dt). Denoting the propensity sum as \(a_{0}(\mathbf {x}, \mathbf {k}) \equiv \sum _{j=1}^{M} a_{j}(\mathbf {x}, \mathbf {k})\), each time to the next reaction τ is exponentially distributed with mean 1/a _{0}(x,k), and each index of the next reaction j ^{′} is categorically distributed with probability a _{ j }(x,k)/a _{0}(x,k),j∈{1,⋯,M}.
where f(x _{ i }(T _{ i }k)) is the value of the event function f(·) evaluated on the i ^{th} trajectory at times \(\boldsymbol T_{i} \equiv \{t_{0}, t_{i1}, \cdots, t_{iN_{\mathcal {T}_{i}1}}, \mathcal {T}_{i}\}\) (t _{ ik } is the firing time of the k ^{th} reaction and \(N_{\mathcal {T}_{i}}\) is the number of reaction firings occurring in the i ^{th} trajectory) simulated with reaction rates k. Two requirements for f(·) are that it takes the system state as an input and can be used to evaluate the distance between this state and \(\mathcal {E}\). The indicator function \(I_{\{f(\mathbf {x}_{i}(\boldsymbol {T}_{i}  \boldsymbol {k})) \cap \mathcal {E}\}}\) thus returns a value of 1 if the distance between f(x _{ i }(T _{ i }k)) and \(\mathcal {E}\) is zero and 0 otherwise.
SParSE—stochastic parameter search for events
where n _{ ij } is the total number of times reaction j fires in the i ^{th} trajectory, \(\sum _{i}'\) iterates only over the subset of N _{ S } trajectories that return 1 for \(I_{\{f(\mathbf {x}_{i}(\boldsymbol {T}_{i}  \boldsymbol {k})) \cap \mathcal {E}\}}\), k indexes the \(N_{\mathcal {T}_{i}}\) reaction firings occurring in the i ^{th} trajectory, and t _{ ik } and τ _{ ik } represent the absolute time and time elapsed since the last firing for the k ^{th} reaction firing in the i ^{th} trajectory. Computation for γ ^{(l)} and k ^{(l)} terminates when Eq. (2) is satisfied, or when \(l \geq \mathcal {L}\) for some \(\mathcal {L} \in \mathbb {N}\) (10 by default).
Besides the multilevel CE method, SParSE is comprised of two other components: exponential parameter interpolation and the inverse biasing method. Both the multilevel CE and the inverse biasing methods proceed by picking a set of intermediate events ξ that are close to \(\mathcal {E}\) and reachable with the current reaction rates. SParSE chooses ξ at each iteration by selecting the top ⌈ρ N _{ S }⌉ simulated trajectories that evolve farthest in the direction of \(\mathcal {E}\). The values of ρ are chosen based on the distance between the current estimate and \(\mathcal {P}_{\mathcal {E}}\). Denoting the distance as \(\delta (\boldsymbol {k}) \equiv \mathcal {P}_{{E}}  \hat {p}(\mathbf {x}_{0}, \boldsymbol {k}^{(l1)}, \mathcal {E}; t_{f})\), ρ is chosen by
where ϕ _{type} is 1 if \(f(\mathbf {x}_{0}) \leq \mathcal {E}\) and 1 otherwise. In the above two cases, sgn(δ(k))≠ϕ _{type} corresponds to the case where the current reaction rates overperturb the system with respect to \(\mathcal {P}_{\mathcal {E}}\). The conventional multilevel CE method cannot be used here, as the top trajectories evolving in the direction of \(\mathcal {E}\) surpass the target event more than the desired amount. These trajectories, however, can be used to reverse the direction of bias (thus the term inverse biasing). Instead of terminating a simulated trajectory when \(\mathcal {E}\) is reached (as is done in the multilevel CE method), we run all N _{ S } trajectories until t _{ f } and record the maximum values reached in the direction of the target event. These values are then used in Eq. (6) to determine intermediate target events. As with Eq. (5), the smaller the distance, the less extreme intermediate events are chosen to avoid excessive biasing. The inverse of the biasing parameters corresponding to these events are then multiplied by the current reaction rates to compute the next estimates \(\hat {p}\). While inverse biasing effectively reverses the direction of bias in the case of overperturbation, estimates computed in this way may not be accurately characterized by exponential interpolation. Thus, we choose more conservative ρ values for inverse biasing than for the multilevel CE method.
where γ ^{(0,·)} are the values of the past multilevel CE method biasing parameters normalized with respect to γ ^{0}. Denoting k ^{(l)} as the reaction rates for the current iteration of SParSE, each \( \gamma ^{(0, l)}_{j}\) can be expressed as \(k^{(l)}_{j} / {k^{0}_{j}}\). The resulting estimate from employing γ ^{(0,l)} is denoted \(\hat {p}^{(l)}\). Since interpolation is initiated only after both under and overperturbing estimates are obtained, γ ^{(0,·)} is guaranteed to have at least two entries. When there are more than five entries, SParSE picks the five estimates closest to \(\mathcal {P}_{\mathcal {E}}\) while requiring that both under and overperturbed values are present. Once the optimal exponential curve in Eq. (7) is found, SParSE returns up to seven sets of candidate biasing parameters. Three of the seven correspond to estimates that are slightly less than \(\mathcal {P}_{\mathcal {E}} \times N_{S}\), one corresponds to the exact value, and the rest to estimates slightly greater than \(\mathcal {P}_{\mathcal {E}} \times N_{S}\). Interpolation starts with the candidate set from the exact target value, and it shifts to over or underperturbing parameters depending on the resulting estimate. For example, if the resulting estimate is greater than \(\mathcal {P}_{\mathcal {E}} \times N_{S} + \epsilon _{\mathcal {P}_{\mathcal {E}}}\), SParSE chooses the next underperturbing biasing parameters to compute the next estimate. SParSE assumes failure in interpolation and exits if it is unable to find k ^{∗} in a userdefined number of iterations, \(\mathcal {I}\), which is set to 3 by default.
Accelerating convergence with crossentropy leaping
We now describe the enhancements made to create SParSE++, a substantially improved algorithm for identifying k ^{ ∗ } for a given event and target probability. The first enhancement introduces crossentropy (CE) leaping, an algorithmic technique enabling accelerated convergence.
Given a set of initial reaction rates, the SParSE convergence rate depends on the intrinsic stochasticity of the system with respect to the target event, which we simply refer to as ‘stochasticity’ in the remainder of this section. Denoting \(\xi _{i}^{\text {max}}\) as the value of f(x _{ i }(tk ^{(l)})) closest to \(\mathcal {E}\) reached by the i ^{th} trajectory, the next intermediate events ξ are chosen as the closest ⌈ρ N _{ S }⌉ values of \(\xi _{i}^{\text {max}}\) to \(\mathcal {E}\), where ρ is chosen by (5) or (6) and i∈{1,⋯,N _{ S }}. For systems and target events exhibiting low stochasticity, the variance among the \(\xi _{i}^{\text {max}}\) values is small. As a result, even small values of ρ will generate subsequent intermediate events that are very close to current ones, causing SParSE to converge slowly, if at all.
 1.
Inequality (2) is satisfied. In this case the objective of SParSE++ is met, and the algorithm exits,
 2.
\( \left \{\begin {array}{lr} \hat {p}\cdot N_{S} \leq \mathcal {P}_{\mathcal {E}}\cdot N_{S} \cdot 0.01, & \text {for} \mathcal {P}_{\mathcal {E}} \leq 0.5\\ \hat {p}\cdot N_{S} \leq (1  \mathcal {P}_{\mathcal {E}})\cdot N_{S} \cdot 0.01, & \text {for} 0.5 < \mathcal {P}_{\mathcal {E}} \end {array}\right \} \)
Condition 2 is enforced to ensure the signal from k ^{(l)} is reliable; at least 1% of the fraction of trajectories equal to \(\mathcal {P}_{\mathcal {E}}\) (or \(1\mathcal {P}_{\mathcal {E}}\)) are required in order to qualify for leaping. SParSE++ repeatedly clears the memory of past estimates until two qualifying probability estimates are observed consecutively, at which point Algorithm 1 is executed to determine leaping eligibility and magnitude.
Here, input variables \(\hat {\boldsymbol p}\) and γ denote consecutive probability estimates and their corresponding biasing parameters, respectively. CE leaping utilizes the mean rate of convergence calculated from \(\hat {\boldsymbol p}\) to determine how far in biasing parameter space to leap forward (Line 13). To compute the convergence rate of the first probability estimate, we include the estimate computed immediately before the first eligible \(\hat {p}\) when possible. The only instance when this cannot be done is when two eligible values from \(\hat {\boldsymbol p}\) are the very first two estimates computed for a given k ^{0}.
The states from which CE leaping are triggered can vary greatly. For example, the distance to the target probability \(\mathcal {P}_{\mathcal {E}}\), rate of change in \(\hat {\boldsymbol p}\), and the magnitude of \(\mathcal {P}_{\mathcal {E}}\) can all differ substantially, even for the same systems using different values of k ^{0}. The amount of leaping used should thus depend on all of these factors, in order to avoid grossly overperturbing the system. To handle different rates of convergence, two predefined variables leap_{projs} and leap_{factors} are used to adjust the amount of leaping based on the average change in \(\hat {\boldsymbol p}\!\cdot \!N_{S}\) (mean_{step}) and the minimum distance to the target event probability (dist_{min}). The largest leaping multiplier (7 from Line 4) is chosen when the estimated number of steps to reach \(\mathcal {P}_{\mathcal {E}}\) (step_{proj}; Line 13), is greater than 50 (Line 3). In contrast, when step_{proj} is less than 5 the standard multilevel CE method is used instead of CE leaping. The value for step_{proj} is computed assuming linear convergence with rate mean_{step}. As this assumption may not be valid for certain systems and target event functions f(x), leaping multipliers (leap_{factors} in Line 4) are chosen conservatively to prevent overperturbation.
When CE leaping is triggered, SParSE++ skips the computation of intermediate target events and their associated biasing parameters, as γ ^{(l)} is set to γ ^{CE}.
Special leaping cases
The CE leaping algorithm is designed to accelerate the rate of convergence of the multilevel CE method when it is stuck on a “plateau” in parameter space. The opposite scenario can also pose a problem to SParSE—in a “steep” region of parameter space, the multilevel CE method may pass through the solution hypersurface too quickly. This can lead to erroneous interpolants or even solution divergence when computed estimates do not meet the thresholds required for interpolation and leaping. Three cases of this scenario are identified and handled in SParSE++. Starting from the least severe instance and moving to the most severe, we explain each case in detail. As with CE leaping, new biasing parameters are computed from past estimates; thus, description of the second stage of the multilevel CE method (i.e., γ computation based on intermediate target events) is omitted below.
Last leaping prior to interpolation
In a quicklychanging parameter region, SParSE may enter the interpolation stage with as few as two estimates, one on either side of \(\mathcal {P}_{\mathcal {E}}\). Even when more than two estimates exist, their values may be far from \(\mathcal {P}_{\mathcal {E}}\) if only two iterations of the multilevel CE method are run prior to interpolation. Lack of \(\hat {p}\) values near \(\mathcal {P}_{\mathcal {E}}\) can significantly degrade interpolation quality. This case can be avoided by first obtaining another estimate near \(\mathcal {P}_{\mathcal {E}}\). To do this, SParSE++ computes an additional estimate prior to entering the interpolation stage using γ ^{WA}, the values of which are generated using Algorithm 2.
Here N _{ S } and \(\mathcal {P}_{\mathcal {E}}\) denote the total number of simulated trajectories and the userdefined target probability, respectively. The least under and overperturbing biasing parameters (γ ^{ u } and γ ^{ o }) are guaranteed to exist, since Algorithm 2 is run immediately prior to the interpolation stage, which is only triggered when both under and overperturbing estimates have been computed. The weights w ^{ u } and w ^{ o } for γ ^{ u } and γ ^{ o } reflect how close \(\hat {p}^{u}\) and \(\hat {p}^{o}\) are to \(\mathcal {P}_{\mathcal {E}}\). Supposing \(\hat {p}^{u}\) is closer to \(\mathcal {P}_{\mathcal {E}}\) than \(\hat {p}^{o}\), then δ ^{ u } (the distance between \(\mathcal {P}_{\mathcal {E}}\) and \(\hat {p}^{u}\)) will be smaller than δ ^{ o }. Since w ^{ u } is defined as δ ^{ o } normalized by the total distance, w ^{ u } will be greater than w ^{ o }. Thus, the weighted averaging method gives more weight to biasing parameters from the better estimate. When characterizing target events with a larger error tolerance, the final leaping performed with γ ^{WA} may satisfy Eq. (2) and thus eliminate the need to run interpolation.
Leaping on lowsignal region
 1.
At least three previous estimates exist
 2.
At least one estimate is located on either end of \(\mathcal {P}_{\mathcal {E}}\)
 3.
At least one estimate on one side of \(\mathcal {P}_{\mathcal {E}}\) is eligible for interpolation, and every estimate on the other side does not qualify
The last condition corresponds to the case where estimates on one side of \(\mathcal {P}_{\mathcal {E}}\) contain a sufficient signal for interpolation, while estimates on the other side do not. We note that although Algorithm 2 is executed both here and in the previous special case, the conditions that trigger the algorithm as well as its purpose are very different. In the previous section, Algorithm 2 is used to compute an estimate close to the target probability before beginning the interpolation stage. Here, the same method is used to escape a lowsignal region in an efficient manner.
Bisection to obtain sufficient signal
The final case occurs when all past estimates exhibit insufficient signal for interpolation. Although expected to occur rarely, the multilevel CE method can produce estimates that either reach the target event too few or too many times. If the corresponding under and overperturbation are severe, these estimates will not be considered for CE leaping or interpolation. When all recorded estimates do not meet the interpolation threshold yet exist on both sides of the target event probability \(\mathcal {P}_{\mathcal {E}}\), SParSE++ executes bisection on previous biasing parameters in an attempt to move the system closer to \(\mathcal {P}_{\mathcal {E}}\). Using Algorithm 3, we compute the biasing parameters with bisection, γ ^{BS}.
Unlike Algorithm 2, we do not measure distances of \(\hat {p}^{u}\) and \(\hat {p}^{o}\) with respect to \(\mathcal {P}_{\mathcal {E}}\). Since these estimates exhibit insufficient signal, their distances to the target event probability do not contain useful information for computing weights of γ ^{ u } and γ ^{ o }.
We note that this final case was not observed in any of the examples evaluated in the next section. However, the event functions used with the four example systems are simply the states of a single species. For more complicated event functions, we expect this case to occur more frequently, and Algorithm 3 will thus reduce the incidence of solution divergence.
Improved interpolation
Exponential interpolation in SParSE plays an integral role when the multilevel CE method alone is unable to deliver the target event probability with acceptable precision. Once both under and overperturbed estimates are observed that do not satisfy the desired probability range (\(\mathcal {P}_{\mathcal {E}} \pm \epsilon _{\mathcal {P}_{\mathcal {E}}}\)), SParSE employs exponential interpolation on previous estimates to compute candidate biasing parameters. Acknowledging that the computed interpolant may still not produce an estimate within the required accuracy, SParSE returns up to seven sets of biasing parameters that correspond to slightly perturbed target event probabilities. Using these candidate parameter sets, SParSE computes the first new estimate using the biasing parameters corresponding to the exact target event probability \(\mathcal {P}_{\mathcal {E}}\). If the estimate is too high (\(\mathcal {P}_{\mathcal {E}} + \epsilon _{\mathcal {P}_{\mathcal {E}}} < \hat {p}\)), SParSE picks a set of candidate biasing parameters corresponding to the least underperturbing probability to compute the next estimate. If the estimate is too low (\(\hat {p} < \mathcal {P}_{\mathcal {E}}  \epsilon _{\mathcal {P}_{\mathcal {E}}}\)), SParSE picks the least overperturbing probability instead. This process continues until k ^{∗} is found or no more candidate biasing parameters remain, whichever occurs first. The default limit on the number of interpolation rounds is set to 3 (i.e., up to 21 candidate biasing parameter sets are computed). If the algorithm does not find γ ^{∗} at the end of the third round of interpolation, SParSE assumes failure to converge and exits.
The motivation behind working with multiple candidate biasing parameter sets in the SParSE interpolation stage is that the candidate set corresponding to \(\mathcal {P}_{\mathcal {E}}\) may not produce a sufficiently accurate estimate, whereas a set corresponding to interpolant values near \(\mathcal {P}_{\mathcal {E}}\) might. Thus, SParSE has as many as six alternate biasing parameter sets to be chosen should the first set fail to satisfy Eq. (2). We note that this approach is only helpful if γ ^{∗} falls within the range of candidate biasing parameter values. In the worst case when this is not true, SParSE must run four additional SSA ensembles to produce an estimate before computing a new interpolant.
SParSE++ greatly improves the efficiency in this worst case scenario by modifying the process of exponential interpolation. First, it computes up to three different exponential interpolants: one as in SParSE, one without the farthest underperturbing γ (when more than two underperturbed estimates exist), and one without the farthest overperturbing γ (when more than two overperturbed estimates exist). For each reaction, SParSE++ then chooses the interpolant with the highest R ^{2} statistic. R ^{2} statistics are commonly used to assess goodness of fit of statistical models to observed data [22]. For our purposes, the R ^{2} value, which is between 0 and 1, indicates the fraction of the total variance of output (\(\hat {p}\)) that is explained by variation in input (γ). Computing the three interpolants and their corresponding R ^{2} statistics incurs a negligible computational cost, as no additional SSA simulations are required. Pseudocode for computing the next reaction rates using SParSE++ exponential interpolation is listed in Algorithm 4.
Unlike in SParSE, the chosen interpolant in SParSE++ returns only a single set of biasing parameters corresponding to the exact value of \(\mathcal {P}_{\mathcal {E}}\), and new interpolants are only computed if the corresponding estimate does not satisfy Eq. (2). The computational cost of repeatedly generating new interpolants and R ^{2} scores in each stage is trivial compared to the cost of simulating a single SSA trajectory for most systems. If the projected biasing parameters \(\bar {\boldsymbol {\gamma }}\) from Algorithm 4 are out of range for any reaction R _{ j }, a weighted average is used to replace \(\bar {\gamma }_{j}\), where the weights are normalized distances between \(\mathcal {P}_{\mathcal {E}}\) and estimates corresponding to the least under and overperturbing biasing parameter sets for R _{ j }.
Results and discussion
In this section we compare the performance of SParSE++ to that of SParSE using the same three models—a birthdeath process, a reversible isomerization model, and a susceptibleinfectiousrecoveredsusceptible (SIRS) disease transmission system—described in [16], as well as an additional eightreaction system modeling yeast polarization [20]. For the first three models, all possible combinations of \(\mathcal {P}_{\mathcal {E}} \in \{0.40, 0.60, 0.80\}\) and \(\epsilon _{\mathcal {P}_{\mathcal {E}}} \in \{0.01, 0.05, 0.10\}\) are analyzed by simulating ensembles of N _{ S }=5×10^{4} trajectories. For the birthdeath process, we also simulated an ensemble of N _{ S }=2×10^{5} for \(\mathcal {P}_{\mathcal {E}} = 0.010\) and \(\epsilon _{\mathcal {P}_{\mathcal {E}}} = 0.001\) in order to illustrate the robustness of SParSE++ on a low probability target event. Similarly, we explored the high probability target event \(\mathcal {P}_{\mathcal {E}} = 0.95\) and \(\epsilon _{\mathcal {P}_{\mathcal {E}}} = 0.005\) with the reversible isomerization model using ensemble size N _{ S }=10^{5}. Lastly, the yeast polarization model is studied with \(\mathcal {P}_{\mathcal {E}} = 0.60\), \(\epsilon _{\mathcal {P}_{\mathcal {E}}} = 0.01\), and N _{ S }=5×10^{4}.
In order to minimize output differences resulting from stochasticity (as opposed to methodological differences), we used the same random number generator seeds and initial reaction rates for SParSE and SParSE++. For fairness of comparison, we treat each of the following computations as a single iteration: estimation using the multilevel CE method, computing biasing parameters for intermediate events, estimation using interpolation, and estimation using any type of leaping (CE, weighted average, and bisection). Although the exact costs of these computations differ depending on the values of reaction rates and the type of simulations (e.g., computation of intermediate events without overperturbation is often cheaper than with overperturbation), the complexity in terms of the number of trajectories simulated is the same, i.e., \(\mathcal {O}(N_{S})\). Overall, this procedure ensures that the net computational gain or loss in terms of the total number of trajectories generated is properly quantified. Using this measure we define \(\text {gain}_{i} (\%):= \frac {\text {No.\ Iterations SParSE}\text {No.\ Iterations SParSE++}}{\text {No.\ Iterations SParSE}} \times 100\) to assess the performance of SParSE++ compared to SParSE for a specific combination of \(\mathcal {P}_{\mathcal {E}}\) and \(\epsilon _{\mathcal {P}_{\mathcal {E}}}\) values. The numerator is the difference between the number of iterations employed by the two methods, while the denominator is the total number of iterations SParSE required. This fraction is multiplied by 100 to create a percentage. Similarly, we define \(\text {gain}_{t} (\%) := \frac {\text {SparSE runtime}\text {SparSE++ runtime}}{\text {SparSE runtime}} \times 100\) for comparison of absolute runtime (in seconds). Both variables gain_{ i } and gain_{ t } measure the relative computational ‘gain’ from using SParSE++ over SParSE. All simulations were run on Intel^{®;} Xeon^{®;} CPU E52620 v2 at 2.10 GHz workstation with 16 GB RAM, 64bit Windows 10 Enterprise OS, using Matlab and its Parallel Computing Toolbox™.
Lastly, we note that SParSE++ achieved 100% success on all examples tested and therefore omit explicitly listing the success rate in any of the tables. For examples where SParSE observed failure [16], we examine the role of new features in SParSE++ that enabled the algorithm to successfully converge to the solution hypersurface.
Birthdeath process
Results of SParSE and SParSE++ applied to the birthdeath process
\(\mathcal {P}_{\mathcal {E}}\)  \(\epsilon _{\mathcal {P}_{\mathcal {E}}}\)  Tot. Iter.  Tot. Iter.  Time (s)  Time (s)  No. Gain  No. Loss 

SParSE  SParse++  SParSE  SParSE++  
0.40  0.01  293  175  1139.0  684.9  119  1 
0.05  189  139  685.1  517.4  53  3  
0.10  139  121  476.2  457.6  21  3  
0.60  0.01  332  171  1128.7  602.2  161  0 
0.05  175  141  539.5  500.1  38  4  
0.10  129  120  378.3  421.6  9  0  
0.80  0.01  242  186  728.8  629.8  60  4 
0.05  139  137  449.9  480.8  3  1  
0.10  110  110  342.2  382.7  2  2  
0.010  0.001  279  244  3943.5  3656.6  44  9 
Reversible isomerization model
with x _{0}=[100 0], i.e., all molecules are initially in the A form. The target event \(\mathcal {E}\) is set to the population of isomer B reaching 30 before t _{ f }=10.
Results of SParSE and SParSE++ applied to the reversible isomerization model
\(\mathcal {P}_{\mathcal {E}}\)  \(\epsilon _{\mathcal {P}_{\mathcal {E}}}\)  Tot. Iter.  Tot. Iter.  Time (s)  Time (s)  No. Gain  No. Loss 

SParSE  SParse++  SParSE  SParSE++  
0.40  0.01  299  208  3133.8  1269.3  93  2 
0.05  210  172  1758.9  1144.3  40  2  
0.10  175  164  1549.3  1042.6  15  4  
0.60  0.01  349  228  2634.2  1266.6  125  4 
0.05  233  187  1762.5  1055.7  46  0  
0.10  188  175  1510.8  941.4  17  4  
0.80  0.01  319  247  2199.7  1384.6  74  2 
0.05  227  206  1415.7  1054.3  24  3  
0.10  190  188  1330.9  979.2  7  5  
0.95  0.005  316  306  3539.8  2678.5  25  15 
SIRS disease transmission system
with x _{0}=[100 1 0], where x=[S I R]. This model describes an epidemiological compartment where members of S become infected by members of I, who recover from the infection at rate γ and transition to R. Once recovered, members of R lose immunity at rate ω, and this transition from recovered to susceptible replenishes the population of S. The target event for this system is set to the population of I reaching 50 before t _{ f }=30. Unlike the two previous examples, this model contains a nonlinear reaction R _{2}, and there is no closedform solution for computing k ^{∗}. Therefore we use the same numerical solution obtained using the SSA in [16] to evaluate accuracy of SParSE and SParSE++ estimates.
Results of SParSE and SParSE++ applied to SIRS disease dynamics
\(\mathcal {P}_{\mathcal {E}}\)  \(\epsilon _{\mathcal {P}_{\mathcal {E}}}\)  Tot. Iter.  Tot. Iter.  Time (s)  Time (s)  No. Gain  No. Loss 

SParSE  SParse++  SParSE  SParSE++  
0.40  0.01  532  326  4252.9  3687.7  208  2 
0.05  424  294  3602.8  3310.3  136  6  
0.10  386  247  3247.4  2863.8  141  2  
0.60  0.01  328  212  826.6  589.8  117  1 
0.05  201  180  518.3  512.3  26  5  
0.10  178  156  477.2  469.0  22  0  
0.80  0.01  309  306  608.9  598.1  16  13 
0.05  231  221  490.9  510.9  10  0  
0.10  183  180  399.2  436.5  3  0 
Detailed results of SParSE and SParSE++ applied to SIRS disease dynamics on eight initial reaction rates (column 1) that exhibited slowest SParSE convergence
SParSE  SParSE++  

k ^{0}  \(\hat {p}^{0}\)  IE  γ  OP  Interp  Tot  IE  γ  OP  Leap  Interp  Tot  Gain 
[0.079 1.59 2.42]  0.80  20  17  17  1  55  4  3  3  3  3  16  39 
[0.081 1.07 2.28]  0.87  18  16  16  0  50  4  3  3  5  0  15  35 
[0.067 3.96 2.37]  0.00  17  6  0  5  38  12  11  0  2  0  25  13 
[0.064 3.31 1.71]  0.00  13  12  0  5  30  10  9  0  3  0  22  8 
[0.035 1.22 2.38]  0.64  11  8  8  2  29  5  4  4  2  0  15  14 
[0.040 3.59 1.52]  0.00  13  12  0  3  28  11  10  0  1  0  22  6 
[0.080 2.08 2.22]  0.74  11  8  8  1  28  3  2  2  1  0  8  20 
[0.031 3.16 1.84]  0.00  11  10  0  6  27  9  8  0  1  0  18  9 
From Table 4 and Fig. 4, we see that some initial reaction rates are not far from \(\mathcal {P}_{\mathcal {E}} = 0.40\) in absolute distance (e.g., \(\hat {p}^{0} = 0.64\) and \(\hat {p}^{0}=0.74\)). This illustrates that the initial distance from \(\mathcal {P}_{\mathcal {E}}\) cannot be reliably used to predict the speed of SParSE convergence to the solution hypersurface; if the initial estimate lies in a lowvariance parameter region, the multilevel CE method may take many iterations to reach k ^{∗}. We also note that the occurrence of overperturbation alone is not highly correlated with the speed of convergence in SParSE. Three of the eight sets did not show any overperturbation yet converged slowly to the solution hypersurface. The same holds for the number of interpolations; half of the eight reaction rates employed either 0 to 1 interpolation iterations to find k ^{∗} (Table 4). These varying behaviors demonstrate that no single modification to the SParSE algorithm would have significantly accelerated the convergence rates for all eight worstcase examples; rather, a collection of enhancements like those implemented in SParSE++ is required.
Detailed breakdown of SParSE++ applied to SIRS disease dynamics leaping usage on eight initial reaction rates that exhibited slowest SParSE convergence
SParSE++ Leaping Usage  

k ^{0}  CE  WA_{Interp}  WA_{LowSig}  Bisect  Tot  Gain 
[0.079 1.59 2.42]  2  1  0  0  3  39 
[0.081 1.07 2.28]  2  1  2  0  5  35 
[0.067 3.96 2.37]  2  0  0  0  2  13 
[0.064 3.31 1.71]  2  0  0  0  2  8 
[0.035 1.22 2.38]  1  1  0  0  2  14 
[0.040 3.59 1.52]  1  0  0  0  1  6 
[0.080 2.08 2.22]  1  0  0  0  1  20 
[0.031 3.16 1.84]  1  0  0  0  1  9 
Yeast polarization
with x _{0}=[70 4 0 100 0 0 0]. In this yeast polarization process, the subunit G _{ β γ } is thought to play an important role of signaling for the downstream Cdc42 cycle. Here we aim to discover reaction rates that yield target event \(\mathcal {E}\) of X(G _{ bg }) reaching 80 by t _{ f }=10 with probability \(\mathcal {P}_{\mathcal {E}} = 0.60\) and error tolerance \(\epsilon _{\mathcal {P}_{\mathcal {E}}} = 0.01\).
Results of SParSE and SParSE++ applied to the yeast polarization system
\(\mathcal {P}_{\mathcal {E}}\)  \(\epsilon _{\mathcal {P}_{\mathcal {E}}}\)  Tot. Iter.  Tot. Iter.  Time (s)  Time (s)  No. Gain  No. Loss 

SParSE  SParse++  SParSE  SParSE++  
0.60  0.01  343  275  13219.6  10350.1  69  1 
Results from applying correlation analysis on SParSE++ output of the yeast polarization model
R3R5  

r  Lower Bound  Upper Bound  pvalue 
0.76  0.88  0.55  1.0e6 
R1R8  
r  Lower Bound  Upper Bound  pvalue 
0.71  0.48  0.85  9.4e6 
Conclusions
We have developed SParSE++, a substantially more computationally efficient enhancement of SParSE for identifying parameter configurations that confer a userdefined probabilistic event. SParSE++ features novel parameter leaping methods for accelerating convergence as well as a more principled interpolation approach. Each class of leaping methods in SParSE++ has a set of prerequisite conditions. When these conditions are met, the algorithm “leaps" through parameter space, resulting in a marked reduction of the number of iterations required for convergence. This crossentropy leaping approach, based on exponential extrapolation, permits the algorithm to converge much more rapidly for low stochasticity problems than the traditional multilevel CE method employed by SParSE. In addition, by computing a weighted average of previous estimates, SParSE++ improves the accuracy of interpolation. We note that all the merits of SParSE—high parallelizability, robustness of \(\mathcal {P}_{\mathcal {E}}\) values, and concurrent updates on all reaction parameters—are retained in SParSE++.
The four examples featured in this paper demonstrate that performance gains are largest for problems requiring high accuracy. In terms of total number of iterations required, SParSE++ outperformed SParSE in 29 out of 30 test problems. For the birthdeath process with \(\mathcal {P}_{\mathcal {E}} = 0.80\) and \(\epsilon _{\mathcal {P}_{\mathcal {E}}} = 0.10\), SParSE and SParSE++ performed equally well. This is not surprising, as most (29 out of 30) of the initial reaction rates did not require any interpolation in SParSE and converged rapidly to the solution hypersurface. For this problem configuration, each set of initial reaction rates required only 3.7 iterations on average to converge to the solution hypersurface. Similarly, SParSE++ outperformed SParSE on 25 out of 30 test problems when comparing the total runtime. For the remaining five problems, the differences in runtime are negligible (less than a minute). We also note that three sets of reaction rates that failed to converge using SParSE (two from the birthdeath process and one from the reversible isomerization model) successfully reached the solution hypersurface with SParSE++.
As computational researchers continue to model events of interest in realistic biochemical systems, the need for efficient methods to identify compatible reaction rate parameters will grow. We expect that the algorithmic advancements provided by SParSE++ will fulfill this need and enable characterization of increasingly more computationally intensive biochemical events in the future.
Abbreviations
 CE:

cross entropy
 OP:

overperturbation
 SParSE:

stochastic parameter search for events
 SSA:

stochastic simulation algorithm
 WA:

weighted average
Declarations
Acknowledgments
The authors thank Bill and Melinda Gates for their active support of this work and their sponsorship through the Global Good Fund.
Funding
This was has been funded by the Global Good Fund.
Availability of data and materials
SParSE++ was coded using Matlab R2015a and simulated using Parallel Computing Toolbox. Matlab files that run SParSE++ on the three examples included in the paper are available on for download at https://github.com/InstituteforDiseaseModeling/PaperRepository/tree/master/SParSEpublication. Psuedocode for all components of SParSE++—multilevel CE method, inverse biasing method, CE and WA leaping methods, and interpolation process—is included in the additional file along with the manuscript.
Authors’ contributions
MR developed the method, coded the algorithm, and prepared figures and tables. BD participated in the design of the methods and edited the manuscript. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Komarov I, D’Souza RM. Accelerating the gillespie exact stochastic simulation algorithm using hybrid parallel execution on graphics processing units. PLoS ONE. 2012; 7(11):46693.View ArticleGoogle Scholar
 Li H, Petzold LR. Efficient parallelization of the stochastic simulation algorithm for chemically reacting systems on the graphics processing unit. Int J High Perform Comput Appl. 2010; 24(2):107–16.View ArticleGoogle Scholar
 Klingbeil G, Erban R, Giles M, Maini PK. Fat versus thin threading approach on gpus: Application to stochastic simulation of chemical reactions. Parallel Distributed Syst IEEE Trans. 2012; 23(2):280–7.View ArticleGoogle Scholar
 Dematté L, Prandi D. Gpu computing for systems biology. Brief Bioinform. 2010; 11(3):323–33.View ArticlePubMedGoogle Scholar
 Aldinucci M, Torquati M, Spampinato C, Drocco M, Misale C, Calcagno C, Coppo M. Parallel stochastic systems biology in the cloud. Brief Bioinform. 2013; 15(5):798.View ArticlePubMedGoogle Scholar
 Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B. Jr, AssadGarcia N, Glass JI, Covert MW. A wholecell computational model predicts phenotype from genotype. Cell. 2012; 150(2):389–401. doi:10.1016/j.cell.2012.05.044.View ArticlePubMedPubMed CentralGoogle Scholar
 Donovan RM, Sedgewick AJ, Faeder JR, Zuckerman DM. Efficient stochastic simulation of chemical kinetics networks using a weighted ensemble of trajectories. J Chem Phys. 2013; 139(11):115105. doi:10.1063/1.4821167.View ArticlePubMedPubMed CentralGoogle Scholar
 Zwier MC, Adelman JL, Kaus JW, Pratt AJ, Wong KF, Rego NB, Suárez E, Lettieri S, Wang DW, Grabe M, Zuckerman DM, Chong LT. Westpa: An interoperable, highly scalable software package for weighted ensemble simulation and analysis. J Chem Theory Comput. 2015; 11(2):800–9. doi:10.1021/ct5010615.View ArticlePubMedPubMed CentralGoogle Scholar
 Reinker S, Altman R, Timmer J, et al. Parameter estimation in stochastic biochemical reactions. Syst Biol. 2006; 153(4):168.View ArticleGoogle Scholar
 Boys RJ, Wilkinson DJ, Kirkwood TBL. Bayesian inference for a discretely observed stochastic kinetic model. Stat Comput. 2008; 18(2):125–35. doi:10.1007/s112220079043x.View ArticleGoogle Scholar
 Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MPH. Approximate bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009; 6(31):187–202.View ArticlePubMedGoogle Scholar
 Wang Y, Christley S, Mjolsness E, Xie X. Parameter inference for discretely observed stochastic kinetic models using stochastic gradient descent. BMC Syst Biol. 2010; 4:99. doi:10.1186/17520509499.View ArticlePubMedPubMed CentralGoogle Scholar
 Daigle Jr. BJ, Roh MK, Petzold LR, Niemi J. Accelerated maximum likelihood parameter estimation for stochastic biochemical systems. BMC Bioinformatics. 2012; 13(1):68. doi:10.1186/147121051368.View ArticleGoogle Scholar
 Lillacci G, Khammash M. The signal within the noise: Efficient inference of stochastic gene regulation models using fluorescence histograms and stochastic simulations. Bioinformatics. 2013. doi:10.1093/bioinformatics/btt380.
 Liao S, Vejchodský T, Erban R. Tensor methods for parameter estimation and bifurcation analysis of stochastic reaction networks. J R Soc Interface. 2015; 12(108). doi:10.1098/rsif.2015.0233. http://rsif.royalsocietypublishing.org/content/12/108/20150233.full.pdf.
 Roh MK, Eckhoff P. Stochastic parameter search for events. BMC Syst Biol. 2014; 8(1):126. doi:10.1186/s129180140126y.View ArticlePubMedPubMed CentralGoogle Scholar
 Gerardin J, Bever CA, Hamainza B, Miller JM, Eckhoff PA, Wenger EA. Optimal populationlevel infection detection strategies for malaria control and elimination in a spatial model of malaria transmission. PLoS Comput Biol. 2016; 12(1):1–19. doi:10.1371/journal.pcbi.1004707.View ArticleGoogle Scholar
 Klünder B, Freisinger T, WedlichSöldner R, Frey E. Gdimediated cell polarization in yeast provides precise spatial and temporal control of cdc42 signaling. PLoS Comput Biol. 2013; 9(12):1003396.View ArticleGoogle Scholar
 Daigle Jr BJ, Roh MK, Gillespie DT, Petzold LR. Automated estimation of rare event probabilities in biochemical systems. J Chem Phys. 2011; 134(4):044110. doi:10.1063/1.3522769.View ArticleGoogle Scholar
 Drawert B, Lawson MJ, Petzold L, Khammash M. The diffusive finite state projection algorithm for efficient simulation of the stochastic reactiondiffusion master equation. J Chem Phys. 2010; 132(7):074101. doi:10.1063/1.3310809.View ArticlePubMedPubMed CentralGoogle Scholar
 Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977; 81(25):2340–61.View ArticleGoogle Scholar
 Heinisch O, Steel RGD, Torrie JH. Principles and procedures of statistics. (With special reference to the biological sciences.) Mcgrawhill Book Company, New York, Toronto, London 1960, 481 s., 15 abb.; 81 s 6 d. Biometrische Zeitschrift. 1962; 4(3):207–8. doi:10.1002/bimj.19620040313.View ArticleGoogle Scholar