A new efficient approach to fit stochastic models on the basis of highthroughput experimental data using a model of IRF7 gene expression as case study
 Luis U. Aguilera^{1},
 Christoph Zimmer^{2} and
 Ursula Kummer^{1}Email author
DOI: 10.1186/s1291801704064
© The Author(s) 2017
Received: 27 August 2016
Accepted: 2 February 2017
Published: 20 February 2017
Abstract
Background
Mathematical models are used to gain an integrative understanding of biochemical processes and networks. Commonly the models are based on deterministic ordinary differential equations. When molecular counts are low, stochastic formalisms like Monte Carlo simulations are more appropriate and well established. However, compared to the wealth of computational methods used to fit and analyze deterministic models, there is only little available to quantify the exactness of the fit of stochastic models compared to experimental data or to analyze different aspects of the modeling results.
Results
Here, we developed a method to fit stochastic simulations to experimental highthroughput data, meaning data that exhibits distributions. The method uses a comparison of the probability density functions that are computed based on Monte Carlo simulations and the experimental data. Multiple parameter values are iteratively evaluated using optimization routines. The method improves its performance by selecting parameters values after comparing the similitude between the deterministic stability of the system and the modes in the experimental data distribution. As a case study we fitted a model of the IRF7 gene expression circuit to timecourse experimental data obtained by flow cytometry. IRF7 shows bimodal dynamics upon IFN stimulation. This dynamics occurs due to the switching between active and basal states of the IRF7 promoter. However, the exact molecular mechanisms responsible for the bimodality of IRF7 is not fully understood.
Conclusions
Our results allow us to conclude that the activation of the IRF7 promoter by the combination of IRF7 and ISGF3 is sufficient to explain the observed bimodal dynamics.
Keywords
IRF7 IFN Stochastic models Parameter estimationBackground
Computer models contribute to the integrative understanding of complex molecular processes in the cell. The most commonly used approach is deterministic modeling based on ordinary differential equations (ODEs). When the studied system comprises species with a low molecular count, stochastic formalisms, e.g. Gillespie’s algorithm that simulates trajectories and uses discrete molecule numbers are more appropriate [1]. In recent years, the use of stochastic models has substantially increased [2].
Considering stochasticity in biological systems has changed the quantitative and qualitative understanding obtained by previous deterministic models [3]. Examples of biological phenomena discovered by stochastic modeling include gene expression in burstlike patterns [4], productive or latent cell decision after HIVinfection [5], and the presence of oscillatory behavior induced by noise [6]. However, in contrast to the plethora of methods used to fit and analyze deterministic models [7–10], there are only very limited sets of methods available to do the same with stochastic models.
Thus, only recently, methods for parameter fitting of stochastic models have been developed and are still under development. Parameter estimation methods for stochastic models have been so far designed for single time course data [11–16]. For experiments leading to data distributions, they include the moment closure [17–19], and the comparison of experimental and simulation distributions [20–22]. Despite the current efforts, there are still major problems that limit the full applicability of those methods. The main drawback is the high computational execution time. To solve this problem novel efficient strategies to fit stochastic models are needed.
In the following we introduce a new approach that is a variant of existing methods based on the comparison of distributions. Differently to the existing methods we use the experimental data to define a condition that the model must fulfill in the deterministic regime. This condition tests, if the deterministic steady states are in the close proximity to the modes of the experimental distribution. Only if the model fulfills this condition, the parameter set is evaluated in the stochastic regime. In this way, the algorithm directs the evaluation of parameter sets towards regions in parameter space with high potential to reproduce the experimental data. Additionally, this method applies to nonlinear models and complex experimental data distributions.
To test the potential of this new method, we selected an open scientific question and real experimental data. We investigated the molecular mechanisms responsible for the experimentally observed bimodal dynamics of IRF7 expression after Interferon (IFN) stimulation.
It is well documented that in isogenic cell populations not all cells produce an antiviral response when infected by a virus [23]. The mechanisms behind this heterogeneity have been related to stochastic events in the signaling pathways responsible to elicit the antiviral response [24–26]. IFN is a cytokine that activates the JAKSTAT signaling pathway in virusinfected cells. The JAKSTAT signaling pathway induces the translocation of ISGF3 (a transcription factor) into the nucleus to directly activate the transcription of a set of several hundred IFNstimulated genes (ISGs) [27]. IRF7 is an ISG with a central role in the immune response [28, 29]. Recently, it was observed that after IFN stimulation murine fibroblasts show a switchlike pattern of IRF7 expression, which is reflected at the population level as bimodality [30].
We developed a model to describe the observed bimodal dynamics of IRF7 expression after IFN stimulation. We hypothesize that the binding of IRF7 and ISGF3 to the IRF7 promoter is the mechanism responsible for this bimodal behavior. Our simulation results reproduced IRF7 switchlike dynamics at single cell level, and bimodality was achieved at the population level. Hereby, we used the newly developed method to fit the model and correctly reproduce the experimental data. Our results allow us to conclude that the IRF7 promoter activation by the combination of IRF7 dimers and ISGF3 is sufficient to quantitatively explain the observed IRF7 bimodal dynamics.
Methods
Experimental data
Published experimental data were produced by Rand et al. [30]. The data described the expression of IRF7 after IFN β stimulation in a population of murine NIH3T3 fibroblasts. Rand’s experiments were done in the following way: First, cells were transfected with a BAC (Bacterial Artificial Chromosome) containing IRF7 and reporter mCherry genes fused, subsequently cultures were treated with different concentrations of murine IFN β. For illustrative purposes we selected the treatment with 150U of IFN β, a concentration where bimodality is prevalent. Then, fluorescence in the cultures was monitored using flow cytometry at different time points during 48 hours.
Numerical methods
In the deterministic regime the model was simulated using symbolic methods in Matlab [31] and/or using the LSODA algorithm in COPASI 4.11 [32]. Stability was calculated by making the righthand side of the differential equations equal to zero and subsequently by finding the roots of the system using function solve in Matlab. Eigenvalues were calculated using function eig in Matlab. For solving the model under stochastic dynamics we used the Gibson and Bruck algorithm [33] coded in COPASI. The random search and genetic algorithm were coded in Matlab. Raw experimental data was analyzed using the function FCS data reader coded in Matlab [34]. The modes in the distributions were calculated using the function PeakFinder coded in Matlab [35]. The source code of the project can be accessed via: https://sourceforge.net/projects/irf7bimodaldynamics/.
Results
New algorithm to fit stochastic biological models
Comparing experimental and simulation distributions
at each measurement time point t _{ i }, i=1,…,n.
Algorithm implementation
As optimization usually requires plenty of function evaluations and each evaluation of F requires ns stochastic simulations, a direct optimization strategy is computationally unfeasible even for simple models. Therefore, we introduce a new strategy that selects good candidate parameters and only performs the stochastic simulations for these parameters.
In large systems where fluctuations can be discarded, the stochastic system can be reduced to the deterministic one [36, 37]. For this reason, in most cases the deterministic dynamics can be associated with a measure of central tendency in the PDFs obtained after the stochastic simulations. Using this reasoning we will introduce the strategy to efficiently estimate parameters for stochastic models making use of deterministic dynamics as an initial indicator.
Assuming that the experimental data is in equilibrium at the last measurement point t _{ n }, we used the modes from P _{ e }(m(t _{ n }),b _{ l }) as a central measure of tendency, denoting the modes by α(t _{ n })=(α _{1}(t _{ n }),…,α _{ q }(t _{ n })), with q being the total number of modes.
Then, using the ODE version of the model (\(\dot {\boldsymbol {X}}(\theta,t)\)) we determine its stability. Here we tested whether the system has stable steady states and denote them with \(\boldsymbol {X}^{*}(\theta) = \left (X^{*}_{1}(\theta),\ldots,X^{*}_{ss}(\theta) \right)\), being s s the number of stable steady states. If the system has no stable steady state, it holds s s=0 and the vector X ^{∗}(θ) is empty. The calculation of steady states can either be performed by solving \(\dot {\boldsymbol {Y}}(\theta) = 0\) where \(\dot {\boldsymbol {Y}}\) is the right hand side of the ODE system with X(θ,t) replaced by Y(θ). Solving this equation can be impossible for large systems. Alternatively, one can numerically solve the ODE systems until the flux is zero for all components. Varying the initial conditions leads to the different steady states. The stability of the system was calculated after determining the sign of the eigenvalues (λ<0), for stable steady states [38].
Our extended objective function with the deterministic check reads as:
Parameter estimation methods
The deterministic precondition was first implemented using a random search algorithm. Random search is a global optimization strategy that tests random combinations of parameter values. The successful output in this method is dependent on the total number of evaluated parameter sets [39]. The more evaluated parameter values, the higher the probability to find the global minimum. The pseudocode for the random search is given in Algorithm 1.
The second implementation of the deterministic precondition was using the more directed Genetic Algorithm. Genetic Algorithm mimics evolution and is based on reproduction and selection. This algorithm is made of a population of individuals (parameter sets), and each contains a genome that is defined by the number of parameters to optimize. The individuals are ranked after solving the objective function, and a population of parental individuals is selected according to an elitism rate (ε). New individuals (offspring) are generated by pairing and recombining the parental genomes (crossover). Variability is introduced in the population by adding mutations in the new individuals according to a given mutation rate (μ). By the continuous process of selecting the best parameters after each generation, the algorithm evolves towards the regions in the parameter space that reduce the values in the objective function. The pseudocode for the Genetic Algorithm is given in Additional file 1.
Identifying parameters for a constitutive gene expression circuit with insilico data
To estimate the parameter values for this model we carried out the following procedure: First, we generated the insilico data by selecting the following true parameter values θ ^{ o }=(5,0.03,0.1,0.03) and running 10000 independent stochastic simulations from which PDFs were built. Four observable time points were defined at 50 min, 100 min, 150 min and 200 min (see Fig. 3 b). From the PDFs a mode was obtained at α _{1}(t _{ n }) = [543] a.u.(arbitrary units), where t _{ n }=200 min. Then, we defined the deterministic precondition using Eq. (4), and assigning minimum and maximum acceptance ranges by setting \(\beta _{1}^{low}\) = 0.95 and \(\beta _{1}^{up}\) = 1.05, respectively. A deeper analysis of the stability of the constitutive gene expression circuit is given in Additional file 2. To build simulated PDFs we used 1000 repetitions of the stochastic model, this number was empirically calculated according to Additional file 2. For illustrative purposes, we assumed that the values for the parameters responsible for the mRNA transcription and protein translation (θ _{1}, θ _{3}) were unknown, and the new algorithm was used to estimate those parameters. The deterministic precondition was applied using the RS strategy obtaining that the algorithm only evaluates stochastic dynamics in 3.1% of the tested parameter values, reducing in this way the total simulation time. Additionally, the complete algorithm was repeated 1000 times and histograms of the estimated parameter were computed to determine whether they are close to θ ^{(0)}. As can be observed in Fig. 3c the deterministic precondition reduces the evaluation of different parameters under stochastic dynamics by selecting only those parameters that are in a welldefined region in the proximity of θ ^{(0)}. For this model, the main benefits of using the deterministic precondition was the reduction of the number of parameters evaluated under stochastic dynamics. This rejection of parameters was made in an area outside the true parameter values, and hence no difference in accuracy is expected in comparison with a method without using the deterministic precondition. A complete description of the analysis of the performance, accuracy and error for this example is given in Additional file 2. The model for the constitutive gene expression circuit can be obtained from BioModels database under reference MODEL1608100000.
Mathematical model for IRF7 expression dynamics
Chemical reactions
Reaction rates considered in the model
Name  Definition 

f _{ IFN }  \(V_{IFN}\left (\frac {IFN^{n}}{k^{n}_{IFN}+IFN^{n}}\right)\) 
f _{ dIFN }  k _{ dIFN }·I F N 
f _{ I S G F3}  k _{ I S G F3}·I F N 
f _{ d I S G F3}  k _{ d I S G F3}·I S G F3 
\(f_{P_{a}}\)  \(k_{on}\left (\frac {ISGF3 \cdot IRF7dimer}{k_{aI3} \cdot k_{aI7} + k_{aI3} \cdot ISGF3 + k_{aI7}\cdot IRF7dimer + ISGF3 \cdot IRF7dimer}\right)(1P_{a})\) 
\(f_{dP_{a}}\)  k _{ off }·P _{ a } 
\(f_{mRNA_{A}}\)  k _{ Active }·P _{ a } 
\(f_{mRNA_{B}}\)  k _{ Basal }(1−P _{ a }) 
f _{ dmRNA }  k _{ dmRNA }·m R N A 
f _{ I R F7}  k _{ I R F7}·m R N A 
f _{ I R F7p h o s p }  k _{ d I R F7}·I R F7 
f _{ I R F7d i m e r }  k _{ I R F7d i m e r }·I R F7p h o s p·I R F7p h o s p 
f _{ d I R F7d i m e r }  k _{ d I R F7d i m e r }·I R F7d i m e r 
Mathematical equations
The model for the IRF7 circuit can be obtained from BioModels database under reference MODEL1608100001.
Fitting the stochastic IRF7 gene expression model to experimental data
where φ is a scaling factor.
Description of the parameter values
Name  Description  Range  Nominal  Units 

V _{ IFN }  Maximum activation rate of the IFN pathway  [ 2.8,11.2]  6.135  (Molecules/Cell) ^{∗}min 
n  Hill coefficient    2  Dimensionless 
k _{ IFN }  Saturation constant for the IFN pathway  [ 0.0022,0.0088]  0.0055  Molecules/Cell 
k _{ dIFN }  Decay rate of the IFN pathway  [ 0.0232,0.0926]  0.0492  min^{1} 
k _{ I S G F3}  Constant for ISGF3 production  [ 0.00012,0.00048]  0.0003  min^{1} 
k _{ d I S G F3}  Decay rate of the ISGF3  [ 0.00068,0.00272]  0.0017  min^{1} 
k _{ on }  Promoter activation  [ 184.55,738.2]  522.59  min^{1} 
k _{ a I3}  Constant of promoter activation by IFN  [ 7681.6,30727]  22687.02  Molecules/Cell 
k _{ a I7}  Constant of promoter activation by IRF7  [ 13399,53597]  35281.99  Molecules/Cell 
k _{ off }  Promoter inactivation  [ 0.00044,0.00176]  0.0013  min^{1} 
k _{ Active }  IRF7 transcription rate by active promoter  [ 0.5402,2.161]  1.144  min^{1} 
^{ a } k _{ Basal }  IRF7 basal transcription rate  [ 0.0312,0.125]  0.0861  min^{1} 
^{ b } k _{ dmRNA }  Decay rate of mRNA  [ 0.029,0.116]  0.0715  min^{1} 
^{ c } k _{ I R F7}  Translation rate of IRF7  [ 14,56]  43.867  min^{1} 
k _{ d I R F7}  Rate of IRF7 phosphorylation  [ 1.540,6.160]  3.877  min^{1} 
k _{ I R F7d i m e r }  Rate of IRF7 dimerization  [ 0.235,0.94]  0.602  (Cell/Molecules)/min 
k _{ d I R F7d i m e r }  Decay rate of IRF7 dimers  [ 0.209,0.836]  0.439  min^{1} 
φ  Scaling factor    1  Cell/Molecules 
Using the random search strategy, we tested 10000 parameter sets from which less than the 1% passed the deterministic precondition and were stochastically evaluated. The reduction of parameter values allowed us to efficiently find a set of parameter values that reproduced the experimental data (the fitting for the random search algorithm is given in Additional file 3). A complete analysis of the performance of the use of the deterministic precondition with the random search algorithm is given in Additional file 4. To test the reproducibility of the estimated parameter values, the method was repeated 100 times obtaining welldefined parameter distributions that show the predominant values, see Additional file 5.
IRF7 temporal dynamics
System’s initial conditions
Variable  Initial condition (molecules/cell) 

IFN  150 
I S G F3  1 
Pa  0 
mRNA  1 
I R F7  1 
I R F7p h o s p  0 
I R F7d i m e r  0 
System’s steady states
Variable  1^{st} sss (eigenvalue)  2^{nd} sss (eigenvalue) 

IFN  0 (15.95)  124.70 (0.0017) 
I S G F3  0 (3.87)  22.00 (0.049) 
Pa  0 (0.44)  0.99 (0.072) 
mRNA  1.20 (0.071)  15.93 (0.32) 
I R F7  13.62 (0.049)  180.32 (0.44) 
I R F7p h o s p  6.62 (0.0017)  24.09 (3.87) 
I R F7d i m e r  60.19 (0.0013)  796.69 (58.03) 
I R F7t o t a l  80.43  954.01 
Discussion
The promise of systems biology is to achieve a quantitative understanding of the molecular processes in the cell with the aid of computational models. However, a bottleneck is the availability of reliable parameter values needed in those models. Often, it is very difficult or even impossible to measure all of these parameters. For deterministic models, this problem has been well tackled by the development of efficient methods for parameter estimation. Contrarily, for stochastic dynamics the landscape of methods for parameter estimation is poorer and still under development. For this reason, innovative and efficient algorithms are needed to fit and validate realistic stochastic models.
During the last years important advances have been achieved in the field of parameter inference for stochastic models. On one hand, strategies that involve mathematical procedures of moment estimation have been suggested [17–19]. On the other, Bayesian methods that test multiple parameters values to find approximated solutions that represent the data have been successfully implemented [20]. Currently, the main problems observed in most methods include the computational cost, the accuracy of the obtained solution, and its potential to be implemented in large and nonlinear systems. Different strategies have been suggested to improve the accuracy and alleviate the computational performance [21, 22].
In our method, we tackled the computational cost by introducing a deterministic precondition that works as a bypass in the algorithm avoiding large amounts of unnecessary stochastic simulations. The concept of reducing the parameter space by introducing a precondition defined by the experimental data has been suggested previously by Hori et al. [21]. In their method, a small order linear model is used to optimize the experimental data by finding the root of a Lyapunov equation. In contrast, we use a deterministic precondition. Both approaches have advantages and disadvantages. Thus, Hori’s method is constrained by the need to find the Lyapunov equation. Our method is based on steady state calculation and Monte Carlo simulations that are standard and wellknown methodologies used in systems biology.
Recently a new algorithm developed by Lillacci et al., has been shown to significantly reduce the computational cost and achieve a high accuracy fit for stochastic models and flow cytometry data [22]. This method uses the Kolmogorov distance as a metric to calculate the difference between model simulation and experimental data. By choosing this metric, it is possible to estimate the minimal number of simulations needed to compare experimental and model simulations under a certain tolerance value. This estimated value decreases as the number of experimental data increases. In typical flow cytometry experiments the number of measured cells is in the order of tens of thousands and this number in Lillacci’s Algorithm is translated in a reduction of at least one order of magnitude in the number of required stochastic simulations. This algorithm has been applied to a model with 18 reactions and 20 free parameters obtaining a good fit to flow cytometry experimental data that reproduces bimodal dynamics. Comparing our method with Lillacci’s algorithm it is important to point out that both methods tackle the computational cost issue in two fundamentally different ways. Lillacci’s algorithm minimizes the number of needed stochastic simulations, whereas our algorithm minimizes the number of parameters sets evaluated with stochastic dynamics. A powerful new algorithm may be the result of combining both methodologies.
It is well known from nonlinear dynamics that the model architecture and parameter values determine the behavior of the system. Hence, more complex stable dynamics such as limit cycles, and highermultistability (a system with more than two stable steady states) can be obtained. Unstable systems are usually not of interest in the biological context. Our method was designed and tested for monostable and bistable systems, all other cases being rejected by the deterministic precondition. The case of limit cycles can be better approached by existing parameter estimation methods for single timecourse data [12–16]. The case of systems with highermultistability can in principle be treated by our method, but the application is limited by the costly need to find multiple stable steadystates in the system. Finally, it is relevant to mention that comparing stable states of the ODE model with modes of the measured flow cytometry distribution in some models can present some inherent problems. Bimodal fluorescence distributions may have no relation to bistability, e.g. when the systems has large differences in transitions rates for the promoter states [44].
Our method was implemented using two wellestablished optimization strategies, random search and genetic algorithms. Using a random search algorithm we observed that less than 1% of the population of parameters are subject to stochastic evaluations, this reduction in the tested parameters allowed us to explore a larger proportion of the parameter space, which is especially relevant for systems with multiple unknown parameters and no initial parameter guesses. On the other hand, using genetic algorithms we achieved a convergence to a minimal value in the optimization function after a few generations (see Fig. 5). Additionally, during each generation in the genetic algorithm a large percentage of the total evaluated parameters is rejected by the deterministic precondition. This percentage is dependent on the parameter of the algorithm (rate of elitism and mutation rate). In both strategies the introduction of the deterministic precondition significantly improved the parameter estimation process. Moreover, the implementation of deterministic precondition is not restricted to random search and genetic algorithms, it can also be implemented in other optimization algorithms, e.g. particle swarm or simulated annealing. A complete analysis of the performance of using the deterministic condition is given in Additional file 4.
The accuracy of the method is given by the agreement between experimental data and simulations. As can be observed in Fig. 6 a good fit was found, albeit not a perfect one. We observed that even increasing the number of tested parameter values during the random search and/or increasing the number of generations during the genetic algorithm did not improve the fit. For this reason, we consider that the differences between the model and the experimental data might be explained by the fact that our system is a highly simplified model that was built by lumping some steps in the biological system. Our team is working to fit more complex models in a future publication. To test the reproducibility of the obtained parameter values, the method was repeated 100 times and distributions of the obtained parameters were built. As can be observed in Additional file 5, when a parameter is identifiable this is reflected in a narrow distribution, on the other hand, when a parameter is nonidentifiable this is reflected in a wide distribution. In the example given by reactions (11) to (23) we observed that most parameters are not identifiable. Thus, nonidentifiability is expected in the parameters contained in reactions (11) and (20). Those reactions have opposite effects on the IFN dynamics. For this reason, multiple combinations of values in parameters V _{ IFN }, k _{ IFN } and k _{ dIFN } have similar effects on the overall system dynamics. Contrarily, the parameters involved in the production of the mRNA (k _{ Active } and k _{ Basal }) are better confined.
Rand et al. described bimodal gene expression of IRF7 after IFN stimulation. This phenomenon was explained at the cell population level by the effects of the IFN paracrine response [30]. Here we observed that bimodality also can be explained in absence of paracrine response and only taking into account the molecular mechanism responsible for the IRF7 promoter dynamics. Bistability in geneexpression circuits is commonly associated with the switching between active/basal states in the gene promoter. In most cases, DNA cooperativity in the promoter is the basis of promoterstate switching [45]. However, for typeI IFN responses a promoter activation in a cooperative manner has recently been discarded [46]. Taking the recent literature into account, we developed a model on the basis of a positive feedback loop circuit comprising the independent activation of ISRE and IRFE elements by one ISGF3 molecule and one IRF7 dimer, respectively (see Fig. 4). This model has the needed and sufficient elements to sustain a bistable system (that is a positive feedback loop and nonlinear dynamics in the reaction rates) [47, 48]. Additionally, multiple model structures testing different biological scenarios including: additional feedback loops, additional intermediate elements, and promoter cooperative activation were tested, obtaining that the presented model reproduces the experimental data best. Our simulation results agree with the experimental data obtained from flow cytometry [30]. Additional characteristics observed in the experimental data are also reproduced by the model, such as the basal expression of IRF7 without IFN stimulation [29], see Additional file 7. The number of molecules in the active state of the system (given by the obtained 2 ^{ n d } steady state, see Table 4) agrees with the order of magnitude reported for mammal mRNAs (average = 17 Molecules/Cell with a range between 1 to 200 Molecules/Cell) and the range of protein concentration (average = 50,000 Molecules/Cell with a range between 100 to 10^{8} Molecules/Cell) [49].
Conclusion
Here, we present a method to fit stochastic models to experimental data. The method is based on the comparison of distributions. The central idea of the method is to use a deterministic precondition that is defined by the experimental data as a filter avoiding large amounts of costly stochastic simulations. Using this idea, the number of parameters evaluated under stochastic dynamics is reduced, resulting in a significant improvement in the performance of the algorithm. As a case study, we used a model of IRF7 gene expression investigating the origin of bimodality in its dynamics upon IFN stimulation. Our results allowed us to conclude that a circuit with IRF7 promoter activation by one IRF7 dimer and one ISGF3 molecule is sufficient to explain the observed bimodal dynamics.
Abbreviations
 au:

Arbitrary units
 BAC:

Bacterial artificial chromosome
 DP:

Deterministic precondition
 GA:

Genetic algorithm
 IFN:

Interferon
 ISG:

Interferon stimulated genes
 ODE:

Ordinary differential equation
 PDF:

Probability density function
 RS:

Random search
 sss:

Stable steady state
Declarations
Acknowledgements
We gratefully thank Dr. Frank T. Bergmann for providing a modified version of Copasi that is used as part of this project. In the same way, we express our sincere thanks to Dr. Sven Sahle for his helpful comments helping us to increase the algorithm performance by suggesting changes in the data acquisition and analysis. We sincerely thank Dr. Ulfert Rand and Dr. Mario Köster for providing the experimental data.
Funding
This research was supported by Consejo Nacional de Ciencia y Tecnología (CONACYT) under grant 263795 in the program “Estancias postdoctorales al extranjero para la consolidación de grupos de investigación convocatoria 2015” to LUA. CZ was supported by BIOMS. The project was also supported by BMBF (Immunoquant).
Availability of data and materials
The data sets supporting the result of the article are included within the article and in additional files. The source code of the project can be accessed via: https://sourceforge.net/projects/irf7bimodaldynamics/. The models can be accessed via: https://www.ebi.ac.uk/biomodelsmain/
Authors’ contributions
LUA, UK and CZ conceived and designed the research. LUA wrote the manuscript, designed and performed the simulations. UK suggested extensions and modifications of the research. UK and CZ participated in discussions and revised the manuscript critically. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Gillespie DT. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys. 1976; 22(4):403–34. doi:10.1016/00219991(76)900413.
 Adrews SS, Dinh T, Arkin AP. Stochastic models of biological processes. Encycl Complex Syst Sci. 2009:8730–49. doi:10.1007/9780387304403_524.
 Raj A, van Oudenaarden A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell. 2008; 135(2):216–26. doi:10.1016/j.cell.2008.09.050.
 Golding I, Paulsson J, Zawilski SM, Cox EC. Realtime kinetics of gene activity in individual bacteria. Cell. 2005; 123(6):1025–36. doi:10.1016/j.cell.2005.09.031.
 Weinberger LS, Burnett JC, Toettcher JE, Arkin AP, Schaffer DV. Stochastic gene expression in a lentiviral positivefeedback loop: HIV1 Tat fluctuations drive phenotypic diversity,. Cell. 2005; 122(2):169–82. doi:10.1016/j.cell.2005.06.006.
 Samoilov M, Plyasunov S, Arkin AP. Stochastic amplification and signaling in enzymatic futile cycles through noiseinduced bistability with oscillations,. Proc Natl Acad Sci U S A. 2005; 102(7):2310–5. doi:10.1073/pnas.0406841102.
 Moles CG, Mendes P, Banga JR. Parameter estimation in biochemical pathways: a comparison of global optimization methods. Genome Res. 2003:2467–74. doi:10.1101/gr.1262503..
 RodriguezFernandez M, Egea JA, Banga JR. Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinformatics. 2006:483. doi:10.1186/147121057483.
 Bock HG, Kostina E, Schlöder JP. Numerical Methods for Parameter Estimation in Nonlinear Differential Algebraic Equations. GAMMMitteilungen. 2007; 30(2):376–408. doi:10.1002/gamm.200790024.
 Marquardt DW. An algorithm for leastsquares estimation of nonlinear parameters. J Soc Ind Appl Math. 1963; 11(2):431–41. doi:10.1137/0111030.
 Zimmer C. Reconstructing the hidden states in time course data of stochastic models. Math Biosci. 2015; 269:117–29. doi:10.1016/j.mbs.2015.08.015.
 Zimmer C, Sahle S. Deterministic inference for stochastic systems using multiple shooting and a linear noise approximation for the transition probabilities. IET Syst Biol. 2015:1–12. doi:10.1049/ietsyb.2014.0020.
 Komorowski M, Costa MJ, Rand DA, Stumpf MPH. Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proc Natl Acad Sci. 2011; 108(21):8645–50. doi:10.1073/pnas.1015814108.
 Golightly A, Wilkinson DJ. Bayesian inference for stochastic kinetic models using a diffusion approximation. Biometrics. 2005; 61(3):781–8. doi:10.1111/j.15410420.2005.00345.x.
 Tian T, Xu S, Gao J, Burrage K. Simulated maximum likelihood method for estimating kinetic rates in gene expression. Bioinformatics. 2007; 23(1):84–91. doi:10.1093/bioinformatics/btl552.
 Boys RJ, Wilkinson DJ, Kirkwood TBL. Bayesian inference for a discretely observed stochastic kinetic model. Stat Comput. 2008; 18(2):125–35. doi:10.1007/s112220079043x.
 Zechner C, Ruess J, Krenn P, Pelet S, Peter M, Lygeros J, Koeppl H. Momentbased inference predicts bimodality in transient gene expression. Proc Natl Acad Sci. 2012; 109(21):8340–345. doi:10.1073/pnas.1200161109.
 Kügler P. Moment fitting for parameter inference in repeatedly and partially observed stochastic biological models. PLoS ONE. 2012; 7(8):1–15. doi:10.1371/journal.pone.0043001.
 MillerJensen K, Skupsky R, Shah PS, Arkin AP, Schaffer DV. Genetic selection for contextdependent stochastic phenotypes: Sp1 and TATA mutations increase phenotypic noise in HIV1 gene expression. PLoS Comput Biol. 2013; 9(7). doi:10.1371/journal.pcbi.1003135.
 Poovathingal SK, Gunawan R. Global parameter estimation methods for stochastic biochemical systems,. BMC Bioinformatics. 2010; 11:414. doi:10.1186/1471210511414.
 Hori Y, Khammash MH, Hara S. Efficient parameter identification for stochastic biochemical networks using a reducedorder realization. In: Proceedings of the European Control Conference 2013. Zurich: European Control Conference, ECC: 2013. p. 4154–49.Google Scholar
 Lillacci G, Khammash M. The signal within the noise: efficient inference of stochastic gene regulation models using fluorescence histograms and stochastic simulations. Bioinformatics. 2013; 29(18):2311–9. doi:10.1093/bioinformatics/btt380.
 Ivashkiv LB, Donlin LT. Regulation of type I interferon responses,. Nat Rev Immunol. 2014; 14(1):36–49. doi:10.1038/nri3581.
 Hu J, Sealfon SC, Hayot F, Jayaprakash C, Kumar M, Pendleton AC, Ganee A, FernandezSesma A, Moran TM, Wetmur JG. Chromosomespecific and noisy IFNB1 transcription in individual virusinfected human primary dendritic cells. Nucleic acids Res. 2007; 35(15):5232–41. doi:10.1093/nar/gkm557.
 Zhao M, Zhang J, Phatnani H, Scheu S, Maniatis T. Stochastic expression of the interferon β gene. PLoS Biol. 2012; 10(1):1001249. doi:10.1371/journal.pbio.1001249.
 Levin D, Harari D, Schreiber G. Stochastic receptor expression determines cell fate upon interferon treatment. Mol Cell Biol. 2011; 31(16):3252–66. doi:10.1128/MCB.0525111.
 Schneider WM, Chevillotte MD, Rice CM. Interferonstimulated genes: a complex web of host defenses. Annu Rev Immunol. 2014; 32:513–45. doi:10.1146/annurevimmunol032713120231.
 Vasquez K, Sigrist K, Kucherlapati R, Demant P, Dietrich WF, Agoulnik S, Plus S. IRF7 is the master regulator of. Nature. 2005; 434(April):772–7. doi:10.1038/nature03419.1..
 Honda K, Taniguchi T. IRFs: master regulators of signalling by Tolllike receptors and cytosolic patternrecognition receptors. Nat Rev Immunol. 2006; 6(9):644–58. doi:10.1038/nri1900.
 Rand U, Rinas M, Schwerk J, Nöhren G, Linnes M, Kröger A, Flossdorf M, KályKullai K, Hauser H, Höfer T, Köster M. Multilayered stochasticity and paracrine signal propagation shape the typeI interferon response. Mol Syst Biol. 2012; 584:584. doi:10.1038/msb.2012.17.
 MATLAB version (R2015a); 2015.
 Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, Singhal M, Xu L, Mendes P, Kummer U. COPASI–a COmplex PAthway SImulator. Bioinformatics (Oxford, England). 2006; 22(24):3067–74. doi:10.1093/bioinformatics/btl485.
 Gibson MA, Bruck J. Efficient exact stochastic simulation of chemical systems with many species and many channels. J Phys Chem A. 2000; 104(9):1876–89. doi:10.1021/jp993732q.
 Balkay L. FCS data reader. MATLAB Central File Exchange. 2006.
 Yoder N. peakfinder. MATLAB Central File Exchange. 2009.
 Gillespie DT. The chemical langevin equation. J Chem Phys. 2000; 113(1):297–306. doi:10.1063/1.481811.
 Gillespie DT. Stochastic simulation of chemical kinetics. Annu Rev Phys Chem. 2007; 58:35–55. doi:10.1146/annurev.physchem.58.032806.104637.
 Strogatz SH. Nonlinear dynamics and chaos with applications to physics, biology, chemistry, and engineering. New York: Westview Press; 2000. doi:10.1137/1037077.
 Brownlee J. Clever algorithms natureinspired programming recipes. Melbourne: Lulu Enterprises; 2011.Google Scholar
 Legewie S, Herzel H, Westerhoff HV, Blüthgen N. Recurrent design patterns in the feedback regulation of the mammalian signalling network. Mol Syst Biol. 2008; 4(1):190. doi:10.1038/msb.2008.29.
 Ning S, Huye LE, Pagano JS. Regulation of the transcriptional activity of the IRF7 promoter by a pathway independent of interferon signaling. J Biol Chem. 2005; 280(13):12262–70. doi:10.1074/jbc.M404260200.
 Litvak V, Ratushny AV, Lampano AE, Schmitz F, Huang AC, Raman A, Rust AG, Bergthaler A, Aitchison JD, Aderem A. A FOXO3IRF7 gene regulatory circuit limits inflammatory sequelae of antiviral responses. Nature. 2012; 490(7420):421–5. doi:10.1038/nature11428.
 Kaufmann DE, Walker BD. Treatment interruption to boost specific HIV immunity in acute infection. Curr Opin HIV AIDS. 2007; 2(1):21–5.View ArticlePubMedGoogle Scholar
 Dobrzyński M, Nguyen LK, Birtwistle MR, von Kriegsheim A, Fernández AB, Cheong A, Kolch W, Kholodenko BN. Nonlinear signalling networks and celltocell variability transform external signals into broadly distributed or bimodal responses. J R Soc Interface. 2014; 11(98):20140383. doi:10.1098/rsif.2014.0383.
 Kaern M, Elston TC, Blake WJ, Collins JJ. Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet. 2005; 6(6):451–64. doi:10.1038/nrg1615.
 Begitt A, Droescher M, Meyer T, Schmid CD, Baker M, Antunes F, Knobeloch KP, Owen MR, Naumann R, Decker T, Vinkemeier U. STAT1cooperative DNA binding distinguishes type 1 from type 2 interferon signaling. Nat Immunol. 2014; 15(2):168–76. doi:10.1038/ni.2794.
 Cherry JL, Adler FR. How to make a biological switch. J Theor Biol. 2000; 203(2):117–33. doi:10.1006/jtbi.2000.1068.
 Craciun G, Tang Y, Feinberg M. Understanding bistability in complex enzymedriven reaction networks. Proc Natl Acad Sci. 2006; 103(23):8697–702. doi:10.1073/pnas.0602767103.
 Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M. Global quantification of mammalian gene expression control. Nature. 2011; 473(7347):337–42. doi:10.1038/nature10098.
 Yang E, van Nimwegen E, Zavolan M, Rajewsky N, Schroeder M, Magnasco M, Darnell JE. Decay rates of human mrnas: correlation with functional characteristics and sequence attributes. Genome Res. 2003; 13(8):1863–72. doi:10.1101/gr.1272403.