Optimal experiment design for model selection in biochemical networks
 Joep Vanlier^{1, 2}Email author,
 Christian A Tiemann^{1, 2},
 Peter AJ Hilbers^{1, 2} and
 Natal AW van Riel^{1, 2}
DOI: 10.1186/17520509820
© Vanlier et al.; licensee BioMed Central Ltd. 2014
Received: 11 April 2013
Accepted: 13 February 2014
Published: 20 February 2014
Abstract
Background
Mathematical modeling is often used to formalize hypotheses on how a biochemical network operates by discriminating between competing models. Bayesian model selection offers a way to determine the amount of evidence that data provides to support one model over the other while favoring simple models. In practice, the amount of experimental data is often insufficient to make a clear distinction between competing models. Often one would like to perform a new experiment which would discriminate between competing hypotheses.
Results
We developed a novel method to perform Optimal Experiment Design to predict which experiments would most effectively allow model selection. A Bayesian approach is applied to infer model parameter distributions. These distributions are sampled and used to simulate from multivariate predictive densities. The method is based on a kNearest Neighbor estimate of the Jensen Shannon divergence between the multivariate predictive densities of competing models.
Conclusions
We show that the method successfully uses predictive differences to enable model selection by applying it to several test cases. Because the design criterion is based on predictive distributions, which can be computed for a wide range of model quantities, the approach is very flexible. The method reveals specific combinations of experiments which improve discriminability even in cases where data is scarce. The proposed approach can be used in conjunction with existing Bayesian methodologies where (approximate) posteriors have been determined, making use of relations that exist within the inferred posteriors.
Keywords
Model selection Inference Bayes factor UncertaintyBackground
Previously, predictive distributions have been used to perform experiment design targeted at reducing the uncertainty of specific predictions [29–31]. In the field of machine learning, optimal experiment design based on informationtheoretic considerations is typically referred to as active learning [32]. In the neurosciences, the Bayesian inversion and selection of nonlinear states space models is known as dynamic causal modelling (DCM). Although DCM is dominated by variational (approximate) Bayesian model inversion  the basic problems and ensuing model selection issues are identical to the issues considered in this work. In DCM, the issue of optimising experimental design focuses on the LaplaceChernoff risk for model selection and its relationship with classical design optimality criteria. Daunizeau et al. (2011) consider the problem of detecting feedback connections in neuronal networks and how this depends upon the duration of design stimulation [33]. We will consider a similar problem in biochemical networks  in terms of identifying molecular interactions and when to sample data. We present a method to use samples from simulated predictive distributions for selecting experiments useful for model selection. Considering the increased use of Bayesian inference in the field [14, 34–39], this approach is particularly timely since it enables investigators to extract additional information from their inferences.
There are many design parameters that one could optimize. In this paper, we focus on a simple example: namely, which system variable should be measured and at which time point. We argue that by measuring those time points at which the models show the largest difference in their predictive distributions, large improvements in the Bayes factors can be obtained. By applying the methodology on an analytically tractable model, we show that the JSD is nearly monotonically related to the predicted change in Bayes factor. Subsequently, the Jensen Shannon divergence is computed between predictions of a nonlinear biochemical network. Since each model implies different relations between the predictive distributions, certain combinations of predictions lead to more discriminability than others. The method serves as a good predictor for effective experiments when compared to the obtained Bayes factors after the measurements have been performed. The approach can be used to design multiple experiments simultaneously, revealing benefits that arise from combinations of experiments.
Methods
Consider biochemical networks that can be modeled using a system of ordinary differential equations. These models comprise of equations $f\left(\overrightarrow{x}\right(t),\overrightarrow{u}(t),\overrightarrow{p})$ which contain parameters $\overrightarrow{p}$ (constant in time), inputs $\overrightarrow{u}\left(t\right)$ and state variables $\overrightarrow{x}\left(t\right)$. Given a set of parameters, inputs, and initial conditions $\overrightarrow{x}\left(0\right)$, these equations can be simulated. Measurements $\overrightarrow{y}\left(t\right)=g\left(\overrightarrow{x}\right(t),\overrightarrow{q},\overrightarrow{r})$ are performed on a subset and/or a combination of the total number of state variables in the model. Measurements are hampered by measurement noise $\overrightarrow{\epsilon}$, while many techniques used in biology necessitate the use of scaling and offset parameters $\overrightarrow{r}$[45]. The vector $\overrightarrow{\theta}$, defined as $\overrightarrow{\theta}=\{\overrightarrow{p},\overrightarrow{r},{\overrightarrow{x}}_{0}\}$, lists all the required quantities to simulate the model. The parameters $\overrightarrow{q}$ determine the experimental design and could include differences in when various responses are measured or the mapping from hidden model states $\overrightarrow{x}$ to observed responses $\overrightarrow{y}$. We will refer to these as ‘design parameters’ that are, crucially, distinguished from model parameters $\overrightarrow{\theta}$. Design parameters are under experimental control and determine the experimental design. In what follows, we try to optimise the discriminability of models in terms of Bayesian model comparison by optimizing an objective function with respect to $\overrightarrow{q}$. In the examples, we will consider $\overrightarrow{q}$ as the timing of extra observations.
where M _{ i } indicates a model and y ^{ D } the observed data. The parameters are given by $\overrightarrow{\theta}$, while ${y}_{k}^{D}\left({t}_{j}\right)$ indicates the value of a data point of state variable k at time j, respectively.
Predictive Distributions
where $p\left({y}^{D}\right\phantom{\rule{2pt}{0ex}}\overrightarrow{\theta})$ is the distribution of the observed data given parameters $\overrightarrow{\theta}$. The parameter prior distributions $p\left(\overrightarrow{\theta}\right)$ typically reflect the prior uncertainty associated with the parameters. To sample from the posterior parameter distribution, one needs to verify that the posterior distribution is proper. This can be checked by profiling the different parameters and determining whether the likelihood times the prior does not flatten out [28, 46]. After checking whether the posterior distribution of parameters is proper, a sample from this distribution can be obtained using Markov Chain Monte Carlo (MCMC) [22, 28]. MCMC can generate samples from probability distributions whose probability densities are known up to a normalizing factor [47] (see Additional file 1: section S1). A sample of the posterior parameter distribution reflects the uncertainty associated with the parameter values and can subsequently be used to simulate different predictions. The predictive distribution can now be sampled by simulating the model for each of the samples in the posterior parameter distribution and adding noise generated by the associated error model. The latter is required as future observations will also be affected by noise.
Model selection
where M _{1} and M _{2} refer to the different models under consideration. Unnecessarily complex models are implicitly penalized due to the fact that these result in a lower weighting of the high likelihood region, which results in a lower value for the integrated likelihood. This is illustrated in Figure 2. This means that maximizing the model evidence corresponds to maintaining an accurate explanation for the data while minimizing complexity [50].
Bounds can be defined where the Bayes factor value becomes decisive for one model over the other. Typically, a ratio of 100:1 is considered decisive in terms of model selection [40, 51]. In dynamic causal modelling, variational methods predominate, usually under the Laplace assumption. This assumes that the posterior density has a Gaussian shape, which greatly simplifies both the integration problems and numerics. Note that assuming a Gaussian posterior over the parameters does not necessarily mean that the posterior predictive distribution over the data is Gaussian (see Figure 1). Computing the required marginal likelihoods is challenging for nonlinear problems where such asymptotic approximations to the posterior distribution are not appropriate. Here one must resort to more advanced methods such as thermodynamic integration (see Additional file 1: section S2) [52] or annealed importance sampling [40]. Though the Bayes factor is a useful method of model selection, determining what to measure in order to improve the Bayes factor in favor of the correct model is a nontrivial problem. As such, it provides a means to perform model selection, but not the optimal selection of data features.
Experiment design
Here P and Q are random variables with p and q their associated densities. Considering that only a sample of the PPDs is available, it is required to obtain a density estimate suitable for integration. Density estimation can be approached in two ways: by Kernel Density Estimation (KDE), or by kNearest Neighbor (kNN) density estimation. In Kernel Density Estimation (KDE), an estimate of the density is made by centering normalized kernels on each sample and computing weighted averages. This results in a density estimate with which computations can be performed. The kernels typically contain a bandwidth parameter which is estimated by means of cross validation [54, 55].
For well behaved low dimensional distributions, KDE often performs well. Considering the strongly nonlinear nature of both the parameter and predictive distributions, a Gaussian kernel with constant covariance is not appropriate. As the dimensionality and nonuniformity of the problem increases, more and more weights in the KDE become small and estimation accuracy is negatively affected [56]. Additionally, choosing an appropriate bandwidth by means of crossvalidation is a computationally expensive procedure to perform for each experimental candidate.
Testing the method: numerical experiments
To demonstrate the method, a series of simulation studies are performed. Since we know which model generated the data, it is possible to compare to the Bayes factor pointing to the true model. After generating an initial data set using the true model, PPDs are sampled for each of the competing models. As the design variable, we consider the timing of a new measurement. Hence, we look at differences between the predictive distributions belonging to the different models at different timepoints. We use a sample of simulated observables at specific timepoints to compute JSD estimates between the different models. We thereby test whether the JSD estimate can be used to compare different potential experiments. The new experimental data is subsequently included and the JSD compared to the change in Bayes factor in favor of the correct model. Note that this new Bayes factor depends on the experimental outcome and that this approach results in a distribution of predicted Bayes factors. A large change in Bayes factor indicates a useful experiment.
Analytic models
where $\overrightarrow{\theta}$ represents a parameter vector and B constitutes a design matrix with basis functions B _{ i }(t). Given that σ, the standard deviation of the Gaussian observation error ε, is known and the prior distribution over the parameters is a Gaussian with standard deviation ξ, the mean and covariance matrix of the posterior distribution can be computed analytically (see Additional file 1: section S4). Using linear models avoids the difficult numerical integration commonly required to compute the Bayes factor and makes it possible to perform overarching Monte Carlo studies on how these Bayes factors adjust upon including new experimental data. The analytical expressions make it possible to compare the JSD to distributions of the actual Bayes factors for model selection.
Nonlinear biochemical networks
Each of the artificial models was able to describe the measured data to an acceptable degree. We used a Gamma distribution with α=1 and β=3 for the prior distributions of the parameters. This prior is relatively noninformative (allowing a large range of parameter values), while not being so vague that the simplest model is always preferred (Lindley’s paradox [40]). Data was obtained using M _{1}. Observables were Bp, of which three replicates were measured, and Dp, of which two replicates were measured. These were measured at t=[ 0,2,5,10,20,40,60,100]. All replicates were simulated by adding Gaussian white noise with a standard deviation of 0.03. The parameter values corresponding to the true system were obtained by running Monte Carlo simulations until a visible overshoot above the noise level was observed. Parameter inference was performed using population MCMC with the noise σ as an inferred parameter. As design variables we consider both the choice of which observable(s) to measure and the time point(s) of the measurement.
Computational implementation
All algorithms were implemented in Matlab (Natick, MA). Numerical integration of the differential equations was performed with compiled MEX files using numerical integrators from the SUNDIALS CVode package (Lawrence Livermore National Laboratory, Livermore, CA). Absolute and relative tolerances were set to 10^{−8} and 10^{−9} respectively. MCMC was performed using a population MCMC approach using N _{ T }=40 chains with a temperature schedule given by ${T}_{n}={\left(\frac{{N}_{T}}{n}\right)}^{4}$[52]. This also permitted computation of the Bayes factors between the nonlinear models by means of thermodynamic integration. The Gaussian proposal distribution for the MCMC was based on an approximation to the Hessian computed using a Jacobian obtained by simulating the sensitivity equations. After convergence, the chain was thinned to 10000 samples. Since the number of experiments designed simultaneously (and therefore the dimensionality of the prediction vectors) was reasonably small (N _{ samples }>>2^{ k }), the kNN search was performed using kd trees [60]. The figures in this paper were determined using k=10.
Results and discussion
Analytic models
Nonlinear models

1. Steady state Cp and BpCp concentration

2. Bp and Dp during the peak in the second condition (u=2)

3. Steady state Cp
Sample table title
D _{12}  Δ B _{12}  D _{13}  Δ B _{13}  D _{14}  Δ B _{14} 

0.03  0.06±0.19  0.49  0.32±0.39  0.05  −0.07±0.36 
0.48  0.26±0.14  0.54  0.72±0.36  0.61  0.43±0.38 
−0.06  −0.49±0.73  −0.01  −0.35±0.68  −0.04  0.32±0.54 
Conclusions
This paper describes a method applicable to performing experiment design with the aim of differentiating between various hypotheses. We show by means of a simulation study on analytically tractable models that the JSD is approximately monotonically related to the expected change in Bayes factor in favor of the model that generated the data (considering the current uncertainty in its parameters). This monotonic relation is useful, because it implies that the JSD can be used as a predictor of the change in Bayes factor. The applicability to nonlinear models of biochemical reaction networks was demonstrated by applying it to models based on motifs previously observed in signaling networks [58, 59]. Experiments were designed for distinguishing between different feedback mechanisms.
Though forecasting a predictive distribution of Bayes factors has been suggested [61], the implicit penalization of model complexity could have adverse consequences. The experiment design could suggest a measurement where the probability densities of two models overlap. When this happens, both models can describe the data equally well, which leads to an implicit penalization of the more complex model (since it allows for more varied predictions due to its added freedom). This penalization can then be followed by subsequent selection (of the simpler model). Though a decisive selection occurs, such an experiment would not provide additional insight however. In [61], this is mitigated by determining the evidence in favor of a more complex model. Moreover, computing the predictive distributions of Bayes factors required for this approach is computationally intractable for nonlinear models that are not nested. By focusing on differences in predictive distributions, both these problems are mitigated, making it is possible to pinpoint where the different models predict different behavior. Aside from their usefulness in model selection, such predictive differences could also be attributed to the different mechanisms present in the different models. This allows for followup studies to investigate whether these are either artificial or true system behavior.
A complicating factor in this method is the computational burden. The largest challenge to overcome is to obtain a sample from the posterior parameter distribution. Running MCMC on high dimensional problems can be difficult. Fortunately, recent advances in both MCMC [19, 62] as well as approximate sampling techniques [39, 48, 63, 64] allow sampling parameter distributions of increasingly complex models [14, 34–38]. The bottleneck in computing the JSD resides in searching for the k ^{ th } nearest neighbor. A subproblem which occurs in many different problems and for which computationally faster solutions exist [65, 66]. An attractive aspect of this methodology is that it is possible to design multiple experiments at once. However, the density estimates typically become less accurate as the number of designed experiments increases (see Additional file 1: section S8). Therefore, we recommend starting with a low number of experiments (two or three) and gradually adding experiments while the JSD is low. Density estimation can also be problematic when the predictions vary greatly in their dispersion. When considering nonnegative quantities such as concentrations, logtransforming the predictions may alleviate problems. Finally, the number of potential combinations of experiments increases exponentially with the number of experiments designed. It is clear that this rapidly becomes infeasible for large numbers of experiments. However, it is not necessary to fill the entire experimental matrix and techniques such as Sequential Monte Carlo sampling could be considered as an alternative to more effectively probe this space. We revert the reader to Additional file 1: section S7 for a proof of principle implementation of such a sampler.
One additional point of debate is the weighting of each of the models in the mixture distribution used to compute the JSD. It could be argued that it would be more sensible to weight models according to their model probabilities by determining the integrated likelihoods of the data that is already available. The reason for not doing this is twofold. Firstly, the computational burden this adds to the experimental design procedure is significant. More importantly however, the implicit weighting in favor of parsimony could strongly affect the design by removing models which are considered unnecessarily complex at this stage of the analysis. When designing new experiments, the aim is to obtain measurements that allow for optimal discrimination between the predictive distributions under the different hypotheses. Optimal discrimination makes it sensible to consider the models equally probable a priori.
The method has several advantages that are particularly useful for modeling biochemical networks. Because the method is based on sampling from the posterior parameter probability distribution, it is particularly suitable when insufficient data is available to consider Gaussian parameter probability distributions or model linearisations. Additionally, it allows incorporation of prior knowledge in the form of prior parameter probability distributions. This is useful when the available data contains insufficient constraints to result in a well defined posterior parameter distribution. Because the design criterion is based on predictive distributions and such distributions can be computed for a wide range of model quantities, the approach is very flexible. In biochemical research, in vivo measurements are often difficult to perform and practical limitations of the various measurement technologies play an important role. In many cases measurements on separate components cannot be performed and measurements result in indirect derived quantities. Fortunately, in the current framework such measurements can be used directly since distributions of such experiments can be predicted.
Moreover, the impact of specific combinations of experiments can be assessed by including them in the design simultaneously which reveals specific combination of measurements that are particularly useful. This way, informative experiments can be distinguished from noninformative ones and the experimental efforts can be targeted to discriminate between competing hypotheses.
Availability
Source code is available at: http://cbio.bmt.tue.nl/sysbio/software/pua.html.
Declarations
Acknowledgements
Funding
This work was funded by the Netherlands Consortium for Systems Biology (NCSB). The authors would like to thank H. W. H. van Roekel, R. M. Foster and C. Çölmekçi Öncü for helpful discussions.
Authors’ Affiliations
References
 Tiemann C, Vanlier J, Hilbers P, van Riel N: Parameter adaptations during phenotype transitions in progressive diseases. BMC Syst Biol. 2011, 5: 17410.1186/175205095174.PubMed CentralView ArticlePubMedGoogle Scholar
 van Riel NA, Tiemann CA, Vanlier J, Hilbers PA: Applications of analysis of dynamic adaptations in parameter trajectories. Interface Focus. 2013, 3 (2): 2012008410.1098/rsfs.2012.0084.PubMed CentralView ArticlePubMedGoogle Scholar
 Schmitz J, Van Riel N, Nicolay K, Hilbers P, Jeneson J: Silencing of glycolysis in muscle: experimental observation and numerical analysis. Exp Physiol. 2010, 95 (2): 380397. 10.1113/expphysiol.2009.049841.PubMed CentralView ArticlePubMedGoogle Scholar
 Schilling M, Maiwald T, Hengl S, Winter D, Kreutz C, Kolch W, Lehmann W, Timmer J, Klingmüller U: Theoretical and experimental analysis links isoformspecific ERK signalling to cell fate decisions. Mol Syst Biol. 2009, 5: 334PubMed CentralView ArticlePubMedGoogle Scholar
 Borisov N, Aksamitiene E, Kiyatkin A, Legewie S, Berkhout J, Maiwald T, Kaimachnikov N, Timmer J, Hoek J, Kholodenko B: Systemslevel interactions between insulin–EGF networks amplify mitogenic signaling. Mol Syst Biol. 2009, 5: 256PubMed CentralView ArticlePubMedGoogle Scholar
 Cedersund G, Roll J, Ulfhielm E, Danielsson A, Tidefelt H, Strålfors P: Modelbased hypothesis testing of key mechanisms in initial phase of insulin signaling. PLoS Comput Biol. 2008, 4 (6): 799806.View ArticleGoogle Scholar
 Koschorreck M, Gilles E: Mathematical modeling and analysis of insulin clearance in vivo. BMC Syst Biol. 2008, 2: 4310.1186/17520509243.PubMed CentralView ArticlePubMedGoogle Scholar
 Schoeberl B, EichlerJonsson C, Gilles E, Müller G: Computational modeling of the dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors. Nat Biotechnol. 2002, 20 (4): 370375. 10.1038/nbt0402370.View ArticlePubMedGoogle Scholar
 Jeneson J, Westerhoff H, Kushmerick M: A metabolic control analysis of kinetic controls in ATP free energy metabolism in contracting skeletal muscle. Am J PhysiolCell Physiol. 2000, 279 (3): C813C832.PubMedGoogle Scholar
 Wu F, Jeneson J, Beard D: Oxidative ATP synthesis in skeletal muscle is controlled by substrate feedback. Am J PhysiolCell Physiol. 2007, 292: C115C124.View ArticlePubMedGoogle Scholar
 Groenendaal W, Schmidt K, von Basum G, van Riel N, Hilbers P: Modeling glucose and water dynamics in human skin. Diabetes Technol Ther. 2008, 10 (4): 283293. 10.1089/dia.2007.0290.View ArticlePubMedGoogle Scholar
 Vanlier J, Wu F, Qi F, Vinnakota K, Han Y, Dash R, Yang F, Beard D: BISEN: Biochemical simulation environment. Bioinformatics. 2009, 25 (6): 836837. 10.1093/bioinformatics/btp069.PubMed CentralView ArticlePubMedGoogle Scholar
 Vanlier J, Tiemann C, Hilbers P, van Riel N: Parameter uncertainty in biochemical models described by ordinary differential equations. Math Biosci. 2013, 246 (2): 305314. 10.1016/j.mbs.2013.03.006.View ArticlePubMedGoogle Scholar
 Klinke D: An empirical Bayesian approach for modelbased inference of cellular signaling networks. BMC Bioinformatics. 2009, 10: 37110.1186/1471210510371.PubMed CentralView ArticlePubMedGoogle Scholar
 Taylor H, Barnes C, Huvet M, Bugeon L, Thorne T, Lamb J, Dallman M, Stumpf M: Calibrating spatiotemporal models of leukocyte dynamics against in vivo liveimaging data using approximate Bayesian computation. Integr Biol. 2012, 4 (3): 335345. 10.1039/c2ib00175f.View ArticleGoogle Scholar
 Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmüller U, Timmer J: Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics. 2009, 25 (15): 192310.1093/bioinformatics/btp358.View ArticlePubMedGoogle Scholar
 Hasenauer J, Waldherr S, Wagner K, Allgower F: Parameter identification, experimental design and model falsification for biological network models using semidefinite programming. Syst Biol IET. 2010, 4 (2): 119130. 10.1049/ietsyb.2009.0030.View ArticleGoogle Scholar
 Brännmark C, Palmér R, Glad S, Cedersund G, Strålfors P: Mass and information feedbacks through receptor endocytosis govern insulin signaling as revealed using a parameterfree modeling framework. J Biol Chem. 2010, 285 (26): 2017110.1074/jbc.M110.106849.PubMed CentralView ArticlePubMedGoogle Scholar
 Girolami M, Calderhead B: Riemann manifold langevin and hamiltonian monte carlo methods. J R Stat Soci: Series B (Stat Methodol). 2011, 73 (2): 123214. 10.1111/j.14679868.2010.00765.x.View ArticleGoogle Scholar
 Cedersund G, Roll J: Systems biology: model based evaluation and comparison of potential explanations for given biological data. FEBS J. 2009, 276 (4): 903922. 10.1111/j.17424658.2008.06845.x.View ArticlePubMedGoogle Scholar
 Müller T, Faller D, Timmer J, Swameye I, Sandra O, Klingmüller U: Tests for cycling in a signalling pathway. J R Stat Soc: Series C (Appl Stat). 2004, 53 (4): 557568. 10.1111/j.14679876.2004.05148.x.View ArticleGoogle Scholar
 Calderhead B, Girolami M: Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods. Interface Focus. 2011, 1 (6): 821835. 10.1098/rsfs.2011.0051.PubMed CentralView ArticlePubMedGoogle Scholar
 Tegnér J, Compte A, Auffray C, An G, Cedersund G, Clermont G, Gutkin B, Oltvai Z, Stephan K, Thomas R: Computational disease modeling–fact or fiction?. BMC Syst Biol. 2009, 3: 5610.1186/17520509356.PubMed CentralView ArticlePubMedGoogle Scholar
 Skanda D, Lebiedz D: An optimal experimental design approach to model discrimination in dynamic biochemical systems. Bioinformatics. 2010, 26 (7): 93910.1093/bioinformatics/btq074.View ArticlePubMedGoogle Scholar
 Steiert B, Raue A, Timmer J, Kreutz C: Experimental Design for Parameter Estimation of Gene Regulatory Networks. PloS one. 2012, 7 (7): e4005210.1371/journal.pone.0040052.PubMed CentralView ArticlePubMedGoogle Scholar
 Casey F, Baird D, Feng Q, Gutenkunst R, Waterfall J, Myers C, Brown K, Cerione R, Sethna J: Optimal experimental design in an epidermal growth factor receptor signalling and downregulation model. Syst Biol IET. 2007, 1 (3): 190202. 10.1049/ietsyb:20060065.View ArticleGoogle Scholar
 Kreutz C, Raue A, Timmer J: Likelihood based observability analysis and confidence intervals for predictions of dynamic models. BMC Syst Biol. 2012, 6: 12010.1186/175205096120.PubMed CentralView ArticlePubMedGoogle Scholar
 Vanlier J, Tiemann C, Hilbers P, van Riel N: An integrated strategy for prediction uncertainty analysis. Bioinformatics. 2012, 28 (8): 11301135. 10.1093/bioinformatics/bts088.PubMed CentralView ArticlePubMedGoogle Scholar
 Liepe J, Filippi S, Komorowski M, Stumpf MP: Maximizing the Information Content of Experiments in Systems Biology. PLOS Comput Biol. 2013, 9: e100288810.1371/journal.pcbi.1002888.PubMed CentralView ArticlePubMedGoogle Scholar
 Vanlier J, Tiemann C, Hilbers P, van Riel N: A Bayesian approach to targeted experiment design. Bioinformatics. 2012, 28 (8): 11361142. 10.1093/bioinformatics/bts092.PubMed CentralView ArticlePubMedGoogle Scholar
 King RD, Whelan KE, Jones FM, Reiser PG, Bryant CH, Muggleton SH, Kell DB, Oliver SG: Functional genomic hypothesis generation and experimentation by a robot scientist. Nature. 2004, 427 (6971): 247252. 10.1038/nature02236.View ArticlePubMedGoogle Scholar
 MacKay DJ: Informationbased objective functions for active data selection. Neural Comput. 1992, 4 (4): 590604. 10.1162/neco.1992.4.4.590.View ArticleGoogle Scholar
 Daunizeau J, Preuschoff K, Friston K, Stephan K: Optimizing experimental design for comparing models of brain function. PLoS Comput Biol. 2011, 7 (11): e100228010.1371/journal.pcbi.1002280.PubMed CentralView ArticlePubMedGoogle Scholar
 Hug S, Raue A, Hasenauer J, Bachmann J, Klingmüller U, Timmer J, Theis F: Highdimensional Bayesian parameter estimation: case study for a model of JAK2/STAT5 signaling. Math Biosci. 2013, 246 (2): 293304. 10.1016/j.mbs.2013.04.002.View ArticlePubMedGoogle Scholar
 Finley S, Gupta D, Cheng N, Klinke D: Inferring relevant control mechanisms for interleukin12 signaling in naïve CD4+; T cells. Immunol Cell Biol. 2010, 89: 100110.PubMed CentralView ArticlePubMedGoogle Scholar
 Konukoglu E, Relan J, Cilingir U, Menze B, Chinchapatnam P, Jadidi A, Cochet H, Hocini M, Delingette H, Jaïs P: Efficient probabilistic model personalization integrating uncertainty on data and parameters: Application to eikonaldiffusion models in cardiac electrophysiology. Prog Biophys Mol Biol. 2011, 107: 134146. 10.1016/j.pbiomolbio.2011.07.002.View ArticlePubMedGoogle Scholar
 Xu T, Vyshemirsky V, Gormand A, von Kriegsheim A, Girolami M, Baillie G, Ketley D, Dunlop A, Milligan G, Houslay M: Inferring signaling pathway topologies from multiple perturbation measurements of specific biochemical species. Sci Signal. 2010, 3 (113): ra20View ArticleGoogle Scholar
 Kalita MK, Sargsyan K, Tian B, PaulucciHolthauzen A, Najm HN, Debusschere BJ, Brasier AR: Sources of celltocell variability in canonical nuclear factorκ B (NFκ B) signaling pathway inferred from single cell dynamic images. J Biol Chem. 2011, 286 (43): 3774137757. 10.1074/jbc.M111.280925.PubMed CentralView ArticlePubMedGoogle Scholar
 Toni T, Stumpf M: Simulationbased model selection for dynamical systems in systems and population biology. Bioinformatics. 2010, 26: 104110. 10.1093/bioinformatics/btp619.PubMed CentralView ArticlePubMedGoogle Scholar
 Vyshemirsky V, Girolami M: Bayesian ranking of biochemical system models. Bioinformatics. 2008, 24 (6): 833839. 10.1093/bioinformatics/btm607.View ArticlePubMedGoogle Scholar
 Schmidl D, Hug S, Li WB, Greiter MB, Theis FJ: Bayesian model selection validates a biokinetic model for zirconium processing in humans. BMC Syst Biol. 2012, 6: 9510.1186/17520509695.PubMed CentralView ArticlePubMedGoogle Scholar
 Mélykúti B, August E, Papachristodoulou A, ElSamad H: Discriminating between rival biochemical network models: three approaches to optimal experiment design. BMC Syst Biol. 2010, 4: 3810.1186/17520509438.PubMed CentralView ArticlePubMedGoogle Scholar
 Flassig R, Sundmacher K: Optimal design of stimulus experiments for robust discrimination of biochemical reaction networks. Bioinformatics. 2012, 28 (23): 30893096. 10.1093/bioinformatics/bts585.PubMed CentralView ArticlePubMedGoogle Scholar
 Endres DM, Schindelin JE: A new metric for probability distributions. Inf Theory IEEE Trans. 2003, 49 (7): 18581860. 10.1109/TIT.2003.813506.View ArticleGoogle Scholar
 Kreutz C, Rodriguez M, Maiwald T, Seidl M, Blum H, Mohr L, Timmer J: An error model for protein quantification. Bioinformatics. 2007, 23 (20): 274710.1093/bioinformatics/btm397.View ArticlePubMedGoogle Scholar
 Raue A, Kreutz C, Theis F, Timmer J: Joining forces of Bayesian and frequentist methodology: A study for inference in the presence of nonidentifiability. Phil Trans Roy Soc A. 2012, 371 (1984): 2011054410.1098/rsta.2011.0544.View ArticleGoogle Scholar
 Geyer C: Practical markov chain monte carlo. Stat Sci. 1992, 7 (4): 473483. 10.1214/ss/1177011137.View ArticleGoogle Scholar
 Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf M: Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009, 6 (31): 187202. 10.1098/rsif.2008.0172.PubMed CentralView ArticlePubMedGoogle Scholar
 Burnham KP, Anderson DR: Model selection and multimodel inference: a practical informationtheoretic approach. 2002, New York: SpringerGoogle Scholar
 Penny WD, Stephan K, Mechelli A, Friston K: Comparing dynamic causal models. NeuroImage. 2004, 22 (3): 11571172. 10.1016/j.neuroimage.2004.03.026.View ArticlePubMedGoogle Scholar
 Good IJ: Weight of evidence: a brief survey. Bayesian Stat. 1985, 2: 249269.Google Scholar
 Calderhead B, Girolami M: Estimating Bayes factors via thermodynamic integration and population MCMC. Comput Stat Data Anal. 2009, 53 (12): 40284045. 10.1016/j.csda.2009.07.025.View ArticleGoogle Scholar
 Lin J: Divergence measures based on the Shannon entropy. Inf Theory IEEE Trans. 1991, 37: 145151. 10.1109/18.61115.View ArticleGoogle Scholar
 Härdle W, Werwatz A, Müller M, Sperlich S: Introduction. Nonparametric Semiparametric Models. 2004, New York: SpringerView ArticleGoogle Scholar
 Kraskov A, Stögbauer H, Grassberger P: Estimating mutual information. Phys Rev E. 2004, 69 (6): 066138View ArticleGoogle Scholar
 Budka M, Gabrys B, Musial K: On accuracy of PDF divergence estimators and their applicability to representative data sampling. Entropy. 2011, 13 (7): 12291266.View ArticleGoogle Scholar
 Boltz S, Debreuve E, Barlaud M: Highdimensional statistical distance for regionofinterest tracking: Application to combining a soft geometric constraint with radiometry. IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07. 2007, Minneapolis, Minnesota, USA: IEEE Computer Society, 18.View ArticleGoogle Scholar
 Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 1999, 27: 2934. 10.1093/nar/27.1.29.PubMed CentralView ArticlePubMedGoogle Scholar
 Bevan P: Insulin signalling. J Cell Sci. 2001, 114 (8): 14291430.PubMedGoogle Scholar
 Friedman JH, Bentley JL, Finkel RA: An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw (TOMS). 1977, 3 (3): 209226. 10.1145/355744.355745.View ArticleGoogle Scholar
 Trotta R: Forecasting the Bayes factor of a future observation. Mon Notices R Astronomical Soc. 2007, 378 (3): 819824. 10.1111/j.13652966.2007.11861.x.View ArticleGoogle Scholar
 Calderhead B, Girolami M: Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods. Interface Focus. 2011, 1 (6): 821835. 10.1098/rsfs.2011.0051.PubMed CentralView ArticlePubMedGoogle Scholar
 Calderhead B, Girolami M, Lawrence N: Accelerating Bayesian inference over nonlinear differential equations with Gaussian processes. Adv Neural Inf Process Syst. 2009, 21: 217224.Google Scholar
 Liepe J, Barnes C, Cule E, Erguler K, Kirk P, Toni T, Stumpf M: ABCSysBio approximate Bayesian computation in Python with GPU support. Bioinformatics. 2010, 26 (14): 179710.1093/bioinformatics/btq278.PubMed CentralView ArticlePubMedGoogle Scholar
 Arefin A, Riveros C, Berretta R, Moscato P: GPUFSkNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs. PloS one. 2012, 7 (8): e4400010.1371/journal.pone.0044000.PubMed CentralView ArticlePubMedGoogle Scholar
 Garcia V, Debreuve E, Barlaud M: Fast k nearest neighbor search using GPU. CVPR Workshop on Computer Vision on GPU. 2008, Anchorage, Alaska, USA: IEEE Computer SocietyGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.