Bayesian model selection validates a biokinetic model for zirconium processing in humans

Background In radiation protection, biokinetic models for zirconium processing are of crucial importance in dose estimation and further risk analysis for humans exposed to this radioactive substance. They provide limiting values of detrimental effects and build the basis for applications in internal dosimetry, the prediction for radioactive zirconium retention in various organs as well as retrospective dosimetry. Multi-compartmental models are the tool of choice for simulating the processing of zirconium. Although easily interpretable, determining the exact compartment structure and interaction mechanisms is generally daunting. In the context of observing the dynamics of multiple compartments, Bayesian methods provide efficient tools for model inference and selection. Results We are the first to apply a Markov chain Monte Carlo approach to compute Bayes factors for the evaluation of two competing models for zirconium processing in the human body after ingestion. Based on in vivo measurements of human plasma and urine levels we were able to show that a recently published model is superior to the standard model of the International Commission on Radiological Protection. The Bayes factors were estimated by means of the numerically stable thermodynamic integration in combination with a recently developed copula-based Metropolis-Hastings sampler. Conclusions In contrast to the standard model the novel model predicts lower accretion of zirconium in bones. This results in lower levels of noxious doses for exposed individuals. Moreover, the Bayesian approach allows for retrospective dose assessment, including credible intervals for the initially ingested zirconium, in a significantly more reliable fashion than previously possible. All methods presented here are readily applicable to many modeling tasks in systems biology.


Background
Radioactive zirconium (Zr) isotopes are produced in large quantities in nuclear fission reactors; one of the most common fragments in uranium fission is the radioactive 95 Zr. For example, the estimated released 95 Zr activity of the Fukushima and Chernobyl accidents is considered to have Full list of author information is available at the end of the article detrimental health effects not only via irradiation, but also via the intake of edibles [1,2]. The estimation of radiation doses is indispensable for risk analysis. This is true for occupational exposure [3] and patients undergoing diagnostic and therapeutic nuclear medicine [4] as well as for the public in general [5]. To calculate the radiation dose a mathematical model is required for quantifying the deposition of radioactivity from the incorporated radionuclide inside the human body. In internal dosimetry, this model is called biokinetic model as defined by the International Commission on Radiological Protection (ICRP) in [5]. Also, the ICRP put forward the current standard model, which we will simply denote the ICRP model. The parameters of this model were mostly derived from animal data. In order to obtain more reliable dose estimates for humans, the Helmholtz Zentrum München (HMGU) http://www.biomedcentral.com/1752-0509/6/95 developed a new, physiologically more plausible biokinetic model [6]. It is based on the processing of non-radioactive Zr isotopes in 16 investigations with 12 healthy human subjects. The measurements were taken in vivo in plasma and urine up to 100 days after administration by application of the double tracer technique. Moreover, a global statistical analysis method has been developed and the uncertainty and sensitivity of the HMGU model parameters were analyzed [7,8].
The biokinetic models mentioned above incorporate basic processes in the human physiological system [3,5,9,10]. Mathematically, this is characterized as follows: All major human organs and tissues are simplified in the model structure as separate compartments that represent kinetically homogeneous amounts of radionuclides; the connections between organs and tissues are described via transfer rates, i.e. model parameters that represent the exchange rates between the individual mutually exclusive compartments. These multi-compartmental systems along with their transfer parameters describing the kinetic behavior of radionuclides in the human body are called compartmental models [5,11]. Throughout this paper we use the terms biokinetic model and compartmental model interchangeably. The transfer of substances into and out of compartments is governed by the law of mass balance. Transfer parameters are frequently evaluated on the basis of experimental data obtained from laboratory animals and, to a lesser extent, human beings [10]. Although animal data is not directly comparable to human data, the former can nevertheless be very helpful as prior information.
In this publication, we address the problem of model inference and model selection. A Bayesian approach enables us to cover model and measurement uncertainties for a compartmental model based on human data, while simultaneously taking into account the prior information. The Bayesian framework provides a fully probabilistic approach [12]. It is grounded on the probability distribution of a problem specific parameter space conditioned on the given data. This specifies a measure of belief for all possible parameter values. Similarly -albeit not identical -to confidence intervals, Bayesian analyses provide credible sets for the parameters at stake, holding regions and limits of high parameter probability [13]. However, as they are intrinsically driven by prior informations, some care has to be taken in their interpretation [14].
For an overall assessment of the two competing biokinetic models for Zr, the previous model parameter uncertainty analysis [7,8] is not sufficient, because uncertainties arising from the model structure were not taken into account. This shortcoming was addressed by our Bayesian approach. Considering the models themselves as a random variable allows to compute the probability for each of the models conditioned on given data. The ratio of the marginal likelihoods of two models, i.e. the ratio of the probability for the data to be produced by the specific model, is known as the Bayes factor, a quantity introduced by Jeffreys [15]. Performing model selection using Bayes factors is particularly useful when dealing with a few models only. While classical model selection approaches using statistics such as the AIC or likelihood ratio tests are based on single best parameter estimates [16], the Bayes factor takes into account all possible parameters values and thus deals with model uncertainty and avoids overfitting issues [17,18]. Moreover, in contrast to classical tests, the Bayes factor provides evidence for either one of the (possibly non-nested) models by definition. With the introduction of Markov chain Monte Carlo (MCMC) methods [19][20][21] as tools for sampling from probability distributions, a remarkable increase in Bayesian analyses was noticed. However, due to very complex probability surfaces these approaches often struggle with sampling efficiency [22]. In order to avoid resulting convergence issues of the MCMC approach, we combined a technique called thermodynamic integration with a novel copulabased Metropolis-Hastings sampler [23]. This provides numerically stable results for the inference process.
The HMGU and ICRP models were compared based on in vivo plasma and urine data of 16 investigations of 12 human subjects [6] using Bayes factors. More precisely, the models were evaluated for each investigation individually and for the concatenated data of all investigations. The latter allows to infer transfer rates (including credible intervals) for an average individual. We furthermore provide an analysis based on the (i) plasma data and (ii) urine data individually. Throughout the analysis, the HMGU model turned out to be superior compared to the ICRP model with respect to covering the specific data. This means the HMGU model more precisely explains the given observations and therefore the processing of zirconium in the human body. We then used the HMGU model to analyze the accretion of zirconium in bones: not only did we observe a delayed aggregation but also to lesser extents than predicted by the ICRP model. Lastly, the Bayesian framework yielded credible bounds for retrospective dose assessment of an average individual, this is, based on the concatenated data of all 16 investigations. We provide a user-friendly estimation table for the prediction of initially ingested zirconium concentrations for ex post measurements. This impacts the estimation of intake and radiation dose, especially the bone dose, when aiming for personalized occupational monitoring data.

Biokinetic models for zirconium processing
We infer the biokinetics of zirconium as it is processed in the human body. The currently used compartmental model was recommended by the ICRP in [5,10,24] http://www.biomedcentral.com/1752-0509/6/95 ( Figure 1A). It consists of eleven compartments and 15 transfer rates. Zirconium enters the body via the stomach compartment y 9 and is processed until it reaches any of the two final compartments urine, y 7 , or feces, y 8 . Some of the transfer rates and compartments of the ICRP model are however physiologically questionable: The direct mass transport from the two bone compartments to the urinary bladder contents and upper large intestine compartments or the distinction between trabecular bone surface and cortical bone surface as such. In order to address these shortcomings, we have recently proposed the alternative HMGU model [6] combining the two bone compartments into one single compartment and replacing these mass flows by physiologically more plausible transfer rates ( Figure 1B). Altogether both models share eight transfer rates, which we denoted by x 1 , . . . , x 8 . Transfers present in just one of the models have a unique index to facilitate distinction.
The dynamics of both models are described by a system of coupled linear first-order ordinary differential equations (ODEs): For each compartment y j with timedependent concentration y j (t), the rate of change of y j is given by where A + y j is the set of indices of all transfer rates x α flowing into compartment y j , A − y j the set of indices of all transfer rates flowing out of compartment y j and y [x α ] is the compartment which is connected to y j by the transfer rate x α . For instance A + y 5 = {8, 10}, y [x 8 ] = y 10 and y [x 10 ] = y 1 in the HMGU model. We have y 9 (0) = 100% and therefore y j =9 (0) = 0%, this is, the complete amount of zirconium is initially contained in the stomach compartment. The explicit model specific ODE systems can be found in the Additional file 1 sections 1.1 and 1.2.

Experimental data
The human biokinetic data was measured in a stable tracer study executed at the Helmholtz Zentrum München (HMGU) [6,25]. It includes 16 investigations of 12 healthy humans with ingestion of a investigationspecific amount of isotopically enriched stable zirconium. The administered amount was based on the individuals weight, aiming at a dose of 0.09mg stable tracer per kg body weight. Tracer concentrations were determined in blood plasma and urine. For the plasma data, samples were taken several times during the first day in increasing intervals, and more scarcely later on. Urine was collected completely in 12-24h periods on several days. The last samples were taken at 100d after tracer administration. Tracer concentrations were normalized to the respective  19 . B: HMGU model. The model consists of ten compartments y 1 ,. . . ,y 10 and twelve transfer rates x 1 ,. . . ,x 12 . In both models zirconium enters the body in the stomach compartment y 9 and diffuses through the system until it reaches either one of the two final compartments urine, y 7 , or feces, y 8 . The gray compartments y 1 and y 7 are directly related to the datasets measured. http://www.biomedcentral.com/1752-0509/6/95 tracer amount ingested in each investigation, such that the total ingested amount corresponds to 100% at t = 0 in the stomach compartment y 9 . For model development, the transfer compartment was taken to be identical with blood plasma, and concentrations therein were expressed as % per kg plasma. The plasma concentration was scaled by the total amount of plasma in the body to get absolute concentrations [26]. Urine data was expressed as excretion rate in % per day.

Model likelihood
Technical limitations, such as missing in vivo measurement techniques for all involved compartments as well as noisy data introduce model uncertainties to biological systems [27]. Comparing different kinds of models based on single parameter estimates as done in maximumlikelihood approaches is thus very critical. For statistical model evaluation we here applied a Bayesian approach, taking into account the full parameter distribution: For each investigation i we assume that the data follows the solution c x k (t) of the differential equation given in (1) for any of the two models M k and some corresponding parameter vector x k , where the model index k ∈ {H, I}. Here, M I is the ICRP model and M H the HMGU model. Corresponding to the notation in Figure 1A and 1B, x I = (x 1 , . . . , x 8 , x 13 , . . . , x 19 ) and x H = (x 1 , . . . , x 12 ). While y i,· 1 indicates measurements in plasma, i.e. in the transfer compartment y 1 ,ẏ i,· 7 designates measurements of the excretion rate in the urine compartment y 7 . There are n b i measurements in plasma and n u i measurements in urine for investigation i. Assuming furthermore that all data points contain normally distributed measurement errors, the investigation i and model k specific likelihood function has the form where c b x k (t α ) denotes the solution to Equation (1) for the transfer compartment y 1 at time point t α , corresponding to the measurement at y i,α 1 , for the parameter vector x k . Furthermore, d dt c u x k (t β ) is the derivative of the solution for the urine compartment y 7 at time point t β , corresponding to the measurementẏ i,β 7 , while (·|μ, σ ) is the univariate probability density function of the normal distribution with mean μ and standard deviation σ .
The standard deviations for plasma, σ b i , and for urine, σ u i , are fitted investigation-specifically by simulated annealing [28] before starting the MCMC sampling process. They correspond to the combined strength of all deviations from the ODE, e.g. to the size of the measurement error as well as to the magnitude of the difference between the ODE solution and the data points that is due to natural internal fluctuations not considered by an ODE approach. The biological variability between the individual investigations is accounted for by the fact that we fit different σ b i and σ u i for each investigation i and thus get investigation-specific likelihoods. This leads to individual credible intervals for each parameter in each investigation in the MCMC sampling procedure later on.
The complete data (i.e. concatenated data) likelihood is given by (1) has to be solved based on x k . Since (1) is of first order, it can be written as where y x k (t) is the vector of all the compartments of model k and the time independent matrix A(x k ) represents all the interactions between these compartments, depending on the transfer rate values x k . The analytical solution is then given by The matrix exponential e A(x k )t was computed by eigenvalue decomposition using MATLAB's eig function (see Additional file 1 section 1.3).

Bayes factors
Specifying model specific, investigation independent prior distributions p(x k |k) based on combined human/animal data yields the posterior distributions of the parameters for model k according to Bayes' theorem: where p(D i |k) denotes the marginal density obtained by integrating over the according parameter space X k , i.e.
The quantity p(D i |k) is called data evidence and is the basis for the model selection process. For comparing two http://www.biomedcentral.com/1752-0509/6/95 models k and k , we can view the model index as a random variable, apply Bayes' theorem again and take ratios to cancel the denominator, yielding The ratio of the two likelihoods in Equation (6) is known as the Bayes factor We had no initial preference for any of the models and thus chose a uniform model prior. The Bayes factor in Equation (7) then coincides with the posterior odds ratio between the models k and k conditioned on the data D i [18,29]. The Bayes factor compares the posterior probability p(k|D i ) that the data D i have arisen according to model k versus the probability p(k |D i ) = 1 − p(k|D i ) that D i have arisen according to model k , in contrast to single parameter measures such as the AIC or the likelihood ratio test [16]. The models may be non-nested. Due to the evaluation of each model on its whole parameter space X k (cf. Equation (5)), the Bayes factor naturally penalizes model complexity and thus prevents overfitting issues [30][31][32]. An interpretation of the likelihood ratio in the Bayes factor was given by Jeffreys [15], which can also be found in the Additional file 1 section 3. Most importantly, a Bayes factor B i k,k > 3 substantially favors model k, while B i k,k > 100 decisively favors model k. Clearly, for B i k,k < 1, model k is favored with evidence B i k ,k = 1/B i k,k .

Prior information
Since the problem of radiation protection has been extensively researched, quite a large number of animal studies have been conducted. These yielded excessive prior knowledge for the Bayesian modeling approach. As the ICRP model is the recommended model used for dose estimation, there naturally exists information on the distribution types of the parameters involved in the model together with confidence intervals [7]. Four parameters have a lognormal distribution, five a triangular distribution and six have a normal distribution (see Additional file 1 section 2.3 for details). From the confidence intervals, the parameters of the normal and lognormal distributions were calculated. For the HMGU model detailed prior information is also available from previous studies [7,8]. Here, eight parameters have a lognormal distribution and four a triangular one (see Additional file 1 section 2.3 for details). It is noteworthy that of the eight parameters shared in both models, x 8 is the only one having different distributions in the ICRP and HMGU model. Due to a lack of prior knowledge of the dependency structure between the parameters, the multivariate prior distribution p(x k |k) of model k was taken to be the product of the univariate prior distributions p(x k r |k) for each parameter x k r , i.e. p(x k |k) = r p(x k r |k). Each univariate prior distribution was truncated at zero. While Bayes factors were computed inter alia for each investigation separately (see Results and discussion), the same prior information was applied throughout all investigations. This is reasonable since the priors contain information from a huge variety of preceding experiments.

Thermodynamic integration
Computing the marginal of Equation (5) by numerical integration generally turns out to be a daunting task. There exist a variety of approaches to tackle this problem. Popular choices include Chib's method, which is based on a Gibbs sampling scheme and therefore not always easily applicable [33] or the Reversible Jump MCMC algorithm proposed by Green [34], based on an across model search. Since the latter involves sampling of an additional model parameter, the sampling generally takes longer than sampling within the different models only. We therefore applied a within model sampling technique called thermodynamic integration (TI). It was proposed by Lartillot and Philippe [35] and Friel and Pettitt [36] and successfully applied e.g. in Xu et al. [37]. The method splits apart the computation into several intermediate steps by introducing an auxiliary "temperature" parameter τ ∈[ 0, 1] that governs the influence of the parameter likelihood. The basis of this approach is the power posterior, which is the usual posterior modified such that the likelihood is exponentiated by the temperature parameter and then normalized accordingly to obtain a probability density: More precisely, the quantity of interest is the normalization constant since it yields a way to compute the terms of the Bayes factor (cf. Equation (7)) by differentiating its logarithm and then integrating both sides with respect to τ according to Calderhead and Girolami [38]. This means that the natural logarithm of the marginal likelihood can be computed as the integral over the expectation of the logarithmized data likelihood within the model with respect to the power posterior. The parameter τ governs the flatness of the power posterior surface and, much as in the concept of path sampling [39], stabilizes the computation of Equation (5) [36]: choosing a discretization 0 = τ 1 < τ 2 < . . . < τ N−1 < τ N = 1, we can compute the natural logarithm of the marginal likelihood p(D i |k) by numerically approximating the integral in Equation (11) by Also, the expectation with respect to the power posterior in Equation (12) is approximated in the usual way by substituting it with the Monte Carlo estimate where x k (m) denotes a sample drawn from p τ n (D i |k). For all our applications we chose a temperature schedule with N = 30 steps according to τ n = n−1 N−1 5 , n = 1, . . . , N to estimate log(p(D i |k)) for each k and i as suggested by Calderhead and Girolami [38].

Copula-based Monte Carlo sampling
The model, investigation, and temperature specific underlying Markov chain Monte Carlo (MCMC) samples were drawn using the recently introduced copula-based Metropolis-Hastings (MH) algorithm [23]. Copulas are constructs from probability theory for assessing and sampling from multivariate distributions. They are widely used in finance and ecology [40,41]. For any absolutely continuous multivariate cumulative distribution function where c (u 1 , . . . , u d ) is a unique copula density function defined on [ 0, 1] d with uniformly distributed marginals on [ 0, 1]. This copula function covers the full dependency structure of the variables. In other words, every joint distribution function can be decomposed into the marginal behavior of its individual variables and a function covering its dependency structure [42]. The MH proposal function then generates problem specific proposals with an according dependence structure drawn from a pair copula distribution. Fitting the copula distribution was done in preruns containing 1,000,000 unthinned samples each. They were generated for each investigation and model separately. For back-and-forth conversion of the prerun samples and proposals [23], we naturally applied the according prior distributions of the models at hand. Choosing different conversion functions is possible, but affects the sampling performance. Before starting the final MCMC sampling procedure, the maximum a posteriori parameter estimates were computed by simulated annealing and used as initial MCMC sampling values. This makes a burn-in period dispensable. For thinning the Markov chains, i.e. for drawing approximately independent samples in the MCMC procedure, we applied the autocorrelation-based Effective Sample Size (ESS) proposed by Kass [43]. The ESS holds the number of samples left when the Markov chain is thinned such that two consecutive samples can be considered approximately independent. The copula-based MH approach is able to deal with the dependence structure in the high dimensional sampling space and allows for high proposal acceptance rates at simultaneously high ESS's. Finally, all Bayes factors were computed based on 30,000 proposals of the copula-based MH algorithm at each τ n throughout all applications.

Results and discussion
In this section, we present the results of our analysis. We address the question which model is superiorly fitting the data. First, several general results, such as investigation dependency of the Bayes factor and effects of parameter correlations are shown, before turning to the results of the model selection, and their consequences for the HMGU and ICRP models.

Investigation specificity of transfer rates
In radiation protection the transfer rates for the biokinetics of radionuclides in the human body are derived from data collected in various independent experiments [5]. We measured plasma and urine levels in 16 different investigations. This poses the question whether the models should be compared based on the complete dataset, or whether statistical evaluation should be done for each investigation individually. While the former approach results in one overall Bayes factor, the latter yields 16 investigation specific, not directly comparable Bayes factors. All investigations follow the same pulse-like time courses in the transfer compartment y 1 while the excretion rates in the urine compartment y 7 exhibit an exponential decay behavior ( Figure 2). However, zirconium tracer concentrations showed up to a 50-fold difference between maximal http://www.biomedcentral.com/1752-0509/6/95  To test the hypothesis whether the diversity in concentration also effects transfer rates and therefore the estimated Bayes factors, we pairwise compared the posterior sample marginals of the MCMC run (corresponding to the samples of τ = 1) for parameter x 7 of the ICRP model between all investigations by means of a Kolmogorov-Smirnov test. Here x 7 was chosen as it directly affects the observed plasma levels [8]. Except for one pair, all p-values were < 6 · 10 −8 , meaning that the chance of falsely rejecting the hypothesis of comparable marginals is negligible. Therefore, as the posterior marginal distributions are quite different, it can be deduced that the basis for the Bayes factor, the joint posterior distribution, can differ quite strongly w.r.t. the individuals. This indicated that each investigation should be treated separately. Nevertheless, in order to infer the transfer rates of an average subject, as currently used by the ICRP, the concatenated data had to be used. We therefore compared the HMGU and ICRP model based on both the concatenated data D = {D 1 , . . . , D 16 } and, in order to account for the biological diversity, the individual investigation specific datasets D i (i = 1, . . . , 16). This could also be the basis for further analysis of influence factors such as weight or gender.

MCMC-based parameter estimation
Throughout, the analysis was based on 30,000 proposals for each of the 30 temperature levels in all 17 cases (one for each investigation and one for D). For the HMGU model the average ESS including one standard error, i.e. taken over all temperature levels and investigations, is 5832 ± 405. In case of the ICRP model we obtained in average 5808 ± 252 (approximately independent) samples from the Markov chains. This naturally implied high acceptance rates for both models. The sampling procedure thus captured the power posteriors very well.
From the posterior samples, we could derive new credible intervals for the parameters at hand as well as a new MAP estimate for an average subject which can be used if single parameter values are required (see Additional file 1 section 4.1). As can be seen in Figure 3, the fit of the time courses covered the data, indicating that both models are in principle able to fit the data. This justifies our ODE approach with additive noise. However, from the fits alone it is not obvious which model is superior. Note that the credible intervals in Figure 3 represent only the uncertainty based on the parameters, in contrast to measurement uncertainties accounted for by the σ b i s and σ u i s, which are not shown. Clearly, this uncertainty in the parameters is specific to the individual investigations or the complete data, respectively.

Parameter correlations and model identifiability
The posterior probabilities of both the HMGU and ICRP model showed strong correlation between the parameters x 7 and x 8 throughout all investigations. The estimated Kendall's τ 's based on the preruns wereτ HMGU = 0.8027 ± 0.01 andτ ICRP = 0.3452 ± 0.02. This can be explained as follows: At time point t = 0 the stomach compartment y 9 is the only compartment with non-zero Zr concentration. It is exclusively connected to the small intestines y 10 in all models. Therefore, all Zr compounds have to pass through y 10 , which further on distributes them to the observed transfer compartment y 1 via x 7 or to the upper large intestines y 5 via x 8

Bayesian model comparison
Using the concept of thermodynamic integration we compared the HMGU and the ICRP model based on (i) the concatenated data D = {D 1 , . . . , D 16 } and (ii) the individual investigation specific datasets D i (i = 1, . . . , 16). This results in a total of 17 Bayes factors. We found that all Bayes factors favored the HMGU model; in 14 out of the 17 cases even decisively (cf. Table 1, second column, of this section and section 4 of Additional file 1).
In order to take a closer look at the contribution of the plasma and urine data to the above results, we computed additional Bayes factors based on the likeli-  (Table 1, third column). For the urine data, three investigations slightly favored the ICRP model (Table 1, fourth column). In summary, all decisive Bayes factors are in favor of the HMGU model. While the HMGU model was never decisively rejected, the ICRP model was decisively rejected in the majority of cases. Hence, in depth analysis showed that the HMGU model is superior to the ICRP model with respect to zirconium processing in the human body. This not only holds investigation-specifically, but also based on the complete data. We additionally considered an extension of the HMGU model (see Additional file 1 section 1.2 and 4) which, however, did not show any significant improvements.

Differences in radioactive 95 Zr retention in bone predicted by the HMGU and ICRP models
In internal exposure monitoring, biokinetic models were used to predict the organ retention or daily excretion http://www.biomedcentral.com/1752-0509/6/95 of incorporated radionuclides [44]. With an interval of 120 days the radioactivity of 95 Zr possibly incorporated by occupational workers was routinely monitored by whole body counters. Depending on the intake route, the radiation dose of bone surfaces or colon was taken as regulatory limit for a decision if an individual is requested for person-specific monitoring [45]. In this monitoring procedure, the biokinetic model structure and parameters are used implicitly in the background. The organ retention function is the solution of the model in each compartment; the organ doses are directly related to the integral of radioactivity of 95 Zr in source organs over 50 years. In order to compare the retention of 95 Zr as predicted by the ICRP and HMGU models, the 90% credible intervals for the time courses in the bone compartments were calculated based on the posterior samples. It is found that there is a significant difference between the two models (Figure 4), where for the ICRP model we added the concentrations in the two bone compartments. The time courses were derived for stable isotopes of Zr and thus had to take the radioactive decay of 95 Zr with half-life of 64.032 d [46] into account. The decrease of retention in bone using the HMGU model consequently reduces the radiation dose estimate in bone in comparison to the ICRP bone dose which is currently used in monitoring.

Retrospective dose assessment
Internal doses due to incorporated radionuclides have to be estimated with the help of biokinetic models based on indirect measurements, using for example bioassays for blood or urinary excretion. Normally, bioassay or in vivo data (e.g. radioactivity accumulated in skull or knee detected by a partial body counter) are measured after an accidental intake of radionuclides. Uncertainties of estimated doses are significant and have a large impact on the remediation and thus action costs. In contrast to conventional uncertainty analysis as performed in [7], our Bayesian approach naturally integrates the uncertainties of measured data and parameters simultaneously. This trait of the Bayesian approach is useful as it provides an estimate for the intake and its credible intervals.
For example, if the urinary excretion after accidental exposure is measured, we can compute credible intervals for the initial intake of radionuclide 95 Zr by exploiting the posterior distribution together with the linearity of the HMGU model. In order to be as general as possible we used the posterior samples based on the complete data. Given a posterior sample x H , a measurementẏ t zero for all except the stomach compartment y 9 . The latter reads y 9 (0) =ẏ t 7 · 100%/c 9 x H (t) where c 9 x H (t) denotes the value of c x H (t) in the stomach compartment at time point t. Now, assuming that for arbitrary posterior samples x H the measurementẏ t 7 is contained in the 90% credible interval of the solution c x H (t) with initial condition y 9 (0) as given above, lower and upper bounds for credible regions of the initial amount of zirconium taken in at t 0 = 0 emerge. These are based on the posterior samples. The estimated extrapolation factors for multiplication with a urine measurement (in [μg/d]) after time t (in [h]) are contained in Table 2 and yield the initial amount of zirconium contained in the stomach at t 0 = 0. For example, if an amount ofẏ 2d 7 = 50μg/d was measured after two days, we find from Table 2 that the 90% credible interval for the ingested amount lies between 0.029g and 0.059g. Since the above calculations are based on non-radioactive Zr isotopes, the results have to be corrected for dose assessment with respect to radioactive decay of the radionuclide in question, i.e. in many cases 95 Zr. Shown are the lower bound factor (lbf), median factor (mf), and upper bound factor (ubf) for multiplication with a urine measurement (in [μg/d]) after time t (in [h]) on a 60h grid yielding the initial intake concentration (IC) at t 0 = 0.

Conclusions
We were the first to evaluate two competing biokinetic ODE models for zirconium processing in the human body after ingestion. These models were the current model recommended by the International Commission on Radiological Protection (ICRP) and a model developed by the Helmholtz Zentrum München (HMGU). The analysis was based on a Bayesian approach, i.e. individual Bayes factors for 16 investigations as well as a Bayes factor based on the concatenated dataset. In order to obtain reliable Monte Carlo sampling results, we combined the numerically stable thermodynamic integration with an efficient copula-based Metropolis-Hastings algorithm. In summary, the HMGU model was unequivocally superior with 14 of 17 Bayes factors being even decisive when compared to the well-established ICRP model. Also, when restricting the data on plasma and urine measurements only, we found that the HMGU model was clearly favored. The HMGU model thus best covers human data.
In contrast to the ICRP model, the HMGU model predicted a delayed accumulation of zirconium in bones which might be experimentally tested in animals in the future. Furthermore, we showed that the HMGU model can be applied for retrospective dose assessment, where the initially ingested amount of zirconium can be reconstructed including credible intervals from ex post urine measurements. This provides a simple hands-on tool that facilitates the decision if measures have to be taken in case of accidental exposure. In future applications the superior HMGU model together with its posterior samples can readily be used as the basis for dose estimation in internal dosimetry. The Bayesian framework for the compartmental model analysis developed in the present work is directly applicable to a personalized dose assessment and the uncertainty quantification if a person-specific monitoring is requested. More generally,