Bifurcation analysis informs Bayesian inference in the Hes1 feedback loop

Background Ordinary differential equations (ODEs) are an important tool for describing the dynamics of biological systems. However, for ODE models to be useful, their parameters must first be calibrated. Parameter estimation, that is, finding parameter values given experimental data, is an inference problem that can be treated systematically through a Bayesian framework. A Markov chain Monte Carlo approach can then be used to sample from the appropriate posterior probability distributions, provided that suitable prior distributions can be found for the unknown parameter values. Choosing these priors is therefore a vital first step in the inference process. We study here a negative feedback loop in gene regulation where an ODE incorporating a time delay has been proposed as a realistic model and where experimental data is available. Our aim is to show that a priori mathematical analysis can be exploited in the choice of priors. Results By focussing on the onset of oscillatory behaviour through a Hopf Bifurcation, we derive a range of analytical expressions and constraints that link the model parameters to the observed dynamics of the system. Computational tests on both simulated and experimental data emphasise the usefulness of this analysis. Conclusion Mathematical analysis not only gives insights into the possible dynamical behaviour of gene expression models, but can also be used to inform the choice of priors when parameters are inferred from experimental data in a Bayesian setting.


Aims
Mathematical models can help biologists understand the mechanisms and dynamics behind their experimental observations (Tomlin et al. [1]). The most widely used approach to modelling the dynamics of a genetic network is to employ systems of ordinary differential equations (ODEs) (Voit [2] and de Jong [3]). These models have biological parameters, some of which can be measured experimentally and some of which cannot. Parameter estimation, that is, recovering unknown parameters from experimental data, is an important step towards obtaining a good model that can not only explain observed results but can also be used for prediction and "what if" scenarios.
The inference of parameters, from real biological data, within a Bayesian framework is a relatively new, albeit currently very active area. Bayesian inference and Markov chain Monte Carlo (MCMC) methods have been recently advocated for the estimation of model parameters from biological systems described by ODEs (Rogers et al. [4]).
This paper aims to show that a general technique, using dynamical systems analysis to inform the choice of priors, can add value. The negative feedback loop studied here is a key feature of many complex biological networks and is referred to as a motif by Alon [5]. Hence our work will be of interest whenever larger networks are dealt with by a modular 'divide and conquer' approach. A Bayesian setting links the quantity that we are interested in, the probability that our parameters take certain values given the data, to two quantities that we can assign, the probability that we would have observed the measured data if the parameters took those values and our prior biological knowledge or ignorance about these parameters (Sivia [6]). Whereas traditional parameter estimation methods are deterministic and point valued (for example, COPASI [7]), these new methods use Bayes' Theorem to assign probabilities to parameter values and can handle noise inherently.
Using a Bayesian approach to parameter estimation for nonlinear dynamical systems throws up several challenges, and these typically increase when time delays are included. Key issues are • dimensionality: models may involve several undetermined parameters, • identification: different parameter combinations may produce similar dynamics, for example, increasing a production rate may be almost equivalent to decreasing a decay rate, • local maxima: the likelihood function may have many locally optimal values. MCMC methods [8] can go some way to addressing these difficult issues. In this work we show that a priori mathematical analysis can also be an effective tool. Our focus is on oscillatory time series, and we use the standard and widely applicable tools of Hopf bifurcation analysis and Lindstedt's method [9] in order to inform the choice of priors, thereby simplifying and focussing the parameter estimation process in order to find a particular parameter regime. More precisely, we follow the approach of Verdugo and Rand [10] in studying the case where oscillations arise via bifurcation through the delay parameter. Verdugo and Rand show that this leads to biologically realistic parameter values. We show here that inference with real data further supports this viewpoint.
The setting for our work is the biologically important instance where the expression of a gene is down regulated by its protein product. This arises, for example, with the p53 tumor suppressor protein whose intracellular activity is regulated through a feedback loop involving its transcriptional target [11]. We focus here on the case of the delayed Hes1 feedback loop featuring hes1 mRNA and Hes1 protein, where both mathematical models and quantitative experimental data are available in [12] and [13].

Biological Data
It has been observed that mRNAs for Notch signalling molecules such as the bHLH factor Hes1 oscillate with 2hour cycles during somite segmentation, although the molecular mechanism of such oscillations remains to be fully determined. Hirata et al. [13] investigated the oscillations of mRNAs for Notch signalling molecules by examining the time course of hes1 mRNA and its Hes1 protein in detail. They observed that a single serum treatment induced a 2-hour cycle oscillation of hes1 mRNA in a variety of cultured cells and that Hes1 protein also oscillated in a 2-hour cycle after the single serum treatment with a delay of about 15 minutes relative to the hes1 mRNA oscillation. The half lives of hes1 mRNA and Hes1 protein were measured and the proteases for Hes1 protein degradation were identified.
Hirata et al. confirmed experimentally that degradation of Hes1 protein is required for hes1 mRNA increase and that de novo production of the protein is required for reduction of hes1 mRNA. These facts together support the theory that Hes1 is an essential component of a two hour cycle clock. They showed that the same mechanism applies to hes1 mRNA oscillation in the presomitic mesoderm.
The Hirata data comprises scaled hes1 mRNA expression levels every 30 minutes over a 12 hour period; see the discrete points in Figure 1.
A system featuring negative feedback and described by only two ODEs (one for each species, in this case hes1 mRNA and Hes1 protein), can be shown never to generate sustained oscillations [14]. Hirata et al. proposed that the observed oscillations could be modelled by introducing a Hes1 interacting factor as a third molecular species, but there is no experimental evidence for such an interacting factor. Monk [12] showed that observed oscillatory behaviour can be explained by incorporating a time delay.

Model for Hes1 Feedback Loop
Monk [12] suggests that the observed oscillations are due to the non-instantaneous nature of transcriptional and translational delays, and proposes a mathematical model where a delay is introduced to account for this feature. Monk's model was able to explain, via numerical simulations, the oscillation of hes1 mRNA and Hes1 protein in cultured cells observed by Hirata et al. Letting m (t) and p (t) denote the concentration of hes1 mRNA and Hes1 protein at time t, respectively, the model proposed by Monk for the Hes1 feedback loop takes the form where μ m and μ p are the rates of degradation of mRNA and protein, respectively, p 0 is the normalised repression threshold, n is a hill coefficient and τ is a constant delay caused by transcription and translation. In this system, the Hes1 gene transcribes hes1 mRNA which passes from the nucleus to the cytoplasm. There it is translated into Hes1 protein. An unusual feature is that the Hes1 protein binds to the gene promoter and represses the transcription of hes1 mRNA. Following Monk, the equations have been rewritten so that the delay appears in the equation describing the regulation of transcription and not in the equation describing protein translation or synthesis.
Throughout the rest of this paper, we will refer to τ as a transcriptional delay. The equations have also been rescaled so that the dynamics of the model are determined by five parameters. Using an ad hoc parameter fitting procedure, Monk showed that this model reproduced the broad features of the oscillation of hes1 mRNA and Hes1 protein in cultured cells observed by Hirata et al. [13].
Thus Monk [12] argues that the observed oscillatory behaviour is best accounted for by the introduction of a delay parameter that acts as a proxy for the many subprocesses that make up transcription and translation, rather than by introducing an unknown third agent [13].
By performing mathematical analysis of the model (1)-(2), we will show that the convincing but ad hoc parameter fitting exercise in [12] can be extended to a fully Bayesian setting.

Mathematical Analysis
Verdugo and Rand [10] gave some mathematical analysis of Monk's model [12] for the Hes1 feedback loop. In particular, they derived closed form approximations for the amplitude and frequency of oscillation, where oscillatory behaviour is assumed to arise through Hopf bifurcation in the delay parameter. The analysis in [10] applies to the case where the decay rates of hes1 mRNA and Hes1 protein, key components of the feedback, are equal, that is, μ m = μ p in (1)- (2). In this work, we study the more realistic case where the decay rates are allowed to be different, also focussing on oscillatory behaviour. By allowing these degradation rates to differ, we expose some interesting features of the nonlinear system under consideration; namely that oscillations of this type only occur when the difference between degradation rates is small. The novelty of this work lies in using bifurcation tools to aid what turns out to be a very difficult inference problem. Details of the mathematical analysis can be found in the methods section. Our results are of interest in their own right as a means to understand how the system dynamics are driven by the model parameters, but our main aim here is to inform the corresponding Bayesian inference problem.

Experiments
In order to illustrate the use of mathematically informed priors when inferring model parameters we conducted two experiments. Experiment 1 uses synthetic data, computer generated from the mathematical equations describing the Monk model with known parameters. Experiment 2 looks at the published data of Hirata et al. Data is noisy and we have noted the error bars in the Hirata data when making assumptions about this type of extrinsic noise. Bayesian methods are very useful when dealing with this type of experimental noise because we can include noise as a hyper parameter, treat it as a nuisance parameter and integrate it out. Intrinsic noise will also come into play at the molecular level [15] and there is recent work on the subject of inferring parameters in stochastic models [16]. However, we do not believe that there is enough data to make it feasible to fit a stochastic model that would take into account intrinsic noise. We therefore use the ODEbased model in line with other authors [12] and [15]. Consequently, we are dealing with extrinsic and experimental noise. To make the synthetic data in Experiment 1 more realistic, Gaussian noise is added, based on the absolute size of the error bars of Hirata with a mean of zero and a standard deviation of 0.2. In our example, we have used larger initial conditions so the relative noise is lower. However, investigations with a higher level of noise to match the relative values from Hirata give broadly similar conclusions. The justification for adding Gaussian noise to make the data more realistic lies with the Central Limit Theorem, which shows that under appropriate conditions the overall effect of extrinsic noise sources will be normal [4]. An advantage of using synthetic data is that the parameters to be inferred are known and so the accuracy of the method to infer them can be evaluated. For Experiment 2, two of the parameters values, μ m and μ p , have been measured independently. So mathematically informed priors are complemented with biological priors where available.
In Experiment 2 (Hirata Data) there is data for both species m and p and so in Experiment 1 (Synthetic Data) we generate data for both species m and p as well. We will show that some aspects of the data for p inform the mathematical analysis, however the actual inference of parameters for both m and p is based on the data for m only. This is because protein expression levels are harder to measure accurately in practice so it is useful to establish what can be inferred from the more accurately measured species alone. With the Hirata data, there is also the added complication that the protein levels measured have been scaled in an undisclosed manner [15].
The location in time of the sampling points is clearly an important issue. For species that oscillate, it may be difficult to know in advance which time points will provide most information. In this work the real data is already available and hence our initial test was chosen to match those details. Generally, for oscillatory behaviour, we would expect that the key constraint on the sample times is that the sampling frequency should be sufficiently large.

Methodology
We have some data and a proposed model. We use a Bayesian framework to infer the parameters underlying the model. In a Bayesian framework, the posterior pdfs are given by the prior pdfs modified by the likelihood of the data given those parameters. More precisely the posterior pdfs are proportional to the product of the likelihood and the priors. The posterior pdf for a particular parameter depends on the values of all the other parameters and are shown for individual parameters by integrating over the other parameter values. Hence the posterior pdfs that we show are marginal distributions for each parameter and not distributions for fixed values of the remaining parameters. The resulting equations for the posterior pdfs are intractable so we use MCMC methods to sample from the posterior distributions. This section spells out further details of this methodology which differs slightly for Experiment 1 and Experiment 2.

Bayesian Inference
Bayesian methods have several advantages over other approaches to parameter estimation [17]: use of back-ground information, the ability to include uncertainity in all parameter values and the ease of making inferences about some parameters irrespective of the values of others. The development of high-speed computing means that the potential of Bayesian methods is being realised, but there are still many computational difficulties. In this section, we summarise the key features of Bayesian inference and highlight the technical decisions which have to be made. For more information about Bayesian analysis we refer readers to [6] and reviews such as [17].
Our initial state of knowledge (or ignorance) about the model parameters is encapsulated in the prior probability. The priors are then modified by the experimental measurements through the likelihood function to give us the posterior probability, which represents our new state of knowledge about the value of the parameters. A likelihood function needs to be chosen and after using a normal distribution for Experiment 1, we chose to base the likelihood function for Experiment 2 on the chi-squared distribution. The chi-squared distribution, χ 2 , has longer tails than the normal distribution and so encourages the inclusion of outliers in the data, which is important for the estimation of model parameters which influence in particular the amplitude of the oscillations. This proved to be an important consideration in Experiment 2, possibly because of experimental noise levels.
For Experiment 1, we therefore suppose that the probability of the kth datum having a value x k is given by N (f k , σ) where f k = f(θ, t k ) is the true value of the function of parameters θ = {p 0 , n, μ m , μ p , τ} of interest, and the variance, σ, accounts for the error in its measurement. In our case, f k is the mRNA level m (p 0 , n, μ m , μ p , τ) at time t k from the delay differential equation model (1)-(2).
For Experiment 2, we suppose that the probability of the kth datum having a value x k is given by χ 2 (f k , v) where the degree of freedom v, accounts for the error in its measurement. Given that the Hirata data has been scaled, it will also be necessary to infer an additional scaling parameter, which we refer to as k s , so θ = {p 0 , n, μ m , μ p , τ, k s }.
For both experiments, we assume that the value of σ and v is known. Our inference about the value of θ is expressed, therefore, by the posterior pdf p (θ | {x k }, I) where I denotes all other background information; to help us calculate it, we use Bayes' Theorem [6] The relation in (3) is expressed using proportionality, because the term p (data|I) has been omitted from the denominator in the right hand side. This is fine for data analysis problems, such as this one, involving parameter estimation, since the missing denominator is simply a nor-malisation constant not depending explicitly on the parameters. However if we were to consider model selection, this term would play a crucial role, and is thus given the special name of evidence.
As in [4], we assume that the data are independent, so that the measurement of one datum does not influence what we can infer about the outcome of another (when given the values of θ). The likelihood function is then given by the product of the probabilities of obtaining the N individual data: We assign a Gaussian pdf as prior for each parameter θ j , with mean μ j and variance σ j , where j = 1:5 in Experiment 1 and j = 1:6 in Experiment 2: According to (3), if we multiply the prior stated in (5) by the likelihood function (4), we obtain the posterior pdf. Our approach here will be to deal with the posterior pdf numerically, using MCMC.

MCMC Method
MCMC techniques are used to sample from distributions and more generally to solve optimisation and integration problems in large dimensional spaces (Andrieu et al. [8]). Here, we use MCMC to sample from the posterior pdf defined in (3). The approach is, essentially, to compute candidate values for our unknown parameters by constructing a Markov chain with an appropriate invariant measure. The candidate values are chosen randomly using a proposal function and are either accepted or rejected depending on an acceptance probability. The frequency at which a particular parameter is successfully sampled is indicative of its posterior probability. Once we have enough samples from the solution space, they can be binned, smoothed and normalised to indicate the relative posterior likelihood of a given parameter. Sampling can start from any particular parameter value. The proposal function is chosen so that samples are drawn widely and performance can be monitored by the acceptance rate. With Markov chains, after a burn in period, the memory of the initial starting point is lost. At this point sampling will be exclusively from the required distribution. We need to establish at which point this has occurred. Gelman et al. [18] propose setting off parallel sampling chains and monitoring their convergence with a statistic . We have followed this convention. More details can be found in the Methods section and Tables 1 (Experiment 1) and 2 (Experiment 2).

Mathematical Analysis
We summarise the main analytical results here. Further details about the derivations of the formulae can be found in the Methods section.
Following [10] we can derive an expression for the equilibrium state of m and p (m* and p*, respectively, see equations (21)-(20)) of model (1)- (2) and show that by increasing the delay the system moves from a stable state to an unstable state. At a critical value τ cr of the time delay a Hopf Bifurcation occurs. For values of delay τ close to τ cr , the nonlinear system is expected to exhibit a periodic solution which can be expressed for both the mRNA and protein in terms of their amplitudes, A and B, respectively, their oscillatory frequency ω (at τ cr ) and their phase difference ϕ. Algebraic manipulation of these expressions results in explicit formulae linking the model's biological parameters which generalise those in [10]. Specifically, at the Hopf bifurcation we have  The result (7) characterises the ratio between mRNA and protein amplitudes, but not their absolute values. To proceed we use Lindstedt's method which is a technique for uniformly approximating periodic solutions to ODEs when regular perturbation approaches fail. We regard the frequency as unknown in advance and solve by demanding that an appropriate series expansion contains no secular terms. The result is closed form approximate expressions for the amplitude and frequency of oscillation expressed in terms of the model parameters. For the derivation of these formulae and more details, see the Methods section. In summary, our approximation to A 2 and the observed frequency -may be written in the form Ω = ω -f Ω (n, p 0 , μ m , μ p ) Δ, ( 1 5 ) and substituting (14) into (7) gives an approximation for B 2 where f A and f Ω are positive functions defined in terms of the model parameters and Δ = ττ cr . As Δ increases we would expect A and B to increase proportionally to and Ω to decrease.
The frequency of oscillation, ω, takes a maximum that can be explicitly defined in terms of the model parameters. We may show that for any μ m μ p = c, a unique maximum value of ω occurs when μ m = μ p = . We may then find the value of c and hence μ m and μ p for which the maximum of ω occurs. We can use this information combined with equation (15) to deduce that ω > Ω. This result allows us to make an informed judgement about ω given Ω, which can be used in the setting of priors.
Biology meaningful solutions must also have ω > 0 and τ cr > 0. Imposing these conditions in equation (9) gives

Experiment 1 (synthetic data)
The synthetic data in Figure 3 comprises 49 mRNA values. Details of the parameter values used with the Monk equations and the added noise can be found in the caption of Figure 3. The corresponding 49 protein values were also generated and feature in Figure 4 where they are considered for the μ p prior. The protein values, as for Experiment 2, are not used in the inference problem. The following section describes how the mathematical analysis is used to find priors for the model parameters.
We can estimate μ p as from equation (70). In computing m ave and p ave , care needs to be taken to discard the transient data and to select data points from a complete number of cycles. In our example, as the oscillatory period appears to be about 2 hours, we take the last 20 data points which form approximately 5 cycles, see  . and (15). This gives p 0 ≈ 90, n ≈ 5 and μ p ≈ 0.026. We also require a prior for τ. Lindstedt's method tells us that the closeness in value of τ and τ cr depends on the difference between ω and the observed frequency Ω. Our observed period is about 2 hours giving an observed frequency of 0.052. Substituting our estimates into (9) and (15) we obtain τ cr ≈ 17.6567 and τ ≈ 19.9508. See Table 3 Figure 5 shows the prior pdfs and the corresponding posterior pdfs for each of the parameters inferred in Experiment 1. The parameter values used in the model to simulate the data are shown on each x-axis with a dot. The peak of each curve indicates which parameter values are most probable, and a sharper peak indicates more certainty. For all parameters in Experiment 1 the peak lies over the input value to within visual accuracy. We are also interested in the extent to which the posterior pdfs, which express the probability of each parameter given the data, modify our prior beliefs. Here we used Gaussian priors with mean based on the point estimates outlined above and small variance. The posterior pdfs for p 0 , n, μ m and μ p provide only slight modifications of the prior pdfs suggesting that the a priori mathematical analysis has allowed us to select these priors very effectively. The fact that these mathematically informed priors, which are based on broad features of the data, are reasonably close to the posterior pdfs suggests that the bifurcation analysis is relevant and consistent with the data. The parameter τ shows the most change from prior to posterior. We can show from (16) that the amplitude of mRNA is sensitive to τ, and this experiment (and Experiment 2 below) suggests that the likelihood function is also strongly affected by τ. In Figure   6 we rerun the same experiment using uniform priors over the point estimates. We see more modification of the posterior pdfs, relative to the prior pdfs but, as shown by the flatness of the pdf curves, there is increased uncertainity in the prediction of parameter values. We ran several more experiments with different priors (not shown here) and conclude that when we apply more general priors we run into trouble with this hard inference problem and get stuck in suboptimal posteriors. This highlights the importance of exploiting whatever information can be gleamed from the structure of the model.

Results
MCMC provides samples and in order to recover the posterior pdfs these samples are binned, smoothed and nor-  malised. We can also simulate the dynamic activity associated with each of these samples using the associated parameter values. The variance of this dynamic activity at each time point indicates the likely range of activity that could have produced the data. The shaded area in Figure  3 shows this range (5-95 percentile) for Experiment 1. The fit of this area in terms of position and width shows how well we have recovered the model parameters from the data.

Experiment 2 (biological data)
Hirata et al. measured the half lives of hes1 mRNA and Hes1 protein and estimated that hes1 mRNA is degraded with a half life of 24.1 ± 1.7 minutes and that Hes1 protein is degraded with a half life of 22.3 ± 3.1 minutes. These half lifes are close but not identical. Expressed as decay rates, the rate for hes1 mRNA is 0.0288 ± 0.002 and the rate for Hes1 protein is 0.0311 ± 0.004. We used these rates to give biologically informed priors for the parameters μ m and μ p . Using Lindstedt's method we obtained an expression for the observed oscillatory frequency Ω from equation (15). As k 2 Δ < 0 we have ω > Ω. In other words,  (8) gives K > 0.036. The value of K in equation (12) can be shown, via numerical computation, to be sensitive to the value of n, whereas the values of m* and p* are more sensitive to p 0 than n. Given m* (or p* as m* = μ p p* from equation (21)) and the estimate for K, we can obtain point estimates from equations (21) and (12) of n ≈ 5 and p 0 ≈ 100. We also require a prior for τ. Substituting our estimates for ω, μ m and μ p into equation (9) gives τ cr < 19.8 and we use this estimate of τ cr , which is close to τ, for choosing our prior for τ. We also need a scaling factor k s because the data has been scaled by a undisclosed constant, and this will also affect the initial condition. We set k s between 2 and 3, the initial condition of p to be 100 and m to be k s . See Table 4 for a summary of the priors. Figure 7 shows the prior pdfs and the corresponding posterior pdfs for each of the parameters inferred in Experiment 2. Once again, the customised priors are very close to the final pdfs. In this experiment, we chose a χ 2 distribution rather than a normal distribution for the likelihood, in order to encourage inclusion of any outlying data points.

Results
The shaded area in Figure 1 shows the fit of the parameter estimates within the 5-95 percentile for Experiment 2.
The early data points are covered by the prediction. Compared to Monk's results, there is broad agreement, however we predict a slightly longer delay (around 18.9 compared to Monk's 18.5) and slightly different values for μ m and μ p (around 0.0285 and 0.0305, respectively compared to Monk's 0.03 for both). We also inferred a different scaling factor which contributed significantly to the improved match to the Hirata et al. data that can be seen in Figure 1.

Conclusion
We have shown that mathematical analysis can give insights into the possible dynamical behaviour of gene expression models. We derived a range of analytical expressions and constraints that link the model parameters to the observed dynamics of the system. The basis of the analysis is the assumption that oscillatory behaviour arises through Hopf bifurcation in the delay parameter. It 2 120 p  is then possible to say for which range of parameters and to what extent this behaviour will occur. For the problem of inferring parameter values from data, we can use this analysis to inform the choice of priors. For these types of problems, where different parameter values can result in similar activity, identification of the parameters through the data can be ambiguous. MCMC methods can go some way towards resolving these difficulties but we showed here that mathematically informed priors can add value too. On the experimental data from Hirata et al., a full Bayesian MCMC computation improved on the parameter fit published by Monk, predicting non-equal values for mRNA and protein decay rates and a longer time delay for transcription.

Methods
We describe here in more detail the mathematical and computational aspects of this work. We generalise the approach of Verdugo and Rand to the more biologically realistic case where the degradation rates of mRNA and protein are not assumed to be equal. The analysis reveals that we would expect to see oscillations when the difference between μ m and μ p is small which justifies, generally, the assumption that the degradation rates are almost equal.

Stability of Equilibrium
Following the approach of Verdugo and Rand [10], equilibrium points for the system (1) and (2) are found by setting = 0 and = 0. After elimination and substitution, we obtain two expressions for p* and m*: To find out whether m* and p* are stable, we linearize about these points and define ζ and η to be deviations from the equilibrium: ζ (t) = m (t) -m*, η (t) = p (t) -p*, and η d = η (t -τ). This results in the linear system: where K and β are given in (12) and (13). Equations (22) and (23) can be written as the second order DDE: We now look for a solution of the form η = e λt , which leads to λ 2 + (μ m + μ p ) λ + μ m μ p + Ke -λτ = 0. (25) First, we focus on the non-delay case, τ = 0. Here we have Examination of equation (26) reveals that since μ m + μ p > 0 and , the real part of λ will always be negative, implying stability. For λ has a non-zero imaginary part as well as a negative real part and the equilibrium point (m*, p*) will be a stable spiral. When μ m = μ p , the inequality (27) becomes K > 0 which always holds. However when μ m ≠ μ p this inequality becomes 4K > (μ m -μ p ) 2 . In other words, we only have a stable spiral when |μ m -μ p | is sufficiently small. For larger values of |μ m -μ p | we will have two negative real roots for λ, resulting in a fixed stable point with no oscillations. We conclude that, overall, the non-delay model always has a linearly stable fixed point.
Following the approach in [10], we now assume, by continuity, that as τ increases the roots λ will cross the imaginary axis (this will not be at the origin because μ m > 0 and μ p > 0 and therefore the real and imaginary parts of λ cannot both be zero) at some critical value τ = τ cr , and for τ > τ cr the steady state (m*, p*) will lose stability giving rise to a Hopf bifurcation. We thus assume that for τ = τ cr the system (22)-(23) will exhibit a pair of pure imaginary eigenvalues ± ω i corresponding to the solution where A and B are the amplitudes of the η (t) and ζ (t) oscillations, and where ϕ is a phase angle. More generally, for values of delay τ close to τ cr , the nonlinear system is expected to relax to a periodic solution which can be written in the approximate form of equations (28) and (29).
Substituting equations (28) and (29) into equations (22) and (23) and matching time dependent terms results in the explicit formulae (6)- (11), which generalise those in [10].  Lindstedt's method Changing the first order system (22)-(23) into a second order DDE results in where the coefficients K, H 2 and H 3 are obtained by Taylor series expansion of the nonlinear term.
Assuming that the true solution is periodic, our goal is to find an analytical approximation that is valid for all t. The key idea is to regard the frequency ω as unknown in advance, and to solve for it by demanding that η contains no secular terms. We introduce a small parameter , and let Δ =ε 2 δ with η =εu, (31) τ = τ cr + Δ = τ cr +ε 2 δ.
We then have where u d = u (σ -Ωτ). Expanding Ω in a power series in , omitting the O () term for convenience, since it turns out to be zero, we have Ω = ω +ε 2 k 2 + ...
Substituting and collecting terms we find We take the solution of the u 0 and u 1 equations to have the form u 1 (σ) = m 1 sin 2σ + m 2 sin 2σ + m 3 , where A = from (29) and (31). Next we substitute (41) and (42)  (67)

Can the data tell us anything about the parameters directly?
Integrating equation (2) gives From the Fundamental Theorem of Calculus, it follows that If p is periodic and T is a multiple of the period then the left-hand side of this equation vanishes and the righthand side may be written m ave -μ p p ave . This gives us the result This allows us, from appropriate data sets, to infer the parameter μ p by averaging data from m and p. This, of course, presupposes that the period T can be estimated accurately and that there is enough data to form good approximations to the averages.
We remark that μ m cannot be approximated this way because of the nonlinearity in (1).

MCMC (Metropolis Hastings) Method
Our approach was to choose candidate values for our