 Research Article
 Open Access
 Published:
A global sensitivity analysis approach for morphogenesis models
BMC Systems Biologyvolume 9, Article number: 85 (2015)
Abstract
Background
Morphogenesis is a developmental process in which cells organize into shapes and patterns. Complex, nonlinear and multifactorial models with images as output are commonly used to study morphogenesis. It is difficult to understand the relation between the uncertainty in the input and the output of such ‘blackbox’ models, giving rise to the need for sensitivity analysis tools. In this paper, we introduce a workflow for a global sensitivity analysis approach to study the impact of single parameters and the interactions between them on the output of morphogenesis models.
Results
To demonstrate the workflow, we used a published, wellstudied model of vascular morphogenesis. The parameters of this cellular Potts model (CPM) represent cell properties and behaviors that drive the mechanisms of angiogenic sprouting. The global sensitivity analysis correctly identified the dominant parameters in the model, consistent with previous studies. Additionally, the analysis provided information on the relative impact of single parameters and of interactions between them. This is very relevant because interactions of parameters impede the experimental verification of the predicted effect of single parameters. The parameter interactions, although of low impact, provided also new insights in the mechanisms of in silico sprouting. Finally, the analysis indicated that the model could be reduced by one parameter.
Conclusions
We propose global sensitivity analysis as an alternative approach to study the mechanisms of morphogenesis. Comparison of the ranking of the impact of the model parameters to knowledge derived from experimental data and from manipulation experiments can help to falsify models and to find the operand mechanisms in morphogenesis. The workflow is applicable to all ‘blackbox’ models, including highthroughput in vitro models in which output measures are affected by a set of experimental perturbations.
Background
Morphogenesis, the organization of multiple cells into shapes and patterns, is a key process in biological development. Computational modeling is commonly used to study mechanistic hypotheses on morphogenesis [1–7] as they allow for simplification and isolation of the process. These computational studies typically involve multiscale, nonlinear and multifactorial models. So far, the behavior of these computational models is studied for one or occasionally two parameters at a time, which can lead to a wrong interpretation for nonlinear models. Studying all parameters collectively with global sensitivity analysis resolves this problem.
In this paper, we introduce a workflow that uses global sensitivity analysis to find the relevant single parameters and parameter interactions in ‘blackbox’ models of morphogenesis, which are strongly nonlinear and multifactorial. Sensitivity analyses of computational models enable us to identify the effects of uncertainties in parameter values on the model output. Local sensitivity analysis investigates the behavior of the model in a small region around the nominal parameter values and is most often used to study model robustness. Global sensitivity analysis (GSA) covers the entire input parameter space, or a specifically selected region hereof. In its most powerful form, it gives information on the impact of individual parameters and combinations thereof on a nonlinear model for arbitrary parameter distributions. This is what e.g. variancebased methods like FAST [8] and the Sobol’ method [9, 10] do. This, of course, is computationally expensive, therefore many methods have been proposed with simplifying assumptions like linearity of the model (MLR [11]); methods that produce less sophisticated results, e.g. partial or no information on interactions (Morris method [12]); [13, 14] are less robust like DGSM ([15, 16]); or that use prior knowledge of the model, like Bayesian DGSM [17]). In this paper we use the Sobol’ method [10], where we have modified the original method for efficiency reasons (for more details see Section Global sensitivity analysis). Moreover, we introduce an approach to determine the sampling size a priori with an a posteriori error check. Thus, it is not likely that the proposed GSA will excel in computational efficiency, but it will excel in predictability of the costs and reliability of the results. In the biological field GSA is mostly applied to models consisting of ordinary differential equations, e.g., in pharmacology [18, 19], neurodynamics [20], or gene expression [21] and biochemical pathways [17] in cells.
GSA can give interesting new insights into models of morphogenesis. Firstly, GSA predicts which parameters can best be tuned to affect the model output. When the model parameters can be associated with biological cell properties, extracellular matrix properties, or gene expression, knowledge of their influence on morphogenesis can give predictions for in vitro perturbation experiments, e.g. genetic knockouts. Secondly, apart from identifying the impact of single parameters, GSA notably identifies parameter interactions. These can give new mechanistic insights in the functioning of the model. Thirdly, GSA is a tool to reduce the number of parameters in the model. When the analysis indicates that parameter variation does not impact the model output, the parameter value can be fixed. Fourthly, GSA can be used to make a selection of models that support biologically plausible hypotheses in a set of contradicting mechanistic hypotheses.
As a case study, we performed GSA on a previously published [22], wellstudied computational model of vascular morphogenesis. In the model, a spheroid of cells develops into a vascular network. Cells secrete a compound to which cells chemotact by migrating towards higher concentrations of the compound. Vascular networks form when chemotaxis is inhibited at cellcell interfaces. Vascular endothelial growth factor (VEGF) is a candidate for the secreted compound and extensions of pseudopods in the direction of cellcell contacts might be locally inhibited by interference of vascular endothelial cadherins with VEGF receptor 2 signaling. Because of the key role of such ‘contactinhibited chemotaxis’ in this model, we will henceforth refer to it as the ‘contact inhibition model’. Numerous alternative hypotheses for vascular morphogenesis have been proposed [2, 22–29], and it is unsure which of these  if any  is correct. Thus the contact inhibition model is here used as an example model for morphogenesis, while the proposed GSA approach can assist in falsifying mechanisms in the future.
Figure 1 shows the workflow of the GSA analysis proposed in this paper. The input (Fig. 1 a), a list of parameter sets, is fed into the cellular Pottsbased contact inhibition model (Fig. 1 b). This model generates images (Fig. 1 c) of the resulting cell configuration as raw output, ranging from spheroids, to networks, to dispersed cells. Subsequently, two quantitative output measures (compactness and lacuna count) are derived from these images (Fig. 1 d). Two types of GSA are performed on the output measures (Fig. 1 e). Firstly, intensity plots show the impact of parameter combinations on the variation in the output measures (Fig. 1 f). This analysis only allows for a twodimensional GSA, in which the value of two parameters are varied simultaneously while keeping all other parameter values fixed. Secondly, a truly multivariate GSA ranks the impact of individual parameters and of parameter combinations on the variance of the output measures (Fig. 1 g). Important aspects we address in this paper are the reliability and the pitfalls of GSA.
Methods
Vascular morphogenesis model
The contact inhibition model [22] is based on the Cellular Potts Model (CPM) [30, 31]. Cells are projected on a regular square lattice (\(\Lambda \subset \mathbb {Z}^{2}\)) as patches of connected lattice sites, \(\vec {x}\). Each lattice site, \(\vec {x}\in \Lambda \), that is part of a cell is marked with that cell’s identifier (\(\sigma (\vec {x})\)) and cellfree lattice sites represent extracellular matrix (ECM) with σ=0. Each cell identifier is associated with a type: τ(σ)∈{ECM,cell,border}. Cells have cell properties and behaviors, such as adhesion, cell size, or chemotaxis. The forces resulting from the biophysical properties of the cells and their active behavior are represented in the Hamiltonian (H) of the system
in which adhesion (J) is restricted to the cell membrane by the Kronecker delta (δ(x,y)={1,x=y;0,x≠y}) and \((\vec {x},\vec {x}\prime)\) represents the set of adjacent lattice site pairs. There are three nonzero types of adhesion: J _{cell,cell}, J _{cell,border} and J _{cell,ECM}. J _{cell,cell} represents the adhesion strength between cells, and J _{cell,ECM} the adhesion strength of cells to the ECM. The lattice is surrounded with a border by which cells are repulsed, by setting J _{cell,border}=100. The second term constrains the volume of cells, with A representing the preferred size of a cell and λ _{ A } the rigidity of the cell. Deviation of the actual size (a) of cells from their preferred value increases the Hamiltonian.
A cell moves by copying the state (σ) of a randomly selected lattice site \(\vec {x}\) into a randomly selected adjacent lattice site \(\vec {x}\prime \). In this manuscript, we use the eight, second order neighbors. These copies represent extensions and retractions of pseudopods at the cell membrane. A copy attempt that diminishes the Hamiltonian represents a move along a force and is always accepted. If a copy increases the Hamiltonian, it will only occur due to active movements of the cell membrane. We assume that such active motions are distributed according to a Boltzmann probability function: \(P_{\text {Boltzmann}}(H)=e^{\frac {\Delta H}{\mu }}\), with μ a measure of the amplitude of random membrane fluctuations; μ is a scaling factor of the weights (λ) of the constraints and we fixed it at μ=50 conform the settings in our previous work [22]. The parameter values can only be qualitatively coupled to biological data.
We assume that cells secrete a chemoattractant at rate α (s ^{−1}), producing a concentration field \(c(\vec {x})\). The chemoattractant diffuses with a diffusion coefficient D (m ^{2}/s) and decays with rate ε (s ^{−1}) in the ECM, resulting in the following partial differential equation (PDE):
such that secretion is located at the cells, where \(\delta (\sigma (\vec {x}),0)=0\), and decay in the ECM. The field of the chemoattractant is initialized as \(c(\vec {x})=0\) and fixed boundary conditions are imposed. Cells can respond to this chemoattractant by migrating towards higher concentrations (chemotaxis). To this end, the change in the Hamiltonian by that copy, ΔH, is augmented with \(\Delta H_{\text {chemotaxis}}=\lambda _{c}(c(\vec {x})c(\vec {x}\prime))\) [32], and contactinhibition is implemented by setting λ _{ c }=0 at cellcell interfaces such that chemotaxis only occurs at the cellECM interface.
During one time step, referred to as a Monte Carlo Step (MCS), as many copy attempts are performed as there are sites in the lattice. The PDE for chemoattractant diffusion and degradation Eq. (2) is discretized on the CPM lattice and we solve it numerically using a finitedifference scheme. We use 15 diffusion steps per MCS. The model is initialized with a centralized spheroid of 256 cells within a lattice of 400 ∗ 400 sites (lattice spacing 2 μm). We run the model for 5000 MCS, each representing 30 seconds, as networks are well formed in this time in the model as well as in vitro [25].
Global sensitivity analysis
The variation in a solution or a measure thereof, like compactness, over the complete parameter space can only be visually inspected by looking at one or at most two parameters at a time while keeping the others fixed (cf. Figs. 2, 4, and 5). Since different measures will produce different multivariate output distributions and therefore also might result in a different outcome of the GSA, it is important to choose a measure or, more likely a number of measures that are significant for the study at hand. If one wants to take the influence of all parameters simultaneously into account some form of a global measure of the multivariate output distribution is required. One such a measure is the variance of the distribution, which will be used in this paper.
We are specifically interested whether parameter interactions have a large impact on the output of this specific CPMbased model. Interactions of the parameters are unpredictable in nonlinear models such as the CPM, but their impact is significant, since a large combined effect of parameters on the output impedes the experimental testing of a predicted effect of a single parameter.
Sobol’ [9, 10] introduced socalled global sensitivity indices that describe the impact of specific parameters or combinations thereof on the uncertainty in the model output and more in particular on the variance of the output distribution, hence the term ‘variancebased’ GSA. In the original method the necessary integrals are computed with Monte Carlo. However, the Sobol’ indices can be computed very efficiently when the distribution of the output measure or response surface is expanded into a series of orthogonal polynomials, the Polynomial Chaos Expansion (PCE) [33, 34]. An overview of this method, using the same notation as in the following, can be found in [35]; here we summarize only the most important definitions for the case that the stochastic input consists of independently uniformly distributed random variables. Note, however, that the method can also be applied for arbitrary, even nonparametric, distributions, allowing for datadriven GSA (see also [36]). The strength of the method described below is that it (i) can efficiently study multiple output measures derived from the output images, (ii) can robustly identify parameter interactions, and (iii) checks the reliability of the result.
Let ξ be the ndimensional vector of the independently uniformly distributed input parameters and ϱ(ξ) its joint probability density function (pdf). The output measure u(ξ), e.g., of the (blackbox) Cellular Potts Model, is expanded into a truncated series of polynomials that are orthogonal with respect to the pdf ϱ, separating the output into a deterministic and a stochastic part
where the nvariate polynomials Φ _{ i }(ξ) are products of n univariate Legendre polynomials. The number of expansion terms N is given by \(N+1 = \frac {(n+\hat {p})!}{n!\hat {p}!}\), with n the number of parameters and the approximation order \(\hat {p}\) the highest order of Φ _{ i }.
To compute the expansion coefficients u _{ i } of Eq. (3) we apply Spectral Projection which has the advantage that it can be used for lackbox models since it projects the solution  and not the model  onto the polynomial space
where Ξ is the support of the joint pdf ϱ(ξ). As the parameter inputs are independent, both Φ _{ i } and ϱ can be written in product form; for the multivariate polynomial Φ _{ i } this results in a product of univariate polynomials
The integrals in Eq. (4) can then be computed by a repeated onedimensional GaussLegendre quadrature rule
with N _{ q } the number of quadrature points and w the associated weights. Note, that for integrals with a known weight function, like e.g. a pdf, Gauss quadrature has the optimal convergence order of 2N _{ q }−1 for N _{ q } quadrature points, where the points and the weights of the quadrature rule are dependent on the weight function.
How to choose N _{ q } and \(\hat {p}\) to obtain reliable Sobol’ indices will be the subject of Section Reliable GSA in practice.
Statistics and polynomial chaos expansion
Using a PC expansion, the only input needed to compute the moments and the Sobol’ indices of the output distribution are the expansion coefficients. E.g., the mean μ=u _{0} and the variance is given by
Note, that the approximation, Var_{PCE}, is a monotonously increasing function of N and thus of \(\hat {p}\). The sum in the variance formula can be directly split into contributions from the various parameters or combinations thereof, the Sobol’ indices (cf. [37]). E.g., for the firstorder Sobol’ index for parameter j only terms contribute if Φ _{ i }(xi) equals a univariate polynomial in ξ _{ j }
where bool(i,j) = (index(i,j) > 0 ∧ index(i,k)=0,∀k≠j). For a combined influence of more than one parameter like S _{13} the Sobol’ index can be computed analogously. The sum of all Sobol’ indices equals one.
Reliable GSA in practice
At first sight the accuracy of the PCE approximation of the response surface  and thus of the statistics  seems to be determined by the number of expansion terms, N, in Eq. (3). But the accuracy of the expansion coefficients u _{ i } also plays an important role. This accuracy is determined by the approximation Eq. (6) of the integral in Eq. (4), which is determined by the number of quadrature points, N _{ q }. Moreover, the higher PCE order \(\hat {p}\) needed to obtain sufficient accuracy, the higher the polynomial order of Φ _{ i }(ξ) becomes, which increases the complexity of the integrand. If one computes the integral of a high order polynomial with a small amount of points, the resulting expansion coefficients are merely noise instead of being informative.
The question we want to answer in this section is how to determine the number of quadrature points, N _{ q }, and the expansion order, \(\hat {p}\), to obtain a sufficiently high accuracy for the coefficients to allow us to trust the Sobol’ indices and more specifically the ranking of the parameters that follows from it. Here, we sketch a method to determine the number of quadrature points. It relies on the fact that the Sobol’ indices are variancebased, i.e., one can not expect to compute Sobol’ indices accurately from a PCE expansion for which the variance Eq. (7) is not a sufficiently accurate approximation of the true value or at least comparable to the GaussLegendre quadrature approximation of the integral. So, let us define
with
For a given choice of N _{ q } one can easily compute PC expansions for various orders \(\hat {p}\). If err _{Var} is small and the required Sobol’ indices have converged, the result can be trusted.
We illustrate this approach with a function for which the values of the statistics are analytically known, viz., the Ishigami function [38, 39]
with \(\xi _{i}\sim \mathcal {U}[\pi,\pi ]\), i={1,2,3}, and a=7 and b=0.1. We compute the PCE approximation of Eq. (10) for an increasing number of PCE terms and an increasing number of quadrature points. Table 1 illustrates the result of using not enough quadrature points (N _{ q }=2 and N _{ q }=5): there is no convergence in the statistics of the PCE approximation and for \(\hat {p}=3\) and 6, respectively, the noise takes over and the results are meaningless. Table 2 shows that, using sufficient quadrature points, for an increasing number of expansion terms the PCE variance converges to the data variance. If both variances are alike also the Sobol’ indices have converged to the true values (bold lines). Still the number of expansion terms should not be taken too large as can be seen for \(\hat {p}>9\) and \(\hat {p}>13\) where again the noise gradually takes over.
Finally, we also used the Saltelli method [40]  an improvement of the original Sobol’ method  to compute the Sobol’ indices for this problem. To reach a similar accuracy approximately 70–80 times as many sampling points are required, thus showing the gain in efficiency using the PCEGauss method to compute the Sobol’ indices.
Software and computational dataset
All software used in this paper is publicly available. The contact inhibition model resides at http://sourceforge.net/projects/tst/. For GSA we provide a repository containing the computational dataset and the analysis software at http://persistentidentifier.org/?identifier=urn:nbn:nl:ui:1823590.
Results
As a case study for the global sensitivity analysis (GSA) approach, we used a wellstudied computational model of vascular morphogenesis: the contact inhibition model [22]. We studied what single parameters and parameter interactions are important in the development of a spheroid of cells into vascular networks. For this purpose, we used the procedure outlined in Fig. 1: 1) select output measures, 2) select input parameters, 3) select a relevant subset of the global parameter space, 4) analyze the raw output, 5) perform GSA.
Selection of output measures
The contact inhibition model [22] produces images of cell configurations as raw output. We chose two measures to quantify the raw output: compactness and lacuna count. Compactness of the network is a suitable measure of network development [22] and is defined as the ratio A _{cells}/A _{hull}, with A _{cells} the number of lattice sites occupied by cells within a convex hull around all cells and A _{hull} the total number of lattice sites within the convex hull. A solid spheroid and a confluent monolayer of cells have a compactness close to one, while networks that contain lacunae have low values for compactness. Lacuna count is the number of lacunae in a network. Lacunae are defined as patches of medium (connected components of σ=0) enclosed by cells and are only counted when they have at least the size of a cell (50 lattice sites ≈200 μm ^{2}).
Selection of input parameters
The contact inhibition model [22] is a stochastic, multifactorial model. We refer to Section Methods for a detailed description of the model. The contact inhibition model has nine parameters: the number of cells (N), the target size of a cell (A), the rigidity of the cell (λ _{ A }), cellcell adhesion (J _{cell,cell}), adhesion between cells and the extracellular matrix (J _{cell,ECM}), the secretion rate of a chemoattractant by cells (α), a diffusion coefficient of the chemoattractant (D), the decay rate of the chemoattractant (ε), and a sensitivity of cells to the chemoattractant at cellmatrix interfaces (λ _{ c }).
In total, there are four model components or mechanisms in the contact inhibition model, namely cell size, adhesion, contactinhibited chemotaxis and the gradient of the chemoattractant. In order to study the impact of each mechanism in the model extensively, we selected one parameter for each, ensuring that it is computationally feasible to generate enough data points for reliable GSA results. We thus selected four parameters: the cell rigidity (λ _{ A }), cellcell adhesion (J _{cell,cell}), the diffusion coefficient of the chemoattractant (D), and a sensitivity of cells to the chemoattractant at cellmatrix interfaces (λ _{ c }). The other parameters that regulate cell size (A), adhesion (J _{cell,ECM}), or the gradient of the chemoattractant (ε and α) will be fixed at the reference values corresponding to the values in [22]. We kept the number of cells (N) in the spheroid constant, because we know from experience that it does not influence sprouting of spheroids in our model.
A GSA with four parameters can give new insights as four parameters are too many to obtain the relative impact of the parameters and their interactions with visual plots or to know their effect solely by logic or intuition, while the number of simulations required for a GSA with four input parameters is computationally very feasible. A GSA with all parameters of the model is not expected to give additional information on the relative balance of the mechanisms and would be very timeconsuming for a computationally intense model like the contact inhibition model. It would require roughly 10^{9} simulations (c.f., Section Reliable GSA in practice) to obtain reliable GSA results with all model parameters.
Selection of a relevant subset of the global parameter space
To select the parameter ranges for which spheroids of cells develop into networks, we studied onedimensional parameter sweeps of the four selected input parameters for the compactness and lacuna count (Fig. 2). The red lines in Fig. 2 represent the compactness and the blue lines the lacuna count. We selected the region in which the morphology of the network, and thus the value of the output measures, is changing and where no model artefacts arise. It is well studied for which parameter ranges artefacts arise in the CPM [6], such as lattice anisotropy and ‘frozen’ motility of cells. The regions shaded in gray indicate the deleted regions from the parameter space. For λ _{ A } the region 0 to 5 is deleted: cells cannot retain their volume here and disappear. This is a model artefact and does not represent a biological plausible situation. The region λ _{ A }>300 is deleted, because cells are so rigid here that they hardly move. In the region λ _{ c }<10 only spheroids form and for λ _{ c }>3000 similar networks are always formed, thus these regions are deleted because the network morphology does not change. The parameters and their selected value ranges are listed in Table 3.
Based on the reliability study for the Ishigami test model (see Section Reliable GSA in practice), we expected that we required 10000 data points to perform a reliable GSA on our model. The points were chosen according to the GaussLegendre quadrature rule (see Section Global sensitivity analysis), resulting in ten values for each parameter. To correct for the stochasticity of the contact inhibition model, each parameter set is replicated twenty times with a different random seed and the output is averaged over them. The size of the standard deviations in Fig. 2 indicate that the variation over different random seeds is very small for compactness, whereas the stochasticity in the model has a larger affect on the lacuna count. Nevertheless, this lacuna count is a reasonable measure for the network morphology.
Analysis of the raw output
The raw output of a model simulation is an image of the cell configuration at the end of a simulation. Figure 3 gives an overview of the raw output for the selected parameter space. Examples of possible morphologies are shown in Fig. 3, ranging from spheroids to small networks with one lacuna, to finemazed networks with many lacunae. This is a visual reassurance that the input parameter space is well chosen. However, it is very difficult to predict from the raw output which parameters have a strong impact on the development of networks from spheroids. A GSA can give us insights into this, as we will show in the next section.
GSA of network development from spheroids
We performed two types of GSA on the distribution of the output measures, compactness and lacuna count, to study the impact of the parameters on vascular network development. The first type of GSA studies the variation of the output measures and the second type studies the decomposition of the variance of the distribution of the output measures.
GSA of the variation of the output measures
The variation in the output measures can be visualized by plotting the intensity of the output measures over twodimensional slices of the parameter space. Figures 4 and 5 show the intensity plots of the lacuna count and compactness, respectively, for each possible pairing of parameters. The parameter values are selected according to the GaussLegendre quadrature rule.
Figure 4 shows that the diffusion coefficient is the main source of variation for the lacuna count: the lacuna count is high for low values of D and low for high values of D, independent of the other parameters. That the lacuna count does not vary significantly over the entire perpendicular axis indicates that the parameter of the perpendicular axis does not have much impact. The dominance of the diffusion coefficient masks the impact of the other parameters. To reveal the impact of the other parameters, Additional file 1: Figure S1 caps the intensity values at a lacuna count of five.
Figure 5 shows a high variation of the compactness in each plot. As a consequence, it is difficult to determine which parameters have a dominant impact on compactness. Interactions between parameters are difficult to deduce from these twodimensional intensity plots. A variancebased GSA is well suited to derive parameter interactions and the ranking of individual parameter effects, as will be outlined in the following subsection.
Variancebased GSA of the output measures
To study the impact of single parameters and of parameter combinations on the development of networks from spheroids, we performed a GSA of the output distribution of compactness and lacunae count using the Sobol’ indices. We refer to Section Global sensitivity analysis for a detailed description of how to obtain the Sobol’ indices that represent the impact of the parameters. The GSA results of both measures are reliable, since the Sobol’ indices have converged for values of \(\hat {p}\) for which err _{Var} Eq. (9) is small (see Additional file 2: Table S1 and Additional file 3: Table S2).
The second column of Table 4 lists the impact of the individual parameters and their combinations on compactness. The sensitivity for the chemoattractant at cellmatrix interfaces (λ _{ c }) has the highest impact on network development (S(λ _{ c }) =0.3188), followed by the diffusion coefficient with S(D) =0.2969, and cellcell adhesion with S(J _{cell,cell}) =0.2048. Elasticity of cells has a low impact of (S(λ _{ A }) =0.0266). Seventeen percent of the variance is caused by interactions of parameters. λ _{ c } and J _{cell,cell} have a combined impact of 0.0559. The impact of all other interactions was lower than S(λ _{ A }), which we will consider as a threshold for relevant impact.
The third column of Table 4 lists the impact of the individual parameters and their combinations on lacuna count. The individual impact of the diffusion coefficient is dominant, with S(D) =0.7130. Cell adhesion also has a small individual impact (S(J _{cell,cell}) =0.0407). In total, twenty four percent of the variance is induced by combinations of parameters. There are five parameter combinations, which all include the diffusion coefficient, with a higher impact than the threshold: S(λ _{ c },D) =0.0570, S(D,λ _{ A }) =0.0347, S(D,J _{cell,cell}) =0.0521, S(λ _{ c },D,J _{cell,cell}) =0.0476, and S(λ _{ c },D,λ _{ A }) =0.0315. The total impact of the diffusion coefficient is 90 percent. When we focus on low values of the lacuna count, by capping the lacuna count at a maximum of five lacunae, the dominance of the diffusion coefficient is slightly reduced and an extra interaction of λ _{ c } and J _{cell,cell} is found (λ _{ c },J _{cell,cell}) =0.0409) (Additional file 4: Table S3).
Interpretation of the GSA results
The GSA results show that three parameters account for over 80 percent of the variance of the compactness distribution. Consistent with previous studies of the contact inhibition model [22], these three parameters are the diffusion coefficient, sensitivity to the chemoattractant at cellmatrix interfaces and cellcell adhesion. For the lacuna count, the GSA identified solely the diffusion coefficient as the dominant parameter. This dominant effect is apparent in a collage of output images (Fig. 6): the number of lacunae varies over the horizontal axis that represents the diffusion coefficient D, whereas there is little variation along the vertical axis that represents the sensitivity to the chemoattractant at cellmatrix interfaces λ _{ c }. The number of lacunae is the largest for small values of the diffusion coefficient (around D=4.3∗10^{−14} m ^{2}/s), whereas no lacunae are formed for large values of the diffusion coefficient. A similar trend is seen when the diffusion coefficient is plotted against cellcell adhesion or cell rigidity (not shown). The distance over which adjacent branches can attract one another is given by the length of the chemoattractant gradient Eq. (2), which is characterized by the diffusion length, \(L_{D}=\sqrt {\epsilon /D}\), the distance over which the secreted chemoattractant drops to 1/e of the concentration at the cells (see, e.g., the discussion in Ref. [22]). If L _{ D } becomes shorter, branches that would fuse for larger values of L _{ D } will not fuse. Hence the pattern will be more finegrained. Also a shorter value of L _{ D } will create sharper gradients and hence increase the inward chemotactic force (as ΔH=λ _{ c }∗gradient) hence “squeezing” the branches more and making them thinner.
In conclusion, the GSA is able to identify the dominant single parameters for compactness and lacuna count. In addition, it gives new information on the relative ranking of the impact of these single parameters.
In contrast to the onedimensional parameter studies performed in [22], GSA provides information on interactions of parameters. Combinations of parameters account for 17 % of the variance in the compactness distribution, and for 24 % of the lacuna count distribution. This indicates that most parameters impact the model output independently. Interestingly, the parameter combination of λ _{ c } and J _{cell,cell} impacted the lacuna count (as seen in Additional file 1: Figure S1, which is capped at a maximum of five lacunae) as well as compactness. How can we explain this interaction? Sprout formation requires a balance between λ _{ c }dependent chemotaxis, creating an inward force perpendicular to the sprout surface, and J _{cell,cell}dependent cellcell adhesion, which is responsible for the surface tension of individual cells. In the limit of zerosurface tension, the cells would behave as a zeroviscosity fluid and the chemotaxis would compress sprouts until they become infinitely thin [41]. The cellular surface tensions resist such compression, thus determining the thickness of sprouts. Altogether, this parameter interdependence highlights a new insight in the mechanisms driving sprouting in our model.
Discussion
Biological morphogenesis is a highly complicated process, involving genetic regulation, pattern formation, the biophysics of collective cell migration, mechanical cellcell interactions, and so forth. As such multiscale mechanisms are practically impossible to understand intuitively, in recent years modeling and simulation has become a key tool in developmental biology (see, e.g., refs. [42–45]). These efforts have led to highly complicated models, where traditional analysis tools in dynamical systems theory, such as bifurcation analysis and phase plane analysis, fall short. The models must then be treated as ‘blackbox’ systems: one or twodimensional parameter sweeps are performed, creating images and movies as output, which can be used to obtain various quantitative output measures. These parameter sweeps must be started from one or a few nominal parameter sets around which ndimensional crossshaped sweeps through the parameter space are performed. However insightful such studies are, a danger is that the impact of some parameters is overlooked: the conclusions may depend on what sets of nominal parameter values were selected. Using a simple, published simulation model of vascular morphogenesis, we have shown in this work how a multivariate GSA helps to get more insight in the relative impact of single parameters and of their interactions. We introduced a workflow for GSA of ‘blackbox’ models of morphogenesis.
We applied the workflow to a vascular morphogenesis model which we refer to as the ‘contact inhibition model’. The output of the contact inhibition model consists of images of the cell configuration in a simulation. To quantify network development, we measured the compactness and the lacuna count of the cell configuration at the end of the simulation. A GSA with four input parameter distributions, that each described one of the four general model components, was performed for both measures. The GSA results of compactness and lacuna count both indicated that variation of the rigidity of the cells (λ _{ A }) has very little impact on the model output. As a result, the model can be reduced by fixing this parameter. For compactness, the sensitivity for the chemoattractant at cellmatrix interfaces (λ _{ c }) has the highest impact on network development, followed by the diffusion coefficient (D) and cellcell adhesion (J _{cell,cell}). In contrast, the GSA showed that the diffusion coefficient alone is dominant for lacuna count. The results for both measures are in line with what has been previously reported [22]. New information from the GSA results is the relative impact of the single parameters. In addition, GSA identified interactions between parameters, which have led to new insights in the mechanism of sprouting in the model. Most notably, the parameter interactions in this specific CPMbased model have very low impact. Since GSA has not been performed for CPMbased models before, it is an important new insight for the CPM community that the most basic mechanisms of the CPM, such as cell size and adhesion here function independently.
Besides the contact inhibition model, there are multiple other computational models of vascular network development [23–28]. These models often share common mechanisms that drive sprouting, but differ by one or a few mechanisms. It is still not known which mechanisms drive sprouting in vivo, or whether a different set of mechanisms is used under different conditions. We propose GSA as an approach to help falsify these models. Firstly, the ranking of the relevance of the mechanisms in the models can be compared with knowledge of the impact of these mechanisms from experimental data to falsify models. A second model falsification method is the validation of the experimental predictions of each model based on the GSA results.
The workflow is designed to take into account some pitfalls of GSA. These arise from the dependency of the outcome on the choices one makes for the output measure, input parameters and their distributions. Different output measures can give different results, as was the case for compactness and lacuna count. This indicates that it is essential to consider carefully whether the selected measure truly describes your goal and if there are other measures for it. A selection of input parameters might be necessary when it is not computationally feasible or methodologically desirable to use all parameters of the model. The importance of the selection of the correct parameter distributions has also been discussed elsewhere [17]. Intuitively, a large range for the parameter values will allow for the largest variation in the output and thus the most interesting result. However, since the analysis is global over the entire parameter space, local though important features might become unnoticed if the distribution is too widespread. For instance, for the contact inhibition model we were interested in the region where the networks developed and where the measures were changing accordingly, and variation in these regions could become unnoticed if we included large regions where for instance spheroids do not sprout. Ideally, the parameter distribution comes from experimental measurements, but in absence hereof we propose to study the variation of the output measures for each parameter individually.
It is crucial to have an estimate of the accuracy of the sensitivity results. One option is to compare the results with the outcome of an analysis with a higher accuracy computed with more quadrature points and a higher PCE order, like advocated in [17]. In this paper we proposed a simpler and cheaper rule: given the number of quadrature points the Sobol’ indices should show convergence for those values of \(\hat {p}\) for which the variance computed with the GaussLegendre quadrature rule is more or less equal to the variance computed from the PCE approximation. If a higher PCE order is required, more model simulations are needed. Since the computation of the PCE statistics is ‘for free’ compared to model simulations this is an efficient way of judging whether the accuracy of the statistics is sufficient for one’s aim. Although Gauss quadrature is optimal, it has the disadvantage that it is not a nested quadrature rule, i.e., if more quadrature points are required, the old model results cannot be reused. An alternative for Gauss quadrature is Monte Carlo (MC) integration. Sampling the PCE integrals by MC is less optimal, so more simulations are needed to obtain reliable GSA results. For the Ishigami test model, MC needs a 100 times more simulations to get comparable results. The benefit of MC is that you can check ‘on the fly’ if there are enough data points generated to get reliable results. Adding simulations on the fly is particular useful when the estimated number of simulations based on the Gauss quadrature rule is computationally unfeasible, but one expects or hopes that the output distribution is relatively smooth and thus can be described by a low order PCE approximation.
Some studies require GSA of a subspace of the output distribution. For instance in our case study, to study not the conditions for network formation per se, but the details of the network morphology (e.g. branch length, branch thickness, and so forth), we must preselect a region of the parameter space where networks actually form. Unfortunately, such a subspace would no longer guarantee that the input distribution is independently random uniform. For such cases, a more complicated method to compute the Sobol’ indices [35] is required.
Besides in computational models, the impact of biological factors on morphogenesis is also studied in vitro. Highthroughput imagebased screenings systematically analyze the impact of genes or potential drugs on cell behavior, such as cell migration [46]. This ‘systems microscopy’ approach is well suited for parallel screening of cellular responses to numerous experimental perturbations [47]. Such highthroughput screens can be performed for the genes, growth factors or ECM concentrations affecting morphogenesis. This is conceptually very similar to parameter studies of in silico ‘blackbox’ models. The perturbed biological factors represent the input parameters and the output is an image from which quantitative data can be derived. Therefore, the GSA workflow proposed in this paper is directly applicable to highthroughput in vitro studies.
Conclusions
Morphogenesis is a complex biological process in which cells organize into shapes and patterns. Computational modeling is used to get insights in the mechanisms of morphogenesis. These models are often multiscale, nonlinear and multifactorial, making it difficult to relate their input to their output. The behavior of such ‘blackbox’ models is mostly studied by visual inspection and analyses of the individual output (e.g. images and movies) and with one or twodimensional parameter sweeps of output measures. However, this does not provide insight in the relative impact of single parameters and of their interactions on the outcome of the model. GSA fulfills this task. GSA results can give insights in the dynamics of the model and help to generate experimental predictions to manipulate morphogenesis. In this paper, we introduced a workflow for GSA of such models and addressed pitfalls and reliability of the analysis. The workflow is applied to the contact inhibition model, a cellbased model of vascular morphogenesis. GSA was able to correctly identify dominant parameters and gave new insights on the magnitude and ranking of their individual impact and importantly, on their interactions. In summary, we propose GSA of ‘blackbox’ models, such as complex computational models or highthroughput in vitro models, as an alternative approach to get insights in the mechanisms of morphogenesis.
Abbreviations
 CPM:

Cellular Potts model
 ECM:

Extracellular matrix
 GSA:

Global sensitivity analysis
 MC:

Monte Carlo
 MCS:

Monte Carlo step
 PCE:

Polynomial chaos expansion
 PDE:

Partial differential equation
 PDF:

Probability density function
 VEGF:

Vascular endothelial growth factor
References
 1
Tanaka S. Simulation Frameworks for Morphogenetic Problems. Computation. 2015; 3:197–221. doi:10.3390/computation3020197.
 2
Merks RMH, Koolwijk P. Modeling Morphogenesis in silico and in vitro: Towards Quantitative, Predictive, Cellbased Modeling. Math Model Nat Phenom. 2009; 4(4):149–71. doi:10.1051/mmnp/20094406.
 3
Iber D, Tanaka S, Fried P, Germann P, Menshykau D. Simulating tissue morphogenesis and signaling In: Nelson CM, editor. Tissue Morphogenesis. Methods in Molecular Biology, vol. 1189. New York: Springer: 2015. p. 323–38, doi:10.1007/9781493911646_21.
 4
Iber D, Menshykau D. The control of branching morphogenesis. Open Biology. 2013; 3(9):130088. doi:10.1098/rsob.130088.
 5
Boehm B, Westerberg H, LesnicarPucko G, Raja S, Rautschka M, Cotterell J, et al. The role of spatially controlled cell proliferation in limb bud morphogenesis. PLoS Biol. 2010; 8(7). doi:10.1371/journal.pbio.1000420.
 6
(Anderson ARA, Chaplain MAJ, Rejniak KA, editors.)2007. Singlecellbased Models in Biology and Medicine. Switzerland: Birkhäuser Verlag Basel.
 7
Herrero MA, KöhnLuque A, PérezPomares JM. Modelling vascular morphogenesis: current views on blood vessel development. 2009. doi:10.1142/S021820250900384X.
 8
Saltelli A, Tarantola S, Chan KPS. A quantitative modelindependent method for global sensitivity analysis of model output. Technometrics. 1999; 41(1):39–56.
 9
Sobol’ IM. Sensitivity estimates for nonlinear mathematical models. Mathematical Modeling and Computational Experiment. 1993; 1(4):407–14.
 10
Sobol’ IM. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math Comput Simul. 2001; 55:271–80. doi:10.1016/S03784754(00)002706.
 11
Ostrom Jr CW. Time Series Analysis, Regression Techniques, 2nd Edition. Newbury Park: Sage Publications; 1990.
 12
Morris MD. Factorial sampling plans for preliminary computational experiments. Technometrics. 1991; 33(2):161–74.
 13
Zheng Y, Rundell A. Comparative study of parameter sensitivity analyses of the TCRactivated erkMAPK signalling pathway. Syst Biol, IEE Proc. 2006; 153(4):201–11. doi:10.1049/ipsyb:20050088.
 14
Cho KH, Shin SY, Kolch W, Wolkenhauer O. Experimental design in systems biology, based on parameter sensitivity analysis using a Monte Carlo method: A case study for the TNF αmediated NF κ B signal transduction pathway. SIMULATION. 2003; 79(12):726–39. doi:10.1177/0037549703040943.
 15
Kucherenko S, RodriquezFernandez M, Pantelides C, Shah N. Monte Carlo evaluation of derivativebased global sensitivity measures. Reliab Eng Syst Saf. 2009; 94:1135–1148. doi:10.1016/j.ress.2008.05.006.
 16
Sobol’ IM, Kucherenko S. Derivative based global sensitivity measures and their link with global sensitivity indices. Math Comput Simul. 2009; 79:3009–17. doi:10.1016/j.matcom.2009.01.023.
 17
RodriguezFernandez M, Banga JR, Doyle III FJ. Novel global sensitivity analysis methodology accounting for the crucial role of the distribution of input parameters: application to system biology models. Int J Robust Nonlinear Control. 2012; 22:1082–102. doi:10.1002/rnc.2797.
 18
Lumen A, McNally K, George N, Fisher JW, Loizou GD. Quantitative global sensitivity analysis of a biologically based doseresponse pregnancy model for the thyroid endocrine system. Front Pharmacol. 2015; 6(107). doi:10.3389/fphar.2015.00107.
 19
Zhang YY, Trame MN, Lesko LJ, Schmidt S. Sobol sensitivity analysis: A tool to guide the development and evaluation of systems pharmacology models. CPT: Pharmacometrics & Systems Pharmacology. 2015; 4(2):69–79. doi:10.1002/psp4.6.
 20
Torres Valderrama A, Witteveen J, Navarro M, Blom J. Uncertainty propagation in nerve impulses through the action potential mechanism. J Math Neurosci. 2015; 5(3). doi:10.1186/2190856753.
 21
Dresch JM, Liu X, Arnosti DN, Ay A. Thermodynamic modeling of transcription: sensitivity analysis differentiates biological mechanism from mathematical modelinduced effects. BMC Syst Biol. 2010; 4(142). doi:10.1186/175205094142.
 22
Merks RMH, Perryn ED, Shirinifard A, Glazier JA. Contactinhibited chemotaxis in de novo and sprouting bloodvessel growth. PLoS Comput Biol. 2008; 4(9):1000163. doi:10.1371/journal.pcbi.1000163.
 23
van Oers RFM, Rens EG, LaValley DJ, ReinhartKing CA, Merks RMH. Mechanical cellmatrix feedback explains pairwise and collective endothelial cell behavior in vitro. PLoS Comput Biol. 2014; 10(8):1003774. doi:10.1371/journal.pcbi.1003774.
 24
Szabó A, Czirók A. The role of cellcell adhesion in the formation of multicellular sprouts. Math Model Nat Phenom. 2010; 5(1):106–22. doi:10.1051/mmnp/20105105.
 25
Merks RMH, Brodsky S, Goligorksy M, Newman S, Glazier JA. Cell elongation is key to in silico replication of in vitro vasculogenesis and subsequent remodeling. Devel Biol. 2006; 289:44–54. doi:10.1016/j.ydbio.2005.10.003.
 26
KöhnLuque A, de Back W, Yamaguchi Y, Yoshimura K, Herrero MA, Miura T. Early embryonic vascular patterning by matrixmediated paracrine signalling: a mathematical model study. Phys Biol. 2013; 11(6):066007. doi:10.1088/14783975/10/6/066007.
 27
Bauer AL, Jackson TL, Jiang Y. A cellbased model exhibiting branching and anastomosis during tumorinduced angiogenesis. Biophys J. 2007; 92(9):3105–21. doi:10.1529/biophysj.106.101501.
 28
Shirinifard A, Gens JS, Zaitlen BL, Popławski NJ, Swat M, Glazier JA. 3D multicell simulation of tumor growth and angiogenesis. PLoS ONE. 2009; 4(10). doi:10.1371/journal.pone.0007190.
 29
Czirok A, Isai DG. Cell resolved, multiparticle model of plastic tissue deformations and morphogenesis. Phys Biol. 2015; 12(1):016005. doi:10.1088/14783975/12/1/016005.
 30
Glazier JA, Graner F. Simulation of the differential adhesion driven rearrangement of biological cells. Phys Rev E. 1993; 47(3):2128–54. doi:10.1103/PhysRevE.47.2128.
 31
Graner F, Glazier JA. Simulation of biological cell sorting using a twodimensional extended Potts model. Phys Rev Lett. 1992; 69:2013–016. doi:10.1103/PhysRevLett.69.2013.
 32
Savill NJ, Hogeweg P. Modelling morphogenesis: from single cells to crawling slugs. J Theor Biol. 1997; 184:229–35. doi:10.1006/jtbi.1996.0237.
 33
Wiener N. The homogeneous chaos. Am J Math. 1938; 60:897–936.
 34
Xiu D. Fast numerical methods for stochastic computations: a review. Communications in Computational Physics. 2009; 5(2–4):242–72.
 35
Navarro M, Witteveen J, Blom J. Polynomial Chaos Expansion for general multivariate distributions with correlated variables. 2014. arXiv:1406.5483[math.NA].
 36
Oladyshkin S, Nowak W. Datadriven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliab Eng Syst Saf. 2012; 106:179–90.
 37
Sudret B. Global sensitivity analysis using polynomial chaos expansions. Reliab Eng Syst Saf. 2008; 93(7):964–79. doi:10.1016/j.ress.2007.04.002.
 38
Ishigami T, Homma T. An importance quantification technique in uncertainty analysis for computer models. In: First International Symposium on Uncertainty Modeling and Analysis (ISUMA’90): 1990. p. 398–403, doi:http://dx.doi.org/10.1109/ISUMA.1990.151285.
 39
Sobol’ IM, Levitan YL. On the use of variance reducing multipliers in Monte Carlo computations of a global sensitivity index. Comput Phys Commun. 1999; 117:52–61. doi:10.1016/S00104655(98)001568.
 40
Saltelli A. Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun. 2002; 145:280–97. doi:10.1016/S00104655(02)002801.
 41
Shirinifard A. Vascular patterning and its application in cancer and choroidal neovascularization. PhD thesis, Indiana University, Department of Physics. 2012.
 42
Sheth R, Marcon L, Bastida MF, Junco M, Quintana L, Dahn R, et al. Hox genes regulate digit patterning by controlling the wavelength of a Turingtype mechanism. Science (New York, NY). 2012; 338(6113):1476–80. doi:10.1126/science.
 43
De Rybel B, Adibi M, Breda AS, Wendrich JR, Smit ME, Novák O, et al. Integration of growth and patterning during vascular tissue formation in Arabidopsis. Science (New York, NY). 2014; 345(6197):1255215. doi:10.1126/science.1255215.
 44
Besnard F, Refahi Y, Morin V, Marteaux B, Brunoud G, Chambrier P, et al. Cytokinin signalling inhibitory fields provide robustness to phyllotaxis. Nature. 2015; 505(7483):417–21. doi:10.1038/nature12791.
 45
Buske P, Przybilla J, Loeffler M, Sachs N, Sato T, Clevers H, et al. On the biomechanics of stem cell niche formation in the gut  modelling growing organoids. FEBS Journal. 2012; 279(18):3475–87. doi:10.1111/j.17424658.2012.08646.x.
 46
Le Dévédec SE, Yan K, de Bont H, Ghotra V, Truong H, Danen EH, et al. Systems microscopy: An emerging strategy for the life sciences. Cell Mol Life Sci. 2010; 67(19):219–3240. doi:0.1007/s0001801004192.
 47
Lock JG, Strömblad S. Systems microscopy: An emerging strategy for the life sciences. Exp Cell Res. 2010; 316(8):1438–1444. doi:10.1016/j.yexcr.2010.04.001. Special Issue Celebrating the 60Year Anniversary of ECR and the 200Year Anniversary of the Karolinska Institute.
Acknowledgements
We thank Indiana University and the Biocomplexity Institute for providing the CompuCell3D modeling environment and SURFsara (www.surfsara.nl) for the support in using the Lisa Compute Cluster.
The investigations were supported by the Division for Earth and Life Sciences (ALW) with financial aid from the Netherlands Organization for Scientific Research (NWO).
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
SB, MNJ, RM and JB conceived of the study. SB performed the computational simulations, participated in the data analysis and drafted the manuscript. MNJ analyzed the data and helped to draft the manuscript. RM participated in the design and coordination of the study and helped to draft the manuscript. JB participated in the design and coordination of the study and in the data analysis, and helped to draft the manuscript. All authors read and approved the final manuscript.
Additional files
Additional file 1
Figure S1. Twodimensional intensity plots of lacuna count with maximum intensity of 5 lacunae. The intensity of the output measure lacuna count, mapped to an interval of 0 to 5 as indicated by the color bars, is plotted for each twoparameter combination of the parameters the cell rigidity (λ _{ A }), cellcell adhesion (J _{cell,cell}), the diffusion coefficient of the chemoattractant (D), and sensitivity of cells to the chemoattractant at cellmatrix interfaces (λ _{ c }). (PNG 568 Kb)
Additional file 2
Table S1. Global sensitivity analysis results for compactness. (PDF 51.7 Kb)
Additional file 3
Table S2. Global sensitivity analysis results for lacuna count. (PDF 51.5 Kb)
Additional file 4
Table S3. Global sensitivity analysis results for lacuna count capped at maximum of 5 lacunae. (PDF 52.3 Kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Morphogenesis
 Vascular network development
 Computational modeling
 Cellular Potts model
 Global sensitivity analysis
 Sobol’ indices
 Polynomial chaos expansion