Identification of parameter correlations
We consider nonlinear model systems described by
(1)
(2)
where x(t) ∈ Rn is the state vector, u(t) ∈ Rm the control vector, and y(t) ∈ Rr the output vector, respectively. In this study, two different sets of parameters, i.e. p ∈ RNP in the state equations and q ∈ RNQ in the output equations, are considered. In most cases the number of parameters in the state equations is much larger than that in the output equations. Since the correlations of the parameters in the output Equation (2) are easier to identify, we concentrate on the analysis and identification of correlations of the parameters in the state Equation (1).
The equation of the state sensitivity matrix can be derived by taking the first order partial derivative of Eq. (1) with respect to parameters p
(3)
where is the state sensitivity matrix. By solving this equation (see Additional file 1 for details) the state sensitivity matrix can be written as
(4)
where V(τ) is a matrix computed at time τ. It means that S has a linear integral relation with the matrix from t0 to t. If at any time has the same linearly dependent columns, the corresponding columns in S will also be linearly dependent, i.e. the corresponding parameters are correlated. Therefore, we can identify parameter correlations by checking the linear dependences of the column in the matrix which is composed of the first order partial derivatives of the right-hand side of the ODEs. Based on Eq. (2), the output sensitivity matrices are, respectively, given by
(5)
To ensure unique estimation of the parameters (i.e. all parameters to be identifiable), based on the measured data of y, it is necessary that the columns in the output sensitivity matrices are linearly independent. From Eq. (6), relations of the columns in can be easily detected. The difficulty comes from Eq. (5), since the sensitivity functions in cannot be analytically expressed. However, from Eq. (5), the output sensitivity matrix is a linear transformation of . Consequently, there will be linearly dependent columns in , if there are linearly dependent columns in . It means the necessary condition for unique estimation of p is that, at least, the matrix must have a full rank. Based on Eq. (1), is expressed as vectors of the first order partial derivative functions
(7)
Now we analyse relations between the partial derivative functions in Eq. (7). If there is no correlation among the parameters, the columns in Eq. (7) will be linearly independent, i.e. if
(8)
there must be α
i
= 0, i = 1, ⋯, NP. Otherwise, there will be some groups of vectors in which lead to the following cases of linear dependences due to parameter correlations. Let us consider a subset of the parameters with k correlated parameters denoted as p
sub
= [ps+1, ps+2, ⋯, ps+k]T with s + k ≤ NP.
Case 1:
(9)
where α
i
≠ 0, i = 1, ⋯, k. Notice that the coefficient α
i
may be a function of the parameters (i.e. α
i
(p)) and/or of control inputs (i.e. α
i
(u(t), p)). It should be also noted that the control inputs u(t) are considered as constants in these coefficients, since they will be specified in experimental design. The linear dependences described by Eq. (9) lead to pairwise correlations among the k parameters, i.e. any pair of the parameters in p
sub
are correlated. Moreover, the correlations mean a functional relationship between the parameters, i.e. the relationship between the parameters can be expressed by an algebraic equation
(10)
where γ(ps+1, ps+2, ⋯, ps+k) denotes a function of the parameters with one set of pairwise correlated parameters. The parameters in this function are compensated each other in an algebraic relationship, e.g. γ(ps+1 + ps+2 + ⋯ + ps+k) or γ(ps+1ps+2⋯ps+k). Eq. (10) describes the functional relationship between the parameters, e.g. ϕ
sub
(γ( ⋅ )) = 1 + γ( ⋅ ) − (γ( ⋅ ))2 = 0. Due to the complexity of biological models, an explicit expression of this equation is not available in most cases.
If the coefficients in Eq. (9) are functions of only the parameters, i.e. α
i
(p), the parameters are structurally non-identifiable. In this case, the correlation relations in Eq. (9) will remain unchanged by specifying any values of control inputs. It means that the non-identifiability cannot be remedied through experimental design.
If the coefficients in Eq. (9) are functions of both the parameters and control inputs, i.e. α
i
(u, p), the parameters are practically non-identifiable. Different values for u can be specified which lead to different α
i
(u, p), such that Eq. (9) will not hold and therefore the parameter correlations will be relieved. Since k parameters are correlated, k different values of the control inputs u(j), (j = 1, ⋯, k) are required, such that the matrix
(11)
has a full rank. Notice that the columns in Eq. (11) are only linearly dependent for the same input, but the columns of the whole matrix are linearly independent. In this way, the non-identifiability is remedied. Moreover, a suggestion for experimental design is provided for the specification of u(j), (j = 1, ⋯, k) to obtain k distinct data sets which will be used for parameter estimation.
If all state variables are measurable, according to Eq. (4), this subset of parameters can be uniquely estimated based on the k data sets. If the outputs y are measured and used for the parameter estimation, it can be concluded from Eq. (5) that at least k data sets are required for unique parameter estimation.
Case 2:
(12)
and
(13)
where α
i
≠ 0, i = 1, ⋯, s + k + d. Similarly, the coefficients may be functions of the parameters and/or of the control inputs. In this case, there are d sets of pairwise correlated parameters (Eq. (12)). A set is not correlated with another set, but all sets are correlated together (Eq. (13)). The functional relationship in this case can be expressed by
(14)
Based on Eq. (12), the group with the maximum number of parameters max (l1, l2, ⋯, l
d
) is of importance for data acquisition. From Eq. (13), in the case of practical non-identifiability, data for at least d different inputs is required. The combination of Eqs. (12) and (13) leads to the conclusion that we need a number of max (l1, l2, ⋯, l
d
, d) data sets with different inputs to eliminate parameter correlations in this case.
Case 3:
(15)
where α
i
≠ 0, i = 1, ⋯, k. In this case, all k parameters are not pairwise correlated but they are correlated together in one group. The correlation equation in this case is expressed by
(16)
which means there is no set of correlated parameters in this case. The approach described above is able to identify pairwise and higher order parameter correlations in the state equations (Eq. (1)). In the same way, correlations among parameters in q in the output equations (Eq. (2)) can also be detected based on the first order partial derivative functions in Eq. (6).
From the results of this correlation analysis, the maximum number of correlated parameters of the correlation groups can be detected. This corresponds to the minimum number of data sets required for unique estimation of all parameters in the model. Furthermore, it is noted that the initial state of the model has no impact on the parameter correlations, which means that any initial state can be used for the experimental runs for the data acquisition.
For complex models, the correlation equations (Eqs. (10), (14), (16)) cannot be analytically expressed. A numerical method has to be used to illustrate the relationships of correlated parameters of a given model, which is discussed in the next section.
Interpretation of parameter correlations
Here we give an interpretation of parameter correlations in a biological model. Geometrically, for NP parameters, i.e. p = [p1, p2, ⋯, p
NP
]T, the estimation task can be considered as searching for true parameter values p* in the NP-dimensional parameter space. To do this, we need NP linearly independent surfaces in the parameter space which should pass through p*. Mathematically, such surfaces are described by linearly independent equations with the unknown parameters. We define such equations based on the results of fitting model Equations (1) to a data set (j) by minimizing the following cost function
(17)
where M is the number of sampling points, n is the number of state variables and denotes the measured data while x(p) the state variables predicted by the model. w
i,l
are weighting factors. The fitting results will depend on the data set resulted from the control inputs u(j), the values of w
i,l
, and the noise level of the measured data. For a geometric interpretation of parameter correlations, we assume to use idealized measurement data, i.e. data without any noises. Based on this assumption, the residual function (17) should be zero, when the true parameter set p* is applied, i.e.
(18)
It is noted that Eq. (18) is true for any noise-free data set employed for the fitting and independent of w
i,q,
. Now we define a zero residual equation (ZRE) as
(19)
This equation contains the parameters as unknowns and corresponds to a zero residual surface passing through the true parameter point p*. It means that a zero residual surface is built by parameter values which lead to a zero residual value. This suggests that we can find p* by solving NP linearly independent ZREs. From the first order Taylor expansion of Eq. (19), the linear dependences of ZREs can be detected by the columns in the following matrix
(20)
where x(j) = [x1,1(j), x1,2(j), ⋯, x1,M(j), ⋯, xn,1(j), xn,2(j), ⋯, xn,M(j)]T. Eq. (20) is exactly the state sensitivity matrix calculated by fitting to the given data set (j). This means, under the idealized data assumption, a zero residual value delivered after the fitting is associated to surfaces passing through the true parameter point. When there are no parameter correlations, the number of linearly independent ZREs will be greater than NP and thus the true parameter point can be found by fitting the current data set.
If there are parameter correlations, the fitting will lead to zero residual surfaces in the subspace of the correlated parameters. For instance, for a group of k correlated parameters, the zero residual surfaces (Eq. (19)) will be reduced to a single ZRE represented by Eq. (10), Eq. (14), or Eq. (16). Therefore, in the case of practical non-identifiability, k data sets are needed to generate k linearly independent ZREs so that the k parameters can be uniquely estimated. In the case of structural non-identifiability, the correlated relations are independent of data sets. It means fitting different data sets will lead to the same ZRE and thus the same surfaces in the parameter subspace.
If the measured data are with noises, the fitting results will lead to a nonzero residual value and nonzero residual surfaces, i.e.
(21)
where ϵi,l ≠ 0. Thus the nonzero residual surfaces will not pass through the true parameter point. However, based on Eq. (20) and Eq. (21) the first order partial derivatives remain unchanged. It means that parameter correlations do not depend on the quality of the measured data. Moreover, it can be seen from Eq. (19) and Eq. (21) that the zero residual surfaces and the nonzero residual surfaces will be parallel in the subspace of the correlated parameters.