Parameter optimization by using differential elimination: a general approach for introducing constraints into objective functions
© Horimoto et al. 2010
Published: 13 September 2010
Skip to main content
© Horimoto et al. 2010
Published: 13 September 2010
The investigation of network dynamics is a major issue in systems and synthetic biology. One of the essential steps in a dynamics investigation is the parameter estimation in the model that expresses biological phenomena. Indeed, various techniques for parameter optimization have been devised and implemented in both free and commercial software. While the computational time for parameter estimation has been greatly reduced, due to improvements in calculation algorithms and the advent of high performance computers, the accuracy of parameter estimation has not been addressed.
We propose a new approach for parameter optimization by using differential elimination, to estimate kinetic parameter values with a high degree of accuracy. First, we utilize differential elimination, which is an algebraic approach for rewriting a system of differential equations into another equivalent system, to derive the constraints between kinetic parameters from differential equations. Second, we estimate the kinetic parameters introducing these constraints into an objective function, in addition to the error function of the square difference between the measured and estimated data, in the standard parameter optimization method. To evaluate the ability of our method, we performed a simulation study by using the objective function with and without the newly developed constraints: the parameters in two models of linear and non-linear equations, under the assumption that only one molecule in each model can be measured, were estimated by using a genetic algorithm (GA) and particle swarm optimization (PSO). As a result, the introduction of new constraints was dramatically effective: the GA and PSO with new constraints could successfully estimate the kinetic parameters in the simulated models, with a high degree of accuracy, while the conventional GA and PSO methods without them frequently failed.
The introduction of new constraints in an objective function by using differential elimination resulted in the drastic improvement of the estimation accuracy in parameter optimization methods. The performance of our approach was illustrated by simulations of the parameter optimization for two models of linear and non-linear equations, which included unmeasured molecules, by two types of optimization techniques. As a result, our method is a promising development in parameter optimization.
The investigation of network dynamics is a major issue in systems and synthetic biology . In general, a network model for describing the kinetics of constituent molecules is first constructed with reference to the biological knowledge, and then the model is mathematically expressed by differential equations, based on the chemical reactions underlying the kinetics. Finally, the kinetic parameters in the model are estimated by various parameter optimization techniques , from the time-series data measured for the constituent molecules. While the computational time for parameter estimation has been greatly reduced, due to the improvement in calculation algorithms and the advent of high performance computers, the accurate numerical estimation of parameter values for a given model remains a limiting step. Indeed, the parameter values estimated by various optimization techniques are frequently quite variable, due to the conditions for parameter estimation, such as the initial values. In particular, we cannot always obtain the data measured for all of the constituent molecules, due to limitations of measurement techniques and ethical constraints. In this case, one of the issues we should resolve is that the parameters are estimated from the data for only some of the constituent molecules. Unfortunately, it is quite difficult to estimate the parameters in such a network model including unmeasured variables.
Boulier and his colleagues developed differential elimination , derived from the Roselfeld-Gröbner base . Differential elimination rewrites a system of original differential equations into an equivalent system. The rewriting feature was applied to solve the parameter optimization issue, especially in network dynamics including unmeasured variables [3, 5], and in the applications, the equations rewritten by differential elimination were utilized to estimate the initial values for the parameter optimization, by Newton-type numerical optimization.
Here, we propose a new method for optimizing the parameters, by using differential elimination . Our method partially utilizes a technique from a previous study , regarding the introduction of differential elimination into parameter optimization in a network including unmeasured variables. Instead of using differential elimination for estimating the initial values for the following parameter optimization, the equations derived by differential elimination are directly introduced as the constraints into the objective function for the parameter optimization. To validate the effectiveness of the constraint introduction, we performed simulations in two models of linear and nonlinear differential equations, where we assumed that the data for only one molecule among them were measured, by using two kinds of evolutionary optimization techniques. The accuracy of the parameter values estimated by the objective functions with and without the new constraints was compared. Finally, we discussed merits and pitfalls of our method in terms of its extension to more realistic and complex models.
We first describe a perspective of our method, and then the two models are analyzed to illustrate its performance. The two models were chosen from representative kinetic models for biological phenomena at the molecular level: one model (Model 1) is composed of two variables, analogous to molecular binding and dissociation, such as affinity binding in an antibody cross-link, and the other model (Model 2) is composed of four variables, analogous to a molecular reaction cascade, such as phosphorylation in signal transduction. Notably, we assumed that only one variable is measured among the variables in the two models.
The key point of this study is the introduction of new constraints obtained by differential elimination into the objective function, to improve the parameter accuracy. Following an explanation of differential elimination, the method of introducing the constraints is briefly described.
Differential algebra aims at studying differential equations from a purely algebraic point of view [6, 7]. Differential elimination theory is a sub theory of differential algebra , based on Rosenfeld-Gröbner . The differential elimination rewrites the inputted system of differential equations to another equivalent system according to ranking (order of terms). Here, we provide an example of differential elimination, as shown below, according to Boulier [3, 5].
When we define the left sides of the above system as C 1,t and C 2,t, C 2,t is composed of x 1, its derivatives, and the parameters obtained by eliminating x 2, and C 1,t is composed of x 1, its derivatives, the parameters and x 2. Note that x 2 in C 1,t can be expressed by x 1, its derivatives and the parameters in C 2,t. Then, the values of C 1,t and C 2,t can be calculated, if we have time-series data of x 1, and they would be zero, if all parameters were exactly estimated. Thus, C 1,t and C 2,t can be regarded as a kind of error function that expresses the difference between the measured and estimated data.
where α is a weighting factor, which is approximately estimated by Pareto optimal solutions for E and C, and then is manually modified (see details in Methods).
The introduction of DE constraints into the objective function clearly improved the parameter accuracy. Indeed, the parameter value sets were correctly estimated by the introduction of DE constraint into the objective function, while they were falsely estimated without the introduction. Furthermore, the parameter sets with the introduction were sharply distributed near the correct values in all cases, in contrast to the wide distribution without the introduction. In general, the derivatives included the information on the curve form of the measured time-series data, such as slope, extremal point and inflection point. This indicates that the new objective function estimates the difference of not only the values but also the forms between the measured and estimated data, while the standard objective function estimates only the value difference. Note that the DE constraint is rationally reduced from the original system of differential equations for a given model in a mathematical sense. Thus, our approach is expected to be a general approach in parameter optimization for improving the parameter accuracy.
As expected, the new objective function requires more computational time, in comparison with an objective function with only a standard error function, due to the increase of the functions in DE constraints. Indeed, the computational time of our method was larger than that of the standard method in Models 1 and 2; the computational times for the standard method and our method were 0.4 and 2.3 hours in Model 1, and 0.03 and 0.22 hours in Model 2 (32 CPU’s of Intel(R) Xeon(R) X5550 2.67GHz). In addition to the computational time, a pitfall of our method is the equation size of DE constraints. In the equivalent systems, the number of terms frequently increases (see Additional file 3), and this may result in the difficulty of the application of our method to a complex or large model. Although we do not still reach a clear conclusion to overcome the difficulty, two ways can be considered. One way is an approximation method and the other is a mathematical manipulation method. As for the former method, in the DE constraints, the terms with a higher order of derivatives in the differential equations appeared frequently in the equivalent system (see Additional files 2 and 3). The magnitude of the estimated values of the higher order derivatives was relatively smaller than those of the lower order derivatives. If the estimation of terms with higher order derivatives can be neglected, then the computational time will be reduced. As for the latter method, we can use some equation-simplification methods by symbolic computation (personal communication from Drs. A. Sedoglavic, F. Lemaire and F. Boulier of Lille University). Indeed, the size of DE constraints for the negative feedback model with oscillation was reduced from 7.4MB obtained by the pure differential elimination in present procedure to 0.1MB after the equation simplification by symbolic computation (data not shown). Further studies will be needed to shorten the computational time by the combination of the approximation and the simplification of the DE constraints.
One possible use of our method is its application to network inference without known structure. Since the present method is designed with the assumption of a known network structure, the application range of our method to network inference is naturally restricted. However, our method can select the most possible network structure among the networks with similar structures. Indeed, we designed a similar procedure for evaluating the network structures with measured data . In our previous approach, we adopted the transformation of a system of differential equations into the equivalent system of algebraic equations by Laplace transformation. In this case, the system must be linear, due to the Laplace transformation. Furthermore, the numeric optimization in the previous approach frequently faces difficulties, due to the existence of the pole in the Laplace domain. In contrast, these pitfalls are overcome in the present method, by introducing the constraints by differential elimination. This supports the application of the present method to the model selection issue.
Various models for describing biological phenomena are available . In particular, several feedback models are important for describing the biological phenomena [12, 13]. Although the performance of our approach for the two representative models in biological phenomena was tested in this study, further tests for the performance of the DE constraint introduction remain for the models that are important in systems and synthetic biology. In the near future, we will report the evaluation of our approach in the cases of various models, in addition to the reduction of computational time and the trials of model selection.
The introduction of the constraints by using differential elimination was effectively improved the parameter accuracy in two models of linear and nonlinear equations, especially when we assumed that unmeasured variables were included, by two optimization techniques. This clearly indicates that the ability of our method for estimating the parameter values was far superior to that of various methods with the standard error function. Although the present study focused on two simple models, our method is a feasible approach for parameter estimation in network dynamics.
We assume that the model expresses the binding and dissociation between two molecules, and that only one complex, x AB , can be measured.
We assume that the molecules, x 2, x 3, and x 4, activate x 1 with linear relationships, and that only one molecule, x 1, can be measured.
Two well-known parameter optimization techniques, the genetic algorithm (GA) [15–19] and the particle swarm optimization (PSO) [20, 21], were used. In the parameter optimization, two thresholds were set to stop the optimization: the average value of the error function over time points, E/T, and the number of generations per optimization. In this study, we performed the optimization 200 times in both techniques, and the thresholds of E/T were set to 0.01 for Model 1 and 0.001 for Model 2, and the threshold for generation number was set to 2000. As a result, the numbers of successes by 200 trials were 200 without DE constraints and 51 by GA and 11 by PSO with DE constraints, for Model 1, and 200 for all cases for Model 2
where N and T are the number of variables and the time points, respectively: N was 2 for Model 1 and 4 for Model 2, and T was 100.
where L and T are the numbers of equivalent equations and time points, respectively: L was 2 for Model 1 and 5 for Model 2.
where α the a weight of two functions, which is approximately estimated by a Pareto optimal solutions for E and C and then is manually modified. In the present study, α was set to 0.1 in Model 1 and 0.9999999 in Model 2. As a result, our computational task is to determine a set of parameter values that minimize to OF.
All of the symbolic computations for the differential elimination were performed using the diffalg package of MAPLE 10. In the performance of differential elimination, the ranking of variables was: x A ≻ x B ≻ x AB in Model 1 and P(Pool) ≻ x 4 ≻ x 3 ≻ x 2 ≻ x 1 in Model 2. Subsequently, we converted the form of the polynomial equations derived by differential elimination to the Java code by using the CodeGeneration feature in Maple 10.
This work was partly supported by a project grant, ‘Development of Analysis Technology for Gene Functions with Cell Arrays’, from The New Energy and Industrial Technology Development Organization (NEDO). KH was partly supported by a Grant-in-Aid for Scientific Research on Priority Areas "Systems Genomics" (grant 20016028) and for Scientific Research (A) (grant 19201039) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. In particular, the authors would like to express their gratitude to Drs. Alexander Sedoglavic, Francois Lemaire, and Francois Boulier of Lille University, for valuable discussions during the course of this work.
This article has been published as part of BMC Systems Biology Volume 4 Supplement 2, 2010: Selected articles from the Third International Symposium on Optimization and Systems Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1752-0509/4?issue=S2
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.