How informative is your kinetic model?: using resampling methods for model invalidation
- Dicle Hasdemir^{1, 2}Email author,
- Huub CJ Hoefsloot^{1, 2}Email author,
- Johan A Westerhuis^{1, 2} and
- Age K Smilde^{1, 2}
https://doi.org/10.1186/1752-0509-8-61
© Hasdemir et al.; licensee BioMed Central Ltd. 2014
Received: 21 February 2014
Accepted: 14 May 2014
Published: 22 May 2014
Abstract
Background
Kinetic models can present mechanistic descriptions of molecular processes within a cell. They can be used to predict the dynamics of metabolite production, signal transduction or transcription of genes. Although there has been tremendous effort in constructing kinetic models for different biological systems, not much effort has been put into their validation. In this study, we introduce the concept of resampling methods for the analysis of kinetic models and present a statistical model invalidation approach.
Results
We based our invalidation approach on the evaluation of a kinetic model’s predictive power through cross validation and forecast analysis. As a reference point for this evaluation, we used the predictive power of an unsupervised data analysis method which does not make use of any biochemical knowledge, namely Smooth Principal Components Analysis (SPCA) on the same test sets. Through a simulations study, we showed that too simple mechanistic descriptions can be invalidated by using our SPCA-based comparative approach until high amount of noise exists in the experimental data. We also applied our approach on an eicosanoid production model developed for human and concluded that the model could not be invalidated using the available data despite its simplicity in the formulation of the reaction kinetics. Furthermore, we analysed the high osmolarity glycerol (HOG) pathway in yeast to question the validity of an existing model as another realistic demonstration of our method.
Conclusions
With this study, we have successfully presented the potential of two resampling methods, cross validation and forecast analysis in the analysis of kinetic models’ validity. Our approach is easy to grasp and to implement, applicable to any ordinary differential equation (ODE) type biological model and does not suffer from any computational difficulties which seems to be a common problem for approaches that have been proposed for similar purposes. Matlab files needed for invalidation using SPCA cross validation and our toy model in SBML format are provided at http://www.bdagroup.nl/content/Downloads/software/software.php.
Keywords
Background
With the concept of ‘sytems biology’ coming to the stage of biological research, construction of kinetic models has been the primary focus in a substantial number of studies [1–4]. Kinetic models are mechanistic representations of biological systems. They include information on two main levels. The first level of information includes the metabolites, enzymes, signaling molecules and chemical reactions involved in the model together with the formulation of the reaction kinetics such as Michaelis-Menten kinetics. Knowledge about inhibition, activation and allosteric regulation of enzymes are also a part of this level. The second level of information consists of numerical values of all different parameters defined in the first level of information. Those parameters include but are not limited to rate parameters for chemical reactions such as production of new metabolites in metabolic models, post-translational modifications of proteins in signaling pathways and transcription processes in genetic regulatory circuits.
As of present, kinetic models are usually restricted to small scale systems. The median of the number of the reactions and species that 462 curated kinetic models in Biomodels Database [5] included are only 12 and 11, respectively. Yet the information they provide at both levels increases very rapidly. This is usually accomplished by in vitro experiments which give insight into appropriate formulations of enzyme kinetics. Also values of the parameters can be determined by in vitro experiments with isolated enzymes. Another common way towards this aim is the use of in vivo experiments in which metabolite concentrations are measured. Optimal values of the parameters can then be estimated by using concentration data [6]. However, in vitro and in vivo kinetics can be very different, not only in the values of the parameters but more importantly, also in the formulation [3]. This points to the need for careful investigation of the model’s validity on the first information level that we defined above.
Most of the time, models are assessed qualitatively based on the goodness of their fit to concentration data [2]. In some other cases, new datasets in different biological conditions are generated and a qualitative analysis is made based on the model’s ability to predict new datasets [7]. However, most of the time multiple candidate models with different structures can show very similar goodness of fit and also prediction in another experimental condition. This stems from high levels of adaptability in these models. One could argue that all candidate models are good as long as they perform reasonably well in prediction. However, rapid elimination of less informative models would be very beneficial to the metabolic modeling community. It would ease the way to trustworthy libraries of models providing the researchers with speed and accuracy for larger scale models. To this aim, model selection and invalidation algorithms supply a quantitative framework.
Model selection criteria borrowed from statistical literature such as Akaike and Bayesian Information Criteria (AIC and BIC respectively) are among the most popular approaches introduced for the selection of sytems biology models [8–10]. Model selection based on AIC have also been successfully implemented in software packages which aim to select the best model within a family of automatically generated models derived from one master model by adding/removing species or interactions [11, 12].
However, those criteria always support in favor of one model without providing any significance to their decisions [13] and can not produce clear results when many parameters are involved [12]. An alternative which is capable of ranking different models according to their plausibility was introduced within a Bayesian perspective using Bayes Factors [14]. This family of Bayesian methods unfortunately still remain unemployed in the field due to the need for smart assumptions on parameters’ prior distributions and their costliness in calculation of bulky integrals despite promising effort regarding the second obstacle [15, 16]. In some studies robustness based measures were proposed for model selection [17, 18]. For oscillating systems, robustness of the model can support its preference over different models. However, this might not hold true for the whole family of kinetic models in systems biology.
Although not employed regularly, the systems biology community has been provided with tools to select between different model structures. However, invalidation of a model structure without an alternative to compare with has not been considered much in the related literature. An analytical approach suggests use of barrier certificates which are functions whose existence proves that the model behaviour can never intersect the experimental data [19]. The existence of the barrier certificates proves the invalidity of the models. However the approach is purely analytical and very complex so it is not easily applicable by biologists. Another drawback is the difficulty in the construction of the barrier certificates for complicated system descriptions as the authors also elaborate in their paper.
In this article, we introduce a statistical measure for the invalidation of kinetic models which suffers neither from complex model descriptions nor large scale models. We use the predictive power of Smooth Principal Components Analysis (SPCA), an unsupervised data analysis method as a threshold to assess the predictive power of kinetic metabolic models. By using this threshold value, we can determine which model structures are informative enough to deserve further attention and which model structures should be abandoned. Our approach stands on a basic assumption: If a totally unsupervised data analysis method without any prior biochemical knowledge predicts better than a kinetic model can do, that points to an inaccuracy or incompleteness in the information which the kinetic model provides us with.
With this paper, we also want to bring the attention of the systems biology community to the idea of using resampling methods which have proven to be very useful in machine learning and data analysis. To our knowledge these methods’ potential has not been exploited fully in the analysis of kinetic systems biology models.
Using synthetic data generated from metabolic models has been adopted widely in literature as a way of testing algorithms in a controlled context [20]. Here, we also employed this approach and used a toy metabolic model and a real signaling model for the generation of data. By using this data, we tested models also with lower complexity than the true model to assess the sensitivity and specificity of our approach.
We applied our method also on an eicosanoid production model in human white blood cells. Eicosanoid is a subclass of fatty acyls. Fatty acyls constitute one of the six major classes of lipids and are related to inflammation, rheumatoid arthritis, sepsis and asthma. Eicosanoids are divided into different groups one of which is prostaglandin family. Prostaglandins have been found to be related to many symptoms of inflammation like fever and pain [2, 21, 22]. That makes the eicosanoids important targets for modeling studies which can be used for predictive purposes in response to treatment with anti-inflammatory drugs. A kinetic model describing the production of prostaglandins from Arachidonic acid has been published in [2]. The model includes the substrate Arachidonic Acid, 8 downstream metabolites, signaling molecules and 4 different enzymes. All reactions were formulated by mass action kinetics. Due to the scarcity of information on enzyme activity regulation, rate parameters for enzymatic reactions were formulated as linear functions of enzyme-regulator molecules. Given the simplicity of the kinetics in the model and limited number of components, we wanted to assess its informative level and our results showed that the model could not be invalidated with the available data.
The other benchmark pathway we analysed was the well known high osmolarity glycerol (HOG) pathway in yeast. Osmo-adaptation in yeast has started to receive increasing attention with the discovery of the associated mitogen-activated protein kinase (MAPK) cascade [23, 24]. Since then, the HOG pathway proved to be a well studied model system to study the principles of signal transduction due to MAPK cascades being conserved eukaryotic signal transduction pathways. The pathway is in charge of regulating the glycerol accumulation in the cell in response to the changing osmotic pressure in the environment. It has been widely accepted that the upstream pathway consists of two redundant paths starting with two different transmembrane osmosensor proteins Sho1p and Sln1p. The cascade proceeds with the phosphorylation of Pbs2p, Pbs2p-Sho1p complex and Hog1p towards the transcriptional regulation of glycerol production [25, 26]. However, there is still active debate on post-translational interactions and transient feedback mechanisms involved in the signal transduction [26, 27]. Therefore we analysed a recently published comprehensive model of the HOG pathway to check its predictive properties given part of the experimental data used to build the model [26, 27]. We also used the model as a basis for our simulation studies in which we generated data according to the published level of complexity and questioned a simplified version for its validity.
Methods
Toy metabolic model
Comparison of predictive power by cross validation
Comparison of predictive power by forecast analysis
Forecasting refers to predicting the future outcome of a variable of interest. It is commonly used in a lot of disciplines ranging from economics to meteorology where modeling is crucial. In forecast analysis, models are established using past data and extrapolated to the future. Variations on forecast analysis exist depending on the types of the models, the needs of the field, partitioning of the training and test sets and the types of the measures that are used to assess the amount of prediction error [30].
Here, we used a basic scheme which fits for both SPCA and kinetic modeling. In each run, we left out approximately the last 20% of the time points of one metabolite as the test set. By this way, we could assign a certain percentage of the end time profiles to a test set once and the total prediction error on those time points gave us a measure for the predictive power of the models.
Kinetic modeling
Smooth principal components analysis
The other key feature of our approach is its comparative nature. The reference method we used for comparison was Smooth Principal Components Analysis (SPCA) [32]. SPCA is an extension of the well known dimension reduction method Principal Components Analysis (PCA) [29, 33] with roughness penalties on the scores.
The reference method is completely unsupervised, making no use of the kinetic model structure nor of any prior biochemical knowledge. Smooth Principal Components Analysis penalizes the non-smoothness of the scores and thus can make use of the underlying time profile in predicting the missing points in the data [32]. This makes it more efficient than normal PCA to be used as a prediction method when the scores are expected to have smoothness as in the case of time series data.
We have estimated the smooth scores (Z) and loadings (P) within a Weighted Principal Components Analysis (WPCA) formulation. WPCA is a special variety of PCA in which data points are weighted proportional to the measurement accuracy at those points by using a weighting matrix [34]. WPCA can also be used to handle PCA on data with missing points using a binary weighting matrix where the entries corresponding to missing points are 0 [35]. That allows it to be employed as a favorite analysis method in multivariate statistics when there are missing points in the data [36] and also for performing cross validation where some of the data points are excluded as test set elements [28]. Our application in this study follows the latter.
Prior to using SPCA, the number of principal components (PCs) and the value of the smoothing parameter (λ) have to be calibrated for each specific problem. We used cross validation also for this purpose. After the test set elements (outer test sets) which we used also in the Kinetic modeling Section were removed from the dataset, the remaining part was again subjected to a division of test (inner test sets) and training sets for a 10-fold cross validation with 10 repetitions. We applied SPCA using a particular value for λ and a particular number of PCs on every training set. The average prediction error on all different inner test sets from 10 different repetitions gave us a measure of how well the inner test set points could be predicted using that particular parameter combination. We repeated the same procedure by using increasing λ values and increasing number of PCs until the predictions on the inner test sets could not improve with increasing number of PCs and started to deteriorate with increasing λ after certain limits. These limits gave us the optimal values for the parameters. This approach is known as “Double Cross Validation” since it makes use of cross validation at two different levels and it leads to unbiased prediction errors [37].
In forecast analysis we followed the same approach with a small variation. There, we left out windows of data which consisted of 5 consecutive time points from the same metabolite as inner test sets, in each run. This helped us to infer the optimal parameters better for the accurate prediction of the end time points. This was because, also in forecast analysis, the purpose was to predict consecutive time points, in opposition to cross validation where the outer test set points were not consecutive.
Results and discussion
Toy model
All the invalidation decisions made by using cross validation
σ _{ noise } | MRN (%) | ODE _{ S } | ODE _{ T } |
---|---|---|---|
0.001 | <1 | 100 | 0 |
0.01 | 2.2 | 100 | 0 |
0.025 | 5.4 | 100 | 4 |
0.03 | 6.5 | 100 | 8 |
0.05 | 10.8 | 75 | 14 |
All the invalidation decisions made by using forecast analysis
σ _{ noise } | MRN (%) | ODE _{ S } | ODE _{ T } |
---|---|---|---|
0.001 | <1 | 100 | 0 |
0.01 | 2.2 | 100 | 3 |
0.025 | 5.4 | 86 | 17 |
0.03 | 6.5 | 81 | 17 |
Up to this noise level, we determined the optimal value of the λ parameter as 0.005 by cross validation for all different realizations of the data. Cross validation gave also the optimal number of principal components as 4 in all of the cases covering more than 99% of the variance in the data. We estimated the optimal values of the parameters to be the same in different noise realizations due to the low amount of noise in the data. However, starting with this noise level, we had to determine the values of the SPCA parameters differently for each noise realization. This clearly showed that the datasets in 100 different noise realizations had different characteristics due to the increasing difference in the realization of the added noise. The difference in the parameters were more apparent for the forecast analysis than for the cross validation.
Noise level affects the plausibility of model simplifying approximations:
As a small demonstration of a specific research question for which our approach can be used, we investigated the plausibility of model simplifying approximations in kinetic modeling.
We used a moderate value (0.33) for the first Michaelis constant (Km _{1}) while generating the data. Its value was well within the range of the substrate concentration ([S]∈[0.2,1]). If it was much higher than the substrate concentration, the substrate concentration term in the denominator of the first rate equation (see Equation 1) could have been neglected. Therefore, the model simplification from ODE _{ T }to ODE _{ S }could have been performed with very low information loss. This approximation is widely employed in many model fitting studies to justify the simplification of Michaelis-Menten Kinetics to linear kinetics which helps to decrease the number of parameters in the model. However, the ranges of the parameter values in which this approximation will be plausible are never clear.
By using our SPCA-based invalidation approach, we could investigate how the invalidation decisions changed for the simplified model with respect to the value of the Michaelis constant. This helped us to assess the plausibility of the approximation based on the degree of support by the available data. We could also observe how that assessment became difficult by increasing noise in the data. For this purpose, we used three different Km _{1} values in data generation. We performed the simulations with noise levels between σ _{ noise }= 0.01 and σ _{ noise }= 0.04.
The number of cases where the model simplification was acceptable
Km _{ 1 } | σ _{ noise } = 0.01 | σ _{ noise } = 0.02 | σ _{ noise } = 0.04 |
---|---|---|---|
0.33 | 0 | 0 | 10 |
1.4 | 44 | 70 | 82 |
3 | 100 | 97 | 94 |
The change in the accuracy of the plausibility assessment proved to be an even more important observation. Table 3 shows that under low levels of noise, when the Michaelis constant was only slightly above the range of the substrate concentration at 1.4, in some 40 of the realizations, ODE _{ S }was not invalidated. This means that the simplification was supported in nearly half of the realizations. The number of realizations at which ODE _{ S }could not be invalidated could increase to 82 when the measurements were more erroneous at σ _{ noise }= 0.04 (Mean Relative Noise ≈ 8%). This clearly shows that noise is an important factor that interferes with the plausibility of model simplification. At low noise levels, it is easier to pull out the correct kinetic mechanism from the rest of the simpler candidates. When higher noise is existent in data, detection of poorer predictions by simpler mechanisms become more difficult by the noise. Models that are in fact too simple to explain the mechanistic behaviour can be wrongly regarded as plausible candidates when the measurement accuracy is low in the experiments.
Eicosanoid production model
We used the mean of all replicates in the calculations. However, replicates in data allowed us to estimate the noise level and we calculated the mean relative noise (MRN) in the data as 8%. That level of noise in the data corresponded to the medium to high noise level that we have covered in our simulations study. Based on the results we achieved in our simulations study, we could expect high sensitivity and specificity of our approach in that noise range.
In SPCA, we preprocessed the data in accordance with the kinetic modeling approach. Therefore, we first scaled every concentration value in the data matrix by the maximum concentration of the corresponding metabolite in all the time points and carried out SPCA on that scaled data matrix. It is highly recommended to scale the data prior to any type of PCA application if the order of magnitude of the data values change substantially between columns, since that will allow a more fair distribution of the loadings of the variables in the most important principal components. Then, the smoothing parameter applies more equally for every metabolite and we can achieve better smoothing of all the time profiles.We used an 8-fold diagonal cross validation scheme with 5 repetitions. The first test set involved consecutive time points from consecutive metabolites as was shown in Figure 2. The other 4 test sets involved time points with increasing intervals from different metabolites. By this approach we could achieve very diverse test sets and all data points except the first and last time points of each metabolite were included in a test set five times. We also weighted the resulting residuals by the maximum concentration before summing up to the final value and averaged by the number of repetitions.
HOG signaling model in yeast
Synthetic data
We used the model depicted in Figure 10 to generate data by using the optimal parameter values determined in [26]. Synthetic data consisted of the time profiles of 4 different species measured following two different osmotic shocks at 0.4 and 0.5 M. NaCl in wild type cells. The species were the phosphorylated Hog1p, glycerol, Hog1 dependent protein (mainly Gpd1p) and the associated mRNA. We set the number of measurement points to 43 which spans the dynamic part of the profiles between the shock and the steady state at around one hour later. Following the generation of model values, we added heterogeneous noise on the data. Noise was drawn from a standard normal distribution with two different values of standard deviation and multiplied by the concentration value of the species at that time point. The standard deviation was 0.01 and 0.2 in the low and high noise levels, respectively. We carried out kinetic modeling with the true model, ODE _{ T }that we used to generate the data and a simplified model ODE _{ S }which lacked the post-translational regulation of glycerol production by the phosphorylated Hog1p (see Figure 10). During both kinetic modeling and SPCA we used a weighting matrix which normalizes the difference between the data and the model predictions, by the mean of the concentration values of the species during all the time points. Weighting serves the purposes we explained in the previous section.
All the invalidation decisions made for HOG pathway models
σ _{ noise } | ODE _{ S } | ODE _{ T } |
---|---|---|
0.001 | 100 | 0 |
0.02 | 100 | 16 |
Real data
We used a part of the experimental data from [26] and [27] to question the best HOG signaling model reported in [26]. The real data included 4 different species. The first species was the phosphorylated Hog1p whose concentration values were normalized by its maximum concentration value in wild type cells at the same osmotic shock experiment. It was measured for the Sho1 and Sln1 deletion mutants at 6 different levels of osmotic shock. The other species were glycerol, protein and the associated mRNA measured in wild type cell following 0.5 M. NaCl treatment. Those species’ concentrations were also normalized by their corresponding maximum concentration throughout their time profiles. We used only the dynamic part of the time profiles which start after the osmotic shock. Some of the interior time points were missing in the original data so we interpolated between the existing data points to achieve a full data matrix of 13 time points and 15 columns. We needed a full data matrix because calculating the prediction residuals for the comparison of the two approaches is a very essential step in our analysis and for this purpose, we need to know the real values of the concentration values at the data points that we leave out as test sets. Therefore we imputed the missing values prior to SPCA & ODE modeling by interpolation. In total, more than half of the time points were calculated by interpolation for the Hog1 dependent proteins (mainly Gpd1p) and the glycerol. We questioned two different models as in the case of the synthetic data. The simplified model lacked the post-translational modification of glycerol production by the Hog1p.
The results showed us that the model in question did not prove to be sufficient to explain the real data from [26] and [27] that we have used in our study. However, here we used only some part of the data that was available. Furthermore we had to impute many missing values prior to our calculations as mentioned above in this section. The reason for this is that we preferred to use the minimum amount of data that would suffice for the parameterization of the ODE model. Therefore the results we highlight here should be regarded as a more realistic demonstration of our approach rather than arriving at strict biological conclusions.
Conclusions
We introduced the use of two resampling methods, namely cross validation and forecast analysis for the analysis of kinetic systems biology models. Cross validation and forecast analysis allowed us to use a part of the available time series metabolite concentration data to infer the proposed model’s kinetic parameters and the remaining part of the same dataset to assess the predictive power of the model. This way, we have showed that resampling strategies eliminated the need for additional datasets for the assessment of predictive capabilities of models. We used those two approaches within a Smooth Principal Components Analysis (SPCA)-based comparative approach for the invalidation of models.
Our approach depends on the assumption that correct kinetic model descriptions can predict the test data better than unsupervised data analysis methods which do not make use of any biochemical knowledge. Therefore, deficiency of a kinetic model in prediction compared to prediction by unsupervised data analysis methods tells us that the model cannot describe the data sufficiently well. A solid measure of this level of ‘sufficiency’ is needed by the biochemical modeling community because most of the time, we aim at the simplest model which is still competent in explaining the data as was also given as a guideline in [12]. On the other hand, it is very important to emphasize that this kind of comparison to unsupervised methods is only needed for the assessment of kinetic models’ validity. We do not intend to underestimate the role of kinetic modeling by showing that there can be cases where unsupervised data analysis methods are superior to some kinetic models. Every kinetic model in systems biology is valuable and deserves attention just because they aim at providing mechanistic explanations which the unsupervised data analysis methods in statistics lack. That independence from kinetic model structure is also exactly the reason why we used the predictive power of unsupervised data analysis methods as a reference point in this study. We used Smooth Principal Components Analysis for this purpose. SPCA offers better predictive capabilities than normal PCA since it can make use of also the underlying time profile and hence is more suitable for time series data. SPCA is also very robust against small changes in the smoothing parameter λ, proving to be a stable reference point.
With our simulations study using synthetic data generated by a toy model, we showed that until high amount of experimental noise in the data, cross validation SPCA prediction error can work as a threshold to invalidate a too simple kinetic model with high specificity and sensitivity. It is however important to note that for an accurate comparison of predictive power, the inferred parameters of the kinetic model have to be optimal. Although proven to be not an easy task, there are many methods proposed in the literature to overcome the local minima problems encountered [38–40] during parameter inference.
Forecast analysis requires higher penalties for smoothing of the scores in SPCA and noise is more influential. Predictions by SPCA forecasting and kinetic modeling are more dependent on the noise realization in the data compared to cross validation with interior time points. Therefore, we need to be more aware of the estimated noise level in the data if we want to use SPCA forecasting prediction error as an invalidation measure.
Our SPCA-based invalidation approach can also be employed iteratively for model reduction. Analyses of model families derived from a master model has proved to be a popular approach in biochemical modeling [11, 12, 26, 41]. In this approach, a master model is allowed to be manipulated in certain directions, either by changing the interactions and the species involved or changing the kinetic laws of the model. By this way, a very high number of models with very different number of parameters are created and analyzed. Here, selection of the best model within the large family of models is a critical task. Our invalidation approach can be very useful in that stage. The most complex models within the model family can be questioned first for their validity. Later, they can be subject to step-wise simplification by removal of interactions or simplification of reaction kinetics. At a certain stage, the models would be invalidated by our approach meaning that they fail to explain the data sufficiently well. This would mean that the models are in their simplest acceptable form one step before the invalidation decision. However, at that step there would still be a number of models with different characteristics which could not be invalidated. Therefore, the problem of model invalidation turns to a problem of model selection between a number of models with similar complexities. Therefore, at that point, we can make use of model selection criteria like AIC or BIC complementary to our invalidation approach for the ultimate selection of the best model.
Declarations
Acknowledgements
This project was financed by the Netherlands Metabolomics Centre (NMC) which is a part of the Netherlands Genomics Initiative/Netherlands Organisation for Scientific Research. The authors thank Maikel Verouden for the m-files performing SPCA.
Authors’ Affiliations
References
- Kotte O, Zaugg J, Heinemann M:Bacterial adaptation through distributed sensing of metabolic fluxes. Mol Syst Biol. 2010, 6: 355-PubMed CentralView ArticlePubMedGoogle Scholar
- Gupta S, Maurya MR, Stephens DL, Dennis EA, Subramaniam S:An integrated model of eicosanoid metabolism and signaling based on lipidomics flux analysis. Biophys J. 2009, 96 (11): 4542-51. 10.1016/j.bpj.2009.03.011.PubMed CentralView ArticlePubMedGoogle Scholar
- Teusink B, Passarge J, Reijenga CA, Esgalhado E, van der Weijden CC, Schepper M, Walsh MC, Bakker BM, van Dam K, Westerhoff HV, Snoep JL:Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. Eur J Biochem. 2000, 267 (17): 5313-5329. 10.1046/j.1432-1327.2000.01527.x.View ArticlePubMedGoogle Scholar
- du Preez FB, van Niekerk DD, Kooi B, Rohwer JM, Snoep JL:From steady-state to synchronized yeast glycolytic oscillations I: model construction. FEBS J. 2012, 279 (16): 2810-2822. 10.1111/j.1742-4658.2012.08665.x.View ArticlePubMedGoogle Scholar
- Le Novère N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, Li L, Sauro H, Schilstra M, Shapiro B, Snoep JL, Hucka M:BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 2006, 34 (Database issue): D689-D691.PubMed CentralView ArticlePubMedGoogle Scholar
- Timmer J, Müller TG, Swameye I, Sandra O, Klingmüller U:Modeling the nonlinear dynamics of cellular signal transduction. Int J Bifurcation Chaos. 2004, 14 (06): 2069-2079. 10.1142/S0218127404010461.View ArticleGoogle Scholar
- du Preez FB, van Niekerk DD, Snoep JL:From steady-state to synchronized yeast glycolytic oscillations II: model validation. FEBS J. 2012, 279 (16): 2823-2836. 10.1111/j.1742-4658.2012.08658.x.View ArticlePubMedGoogle Scholar
- Akaike H:A new look at the statistical model identification. Automatic Control IEEE Trans. 1974, 19 (6): 716-723. 10.1109/TAC.1974.1100705.View ArticleGoogle Scholar
- Turkheimer FE, Hinz R, Cunningham VJ:On the undecidability among kinetic models: from model selection to model averaging. J Cereb Blood Flow Metab. 2003, 23 (4): 490-498.View ArticlePubMedGoogle Scholar
- Link H, Kochanowski K, Sauer U:Systematic identification of allosteric protein-metabolite interactions that control enzyme activity in vivo. Nat Biotech. 2013, 31 (4): 357-361. 10.1038/nbt.2489.View ArticleGoogle Scholar
- Flotmann M, Schaber J, Hoops S, Klipp E, Mendes P:ModelMage: a tool for automatic model generation, selection and management. Genome Inform. 2008, 20: 52-63.Google Scholar
- Haunschild MD, Freisleben B, Takors R, Wiechert W:Investigating the dynamic behavior of biochemical networks using model families. Bioinformatics. 2005, 21 (8): 1617-1625. 10.1093/bioinformatics/bti225.View ArticlePubMedGoogle Scholar
- Cedersund G, Roll J:Systems biology: model based evaluation and comparison of potential explanations for given biological data. FEBS J. 2009, 276 (4): 903-922. 10.1111/j.1742-4658.2008.06845.x.View ArticlePubMedGoogle Scholar
- Vyshemirsky V, Girolami M:Bayesian ranking of biochemical system models. Bioinformatics (Oxford, England). 2008, 24 (6): 833-839. 10.1093/bioinformatics/btm607.View ArticleGoogle Scholar
- Toni T, Stumpf MPH:Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics. 2010, 26: 104-110. 10.1093/bioinformatics/btp619.PubMed CentralView ArticlePubMedGoogle Scholar
- Milias-Argeitis A, Porreca R, Summers S, Lygeros J:Bayesian model selection for the yeast GATA-factor network: a comparison of computational approaches. Decision and Control (CDC), 2010 49th IEEE Conference on. 2010, 3379-3384.View ArticleGoogle Scholar
- Bates DG, Cosentino C:Validation and invalidation of systems biology models using robustness analysis. IET Syst Biol. 2011, 5 (4): 229-44. 10.1049/iet-syb.2010.0072.View ArticlePubMedGoogle Scholar
- Hafner M, Koeppl H, Hasler M, Wagner A:‘Glocal’ robustness analysis and model discrimination for circadian oscillators. PLoS Comput Biol. 2009, 5 (10): e1000534-10.1371/journal.pcbi.1000534.PubMed CentralView ArticlePubMedGoogle Scholar
- Anderson J, Papachristodoulou A:On validation and invalidation of biological models. BMC Bioinformatics. 2009, 10: 1-13.View ArticleGoogle Scholar
- Mendes P, Camacho D, de la Fuente A:Modelling and simulation for metabolomics data analysis. Biochem Soc Trans. 2005, 33 (Pt 6): 1427-1429.View ArticlePubMedGoogle Scholar
- Buczynski MW, Dumlao DS, Dennis EA:Thematic review series: proteomics. An integrated omics analysis of eicosanoid biology. J Lipid Res. 2009, 50 (6): 1015-1038. 10.1194/jlr.R900004-JLR200.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang K, Ma W, Liang H, Ouyang Q, Tang C, Lai L:Dynamic simulations on the arachidonic acid metabolic network. PLoS Comput Biol. 2007, 3 (3): e55-10.1371/journal.pcbi.0030055.PubMed CentralView ArticlePubMedGoogle Scholar
- Hohmann S:Osmotic stress signaling and osmoadaptation in yeasts. Microbiol Mol Biol Rev. 2002, 66 (2): 300-372. 10.1128/MMBR.66.2.300-372.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Gustin MC, Albertyn J, Alexander M, Davenport K:MAP Kinase Pathways in the YeastSaccharomyces cerevisiae. Microbiol Mol Biol Revs. 1998, 62 (4): 1264-1300.Google Scholar
- Posas F, Wurgler-Murphy SM, Maeda T, Witten EA, Thai TC, Saito H:Yeast HOG1 MAP kinase cascade is regulated by a multistep phosphorelay mechanism in the SLN1–YPD1–SSK1 “two-component” osmosensor. Cell. 1996, 86 (6): 865-875. 10.1016/S0092-8674(00)80162-2.View ArticlePubMedGoogle Scholar
- Schaber J, Baltanas R, Bush A, Klipp E, Colman-Lerner A:Modelling reveals novel roles of two parallel signalling pathways and homeostatic feedbacks in yeast. Mol Syst Biol. 2012, 8: 622-PubMed CentralView ArticlePubMedGoogle Scholar
- Macia J, Regot S, Peeters T, Conde N, Sole R, Posas F:Dynamic signaling in the Hog1 MAPK pathway relies on high basal signal transduction. Sci Signal. 2009, 2 (63): ra13-View ArticlePubMedGoogle Scholar
- Bro R, Kjeldahl K, Smilde AK, Kiers HaL:Cross-validation of component models: a critical look at current methods. Anal Bioanal Chem. 2008, 390 (5): 1241-1251. 10.1007/s00216-007-1790-1.View ArticlePubMedGoogle Scholar
- Alpaydin E: Introduction to Machine Learning. 2004, Cambridge: MIT pressGoogle Scholar
- Box GEP, Jenkins GM: Time Series Analysis: Forecasting and Control. 1976, Holden Day: San FranciscoGoogle Scholar
- Coleman TF, Li Y:An interior trust region approach for nonlinear minimization subject to bounds. SIAM J Optimization. 1996, 6 (2): 418-445. 10.1137/0806023.View ArticleGoogle Scholar
- Verouden MP:Fusing Prior Knowledge with microbial metabolomics. PhD thesis, University of Amsterdam. 2012,Google Scholar
- Jolliffe I: Principal Component Analysis. 2002, New York: Springer-VerlagGoogle Scholar
- Jansen JJ, Hoefsloot HCJ, Boelens HFM, van der Greef J, Smilde AK:Analysis of longitudinal metabolomics data. Bioinformatics (Oxford, England). 2004, 20 (15): 2438-2446. 10.1093/bioinformatics/bth268.View ArticleGoogle Scholar
- Kiers HaL:Weighted least squares fitting using ordinary least squares algorithms. Psychometrika. 1997, 62 (2): 251-266. 10.1007/BF02295279.View ArticleGoogle Scholar
- Josse J, Husson F:Handling missing values in exploratory multivariate data analysis methods. J de la Société Française de Stat. 2012, 153 (2): 79-99.Google Scholar
- Smit S, van Breemen MJ, Hoefsloot HC, Smilde AK, Aerts JM, de Koster CG:Assessing the statistical validity of proteomics based biomarkers. Anal Chimica Acta. 2007, 592 (2): 210-217. 10.1016/j.aca.2007.04.043.View ArticleGoogle Scholar
- Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MP:Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J R Soc Interface. 2009, 6 (31): 187-202. 10.1098/rsif.2008.0172.PubMed CentralView ArticlePubMedGoogle Scholar
- Moles CG, Mendes P, Banga JR:Parameter estimation in biochemical pathways: a comparison of global optimization methods. Genome Res. 2003, 13 (11): 2467-2474. 10.1101/gr.1262503.PubMed CentralView ArticlePubMedGoogle Scholar
- Mendes P, Kell D:Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation. Bioinformatics. 1998, 14 (10): 869-883. 10.1093/bioinformatics/14.10.869.View ArticlePubMedGoogle Scholar
- Kuepfer L, Peter M, Sauer U, Stelling J:Ensemble modeling for analysis of cell signaling dynamics. Nat Biotechnol. 2007, 25 (9): 1001-1006. 10.1038/nbt1330.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.