Skip to main content

A model comparison study of the flowering time regulatory network in Arabidopsis

Abstract

Background

Several dynamic models of a gene regulatory network of the light-induced floral transition process in Arabidopsis have been developed to capture the behavior of gene transcription and infer predictions based on experimental observations. It has been proven that the models can make accurate and novel predictions, which generate testable hypotheses.

Two major issues were addressed in this study. First, construction of dynamic models for gene regulatory networks requires the use of mathematic modeling that comprises equations of a large number of parameters. Second, the binding mechanism of the transcription factor with DNA is another factor that requires detailed modeling. The first issue was tackled by adopting an optimization algorithm, and the second was addressed by comparing the performance of three alternative modeling approaches, namely the S-system, the Michaelis-Menten model and the Mass-action model. The efficiencies of parameter estimation and modeling performance were calculated based on least square error (O(p)), mean relative error (MRE) and Akaike Information Criterion (AIC).

Results

We compared three models to describe gene regulation of the flowering transition process in Arabidopsis. The Mass-action model is the simplest and has the least parameters. It is therefore less computation-intensive with the smallest AIC value. The disadvantage, however, is that it assumes the system is simply a second order reaction which is not the case in our study. The Michaelis-Menten model also assumes the system is homogeneous and ignores the intracellular protein transport process. The S-system model has the best performance and it does describe the diffusion effects. A disadvantage of the S-system is that it involves the most parameters. The largest AIC value also implies an over-fitting may occur in parameter estimation.

Conclusions

Three dynamic models were adopted to describe the dynamics of the gene regulatory network of the flowering transition process in Arabidopsis. Based on MRE, the least square error and global sensitivity analysis, the S-system has the best performance. However, the fact that it has the highest AIC suggests an over-fitting may occur in parameter estimation. The result of this study may need to be applied carefully when modeling complex gene regulatory networks.

Background

Arabidopsis thaliana is a plant in the mustard family that has been frequently chosen as the organism model in research on plant science. It possesses small size, diploid genetics, small genome and relatively short generation time. The life cycle of Arabidopsis from vegetative to reproductive growth is an important developmental step that is under tight genetic control. In the meanwhile the floral transition state has shown to be integrated by a complex gene regulatory network.

For Arabidopsis, floral organ specification has been successfully linked to spatial gene expression patterns according to floral transition and floral development. This model has five pathways that can explain various external (photoperiod, vernalization, ambient temperature) and internal (autonomous, age, gibberellins) conditions to regulate the floral transition through an elaborated genetic network [15].

Recently, gene expression data sets have become available for the genes involved in the regulation of floral transition and flower development in Arabidopsis. In [6], time series of gene expression were presented for each class of genes in the floral transition group. For most floral transitions, the majority of which are members of Arabidopsis, namely APETALA (AP1) and LEAFY (LFY), are transcription factors where the way in which they are activated was known from experiments [7]. Furthermore, it has been shown that in regulating the flowering time in Arabidopsis, these two information sources open the door for mathematical model development.

To inference gene regulatory networks from time course data has been one of the main challenges in systems biology. In recent years, technological advances have driven the development of systems biology in experimental methods that generate in vivo time course data to characterize regulatory interactions. In the last years there has been a significant increase of publications in the area of model construction. Some examples include: cell fate determination in Arabidopsis flowers [8], model study of role proteins (CLV1, CLV2, CLV3 and WUS) in shoot apical meristem of Arabidopsis[9], integration of developmental and hormonal pathways in the Arabidopsis flower [10], and gene regulatory network models for plant development [11].

However, a major challenge with such models is that the detailed transcript binding process in a microscopic picture is usually unclear; therefore these models may be deviated from the reality. In addition, a dynamic model requires extensive mathematical formula and large amount of experimental data that are not available. Alternatively, a large-scale gene regulation model can be constructed based on stoichiometry without a large number of fitted parameters. Although these models can be used to predict the regulation behaviour using flux analysis, they failed to capture the transient behaviors of genes. For instance, Mahadevan [12] proposed the dynamic flux balance analysis for situations where there is knowledge available; Yugi [13] proposed a method that aims to simplify the number of kinetic parameters in building a dynamic model.

Many studies on dynamic simulation of gene regulation systems have been reported in the literature. Spieth [14] used linear weight matrices, S-system and H-systems model, and different optimization algorithms to model a nonlinear dynamic system. Rafael et al. [15] compared Michaelis-Menten model, power-law and generalized mass action to represent an E. coil central carbon metabolic network.

In this study, we modeled the regulatory interactions in the flowering of Arabidopsis with a series of kinetic functions. The first case considers the conditions that mRNA is produced immediately after transcript factor binding. This process is formulated as a mass action model. The second case assumes that a complex state is formed between the transcription factor and its target gene. The production of mRNA is delayed due to the stability of the complex state. This process is formulated as a Michaelis-Menten model. The third case assumes that the binding process of the transcript factor is limited by 1-D and 3-D diffusions, and the production of mRNA is dominated by a diffusion-reaction process. Accordingly S-system was adopted in this study to model such a mechanism.

Results

Comparison of the models

Table 1 presents the reaction mechanisms depicted in the flowering time regulatory network of Arabidopsis. This gene regulatory network describes the flowering time (Photoperiodic) in Arabidopsis thaliana. The core of this regulatory network includes:

  1. 1

    The photoperiod activates the CO gene.

  2. 2

    After CO activates the expression of FLOWERING LOCUST (FT) probably by binding to the FT regulatory regions and the bZIP transcription factor FD, the FT/FD complex activates the expression of SOC1.

  3. 3

    SOC1 and AGL24 form a positive feedback loop and up-regulate LFY.

  4. 4

    AP1 and LFY are ultimately resolved in the up-regulation of the floral meristem identity genes.

Table 1 A dynamic model for the concentration of gene complexes of the flowering time regulatory network of Arabidopsis according to the reaction scheme depicted in Figure 4

In this study, we compared three dynamic models to reconstruct the behaviour of the flowering time regulatory network of Arabidopsis. The governing differential equations are listed in the following:

S-system

X ˙ 1 = α 1 X 9 g 19 - β 1 X 1 h 11 X ˙ 2 = α 2 X 1 g 21 - β 2 X 2 h 22 X 8 h 28 X ˙ 3 = α 3 X 2 g 32 X 8 g 38 - β 3 X 3 h 33 X ˙ 4 = α 4 X 3 g 43 X 6 g 46 - β 4 X 4 h 44 X ˙ 5 = α 5 X 3 g 53 - β 5 X 5 h 55 X ˙ 6 = α 6 X 4 g 64 - β 6 X 6 h 66 X ˙ 7 = α 7 X 6 g 76 - β 7 X 7 h 77

X ˙ 8 and X ˙ 9 are constants.

Micharlis-Menten model

X ˙ 1 = V m 1 · X 9 K m 1 + X 9 - d k 1 X 1 X ˙ 2 = V m 2 · X 1 K m 2 + X 1 - d k 2 X 2 X ˙ 3 = V m 3 · X 2 X 8 K m 3 + X 2 X 8 - d k 3 X 3 X ˙ 4 = V m 4 a · X 3 K m 4 a + X 3 · V m 4 b · X 6 K m 4 b + X 6 - d k 4 X 4 X ˙ 5 = V m 5 · X 3 K m 5 + X 3 - d k 5 X 5 X ˙ 6 = V m 6 · X 4 K m 6 + X 4 - d k 6 X 6 X ˙ 7 = V m 7 · X 6 K m 7 + X 6 - d k 7 X 7

X ˙ 8 and X ˙ 9 are constants.

Mass-action model

X ˙ 1 = k r 1 X 9 - d k 1 [ X 1 ] X ˙ 2 = k r 2 X 1 - d k 2 [ X 2 ] X ˙ 3 = k r 3 X 2 [ X 8 ] - d k 3 [ X 3 ] X ˙ 4 = kr X 3 4 a · kr X 6 4 b - d k 4 X 4 X ˙ 5 = k r 5 X 3 - d k 5 [ X 5 ] X ˙ 6 = k r 6 X 4 - d k 6 [ X 6 ] X ˙ 7 = k r 7 X 6 - d k 7 [ X 7 ]

X ˙ 8 and X ˙ 9 are constants.

Table 2 summarizes the total number of parameters used in each model. Due to the complex nature of the S-system model, 31 parameters were used to describe the floral transition pathway, whereas the Micharlis-Menten model and the Mass action model used 23 and 15 parameters, respectively. Among them the S-system used the most parameters because the reaction rate was described by a non-linear function for the reactant concentration.

Table 2 The total number of parameters in each of the three models used in this study

Parameter estimation has been considered as a reverse engineering problem, which may be performed based on local optimization methods and global optimization methods. In this study, three different optimization methods were employed including local HJ (local optimization), EP and PSO (global optimization). The O(p) and MRE were used to measure the quality of the fit for the estimated parameters. The values of O(p) calculated for the three models and the three optimization methods with four experimental data sets are shown in Figure 1. The result suggests that the PSO method was most suitable for our dynamic models.

Figure 1
figure 1

An analysis of the O(p) calculated for the three models and three optimization methods in four experimental data sets: (A) O(p) calculated for the col experimental data; (B) O(p) calculated for the ler experimental data; (C) O(p) calculated for the co experimental data; and (D) O(p) calculated for the ft experimental data.

The values of the MRE calculated for the three models and the four experimental data sets are presented in Table 3. The result suggests that the S-system model could achieve the best performance compared with the other two models, as the S-system has the smallest mean relative error (shown in Figure 2).

Table 3 The means and standard deviations of MRE calculated for the S-system, Michaelis-Menten model and Mass-action model
Figure 2
figure 2

The MRE calculated for the three models with four experimental data sets.

The AIC calculated for the three models, namely the S-system (col, 53.0331; ler, 52.6223; co, 46.2319; ft, 49.6211), Michaelis-Menten model (col, 30.1816; ler, 38.1605; co, 24.6298; ft, 34.1275) and Mass-action model (col, 26.1364; ler, 26.5598; co, 10.8465; ft, 22.8427), are presented in Table 4. The result suggests that among the three, the Mass-action model is the simplest and has the least parameters. It is therefore less complex with the smallest AIC value.

Table 4 The Akaike Information Criterion (AIC) calculated for the S-system, Michaelis-Menten model and Mass action model

These results suggest that the S-system model represents an option to simulate the dynamics of our gene regulatory network. It is understood that a limitation of the S-system model is that the parameters may not be identifiable when the concentrations and reaction rates are very small. However, considering that the cell interior is homeostatic, this condition is unlikely to occur during the flowering time of Arabidopsis. Another possible difficulty with the S-system is the low sampling intervals of the genes required for parameter identification, which is also a challenge for all other kinetic models. For parameter estimation we assumed that all the 8 genes are measurable.

The estimated parameter values are listed in Additional file 1. Figure 3 shows the simulated time course data for the following genes: CONSTANS (CO), FLOWERING LOCUS T (FT), protein FD (FD), SUPPRESSOR OF CONSTANS OVEREXPRESSION 1 (SOC1), APETALA1 (AP1), AGL24 and LEAFY (LFY). It is noted that the discrepancies between the S-system predicted values and the mRNA expression levels were relatively small for all of the modeled genes, suggesting that it may successfully replace the other two models to simulate time course data.

Figure 3
figure 3

A comparison of the simulated regulation of the flowering time in Arabidopsis (0, 3, 5, and 7 day)CONSTANS (CO), FLOWERING LOCUS T (FT), Protein FD (FD), SUPPRESSOR OF CONSTANS OVEREXPRESSION 1 (SOC1), APETALA1 (AP1), AGL24 and LEAFY (LFY ) for the expression data of ler (black solid line) and other models (S-system , Mass-action model , Michaelis-Menten model ).

Performance of the three models

In this study, we compared three alternative models: the S-system, Michaelis-Menten model, and Mass-action model, for the flowering time regulatory network of Arabidopsis. Both the Michaelis-Menten model and Mass-action model ignore the diffusion effect in the reaction. The Mass-action model assumes that the TF initiates mRNA transcription immediately, whereas the Michaelis-Menten model describes TF binds with DNA first and with the active mRNA later. The S-system models this process by adopting the 1D and 3D diffusion-reaction mechanisms. The 1D diffusion-reaction mechanism assumes the TF binds with the target site then activates the mRNA. The 3D diffusion-reaction mechanism describes that the TF binds with the DNA first then diffuses along the DNA to search for the target site, activating the mRNA transcription process during the final stage.

Shown in Figure 1 and Figure 2 are the MRE and O(p) calculated for the seven floral transition genes. The results indicate that the S-system, when compared with the real experimental data, achieved the best prediction.

Sensitivity analysis of the models

Sensitivity analysis can be applied to estimate the effect of parameter changes, while MRE gives an estimate of a model’s rate of change based on local sensitivity analysis. In this section, we report the results of the time dependent sensitivity analysis [Eq. 8] for a time period of 100 seconds.

As shown in Figure 4, it is obvious the sensitivity measure is positive for every gene. This implies that the mRNA concentration increases due to perturbation. The reason is that all of the interactions were activations. The results of the S-system show that fluctuations are limited to the first 10 seconds only, which is relative short compared to 50 seconds for the Michaelis-Menten model or mass action model (see Additional file 2 and Additional file 3). This in turn suggests that the response is a transient effect at most. A sensitivity value near zero means that gene activity is not sensitive to parameter perturbation at all. For a given gene, the response curves for the rate constants reflect the effect of perturbation on the transcription rate. For the kinetic order response, the sensitivity results indicate the effects on the strength of the activation or suppression.

Figure 4
figure 4

Time-dependent sensitivity analysis of the parameters in the S-system, where a and b are system function parameter vectors (alpha and beta) consisting of rate constants, and g and h are kinetic orders for genes CONSTANS (CO), FLOWERING LOCUS T (FT), Protein FD (FD), SUPPRESSOR OF CONSTANS OVEREXPRESSION 1 (SOC1), APETALA1 (AP1), AGL24 and LEAFY (LFY).

Although an FT/FD complex formation measurement was not available, as long as we assumed a steady state approximation, i.e., [FD/FT] = k[FD][FT], the S-system was still capable of giving a reasonable fitting for the complex’s regulated genes: SOC1 and AP1.

In this study, we focused on the experiments for fitting the parameters. With the sensitivity analysis we identified the most sensitive parameters and sampling time intervals; this may provide some directions for future experimental design for model refinement.

Discussions

Production of mRNA is dominated by the binding process between TF and its target gene. This binding process can be described by a diffusion-reaction mechanism. Although the Michaelis-Menten model has been widely used in gene regulation models, the results based on the MRE values show that the S-system is a better dynamic model for describing the flowering time of Arabidopsis. But the AIC values for the S-system (shown in Table 4) were larger than those of the Mass-action model, which implies an over-fitting may occur in parameter estimation.

The deviation is significant between the simulated data and experimental data for genes AGL24 and SOC1 in all the three models. These two genes form a feedback loop in the regulatory network; therefore such interactions degrade the performance of the three models. For a more complex regulation model, additional factors should be considered in the transcription mechanism. The values obtained from the sensitivity analysis were all positive, which indicate that the mRNA production rate is proportional to the collision frequency as well as the binding force between the TF and its target gene.

One may suggest that the performance effect is more due to the modeling technique rather than the equation system. This study only compared three possible models for one dynamic system. As the Mass-action model and Michaelis-Menten model techniques do not consider diffusion, they did not perform as well as the S-system. While the S-system seems to be most promising, a general conclusion that it is a better approach for modeling complex large-scale networks is yet to be established.

We used three reaction mechanisms to describe the process that a transcription factor binds on the promoter to generate mRNA. The Mass-action model assumes that this is a simple second-order reaction; the Michaelis-Menten model assumes that there is an intermediate product; and the S-system considers the diffusion effects in one dimension and three dimensions. Among the three, the Mass-action model is the simplest and has the least parameters. It is therefore less computation-intensive with the smallest AIC value (see Table 4). The disadvantage, however, is that it assumes the system is homogeneous which is not the case in our study. The Michaelis-Menten model also assumes the system is homogeneous and ignores the intracellular protein transport process. The S-system model does describe the diffusion effects which is the main driving force for mass transport in the cell. A disadvantage of the S-system is that it involves the most parameters. The largest AIC value of the S-system also implies an over-fitting may occur in parameter estimation.

We also considered the use of the Hill equation. In general the Hill equation is employed to describe the phenomenon that binding of a ligand to a macromolecule influencing the other ligands binding on the same macromolecule, which is known as cooperative binding. The Hill coefficient is used to quantify this effect, where a value of 1 indicates a completely independent binding, a value greater than 1 indicates a positive cooperativity, and a value less than 1 indicates a negative cooperativity. That is to say, a plurality of the same or different transcription factors are bound in the promoter region of a gene, and the first transcription factor affects the subsequent transcription factors in the promoter region. However, in our gene regulatory system, the number of transcription factors for the regulating genes is mostly one, and all the transcription factors have only one binding site on the promoter region. Therefore the Hill equation was not applicable in our study.

Conclusions

One of the major problems of establishing large-scale dynamic models is the lack of experimental data. The parameters are usually unknown, so are the specific reaction rate laws. Moreover, for a large number of reactions, the parameters are only available in the literature whose values have to be obtained in in vitro conditions.

In this study, we focused on the molecular mechanism of transcription to propose models for describing the gene regulatory interactions of the flowering transition processes in Arabidopsis. The S-system has the best performance. Although we assumed that the best performance had come from the consideration of diffusion effects, its highest AIC values indicated a possible over-fitting in parameter estimation. It is therefore necessary to carefully apply the S-system for modeling complex gene regulatory networks. The diffusion effects may as well be included in the parameters for the Mass action model and the Michaelis-Menten model.

Methods

The regulation of flowering time network

The state transition to flowering in plants is precisely controlled by environmental conditions and endogenous developmental cues so that plants produce their progeny under favorable conditions. The response to multiple factors suggests the existence of a complex network regulating this state transition in plants. In this study, the biology of flowering time (Photoperiodic) in Arabidopsis thaliana showed that a complex gene regulatory network that controls this transition integrates the responses based on various external and internal conditions. Consequently, the regulation of the flowering time has been a major adaptive trait during plant evolution and domestication. A large number of genes have been characterized as flowering time regulators, and several recent reviews have provided detailed descriptions of flowering time pathways [2].

Arabidopsis is a facultative or quantitative long day plant that can flower, albeit much later in a short day. Key regulatory genes appear to be conserved in Arabidopsis. A short day plant suggests that common pathways are utilized [16, 17]. The plant perceives photoperiod and is transduced to a downstream signaling system. The light-dependent pathway controls the flowering in response to seasonal changes.

Figure 5 shows the gene regulatory pathway for the flowering transition pathway in Arabidopsis[1], which is mediated by CONSTANS (CO):

  1. 1

    CO codes for a zinc finger and CCT domain-containing transcription factor that accumulates under long day conditions in leaves as a result of the combination of the rhythmic expression of CO mRNA.

  2. 2

    CO activates the expression of FLOWERING LOCUST (FT) probably by binding to the FT regulatory regions [18, 19]. The FT protein is a component of the mobile flowering signal that moves upon its expression in the vascular tissue of leaves to the shoot apex [20, 21].

  3. 3

    At the meristem, FT physically interacts with the bZIP transcription factor FD and the FT/FD complex and activates the expression of SOC1 [22].

  4. 4

    SOC1 and AGL24 form a positive feedback loop, and the two factors might form a complex for the up-regulation of LFY. Thus, the regulators of the floral transition form a small network with multiple interactions among themselves,

  5. 5

    AP1 and LFY are ultimately resolved in the up-regulation of the floral meristem identity genes.

Figure 5
figure 5

Photoperiod pathway for the flowering transition process in Arabidopsis .

We used three different models for the flowering transition pathway in Arabidopsis to reconstruct the experimental data.

Experimental data

We obtained the experimental data from:

  1. (1)

    Plant Expression Database (http://www.plexdb.org/), ID: AT4;

  2. (2)

    NCBI GEO, ID: GSE576 and GSE577.

Both contain the microarray data of the eight genes included in our study. Between the two, AT4 discusses the flower development of Arabidopsis thaliana; it was controlled by several signaling pathways which converge on a set of genes such as FT and SOC1 that function as pathway integrators. The photoperiod is regarded the most important factor in promoting floral transition: Arabidopsis thaliana will flower earlier under long day conditions than under short day conditions. It is therefore considered a facultative long day plant. To monitor changes of gene expression during floral transition and early flower development, plants were grown under short day (9 hr light, 15 hr dark) for 30 days and then shifted to long day (16 hr light, 8 hr dark) to induce flowering. The RNA was isolated from micro-dissected apical tissue harvested 0, 3, 5, and 7 days after hybridized to the Affymetrix ATH1 microarray.

We not only analyzed the reopens of Columbia (col) and Landsberg erecta (ler) but also the effect of mutants in the flowering time genes CONSTANS (co) and FT (ft). In this study, we used four experiments for parameter estimation: (1) the Columbia (col) is the most widely-used wild type of Arabidopsis thaliana; (2) The Landsberg erecta (ler) is currently the second most widely-used accession of Arabidopsis thaliana; finally (3) CONSTANS (co) and (4) FT (ft) are mutants in the flowering time [6].

We used four different experimental data sets based on optimized parameter estimation to find the most appropriate model.

Dynamic model

Before introducing the transcription mechanism of the binding process, the following assumptions were made in advance: (a) Transcription is initiated when all the activation binding sites are occupied, and all the repression binding sites regarding the same gene are empty; and (b) The cell size remains constant during the time course of flowering state transition.

S-system

Most biological systems are nonlinear. Although the Michaelis-Menten model has been widely used to approach biological systems, one of the disadvantages lies in the fact that it is not an explicit mathematical form for all cases, especially for complex processes such as diffusion-reaction interactions. The S-system consists of a set of mathematical terms that is sufficient to capture most of the nonlinear phenomena in nature including diffusion-reaction interactions. The development of the S-system [23] has been shown to provide a good approximation for many cases, and there are efficient procedures for estimating the parameter values [24, 25]. Protein-DNA interactions such as the binding between a transcription factor and its target gene have been studied for many years. Experimental observations have promoted the proposition that the diffusion effects in 3-D and 1-D crawled along the DNA are critical in the binding process. Early studies have yielded an unexpected result that the binding rate for the Lac repressor protein to its binding site on DNA is approximately 100-1000 times faster than the maximal 3-D diffusion rate in solution [26]. This phenomenon is called facilitated diffusion [27]. A picture of facilitated diffusion is schematically shown in Figure 6.

Figure 6
figure 6

Protein searches the target on DNA. The kon and koff are adsorption and dissociation rate constants for protein and λ is the average length that each protein moves along DNA.

The process can be described by the reaction

TF T F ns
T F ns + BS mRNA

where TF represents the transcription factor, TF ns represents the non-specific absorbed transcription factor on the DNA, and BS represents the binding site. The first step in this reaction is absorption of the transcription factor on the DNA. The second step in this reaction is mRNA production after the absorbed transcription factor has bound to its target gene. The mRNA production rate can be formulated as [28]:

v = λ T F ns L τ
(1)

where λ is the average length of the transcription factor that moves along the DNA, L is the total length of the DNA, and τ is the sum of the 3-D diffusion time and the retention time on the DNA for the transcription factor.

The Langmuir isotherm is not suitable for describing protein adsorption, because the diverse conformations and multiple absorbed sites in the absorption process [29]. The Freundlich adsorption isotherm is more concordant with the experimental observations of proteins absorption. Proteins absorption is strongly dependent on the bulk concentration of proteins. In this case, the Freundlich adsorption isotherm of transcription factor on the DNA can be expressed as

T F ns = K TF 1 / n
(2)

where K and n are constants at a particular temperature. From equations (1) and (2), the mRNA production rate can be determined as:

v = λ k L τ TF 1 / n
(3)

It can be transformed to the mathematic form of the S-system. For this reason, we adopted the S-system function to describe the diffusion-reaction interactions in the mRNA transcription process. In the case of mRNA transcription, the S-system equation for the transcription rate is given by:

v i = α i j = 1 n T F j g ij - β i j = 1 n T F j h ij
(4)

where n is the number of transcription factors, TF j is the j-th transcription factor that regulates gene i, α i and β i denote the positive rate constants, and g ij and h ij are referred to as the kinetic orders. If g ij  > 0, gene j will induce the expression of gene i. On the contrary, gene j will inhibit the expression of gene i if g ij  < 0. The variable h ij has the opposite effects in controlling the gene expression compared to g ij . In the present study, the range of g ij and h ij falls between 0 and 2.

Michaelis–Menten model

The Michaelis-Menten model describes a catalysed reaction in which an intermediate complex forms after binding between enzyme and substrate. Thereafter, the complex (TF-BS) converts to the product and enzyme. In the case of mRNA transcription, the transcription process via such mechanism may be represented as:

TF + BS TF - BS
TF - BS mRNA

Under the condition [TF]> > [BS], the production rate of mRNA for a gene with the diffusion effect ignored can be formulated as:

v = V max TF K m + TF
(5)

where V max is the maximum production rate of mRNA and K m is the Michaelis constant. The delay effect on the mRNA production increases with the stability of the complex state.

Mass-action model

The chemical mechanism of Mass-action model states that the reaction rate is proportional to the product of the active mass of reactants. In the case of mRNA transcription, the reactants correspond to the transcription factor (TF) from the upstream gene and its specific binding site (BS) on the downstream gene. The transcription process in the cell may be represented as:

TF + BS mRNA

Because the total number of binding sites for a specific gene is fixed, the production rate of mRNA of the downstream gene with diffusion ignored can be formulated as:

v = k TF
(6)

where k is the rate constant and [TF] represents the concentration of the transcription factor. The delay effect on the mRNA production is assumed to be zero.

Parameter estimation

The objective of parameter estimation is to adjust the parameter values of a model via an optimization procedure so that the predictions based on the model can closely express the observation data. Parameter estimation can be performed through global methods and local methods [30]. However, one of the major challenges in modeling large-scale dynamic systems has been the existence of several local minima in the solution space. In this study, parameter estimation was performed by using the software tool “Complex Pathway Simulator” (COPASI Ver. 4.8) to fit the time series experimental data based on a dynamic model [31]. Evolutionary Programming (EP), Hooke & Jeeves (HJ) and Particle Swarm Optimization (PSO) were applied to search for an optimal solution, which may not converge to the minimum with different initial guesses. Among the three, EP [32, 33] was inspired by biological evolution, PSO [34] was inspired by social behavior and the movement dynamics of insects, birds and fishes, and HJ [35] was derived based on a hill climbing technique.

These algorithms possess key advantages in large inverse problems of quantitative mathematical models [36]. The goodness of fitting for each set of estimated parameters can be quantified by the least squared error O:

O p = i = 1 n j = 1 t ω i X i , j - Y i , j p 2
(7)

where p is the tested parameter set, n and t are the number of genes in the regulatory network and the number of samples in the time series data, respectively; Y i,j (p) is the prediction time series data by the dynamic model for the parameter set p; and X i,j is the experimental time series data. The weighting factor is given by the mean square: ω i = 1 X 1 2 .

Model ranking and selection

In this study, we compared three models for the gene regulatory network of the flowering time regulation in Arabidopsis. We used the mean relative error (MRE) to quantify the response of each model subject to small perturbations [37]:

MRE = i = 1 n i = 1 t x i , j - y i , j / x i , j n
(8)

where x i,j denotes the experimental time series data for the i- th gene at time point j, y i,j is the simulation data for the i- th gene given by the model at time point j, n is the total number of genes, and t is the number of samples in the time series data.

Parameter sensitivity analysis

Dynamic models have been widely used to study metabolic networks and gene regulatory networks. These models are used to reconstruct experimental data and predict unobserved behaviors of a biological system. To address the many sources of uncertainty including error and noise in the experimental data, sensitivity analysis may be performed to identify the parameters in a model that have the strongest effect on the overall behavior.

Sensitivity analyses have the primary goal of determining how a given model responds to variations in a parameter. Local sensitivity analysis focuses on a particular point in the parameter space by changing one parameter at a time to obtain a local response of the model. Global sensitivity analysis tries to capture the entire parameter space all at once, allowing multiple parameters to be explored simultaneously [38].

In this study, we used the SBML-SAT software tool for Multi-Parametric Sensitivity analysis (MPSA) [39]. An MPSA analysis of the time dependent normalized sensitivity response is defined as:

S ij X j , p i = ln X j ln p i
(9)

where X j is the mRNA concentration of the j- th gene, and p i is the i- th parameter in the dynamic model.

Akaike information criterion

We compared three models for flowering time regulation in Arabidopsis. We used several parameter estimation methods to estimate the parameters of the dynamic models.

Parameter Estimation helps us quantify the regulatory abilities of the genes involved at the flowering time. In order to determine whether a dynamic model is optimal, in this study a statistical approach called Akaike Information Criterion (AIC) [40] was employed to validate the number of model parameters and determine the significance of the parameters.

The Akaike Information Criterion (AIC), which attempts to include both the estimated residual variance and the model complexity in one statistic, decreases as the residual variance S e decreases, and it increases as the number of parameters p increases. For a gene regulatory model with p regulatory parameters to fit with a dataset of N samples, the Akaike Information Criterion (AIC) can be written as [40]:v

AIC = 2 k - 2 ln L
(10)

where L is the likelihood of the mathematical model and p is the number of parameters in the model.

References

  1. Amasino R: Seasonal and developmental timing of flowering. Plant J. 2010, 61: 1001-1013. 10.1111/j.1365-313X.2010.04148.x.

    Article  CAS  PubMed  Google Scholar 

  2. de Montaigu A, Toth R, Copland G: Plant development goes like clockwork. Trends Genet. 2010, 26: 296-306. 10.1016/j.tig.2010.04.003.

    Article  CAS  PubMed  Google Scholar 

  3. Greenup A, Peacock WJ, Dennis ES, Trevaslis B: The molecular biology of seasonal flowering responses in Arabidopsis and the cereals. Ann Bot. 2009, 103: 1165-1172. 10.1093/aob/mcp063.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Kim DH, Doyle MR, Sung S, Amasion RM: Vernalization: winter and the timing of flowering in plants. Annu Rev Cell Dev Biol. 2009, 25: 277-299. 10.1146/annurev.cellbio.042308.113411.

    Article  CAS  PubMed  Google Scholar 

  5. Yant L, Mathieu J, Schmid M: Just say no: floral repressors help Arabidopsis bide the time. Curr Opin Plant Biol. 2009, 12: 580-586. 10.1016/j.pbi.2009.07.006.

    Article  CAS  PubMed  Google Scholar 

  6. Schmid M, Uhlenhaut NH, Godard F, Demar M, Bressan R, Weigel D, Lohmann JU: Dissection of floral induction pathways using global expression analysis. Development. 2003, 130: 6001-6012. 10.1242/dev.00842.

    Article  CAS  PubMed  Google Scholar 

  7. Wellmer F, Riechmann JL: Gene networks controlling the initiation of flower development. Trends Genet. 2010, 26: 519-527. 10.1016/j.tig.2010.09.001.

    Article  CAS  PubMed  Google Scholar 

  8. Van Moorik S, van Dijk A, de Gee M, Immink RG, Kaufmann K, Angent GC, van Ham RC, Molenarr J: Continuous-time modeling of cell fate determination in Arabidopsis flowers. BMC Syst Biol. 2010, 4: 101-10.1186/1752-0509-4-101.

    Article  Google Scholar 

  9. Nikolaev SV, Penenko AV, Lavreha VV, Mjolsness ED, Kolchanov NA: A model study of the role of proteins CLV1, CLV2, CLV3, and WUS in regulation of the structure of the shoot apical meristem. Ontogenez. 2007, 38: 457-462.

    CAS  PubMed  Google Scholar 

  10. Kaufmann K, Muino JM, Jauregui R, Airoldi CA, Smaczniak C, Krajewski P, Angenent GC: Target genes of the MADS transcription factor SEPALLATA3: integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol. 2009, 7: e1000090-

    Article  PubMed Central  PubMed  Google Scholar 

  11. Alvarez-Buylla ER, Benitez M, Davila EB, Chaos A, Espinosa-Soto C, Padilla-Longoria P: Gene regulatory network models for plant development. Curr Opin Plant Biol. 2007, 10: 83-91. 10.1016/j.pbi.2006.11.008.

    Article  CAS  PubMed  Google Scholar 

  12. Mahadevan R, Edwards JS, Doyle FJ: Dynamic flux balance analysis of diauxic growth in Escherichia coli. Biophys J. 2002, 83 (3): 1331-1340. 10.1016/S0006-3495(02)73903-9.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  13. Yugi K, Nakayama Y, Kinoshita A, Tomita M: Hybrid dynamic/static method for large-scale simulation of metabolism. Theor Bio Med Model. 2005, 2 (1): 42-53. 10.1186/1742-4682-2-42.

    Article  Google Scholar 

  14. Spieth C, Hassis N, Streichert F: Comparing mathematical models on the problem of network inference, Proceeding of the 8th Annual Conference on Genetic and Evolutionary Computation: July 2006. 2006, Seattle: Washington, USA

    Google Scholar 

  15. Costa RS, Machado D, Rocha I, Ferreira EC: Hybrid dynamic modeling of Escherichia coli central metabolic network combining Michaelis-Menten and approximate kinetic equations. Biosystems. 2010, 100 (2): 150-157. 10.1016/j.biosystems.2010.03.001.

    Article  CAS  PubMed  Google Scholar 

  16. Goff S, Ricke D, Lan T, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H: A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science. 2002, 296: 92-100. 10.1126/science.1068275.

    Article  CAS  PubMed  Google Scholar 

  17. Mouradov A, Cremer F, Coupland G: Control of flowering time: integrating pathways as a basis for diversity. Plant Cell. 2002, 14: 111-130.

    Google Scholar 

  18. Takada S, Goto K: Terminal flower2, an Arabidopsis homolog of heterochromatin protein, counteracts the activation of flowering locust by CONSTANS in the vascular tissues of leaves to regulate flowering time. Plant Cell. 2003, 15: 2856-2865. 10.1105/tpc.016345.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. An H, Roussot C, Suarez-Lopez P, Corbesier L, Vincent C, Pineiro M, Hepworth S, Mouradov A, Justin S, Turnbull C, Cuopland G: CONSTANS acts in the phloem to regulate a systemic signal that induces photoperiodic flowering of Arabidopsis. Development. 2004, 131: 3615-3626. 10.1242/dev.01231.

    Article  CAS  PubMed  Google Scholar 

  20. Corbesier L, Vincent C, Jang S, Fornara F, Fan Q, Searle I, Giakountis A, Farrona S, Gissot L, Turnbull C, Coupland G: FT protein movement contributes to long-distance signaling in floral induction of Arabidopsis. Science. 2007, 316: 1030-1033. 10.1126/science.1141752.

    Article  CAS  PubMed  Google Scholar 

  21. Tamaki S, Matsuo S, Wong HL, Yokoi S, Shimamoto K: Hd3a protein is a mobile flowering signal in rice. Science. 2007, 316: 1033-1036. 10.1126/science.1141753.

    Article  CAS  PubMed  Google Scholar 

  22. Michaels SD, Himelblau E, Kim SY, Schomburg FM, Amasino RM: Integration of flowering signals in winter-annual Arabidopsis. Plant Physiol. 2005, 137: 149-156. 10.1104/pp.104.052811.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  23. Savageau MA: Rules for the evolution of gene circuitry. Pac Symp Bio Comput. 1998, 3: 54-65.

    Google Scholar 

  24. Gentilini R: Toward integration of systems biology formalism: the gene regulatory networks case. Genome Inform. 2005, 16: 215-224.

    PubMed  Google Scholar 

  25. Voit E: Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists. 2000, Cambridge University Press

    Google Scholar 

  26. Houston PL: Chemical Kinetics and Reaction Dynamics. 2001, New York: McGraw Hill

    Google Scholar 

  27. von Hippel PH, Berg OG: Facilitated target location in biological systems. J Biol Chem. 1989, 264: 675-678.

    CAS  PubMed  Google Scholar 

  28. Kolomeisky AB: Physics of protein-DNA interactions: mechanisms of facilitated target search. Phys Chem Chem Phys. 2011, 13: 2088-2095. 10.1039/c0cp01966f.

    Article  CAS  PubMed  Google Scholar 

  29. Umpleby RJ, Baxter SC, Bode M, Berch JK, Shah RN, Shimizu KD: Application of the Freundlich adsorption isotherm in the characterization of molecularly imprinted polymers. Anal Chim Acta. 2001, 435: 35-42. 10.1016/S0003-2670(00)01211-3.

    Article  CAS  Google Scholar 

  30. Rodriguez-Fernandez M, Egea J, Banga J: Novel meta heuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinforma. 2006, 7: 483-10.1186/1471-2105-7-483.

    Article  Google Scholar 

  31. Hoops S, Sahle S, Gauges R, Lee C, Pahle J, Simus N, Singhhal M, Xu L, Mendes P, Kummer U: COPASI – a complex pathway simulator. Bioinformatics. 2006, 22: 3067-3074. 10.1093/bioinformatics/btl485.

    Article  CAS  PubMed  Google Scholar 

  32. Rodriguez-Fernandez M, Mendes P, Banga JR: A hybrid approach for efficient and robust parameter estimation in biochemical pathways. BioSystems. 2006, 83: 248-265. 10.1016/j.biosystems.2005.06.016.

    Article  CAS  PubMed  Google Scholar 

  33. Baker SM, Schallau K, Junker BH: Comparison of different algorithms for simultaneous estimation of multiple parameters in kinetic metabolic models. J. Integrative Bioinformatics. 2010, 7: 1-9.

    Google Scholar 

  34. Eberhart RC, Kennedy J: A new optimizer using particle swarm theory, Proceedings of the Sixth International Symposium on Micro Machine and Human Science: Oct 1995. 1995, Nagoya, Japan

    Google Scholar 

  35. Hooke R, Jeeves TA: Direct search solution of numerical and statistical problems. J Assoc Comput Mach. 1961, 2: 212-229.

    Article  Google Scholar 

  36. Mendes P: Modelling large biological systems from functional genomic data: parameter estimation. Foundations in Systems Biology. Edited by: Kitano H. 2001, Cambridge, MA: MIT Press, 163-186.

    Google Scholar 

  37. Kitayama T, Kinoshita A, Sugimoto M, Nakayama Y, Tomita M: A simplified method for power-law modelling of metabolic pathways from time-course data and steady-state flux profiles. Theor Bio Med Model. 2006, 3: 24-10.1186/1742-4682-3-24.

    Article  Google Scholar 

  38. Tang Y, Reed P, Wagener T, van Werkhoven K: Comparing sensitivity analysis methods to advance lumped watershed model identification and evaluation. Hyrdol Earth Syst Sci Discuss. 2006, 3: 3333-3395. 10.5194/hessd-3-3333-2006.

    Article  Google Scholar 

  39. Zi Z, Zheng Y, Rundell AE, Klipp E: SBML-SAT: a systems biology markup language (SBML) based sensitivity analysis tool. BMC Bioinforma. 2008, 9: 342-355. 10.1186/1471-2105-9-342.

    Article  Google Scholar 

  40. Akaike H: A new look at the statistical model identification. IEEE Trans Autom Control. 1974, 19: 716-723. 10.1109/TAC.1974.1100705.

    Article  Google Scholar 

Download references

Acknowledgments

This work of CCNW, KLN, PCYS and JJPT was supported by the National Science Council, Taiwan, under grant numbers NSC 99-2632-E-468-001-MY3 and NSC 102-2632-E-468-001-MY3. Our gratitude goes to Dr. Feng-Sheng Wang, Department of Chemical Engineering, National Chung Cheng University, Taiwan, for his advice on the S-system and parameter estimation methods.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charles CN Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

CCNW, PCC and KLN: Mathematical model analysis, simulation studies, software development, data analysis and manipulation, and drafting of the article. CMC,PCYS, and JJPT: co-wrote the article. All authors read and approved the manuscript.

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Cite this article

Wang, C.C., Chang, PC., Ng, KL. et al. A model comparison study of the flowering time regulatory network in Arabidopsis . BMC Syst Biol 8, 15 (2014). https://doi.org/10.1186/1752-0509-8-15

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1752-0509-8-15

Keywords