Modeling biological systems with uncertain kinetic data using fuzzy continuous Petri nets

Background Uncertainties exist in many biological systems, which can be classified as random uncertainties and fuzzy uncertainties. The former can usually be dealt with using stochastic methods, while the latter have to be handled with such approaches as fuzzy methods. Results In this paper, we focus on a special type of biological systems that can be described using ordinary differential equations or continuous Petri nets (CPNs), but some kinetic parameters are missing or inaccurate. For this, we propose a class of fuzzy continuous Petri nets (FCPNs) by combining CPNs and fuzzy logics. We also present and implement a simulation algorithm for FCPNs, and illustrate our method with the heat shock response system. Conclusions This approach can be used to model biological systems where some kinetic parameters are not available or their values vary due to some environmental factors.


Background
Modeling and simulation is one of the main techniques that are used to study biological systems from a computational point of view [1][2][3]. It plays an essential role in understanding the mechanisms of biological systems and for making predictions for new biological experiments.
So far, a variety of approaches have been proposed for modeling different types of biological systems. These approaches generally can be classified in the following three categories: (1) qualitative approaches such as Boolean networks [4], fuzzy rules [5] and (qualitative) Petri nets [6]; (2) quantitative approaches such as differential equations, Bayesian networks [7] and stochastic/continuous Petri nets [8]; and (3) hybrid approaches by combining qualitative and quantitative methods such as fuzzy stochastic Petri nets [9]. It is natural to employ *Correspondence: feiliu@scut.edu.cn; sehjsong@scut.edu.cn 1 School of Software Engineering, South China University of Technology, 510006 Guangzhou, People's Republic of China Full list of author information is available at the end of the article quantitative (qualitative) approaches when quantitative data are (not) available. However, for many biological systems, not all required quantitative data can be measured completely and precisely, so some kinetic parameters cannot be accurately estimated. In such a case, if we confine ourselves to qualitative methods, any available quantitative data become useless; however if we adopt quantitative methods, some kinetic data are missing or inaccurate. In order to make full use of all the available data for a biological system, hybrid methods could be a good option.
The uncertainties in biological systems usually come from their intrinsic internal noises and external environmental factors, or are caused by measurements. These uncertainties can be further classified according to their sources in two categories: random uncertainties and fuzzy uncertainties. If there are insufficient or missing data for a biological system, the modeling of the system could be accompanied with fuzzy uncertainties. Random uncertainties usually can be dealt with using stochastic methods such as stochastic Petri nets or stochastic differential equations, while fuzzy uncertainties have to be handled with such methods as fuzzy methods.
In this paper, we focus on a class of biological systems, which can be described using ordinary differential equations (ODEs) or continuous Petri nets (CPNs) [10], but have some inaccurate or missing kinetic parameters. Because of the existence of fuzzy uncertainties caused by insufficient data, we may have to combine fuzzy methods with quantitative methods such as ODEs to accomplish a trustworthy modeling of such a class of biological systems. In order to achieve this, we are going to propose a class of fuzzy continuous Petri nets (FCPNs).
Petri nets have been widely used in systems biology these years and have been extended in many ways, e.g., stochastic Petri nets by associating a stochastic time delay with each transition [11], and continuous Petri nets (CPNs) by associating deterministic rates with transitions and allowing tokens on places to be real values [12]. The underlying semantics of a CPN is a set of ODEs, which means a CPN model is nothing else than a graphical representation of a set of ODEs. By doing so, biologists can easily construct biological models described by ODEs and the constructed models are less error prone. On the other hand, fuzzy logic [13] was proposed to deal with fuzzy uncertainties, and has been applied in many fields. Fuzzy logic has also been combined with differential equations (DEs), and different types of fuzzy DEs have been proposed [14,15]. This offers good means to cope with quantitative systems involving uncertainties.
In this paper, we aim at the modeling of uncertain biological systems, and propose a class of biologically interpreted FCPNs by allowing some kinetic parameters to be fuzzy numbers. Considering the fact that analytical methods are impossible for large models [14], we present an appropriate algorithm for simulating large FCPNs. We illustrate the application of our method with a mediumsized model.
We believe that our approach can be applied in the following biological scenarios.
• For a biological model, if some of its kinetic parameters are not available, thus precluding ODE simulation, we can still use FCPNs to quantitatively analyze the model by giving an uncertain band of any output, rather than crisp values. • For a biological model where some of its parameters vary due to environmental effects or other factors and stochastic methods are not appropriate, then we can represent each variable (or uncertain) parameter as a fuzzy number. By running fuzzy simulation, we can obtain an uncertain band of an output, which describes the effect of the variable parameters.
Taking into account the fact that many ODE and CPN models of biological systems do exist and intrinsic uncertainties are associated with many biological systems, we believe that FCPNs offer a new means for reexamining these existing ODE and CPN models, revealing potentially new insights into the corresponding biological systems.
Our fuzzy modeling approach vs parameter estimation methods. Parameter estimation is an essential step in constructing quantitative (biological) models, which aims to infer kinetic parameters from experiment observations [16]. In contrast, our approach is one of the fuzzy modeling approaches, which aims to derive the uncertainties of outputs from the uncertainties of input parameters. In this paper, we will present a workflow for using our approach, where we will clearly see that parameter estimation can be considered as a key step of our fuzzy modeling approach.
The structure of the paper is as follows. In the section of methods, we describe fuzzy sets and continuous Petri nets. In the section of results and discussion, we present a class of fuzzy continuous Petri nets together with a fuzzy simulation algorithm, and discuss how to use the approach for modeling and analyzing biological systems illustrated by a medium-sized biological model. Finally, the conclusions are given.

Methods
In this section, we introduce fuzzy sets and continuous Petri nets.

Fuzzy sets
Fuzzy sets, proposed by Zadeh [13], are a generalization of classical sets and can handle uncertainty associated with imprecision and vagueness. Fuzzy theory is different from probability theory that deals with randomness.
A fuzzy setξ is defined on a universal set X by its membership function which maps a real value μξ (x) ∈ [0, 1] to each element x ∈ X. That is, in a fuzzy set, each element has a membership degree between 0 and 1, which is different from any element in a crisp set, whose membership degree is either 0 or 1. The α-cut of a fuzzy setξ at a level α ∈ [0, 1] is defined as the crisp subset of X with all the elements whose membership degree is greater than or equal to the given α, i.e., A fuzzy number is a special convex normalized fuzzy set over the real set R. Many different types of fuzzy numbers have been defined such as triangular, trapezoidal and bell-shaped fuzzy numbers. A triangular fuzzy number, denoted byξ = (a, b, c), a ≤ b ≤ c, is defined as (see Fig. 1): Its α-cut for any α ∈ [0, 1] is simply written as In the following, we arbitrarily chose to consider triangular fuzzy numbers to illustrate our approach, without loss of generality. We denote by the set of triangular fuzzy numbers whose lower bound is greater than 0, i.e. a > 0.
The extension principle is a powerful tool of fuzzy logic, which offers a general procedure for extending crisp domains to fuzzy domains. Assume f : X n → Y, andÃ is a fuzzy set on X such that Then, applying the extension principle yields the following fuzzy set The extension principle will be used in the paper to achieve fuzzy simulation of FCPNs.
Fuzzy differential equations (FDEs). As a typical class of FDEs, a differential equation with fuzzy parameters can be defined as where C is a fuzzy parameter, represented as a fuzzy number. The analytical solution of Eq. 5 can be obtained via a family of differential inclusions [17], or by considering dx/dt as the fuzzy generalized derivative [18]. However, all these existing analytical methods are only applicable for one or two equations. In order to address a number of FDEs, numerical methods have to be employed. In this paper, we will present a numerical approach for solving fuzzy differential equations.

Continuous Petri nets
Petri nets (PNs) [19] are weighted, directed, bipartite multigraphs. A PN consists of places, transitions (both of which are called nodes) and arcs (or edges) that connect nodes of both types. In the biological scenario, places may represent chemical species or their compounds, e.g., genes, mRNA, proteins or protein complexes; transitions may represent any kind of chemical reactions (e.g. association, disassociation, translation or transcription) or any positive/passive behavior such as molecular movement [20,21]. The tokens on places represent the number of molecules or the concentration levels of species, which only allow discrete integers.
For modeling a variety of scenarios, PNs have been extended in many ways, one of which is continuous Petri nets (CPNs) by allowing tokens on places to be real values to represent the concentration of species. The formal definition of a CPN is given as follows [11].
• P is a finite, non-empty set of continuous places.
• T is a finite, non-empty set of continuous transitions.
is a finite set of directed arcs.
• f : F → R + 0 is a function that assigns a non-negative real number to each arc a ∈ F. R + 0 denotes the set of non-negative real numbers.
• v : T → H is a function that assigns a firing rate function h t to each transition t ∈ T, whereby R denotes the set of real numbers. • t denotes the preplaces of transition t. • M 0 : P → R + 0 gives the initial marking, which assigns a non-negative real number to each place p ∈ P.
In a CPN, besides the continuous token values on places, the transitions are also continuous. That is, each transition is associated with a firing rate function and can continuou sly fire during simulation, if its preplaces allow to do so.
In fact, the underlying semantics of a CPN is a system of ODEs, where each equation describes the continuous token change on a given place, i.e., continuously increasing by its pretransitions' flow and decreasing by its posttransitions' flow. An equation for a place p takes the following form: where • p and p • denote the pretransitions and posttransitions of place p, respectively. Besides, the rate function related to each transition can be further written in the following form: where θ is a rate constant (also called kinetic constant). For example, Fig. 2 gives a CPN model of a decaydimerization network, which includes the following four biochemical reactions:

Fig. 2
A Petri net model of a decay-dimerization network [26]. This network consists of a degradation reaction (r 1 ), two reversible dimerization reactions (r 2 and r 3 ), and a conversion reaction (r 4 ) This CPN model generates the following set of ODEs under the mass action semantics.
If we run numerical simulation of the CPN model, we obtain the simulation result that is illustrated in Fig. 3 by taking the species S 2 as an example.

Results and discussion
In this section, we present a class of fuzzy continuous Petri nets together with a fuzzy simulation algorithm, and discuss how to use the approach for modeling and analyzing biological systems illustrated by a medium-sized biological model.

Fuzzy continuous Petri nets
In Eq. 6, the kinetic constant θ is usually obtained using parameter estimation methods. If θ is not precise, the CPN model will not produce the correct result. In this case, we have to abandon CPNs or ODEs and turn to other methods able to deal with uncertainties at the cost of losing all available data.
To overcome this issue, we may turn to hybrid methods by replacing the values of those uncertain parameters with fuzzy numbers. Thus, in such a model, we allow that some parameters are crisp, but others fuzzy numbers. In this way, we can make full use of all the available data of a system. In this case, the rate function of each transition turns into the following form (either a real value or a fuzzy number in ): Formally, an FCPN is defined as a six-tuple N =< • P is a finite, non-empty set of continuous places.
• T is a finite, non-empty set of continuous transitions.
0 is a function that assigns a non-negative real number to each arc a ∈ F.
• v : T → H is a function that assigns a firing rate function h t to each transition t ∈ T, whereby H := t∈T {h(t, θ)} is the set of all firing rate functions, and v(t) = h t for all transitions t ∈ T, and h(t, θ) in defined by Eq. (7). • M 0 : P → R + 0 gives the initial marking, which assigns a non-negative real number to each place p ∈ P.
With FCPNs, we can model θ as a crisp value if precise quantitative data are available or as a fuzzy number if the parameter cannot be measured precisely. As a result, we may use fuzzy analytical or fuzzy simulation methods to obtain an uncertain band for each output according to the uncertain band of the parameters. Therefore, we still obtain a quantitative analysis of a biological model that lacks some quantitative data. By presenting FCPNs, we offer a new method for modeling and analyzing biological systems with uncertain information.
The semantics of an FCPN is described by a combination of a set of FDEs in the form of Eq. 5 and a set of ODEs. Due to the existence of a couple of FDEs for a biological model, we have to resort to a simulation approach to analyze an FCPN model. Using the extension principle, we obtain the uncertainty of model outputs according to the uncertainty of model parameters.
For example, when we replace the crisp values of parameters θ 3 and θ 4 with the fuzzy numbers given in Table 1, we obtain an FCPN model for the decaydimerization network, which looks exactly the same as the one in Fig. 2 except of the two fuzzy kinetic parameters. Besides, Fig. 4 gives a simulation plot of the model, which is an uncertain band of an output due to the uncertain parameters. We can check the correctness of the model by analyzing the uncertain band of each output.   Table 1 for the values of parameters

Simulation algorithm
A biological model constructed as FCPNs usually generates a number of ODEs and/or FDEs. Analytical methods that are applicable for several simple FDEs can hardly be applied for analyzing such models. Therefore, numerical simulation becomes essential, especially for larger models. For achieving numerical simulation of an FCPN model, we adopt the following idea. We first represent each fuzzy number as a union of its α-cuts according to Zadeh's extension principle. By sampling the α-cut at each α level, we obtain a combination of samples for all parameters. For each combination, we run numerical simulation on the corresponding CPN model at an α level. After running simulations for all considered α levels, we compose all the α-cuts to obtain the membership function of each output at each simulation time point. That is, we obtain the uncertainties of outputs caused by the uncertainties of kinetic parameters.
This procedure is given in Algorithm 1 and explained in depth in the following.
(1) Determine the appropriate number of α levels to decompose the membership functions of uncertain parameters (Line 1). Each level is denoted by α j , j = 1, 2, . . . , J, where J is the total number of α levels to be considered. Each parameter takes the same number of α levels.
(2) For each α level, compute the corresponding α-cut of each fuzzy numberθ i , i = 1, 2, . . . , I, according to Eq. 4, where I is the total number of fuzzy numbers contained in the uncertain parameters. Each α-cut is represented as Figure 5 illustrates the result by performing the first two steps.
(3) Discretize each α-cut L j i , U j i and obtain a set of crisp values for each fuzzy number (Line 4). To do this, Fig. 5 Decompose a membership function into its α-cuts for a fuzzy kinetic parameter. This is done according to Zadeh's extension principle we use the same sampling step for simplicity, but of course we can adopt any other sampling method. But in order to improve the computational efficiency, we have to optimize the sampling or even minimize the sample number. Assume we sample K discretization values for each α-cut. We then obtain K I combinations of these sampling values for I fuzzy numbers at each α level. Thus we have K I × J combinations for all α levels. That is, the time complexity of the algorithm can be represented as O K I × J . For an uncertain model, the number of its uncertain parameters, i.e. I, usually can be fixed. Therefore, we should try to decrease both K and J.
(4) For each combination c ∈ K I × J, we replace each fuzzy number with its corresponding sampling value and then obtain a sample (a set of ODEs) of the FCPN model (Line 7). After that, we perform numerical simulation on the corresponding ODEs with a numerical integration method such as the Runge-Kutta method, and obtain simulation results (Lines 8-9).
(5) We compose the simulation traces of all the α-cuts to obtain the membership function of each output at each time point, which is illustrated in Fig. 6 (Lines 11-16). Such membership function reflects more accurately the effects of uncertain parameters.
(6) By computing the maximum and minimum values over all the simulation traces at each time point for each output, we obtain an uncertain band for the output (Lines [18][19][20][21]. Such uncertain band roughly reflects the effects of uncertain parameters.
A more efficient sampling method. The sampling method we use above involves a large number of samples, i.e., K I × J; however, there may be some redundancy at some α levels. By reducing this redundancy without affecting the accuracy, we propose a new sampling method, which works as follows.
For each fuzzy parameter i, we only discretize the αcut at α J = 0 into k crisp values, while for other α levels, we only consider the start and end points, i.e. L j i and U j i . Thus we have in total K I + 2 I × (J − 1) combinations, which result in the same number of simulation traces, each of which is denoted by Tr ijk for an output. The new sampling method substantially reduces the number of samples compared with the method above. When the α level m is equal to J, we use the same method as given in Algorithm 1 to compute the membership function and uncertain band of an output. When m is unequal to J, we first obtain the traces to be used at this α level, which consist of the traces in Tr imk and the others in Tr iJk . The latter can be obtained in the following way. If v iJk ∈ L m i , U m i for all fuzzy parameters, where v iJk is the crisp parameter value via discretation at the α level J, the corresponding trace Tr iJk is selected. After that, we still use the same method given in Algorithm 1 to compute the membership function and uncertain band of an output.

A workflow to use FCPNs for modeling biological systems
In this section, we give a general workflow to use FCPNs to model and analyze biological systems.
(1) Collect quantitative data and qualitative knowledge for the biological system to be studied. A model usually consists of two main parts, structure and kinetic parameters. We first determine the structure of the model by means of the available data and knowledge. At this step, we can obtain a qualitative Petri net model, which can be read as a set of ODEs in which parameters are not assigned values. We then divide the kinetic parameters of the model into two categories: precise or uncertain. Obtain a set of ODEs for each combination; 8: Run numerical simulation on the set of ODEs; 9: Obtain a trace TR j c c ∈ K I ; 10: end for 11: for each time point τ ∈ TT (TT is the simulation time length) do 12: TRMax j (τ ) = max c∈K I TR j c (τ ) ; 13: TRMin j (τ ) = min c∈K I TR j c (τ ) ; 14: end for 15: end for 16: Compute the membership function of each output in terms of TRMax j and TRMin j ; 17: //Compute the uncertain band for each output; 18: for each time point τ ∈ TT do 19: UpperBound(τ ) = max j=1,2,...,J (TRMax j (τ )); 20: LowerBound(τ ) = min j=1,2,...,J (TRMin j (τ )); 21: end for (2) For parameters with sufficient quantitative data, we can use well-established parameter estimation methods to obtain their precise values. See e.g., [22,23] for performing parameter estimation of biological models described by ODEs.
(3) For each uncertain parameter, we can adopt two methods to specify their fuzzy values. (a) Perform parameter estimation based on the available incomplete quantitative data. When parameter estimation is performed, we usually specify the initial parameter search space and then refine this space based on the data. If the data is insufficient, we cannot obtain a precise parameter value, although we may be able to reduce the search space. In this case, we will use the reduce parameter space to specify the fuzzy value of the parameter. (b) Employ experts to directly give a fuzzy value for a parameter. If quantitative data are not available, we can ask experts to assign a fuzzy number to a parameter according to their experience. By assigning values to all parameters, either crisp or fuzzy, we obtain a complete FCPN model for a biological system.
(4) We then run fuzzy simulation using the algorithm given above to obtain uncertain outputs and analyze them. After the model is validated, we can use it to explain the corresponding biological phenomena and potentially make predictions for further experiments.

Case study
In this section, we use a heat shock response model given in [24] to illustrate the approach presented in the paper.

Modeling
The heat shock response is a highly conserved genetic network that acts as a main defense mechanism against cell stress and protein damage. The biochemical reactions considered in [24] are repeated in Table 2, and the species involved in the model are explained in Table 3 together with their initial values [24].
The heat shock factor (hsf ) has two phosphorylation states hsf 2 and hsf 3 ; these three species can be converted from one to another by phosphorylation or dephosphorylation. The heat shock protein (hsp) plays the key role in preventing misfolding and facilitating protein folding. The hsp-encoding genes can be transactivated through the binding of hsf 3 to the heat shock element (hse), which forms hsf 3 _hse. hsp may also bind to hsf, forming hsp_hsf , or bind to hsf 2 or hsf 3 . hsp can also bind to the heat-induced misfolded proteins mfp which are the drivers of the whole heat  Table 4 for the values of the kinetic parameters The initial values are taken from [24] shock response, forming hsp_mfp. mfp is converted from an unfolded or native protein (prot) induced by the heat. According to these reactions, we build a CPN/FCPN model, depending on the type of parameters, illustrated in Fig. 7. From the model, we see that we map each reaction to a transition of the CPN/FCPN model; a reversible reaction is considered as two irreversible reactions. We also assume that the principle of mass action applies to each reaction.

Structural analysis
As a member of the family of Petri nets, FCPNs likewise enjoy all Petri net analysis techniques. Here we will analyze T-invariants [6] of our constructed model by feeding the FCPN model given in Fig. 7 to Charlie [25], an analysis tool of Petri net models.
Our model in hand is covered by 10 minimal T-invariants, i.e., {(r 1 , r 2 ), (r 3 , r 4 ), (r 5 , r 6 ), (r 8 , r 9 ), (r 15 , r 16 ), (r 1 , r 9 , r 10 ), (r 1 , r 3 , r 9 , r 11 ), (r 1 , r 3 , r 5 , r 9 , r 12 ), (r 14 , r 15 , r 17 ), (r 7 , r 13 ) }. Each T-invariant is an elementary behavioral mode of a system, and reproduces a system state (or marking) [6]. The first four T-invariants correspond to the four reversible reactions given above, while the others reveal further elementary behavioral modes that cannot be easily deduced from the reaction equations. For example, the T-invariant (r 7 , r 13 ) shows that dissociation (of hsf 3 _hse) and degradation (of hse) form an elementary behavioral mode, and in fact this kind of mode is a basic and widely-seen biological module. Besides, we can also structure the model using these T-invariants.

Simulation analysis
We first validate our model with the parameters given in [24] (see Table 4), and obtain the simulation result. For example, a plot of hsf 3 _hse is given in Fig. 8, which is completely consistent with the one given in [24].
We now illustrate our approach by setting the values of parameters T and k 8 to fuzzy numbers. For example, a plot of hsf 3 _hse is given in Fig. 9 with four α levels. With the decrease of α (from 1.0 to 0), the uncertain band decreases too. That is, bigger uncertainties of inputs cause larger uncertainties of outputs. In fact, α = 1.0 corresponds to the case that all parameters have crisp values. This means the plot of Fig. 9a is the same as the one given in Fig. 8. Besides, Fig. 10 gives a 3D plot of hsf 3 _hse for the FCPN model, where we show the curve at each α level. This figure describes the effects of uncertainties. For comparison, we give another 3D plot of hsf 3 , illustrated in Fig. 11.
Moreover, by composing the simulation results of all the α-cuts at the simulation end time point, we obtain the membership function of each output, which is illustrated in Fig. 12. From this figure, we can Here T is the temperature Fig. 8 A simulation plot of the CPN model in Fig. 7. See Table 4 for the values of the kinetic parameters clearly observe the uncertain range of each output. For example, the uncertain range of hsf 3 _hse is between 2.6 × 10 −4 and 4.7 × 10 −4 . According to this, we can carefully check the effects of different uncertain parameters.

Simulative model checking.
Considering plenty of traces generated by the simulation of an FCPN model, we want to further check if these traces are similar or distinct in terms of their shape. By doing this, we may deduce those parameters that cause severe changes of the model. To address this issue, we apply PLTL model checking [16] to analyze traces from the FCPN model. From Fig. 8 Tr] > 0))))] Here we use the function d(species) to get the derivative of the concentration of the species at each time point. The first query checks if there is a peak in a trace Tr, which is evaluated to true for all the traces of the FCPN model in Fig. 7. However, this query cannot answer the uniqueness of the peak. So we use the second query to check if there is a second peak, which is evaluated to false for all the traces. Of course, we can write more complicated queries to check more complex shapes of traces. For more details about simulative model checking, please refer to [16].

Conclusions
In this paper, we present an FCPN approach for modeling and analyzing biological systems with uncertain kinetic parameters. An FCPN model is equivalent to a set of both FDEs and ODEs. Considering the fact that ODE/CPN modeling is widely used in the field of systems biology and the fact that uncertainties exist in many biological systems, we believe our approach could offer a good means to study uncertain biological systems.
In a next step, we will continue to explore the uncertain modeling issue in the field of systems biology, and then develop appropriate approaches for solving this issue. Specifically, we will concentrate on several typical biological systems and give some case studies with fuzzy features. Fig. 10 A three-dimensional simulation plot of hsf 3 _hse for the FCPN model in Fig. 7. The same values of the kinetic parameters are used as those given in Fig. 9

Fig. 11
A three-dimensional simulation plot of hsf 3 for the FCPN model in Fig. 7. The same values of the kinetic parameters are used as those given in Fig. 9 Fig. 12 The membership functions of nine species at the simulation end time point. The same values of the kinetic parameters are used as those given in Fig. 9. a hsp, b hsf2, c hsf3, d hse, e hsf3_hse, f hsp, g hsp_hsf, h mfp, i prot