- Software
- Open access
- Published:
GraTeLPy: graph-theoretic linear stability analysis
BMC Systems Biology volume 8, Article number: 22 (2014)
Abstract
Background
A biochemical mechanism with mass action kinetics can be represented as a directed bipartite graph (bipartite digraph), and modeled by a system of differential equations. If the differential equations (DE) model can give rise to some instability such as multistability or Turing instability, then the bipartite digraph contains a structure referred to as a critical fragment. In some cases the existence of a critical fragment indicates that the DE model can display oscillations for some parameter values. We have implemented a graph-theoretic method that identifies the critical fragments of the bipartite digraph of a biochemical mechanism.
Results
GraTeLPy lists all critical fragments of the bipartite digraph of a given biochemical mechanism, thus enabling a preliminary analysis on the potential of a biochemical mechanism for some instability based on its topological structure. The correctness of the implementation is supported by multiple examples. The code is implemented in Python, relies on open software, and is available under the GNU General Public License.
Conclusions
GraTeLPy can be used by researchers to test large biochemical mechanisms with mass action kinetics for their capacity for multistability, oscillations and Turing instability.
Background
Biochemical mechanisms are often modeled by differential equations (DE) systems. Instabilities, such as multistability, oscillations, or Turing instability, are ubiquitous in DE models of biochemical mechanisms. Methods from bifurcation analysis are usually applied in order to analyze DE models for instabilities [1]. Bifurcation analysis methods are easily applied when the DE model has one or two concentration species (phase plane analysis) or has a relatively small number of parameters (numerical bifurcation analysis). However, it is both difficult and expensive to apply bifurcation methods to analyze large DE models with many variables for instabilities.
On the other hand a biochemical mechanism can be represented as a directed bipartite graph (bipartite digraph), which is a graph with two different sets of nodes representing species and reactions, and directed edges connecting a species and a reaction node. The existence of structures referred to as critical fragments in the bipartite digraph of a biochemical mechanism is necessary for the existence of multistability or Turing instability in the DE model [2–4]. Thus biochemical mechanisms that do not have the potential for multistability or Turing instability can be ruled out early in the modeling process. The existence of a critical fragment that does not contain all species nodes can indicate that oscillations exist for some parameter values for the DE model [3]. Thus graph-theoretic methods can be used to determine the potential of various biochemical mechanisms to exhibit some desired behavior, including multistability related to cell decision [5, 6], oscillations related to circadian rhythms [7], or Turing instability related to pattern formation [8].
Graph-theoretic methods are applicable to mechanisms with any number of species and reactions, which enables the screening of large biochemical mechanisms for potential instabilities. However, application of graph-theoretic methods by hand becomes challenging for large mechanisms, making a computational implementation highly desirable.
The graph-theoretic method implemented by GraTeLPy identifies all critical fragments in the bipartite digraph of a biochemical mechanism that can give rise to some instability (multistability, oscillations and Turing instability) [3, 4]. GraTelPy is implemented in Python and can run in parallel on computer clusters which increases the size of testable biochemical mechanisms.
Other software packages implement theoretical and computational methods for studying chemical reaction networks for multistability. Using the deficiency theory developed by M. Feinberg and collaborators [9], it can be shown that a chemical network model does not admit multistability for any choice of parameter values. The CRNT toolbox [10] developed originally by M. Feinberg implements the Deficiency One algorithm [11], that can be used to detect if a given network has the capacity for multistability [12]. If a given network admits multiple positive equilibria, in many cases the CRNT toolbox returns rate constant values such that the corresponding model system has at least two positive equilibria. In recent years, the CRNT toolbox has been extended to implement an algorithm for the mass-action injectivity test. A special case of this test is the Jacobian criterion, which provides a sufficient condition for excluding the existence of multiple positive equilibria and is based on the theory developed in [2, 13, 14].
Related software packages include BioNetX [15] and CoNtRol [16]. BioNetX is based on the work of M. Banaji and G. Craciun [17, 18] and is created by C. Pantea. BioNetX is used to analyze uni-molecular and bi-molecular reaction networks for the existence of multiple positive equilibria in [19, 20]. CoNtRol [16] is a web-based software package that employs matrix and graph-theoretic methods based on the DSR graph [17, 18, 21]. In particular CoNtRol provides information about the capacity of a given chemical network for multistability based on the DSR graph and on some additional tests. In addition CoNtRol calculates the deficiency of a network and checks if a network is weakly reversible. BioNetX and CoNtRol are available to download for free, they are open-source and are conveniently web-based.
We describe in Section Mathematical background the DE model and the bipartite digraph of a biochemical mechanism, as well as the instability criteria. In Section Implementation we describe the algorithm for finding critical fragments. In Section Results and discussion we present several examples along with some concluding remarks. A guide for downloading and installing GraTeLPy for Mac, Windows and Linux operating systems is available in the Additional file 1.
Mathematical background
Here we introduce the differential equations model and the bipartite digraph representation of a biochemical mechanism. In this section we also briefly describe the instability criteria for multistability, oscillations and Turing instability. More details on the instability criteria are available in [3, 4].
Mathematical model
A biochemical mechanism withn species A i , i=1,…,n, and m elementary reactions B j can be written as
where k j >0, j=1,…,m are the rate constants. The constants α ji ≥0 and β ji ≥0 in (1) are small integers called stoichiometric coefficients that account for the number of molecules of species A i participating in the j th elementary reaction. An example of a biochemical mechanism, the reversible substrate inhibition mechanism, is given below:
where the first two reactions represent an inflow and outflow reaction, respectively.
We will assume that every species A k in (1) is consumed and produced in at least one true reaction, i.e, a reaction which is different from an outflow reaction A k →∅ or an inflow reaction ∅→A k . However, we do not specifically require that all species participate in an inflow and an outflow reaction.
We further assume mass action kinetics for the mechanism (1) with rate functions
where u k (t) is the concentration at time t of a species A k , k=1,…,n.
The ordinary differential equations (ODE) model of a mass-action biochemical mechanism (1) can be written in vector form as
where u(t)=(u 1(t),…,u n (t))T is the concentration vector of the chemical species of (1), S ji =β ji −α ji are the entries of the stoichiometric matrix S and w(u)=(w 1(u),…,w m (u))T is the vector of rate functions (3). Throughout the paper it will be assumed that the ODE system ( 4) has a positive equilibrium.
The model equations of the reversible substrate mechanism (2) are given below
The rank of the stoichiometric matrix S of (5) equals 3 since there is one conservation relationship u 2+u 3+u 4=c 0.
Since
where S ji are the stoichiometric matrix entries and w j (u) are the rate functions (3), the Jacobian matrix J(u,w) has entries that can be written as
Note that the concentrations u k , k=1,…,n and the rate functions w j (u), j=1,…,m (both considered evaluated at a positive equilibrium) are used as parameters in (6). The rank of the Jacobian (6) equals the rank of the stoichiometric matrix S[3].
The Jacobian matrix of the model (5) parametrized in (u,w) has rank 3 and is given below
The characteristic polynomial of J(u,w) is
where I is the identity matrix. Note that the coefficients a i =a i (u,w), i=1,…,n of (8) are also functions of (u,w). For example, the last non-zero coefficient of the characteristic polynomial of the Jacobian (7) is
The bipartite digraph of a biochemical mechanism
For the convenience of the reader, in this section we present definitions regarding the bipartite digraph of a biochemical mechanism (1) [3, 4, 22]. To illustrate the definitions in this section we will continue to use as an example the reversible substrate mechanism (2).
A directed bipartite graph (bipartite digraph) has a node set that consists of two disjoint subsets, V 1 and V 2, and each of its directed edges (arcs) has one end in V 1 and the other in V 2[23].
The bipartite digraph G of a biochemical reaction network (1) is defined as follows. The nodes are separated into two sets, one for the chemical species V 1={A 1,A 2,…,A n } and one for the elementary reactions V 2={B 1,B 2,…,B m }. We draw an arc from A k to B j if and only if species A k is a reactant in reaction j, i.e., if the stoichiometric coefficient α jk >0 in (1). Similarly, we draw an arc from B j to A i if and only if A i is a product in reaction j, i.e., if the stoichiometric coefficient β ji >0 in (1). Therefore the set of arcs E(G) consists of arcs such as (A k ,B j ) and (B j ,A i ). Hence the bipartite digraph can be defined as G={V,E(G)} where V=V 1∪V 2 is the set of nodes and E(G) is the set of arcs. If an arc is not weighted explicitly, we assume that its weight equals 1. The corresponding bipartite digraph of the reversible substrate inhibition mechanism (2) is shown in Figure 1.
The element [ A k ,B j ] is an edge if α jk >0, i.e., if species A k is a reactant in reaction j. The weight of an edge E=[ A k ,B j ] is defined as
For example, the edge E=[ A 1,B 3] in Figure 1 has weight K E =−1.
If α jk β ji >0, then the arcs (A k ,B j ) and (B j ,A i ) form a positive path [ A k ,B j ,A i ] that corresponds to the production of A i from A k in a reaction j. The weight of the positive path [ A k ,B j ,A i ] is defined as α jk β ji . For example, the positive path [ A 1,B 3,A 3] in Figure 1 has weight 1.
If α jk α ji >0, then the arcs (A k ,B j ) and (A i ,B j ) form a negative path that corresponds to A k and A i interacting as reactants in reaction j. The weight of the negative path is defined as −α jk α ji . Note that the negative paths and are considered to be different since they start at a different species node. For example, both and in Figure 1 are negative paths with weight −1. We note that the direction of the arcs is followed in the positive paths but not in the negative paths.
A cycle C of G is a sequence of distinct paths with the last species node of each path being the same as the first species node of the next path , , …, , . A cycle will be denoted by , where the number of species nodes defines its order. The set of species nodes in a cycle is distinct, but there may be a repetition among the reaction nodes. This is because negative paths containing the same nodes are considered different depending on the starting species node. For example, in Figure 1 is a cycle formed by the two negative paths and .
A cycle is positive if it contains an even number of negative paths and negative if it contains an odd number of negative paths. The sign of a cycle C can also be determined by the cycle weight which is a product of all corresponding weights of negative and positive paths of C
For example, (see Figure 1) is a negative cycle of order 2 with weight K C =−1. The cycle (see Figure 1) is a positive cycle of order 2 with weight K C =1.
A subgraph g={L1,L2,…,L s } of G consists of edges or cycles L i , i=1,…,s, where each species is the beginning of only one edge, or one path participating in a cycle. In other words, the edges and cycles in a subgraph are species mutually disjoint. The number of species nodes in a subgraph is defined as its order. The subgraph weight is defined using the product of the cycle weights (11) and the edges weights (10) of the cycles and edges in g
where c is the number of cycles in g. For example, the subgraph with weight K g =−1 is shown in Figure 2 (bottom right).
Since more than one path can exist between species nodes via different reaction nodes in a bipartite digraph, the number of subgraphs through the same node sets may be greater than one. The set of all subgraphs g of order k with the same species nodes and reaction nodes sets is called a fragment of order k and is denoted by . For a fragment we define the number
as the fragment weight. If , then S k is a critical fragment.
For example, the fragment is shown in Figure 2 (top left) together with its three subgraphs , and g 3={[ A 1,B 5],[ A 2,B 3],[ A 3,B 4]}. Each of the first two subgraphs g 1 and g 2 contains a positive cycle, and thus is a critical fragment since
In [3, 22] it is shown that the coefficients of the characteristic polynomial (8) have the following graph-theoretic representation
Note that similar terms in a k have been combined using summation over the subgraphs of a fragment (13) and (14) is in a simplified form. It follows by (14) that the correspondence between a fragment and a non-zero term in a k (u,w) is one-to-one. For example, the negative coefficient in a 3(u,w) given in (9) corresponds to the critical fragment shown in Figure 2 (top left).
Critical fragments corresponding uniquely to negative terms in (14) are important for the existence of instabilities as it is explained next.
Instability criteria for the Jacobian and the bipartite digraph
Here we summarize classical results from bifurcation analysis [1] and more recent results relating graph-theoretic methods to instabilities [3, 4, 22].
Multistability often arises from a saddle-node bifurcation in an ordinary differential equations (ODE) model, [1, 24]. If a saddle-node bifurcation occurs, then a real eigenvalue λ(u,w) of J(u,w) changes sign as the parameters (u,w) change values. Hence, a necessary condition for multistability arising from a saddle-node bifurcation is a n (u,w)= det(−J(u,w))=0 for some parameter values of (u,w) [1].
Often ODE models of biochemical mechanisms (4) have mass conservation relations reducing the rank r of the stoichiometric matrix S and the Jacobian J(u,w) to r<n, which means that the last non-zero coefficient in (8) is a r (u,w). Thus if a saddle-node bifurcation exists, then a r (u,w)=0 for some values of (u,w) [3]. Therefore a critical fragment of order r, corresponding uniquely to a negative term in (14) for k=r, is required for a saddle-node bifurcation, and thus for multistability [3, 22]. Thus the potential of a biochemical mechanism (1) for multistability depends on the structure of its bipartite digraph.
Oscillations in ODE models of biochemical mechanisms (1) often arise from Hopf bifurcation. It is shown in [3], that if a coefficient a k (u,w)≥0, k∈{1,…,n−1} is close to zero, then it is possible to choose parameter values for (u,w) such that oscillations arising from Hopf bifurcation occur.
The existence of a critical fragment of order k∈{1,…,n−1} makes it possible to minimize a k (u,w)≥0, k<n for some parameter values of (u,w) by increasing the magnitude of the corresponding negative term in a k (u,w). If there are mass conservation relations reducing the rank of the Jacobian matrix to r<n, a critical fragment of order k<r is required to detect possible oscillations in an ODE model (4) of a biochemical mechanism (1). Thus, the existence of oscillations in the ODE model of a biochemical mechanism (1) can also be determined by the structure of the bipartite digraph.
Patterns in a corresponding reaction–diffusion model to (4) usually arise as a result of Turing instability. Turing instability arises when a spatially homogeneous equilibrium is asymptotically stable in the absence of diffusion and becomes unstable when diffusion is added to the model [25]. For the existence of Turing instability, we study the matrix J(u,w)−μ D, where J(u,w) is the Jacobian matrix (6), D is a diagonal matrix with positive diffusion coefficients d i >0, i=1,…,n on the diagonal and μ>0 is a parameter (μ represents an eigenvalue of the negative Laplacian) [25]. Turing instability is associated with a real eigenvalue of the matrix J(u,w)−μ D passing through zero from left to right as parameter values are varied. In [4, 22] it is shown that a necessary condition for Turing instability is the existence of a critical fragment of order k<n. Thus, the potential of a biochemical mechanism to display Turing instability can be inferred from the structure of its bipartite digraph.
Implementation
Recall that the existence of a critical fragment in the bipartite digraph of a biochemical mechanism, where r is the rank of the stoichiometric matrix S, can induce multistability. Similarly a critical fragment of order k<n can induce Turing instability or even oscillations. On the other hand if no critical fragments of order r are found, then the existence of multistability can be excluded for any values of the parameters. Similarly if no critical fragments of order k<n are found, then the existence of Turing instability can also be excluded for any values of the parameters.
GraTeLPy enumerates all critical fragments of user-defined order k for a given biochemical mechanism, thus providing the user with information on the potential of a biochemical mechanism for multistability, oscillations or Turing instability.
We present in Figure 3 a flowchart that describes schematically the algorithm implemented by GraTeLPy.
First a biochemical mechanism is read from an user-provided input text file and its bipartite digraph is generated. Then all fragments of an user-defined order k are enumerated and placed in a computational queue. Each fragment from the queue will be further processed in order to compute its weight (top diamond nodes, Figure 3).
For each fragment in the queue, a linear sequence of operations is carried out (central rectangular nodes, Figure 3). First all subgraphs g of a fragment are enumerated, and the weight K g of each subgraph g is computed. Then the weights of all subgraphs g contained in are added to compute the weight of the given fragment. At this point it is decided based on the sign of the weight , if the fragment is critical, i.e., is satisfied.
Once all of the fragments from the queue have been processed, an output based on the potential of the biochemical mechanism for some desired instability is created (bottom diamond nodes, Figure 3). The information in the output includes the number of critical fragments of an user-defined order found by GraTeLPy. Based on the number of critical fragments found, GraTeLPy states if a biochemical mechanism meets the necessary condition for multistability or Turing instability, and if the mechanism can exhibit oscillations for some parameter values. In addition a list of all critical fragments of a given order detected by GraTeLPy can be provided.
Processing the queue of the enumerated fragments (central rectangular nodes, Figure 3) is inherently parallel as each fragment may be handled independently of all other fragments. To use this parallelism to our advantage, two scripts are created for GraTeLPy that implement a server role and a client role, respectively.
The server script takes care of actions in the top and bottom diamond nodes in Figure 3. The server creates the bipartite digraph, enumerates all fragments and places them in a queue (top diamond nodes). At the end the server collects all computed data from the client processes before displaying them for the user (bottom diamond nodes).
The client script deals with actions in the central rectangular nodes in Figure 3. The client fetches a fragment from the queue presented by the server, generates all subgraphs of the fragment, computes the weights of the subgraphs, computes the weight of the fragment, and reports all computed results back to the server. If the fragment queue has not been exhausted, then the client fetches another fragment and repeats these steps.
This server-client architecture allows the user to run one or multiple instances of the client script to analyze several fragments of a large mechanism in parallel. We discuss the technical details of the parallelization in more detail in Implementation challenges below.
In the following subsections we describe in detail the implementation of both fragment and subgraph enumeration as these parts presented considerable technical challenges during development.
Fragment enumeration
It follows by the definition of a fragment given in Section The bipartite digraph of a biochemical mechanism that fragments are identified by the species and reaction indices of their subgraphs. A fragment of order k contains k unique species indexed by {i 1,…,i k }, and k possibly repeated reactions indexed by {j 1,…,j k }.
Suppose that a given biochemical mechanism has N species and R reactions. Using a combinatorial approach, we can generate all fragments of order k by pairing the unique combinations of species with R k combinations of reaction nodes. This approach generates possible fragments that need to be filtered. This is because many of the combinatorially generated fragments do not exist in the bipartite digraph of a given biochemical mechanism.
To save computational time and cost we use a different approach. We note that each fragment contains one subgraph that consists of edges , s=1,…,k. Let us denote by |E i | (i=1,…,n i ) the number of edges that a species A i in a biochemical mechanism induces, or, is the starting node of. If we assume that each species A i is on average the starting node of |E|=Avg(E i ) edges, then this approach generates approximately fragments. Empirically, we observe that |E| is usually considerably less than some common values for the number of reactions R. Hence this latter approach generates fewer fragments than the former combinatorial approach. In fact, since fragments correspond uniquely to the subgraphs consisting of edges, using this method we generate only the fragments that are present in the bipartite digraph.
By using the method of one-to-one correspondence between fragments and subgraphs consisting of edges, we reduce the number of fragments generated by the combinatorial approach by multiple orders of magnitude. A reduction in the number of the generated fragments translates directly to a reduction in computational cost. Hence the latter approach for fragment generation is an important development in the implementation of GraTeLPy that allows for analyzing larger biochemical mechanisms. To highlight this reduction in computational cost we plot the number of fragments (of varying order) generated with both methods for the double-layer mitogen-activated protein kinase (MAPK) mechanism in Figure 4. The double-layer MAPK mechanism is discussed in more detail in the last example in Section Results and discussion.
Subgraph enumeration
Given a fragment we generate all edges , positive paths and negative paths , where l,s=1,…,k, that are induced by the species and reactions of the fragment. We will refer collectively to edges, and positive and negative paths of a subgraph as subgraph components.
The subgraph components of a fragment are stored in a lookup table that lists for each species and corresponding reaction all subgraph components induced by the pair . The subgraph components of a fragment are generated as follows:
-
(i)
For a fragment each edge (s=1,…,k) is identified and stored in the lookup table.
-
(ii)
For each edge in the lookup table, arcs starting at , such as (l=1,…,k) are identified. This way all positive paths induced by are generated and added to the lookup table as part of the record for species .
-
(iii)
Similarly to (ii), for each edge in the lookup table, arcs ending at , such as are identified. This way all negative paths induced by are generated and added to the lookup table as part of the record for species .
To gain some intuition on how subgraphs can be generated, we first describe a simple combinatorial approach before we introduce the method implemented by GraTeLPy. Suppose that for a fragment there are subgraph components induced by each species . We can generate combinatorially a subgraph g of by selecting at random one subgraph component per species since each species must be the starting node of exactly one component [3]. Using this approach we can generate combinatorially all possible combinations of subgraph components , that represent all possible subgraph candidates of a fragment . For a fragment with |L|=Avg(L i ) subgraph components per species on average, this method generates |L|k possible subgraphs.
Even though this method guarantees that each species is the starting node of exactly one subgraph component, there may be combinations of paths that do not form cycles as defined in Sec. Mathematical background. This is because the end species node of a path has to be the starting species node of another path in a cycle [3]. If we use the combinatorial method for generating subgraphs, then all candidate subgraphs that do not satisfy the definition of a subgraph given in Sec. Mathematical background need to be removed which would increase the computational cost.
In the next two subsections we introduce the path graph and the cycle graph that will allow us to generate only the subgraphs that belong to a given fragment. The implementation of the algorithms associated with the path graph and the cycle graph by GraTeLPy will allow us to further reduce the computational cost.
Cycle detection: the path graph
We can avoid generating invalid subgraphs if paths are not joined combinatorially, but rather only paths that form cycles are joined. Recall that a cycle is a sequence of paths where the end species node of each path is the starting species node of exactly one other path in the sequence.
Next, we introduce expanded paths, where a negative path is converted into two expanded paths [ A i ,B m ,A j ] and [ A j ,B m ,A i ] that are positive. This expansion is necessary as negative paths can be traversed in both directions as explained in Section The bipartite digraph of a biochemical mechanism. To enumerate all cycles of a given fragment , we construct the directed graph (digraph) Φ. The nodes of Φ correspond uniquely to the expanded negative paths and the positive paths of a fragment . We connect the nodes of the digraph Φ representing paths whose end nodes and starting nodes are the same. For example, there is a directed edge in Φ that starts at a node representing and ends at a node representing . Self-loops in Φ from a node back to itself are also permitted and they correspond to paths of the form .
To summarize, we generate a digraph Φ with the following properties:
-
The nodes Φ i of Φ are the expanded negative paths and the positive paths of a given fragment .
-
A directed edge (Φ i ,Φ j ) exists if and only if the end species node of the path corresponding to Φ i is the starting species node of the path corresponding to Φ j . Self-loops (Φ i ,Φ i ) are permitted and correspond to positive paths of the form [ A i ,B j ,A i ].
We refer to the digraph Φ as the path graph. For a fragment with a total of P paths that include all expanded negative paths and all positive paths of , the generation of the path graph Φ has time complexity .
To detect the cycles of the path graph Φ, and ultimately the cycles of a given fragment , an implementation of Johnson’s algorithm [26] provided by NetworkX [27] is used by GraTeLPy. For a fragment with a total number of P expanded negative paths and positive paths (corresponding uniquely to the nodes of Φ), P E sequential relations between these paths (corresponding uniquely to the directed edges of Φ), and P C cyclic sequential relations (corresponding uniquely to the cycles of Φ), the enumeration of all P C cycles requires units of time [26].
Next we illustrate the construction and usage of the path graph Φ described above. The path graph Φ for the critical fragment (see Figure 2) is shown in Figure 5, together with the cycles c 1 and c 2 produced by Johnson’s algorithm.
Some of the cycles of Φ enumerated by NetworkX correspond to closed paths with revisited nodes in the bipartite digraph G, and are therefore not cycles of G. This is the case because Johnson’s algorithm finds all cycles of all lengths of the path graph Φ. In our current implementation, we remove cycles of Φ that correspond to closed paths with revisited nodes of the bipartite digraph G. However, further optimization of Johnson’s algorithm is likely possible, so that only cycles that exist in the bipartite digraph G are generated in the first place.
Cycle combinations: the cycle graph
Suppose that P C valid cycles of a given fragment have been found using the algorithm from the previous subsection. Possible candidates for subgraphs of can be constructed by creating all possible combinations of cycles. In total, there are possible ways to combine P C cycles into combinations of k cycles with no repeating cycles. Then, edges may have to be added to the combinations of cycles in order to construct the subgraphs of a fragment .
Generally, not all combinations of cycles or edges form subgraphs since such combinations may not contain every species of a fragment exactly once. Suppose that a given set of cycles has mutually disjoint species sets, but the orders of the cycles sum to less than k. In order to form a subgraph of a fragment we need to amend such a cycle combination with a set of edges whose species nodes are in , but not in any of the cycles. If on average a species A i is the starting node of E edges and if on average we have to add μ edges to a cycle combination, then we generate combinatorially possible subgraphs.
Many of the combinatorially generated cycle and edges combinations will have repeated species nodes, thus rendering such a combination of edges or cycles invalid as a subgraph. Hence we need to verify which of the generated combinations of cycles or edges are subgraphs of . If validating a subgraph requires units of time then, on average, validating all possible candidates for subgraphs has time complexity . In reality, validating a combination of cycles or edges as a subgraph has greater time complexity than . Therefore, the computational cost will be greatly reduced if we can generate only subgraphs that require no further validation steps.
We use a similar approach to the one for finding the cycles of a given fragment. We will reduce the problem of generating cycle or edge combinations forming subgraphs to a problem that can be solved with available algorithms from the literature. To this end we generate an undirected graph Γ whose nodes correspond uniquely to the cycles of a given fragment , that are found using the path graph Φ. Drawing an edge between two nodes (representing cycles of ) means that these two cycles do not share species nodes and can be combined as a part of a subgraph. Next, we formally define the undirected graph Γ
-
A node Γ i of Γ represents a cycle of a given fragment .
-
An edge (Γ i ,Γ j ) exists if and only if the set of species nodes of the cycle represented by Γ i and the set of species nodes of the cycle represented by Γ j are disjoint.
We refer to the undirected graph Γ defined above as a cycle graph.
If a given set of cycles does not contain a number of species equal to the order of the subgraph constructed, then species-disjoint edges need to be added. To this end the problem of generating a subgraph of a given fragment can be reduced to finding all cliques in Γ. Recall that a clique is a set of nodes of an undirected graph such that every node is connected to every other node from the set [23]. To find all subgraphs of a fragment , its corresponding undirected graph Γ should be searched for all cliques. This is a standard problem in graph theory, known as clique enumeration, that can be solved using existing algorithms from the literature [28].
As an example, we construct the cycle graph Γ corresponding to the fragment of the Reversible Inhibition reaction (2) shown in Figure 2 (top left). We use the fact that the cycles c 1 and c 2 of the path graph Φ, shown in Figure 5 have paths that share at least one species. Hence, the cycle graph Γ consists of two nodes corresponding to the two valid cycles c 1 and c 2 with no edge connecting them. Since the cycle graph Γ constructed from the valid cycles in Figure 5 is completely disconnected, we can choose one cycle at a time and attempt to construct a valid subgraph by adding edges to the cycle. If c 1 is chosen, then the remaining nodes A1 and B5 form the edge [ A1, B5] yielding a valid subgraph of order 3, Figure 2 (bottom right). If c 2 is chosen, then no other nodes remain. Thus the cycle c 2 forms a valid subgraph of order 3, Figure 2 (bottom left).
When generating subgraphs of a fragment combinatorially, the number of subgraph candidates depends on the number of subgraph components of a given fragment. Using the improved algorithm (based on the path and cycle graphs) implemented by GraTeLPy, the number of generated subgraphs depends on the number of cycles in the path graph.
To compare the computational cost of the two approaches, we count the number of subgraphs generated in both cases for 100 randomly selected fragments of varying order for the double-layer MAPK network. The results are presented in Figure 6, and show that multiple orders of magnitude fewer subgraphs are generated by the path and cycle graph method in comparison to the combinatorial method.
Implementation challenges
As an overarching principle, we have strived to reduce code duplication hence we reuse as many components as possible from open source libraries. To this end, we have used combinatorial standard libraries distributed with the programming language Python [29], NetworkX [27] for all graph-related operations, and matplotlib [30] for graphical output. We note however that matplotlib is an optional package and is not required for the core functionality of GraTeLPy.
Over the course of implementation of GraTeLPy we encountered combinatorial blowup and memory usage as major challenges. Thus we have designed GraTeLPy to minimize storage of fragments, subgraphs, and intermediate structures in memory. To this end we make considerable use of the standard Python library itertools and the concept of generators that allow us to transfer many results from one method to another with minimal memory footprint.
The cycle enumeration method provided by NetworkX [27] stores all detected cycles, causing memory shortage and overflow due to the large number of generated invalid cycles revisiting species. We have amended this library method to only store and return valid cycles of the bipartite digraph, i.e., those cycles that do not revisit species nodes.
Analyzing large networks is computationally unfeasible when only a single processor is used. Hence, we have parallelized the code. Python’s global interpreter lock causes threaded code to run slowly, so we have used the Multiprocessing module [29], which operates by using subprocesses rather than threads. We have implemented two parallelized versions of the code:
-
1.
Single multiprocessor machine. Here we use the Multiprocessing.Pool system, whereby the Multiprocessing module itself launches the requisite subprocesses. The code allows the user to specify the number of subprocesses that should be run (ideally this should match the number of processors available on the machine).
-
2.
Multiple machines client/server. In this case, we launch a server process that generates the list of fragments to be analyzed. Clients can then be launched on any machine with a network connection to the server. These clients receive fragments from the server and pass back to the server the results of the analysis of the fragments. The server collates the responses.
Because the time spent analyzing a fragment is orders of magnitude higher than the time required to pass a representation of the fragment between a client and the server, the parallelization is extremely efficient. We have tested this client/server implementation with over 500 client processes, and the processing time scales very well.
Results and discussion
GraTeLPy allows the user to enumerate critical fragments of an user-defined order. Thus biochemical mechanisms can be analyzed for their potential for some instability in an efficient way. The existence of multistability requires at least one critical fragment of order r, which is the rank of the stoichiometric matrix. If a critical fragment of order k<n exists, then oscillations may exist for some parameter values. The existence of Turing instability requires at least one critical fragment of order k<n, where n is the number of species.
Several examples of biochemical mechanisms of different sizes are presented in this section. We have used GraTeLPy to find the critical fragments of a given order in the bipartite digraph of each biochemical mechanism. The first three examples are smaller mechanisms and are used to verify the correctness of implementation of GraTeLPy, since their critical fragments have already been found elsewhere. Furthermore, we show that by using GraTeLPy, finding critical fragments in larger biochemical mechanisms such as the MAPK single-layer and MAPK double-layer networks becomes feasible. The median running time for finding the critical fragments for each biochemical mechanism is presented.
The models and data discussed in this section are available at https://github.com/gratelpy/gratelpy-supplementary-information.
Reversible substrate inhibition
The reversible substrate inhibition model is analyzed for multistability in [3] using the graph-theoretic method presented here.
GraTeLPy reads in the biochemical mechanism from a plain text file.
Recall that the bipartite digraph of (15) shown in Figure 1.
The bipartite digraph of (15) contains one critical fragment of order 3 found in [3]. GraTeLPy reproduces this fragment and its constituent subgraphs shown in Figure 2.
The median running time with one processor for finding the critical fragment is 0.05 sec.
Remark
Note that the critical fragment corresponds to the negative term in the last non-zero coefficient (9) of the characteristic polynomial of the Jacobian matrix (7). This suggests how we may choose parameter values for (u,w) so that a saddle-node bifurcation and multistability occur. The inequality w 4>w 1 should be satisfied, otherwise a 3(u,w)>0. Also u 4≫u i , i=1,2,3 so that a 3(u,w) is close to zero. In general if is a critical fragment, then the species concentrations at equilibrium , s=1,…,k should be chosen small and the rate functions , s=1,…,k should be chosen large in order for a saddle-node bifurcation to occur.
As a future extension of GraTeLPy, we plan to make parameter choices for (u,w) such that some desired instability occurs available to the user.
Glycolysis-Gluconeogenesis switch
Critical fragments of order smaller than n, the number of species in a biochemical mechanism, can induce Turing instability in a reaction–diffusion model [4] as well as oscillations in an ODE model [3].
The biochemical mechanism
represents a glycolysis-gluconeogenesis switch and is studied for oscillations in [31]. It has been shown that the critical fragments (identified here by GraTeLPy as well) are the structural reason for the oscillations [31]. Based on the existence of the critical fragments, parameter values are chosen such that oscillations occur [31]. Similarly the mechanism (15) is studied for the existence of Turing instability in [4]. In fact, parameter values are found such that Turing instability exists in the reaction–diffusion model of (15).
The bipartite digraph of (15) is shown in Figure 7. The stoichiometric matrix associated with the biochemical mechanism (15) has rank 5. The biochemical mechanism meets the necessary criterion for Turing instability since critical fragments of order 1≤k≤5 exist [4]. In Figure 8 we show the critical fragments of order 2 and 3 reported in [4] and identified by GraTeLPy.
The median running time with one processor for finding the critical fragments of the bipartite digraph of the glycolysis-gluconeogenesis switch is 5.9 sec.
Cdc42 network in yeast
A biochemical mechanism that describes the Cdc42 dynamics of yeast and cell polarity is studied in [32]. The corresponding reaction-diffusion model displays Turing instability and patten formation for some parameter values [32].
The biochemical mechanism of the Cdc42 network is given below
where RD and RT denote the membrane-bound inactive and active form of Cdc42 respectively; I denotes cytoplasmic GDI that forms a membrane-bound complex with RD, RDIm, that detaches from the membrane and diffuses as RDIc in the cytoplasm. The enzyme E is a complex that contains Cdc42-activating Cdc24 and exists in both a cytoplasmic and membrane-bound form, Ec and Em, respectively. If E is on the membrane, it can form a catalytic complex M together with RT, that aids activation of membrane-bound RD.
The bipartite digraph of the Cdc42 network is shown in Figure 9. The Cdc42 network has a corresponding stoichiometric matrix of rank 5. The necessary condition for Turing instability requires that a critical fragment of order 1≤k≤5 exists. GraTeLPy identifies 35 critical fragments – among which we find the two critical fragments reported in [32] and shown in Figure 10.
The median running time with one processor for finding the critical fragments of the bipartite digraph of the Cdc42 network is 9.7 sec, and with two processors 6.1 sec.
Single-layer MAPK network
As the size of biochemical mechanisms increases, enumerating the critical fragments of their corresponding bipartite digraphs by hand becomes tedious and difficult. Using GraTeLPy we can find the critical fragments of a given order of larger mechanisms in a short period of time.
An example of a larger biochemical mechanism that is difficult to analyze by hand is the single-layer MAPK network
whose bipartite digraph is shown in Figure 11. The MAPK network is a well-known example of a multistable system [5, 33]. The necessary condition for multstability requires the existence of a critical fragment of order equal to the rank of the stoichiometric matrix. Since the rank of the stoichiometric matrix for the MAPK network equals 6, using GraTeLPy, we enumerate all critical fragments of order 6. The 9 critical fragments of order 6 of the MAPK network are shown in Figure 12.
The median running time with two processors for finding the critical fragments of the bipartite digraph of the single-layer MAPK network is 10.7 sec.
Double-layer MAPK network
For large biochemical mechanisms the number of critical fragments of a given order may grow into the dozens or hundreds. Thus the task of enumeration by hand of all critical fragments of a given order becomes unfeasible, but can be accomplished with the help of GraTeLPy.
An example of a larger biochemical mechanism is provided by the double-layer MAPK network which has 12 species and 18 reactions
Similarly to the single-layer MAPK network, the double-layer MAPK network is known to display multistability. The stoichiometric matrix for the double-layer MAPK network has rank 9. Therefore the necessary condition for multistability requires the existence of at least one critical fragment of order 9. GraTeLPy detects 88 critical fragments of order 9. The list of all critical fragments of order 9 of the bipartite digraph of (17) can be obtained from https://github.com/gratelpy/gratelpy-supplementary-information.
The median running time of each client for finding the critical fragments of the bipartite digraph of the double-layer MAPK network is as follows: 141 sec with 100 clients, 270 sec with 50 clients and roughly 4 hours with a single client.
Conclusions
We have implemented a graph-theoretic method that allows for parameter-free model testing of biochemical mechanisms with mass action kinetics for multistability, oscillations and Turing instability. GraTeLPy is open-source and is based on a free software. GraTeLPy enables users to identify the graph structures referred to as critical fragments that can be responsible for the existence of some instability in a differential equations model of a biochemical mechanism (1).
At present, GraTeLPy expects that the user converts a biochemical mechanism to a text format such as the one presented in (15). In a future release we plan to include additional functionality so that biochemical mechanisms provided in SBML [34] and other formats can be analyzed. A list of the critical fragments of a user-defined order is provided upon completion.
We plan a future extension of GraTeLPy where choices of parameter values such that some desired instability occurs will be offered to the user. This extension will be based on the existence of a critical fragment and its one-to-one correspondence with a negative term in a coefficient of the characteristic polynomial (See Remark in the Reversible substrate inhibition Example).
We also plan to combine GraTeLPy with a new analytic method, local perturbation analysis (LPA) [35, 36], in order to test biochemical mechanisms for pattern formation.
An extension of GraTeLPy to multigraphs [37] that can be used for the analysis of gene regulatory networks [38] is also left as a future extension.
Availability and requirements
GraTeLPy is available from https://github.com/gratelpy/gratelpy and has the following requirements: Python 2.6 or 2.7 and NetworkX 1.6 or above.
References
Kuznetsov Y: Elements of Applied Bifurcation Theory, Volume 112. 1998, New York: Springer
Craciun G, Feinberg M:Multiple equilibria in complex chemical reaction networks. II. SIAM J Appl Math. 2006, 66 (4): 1321-1338. 10.1137/050634177.
Mincheva M, Roussel MR:Graph-theoretic methods for the analysis of chemical and biochemical networks. I. J Math Biol. 2007, 55: 61-86. 10.1007/s00285-007-0099-1.
Mincheva M, Roussel MR:A graph-theoretic method for detecting potential Turing bifurcations. J Chem Phys. 2006, 125 (20): 204102-10.1063/1.2397073.
Markevich NI, Hoek JB, Kholodenko BN:Signaling switches and bistability arising from multisite phosphorylation in protein kinase cascades. J Cell Biol. 2004, 164 (3): 353-359. 10.1083/jcb.200308060.
Ozbudak EM, Thattai M, Lim HN, Shraiman BI, Van Oudenaarden A:Multistability in the lactose utilization network of Escherichia coli. Nature. 2004, 427 (6976): 737-740. 10.1038/nature02298.
Smolen P, Baxter DA, Byrne JH:Modeling circadian oscillations with interlocking positive and negative feedback loops. J Neurosci. 2001, 21 (17): 6644-6656.
Murray J: Mathematical Biology, Volume 2. 2002, New York: Springer
Feinberg M:Chemical reaction network structure and the stability of complex isothermal reactors. I. Chem Eng Sci. 1987, 42 (10): 2229-2268. 10.1016/0009-2509(87)80099-4.
Ellison P, Feinberg M, Ji H:Chemical Reaction Network Toolbox. Available online at http://www.crnt.osu.edu/CRNTWin 2011,
Feinberg M:Chemical reaction network structure and the stability of complex isothermal reactors–II. Multiple steady states for networks with deficiency one. Chem Eng Sci. 1988, 43: 1-25. 10.1016/0009-2509(88)87122-7.
Ellison P, Feinberg M:How catalytic mechanisms reveal themselves in multiple steady-state data. I. J Mol Catal A-Chem. 2000, 154 (1–2): 155-167.
Craciun G, Feinberg M:Multiple equilibria in complex chemical reaction networks: I. The injectivity property. SIAM J Appl Math. 2005, 65 (5): 1526-1546. 10.1137/S0036139904440278.
Ji H: Uniqueness of equilibria for complex chemical reaction networks . PhD thesis. Mathematics, Ohio State University; 2011,
Pantea C:BioNetX. Available online at http://bionetx.uwbacter.org/ 2010,
Banaji M, Donnell P, Marginean A, Pantea C:CoNtRol-Chemical reaction network analysis tool. Available online at http://math.wvu.edu/~cpantea/ 2013,
Banaji M, Craciun G:Graph-theoretic approaches to injectivity and multiple equilibria in systems of interacting elements. Comm Math Sci. 2009, 7 (4): 867-900. 10.4310/CMS.2009.v7.n4.a4.
Banaji M, Craciun G:Graph-theoretic criteria for injectivity and unique equilibria in general chemical reaction systems. Adv Appl Math. 2010, 44 (2): 168-184. 10.1016/j.aam.2009.07.003.
Pantea C, Craciun G:Computational methods for analyzing bistability in biochemical reaction networks. Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium. 2010, Paris: IEEE, 549-552.
Pantea C, Koeppl H, Craciun G:Global injectivity and multiple equilibria in uni-and bi-molecular reaction networks. Discr Cont Dyn Syst B. 2012, 17: 2153-3170.
Banaji M, Pantea C:Some results on injectivity and multistationarity in chemical reaction networks. Available on http://arxiv.org/pdf/1309.6771v2.pdf 2013,
Volpert A, Ivanova A:Mathematical models in chemical kinetics. Mathematical modeling (Russian). 1987, Nauka, Moscow, 57-102.
Harary F: Graph Theory. 1969, Reading, MA: Addison-Wesley
Tyson J:Classification of instabilities in chemical reaction systems. J Chem Phys. 1975, 62 (3): 1010-1015. 10.1063/1.430567.
Conway ED: Diffusion and Predator-Prey Interaction: Pattern in Closed Systems, Volume 101. 1984, Boston: Pitman
Johnson D:Finding all the elementary circuits of a directed graph. SIAM J Comput. 1975, 4: 77-84. 10.1137/0204007.
Hagberg AA, Schult DA, Swart PJ:Exploring network structure, dynamics, and function using NetworkX. Proceedings of the 7th Python in Science Conference (SciPy2008). 2008, Pasadena: 7th Annual Python in Science Conference, 11-15.
Zhang Y, Abu-Khzam FN, Baldwin NE, Chesler EJ, Langston MA, Samatova NF:Genome-scale computational approaches to memory-intensive applications in systems biology. Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005 Conference. 2005, Washington, DC: IEEE Computer Society, 12-12.
Python Software Foundation:Python Language Reference, Version 2.7. Available at http://www.python.org 2010,
Hunter JD:Matplotlib: A 2D graphics environment. Comput Sci Eng. 2007, 9 (3): 90-95.
Goldstein B, Maevsky A:Critical switch of the metabolic fluxes by phosphofructo-2-kinase: fructose-2, 6-bisphosphatase. A kinetic model. FEBS Lett. 2002, 532 (3): 295-299. 10.1016/S0014-5793(02)03639-6.
Goryachev A, Pokhilko A:Dynamics of Cdc42 network embodies a turing-type mechanism of yeast cell polarity. FEBS Lett. 2008, 582 (10): 1437-1443. 10.1016/j.febslet.2008.03.029.
Conradi C, Flockerzi D, Raisch J:Multistationarity in the activation of a MAPK: Parametrizing the relevant region in parameter space. Math Biosci. 2008, 211: 105-131. 10.1016/j.mbs.2007.10.004.
Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, the SBML Forum:The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003, 19 (4): 524-531. 10.1093/bioinformatics/btg015.
Holmes WR:An efficient, nonlinear stability analysis for detecting pattern formation in reaction diffusion systems. Bull Math Biol. 2014, 76: 157-183. 10.1007/s11538-013-9914-6.
Walther GR, Marée AF, Edelstein-Keshet L, Grieneisen VA:Deterministic versus stochastic cell polarisation through wave-pinning. Bull Math Biol. 2012, 74 (11): 2570-2599.
Mincheva M, Craciun G:Multigraph conditions for multistability, oscillations and pattern formation in biochemical reaction networks. Proc IEEE. 2008, 96 (8): 1281-1291.
Karlebach G, Shamir R:Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Bio. 2008, 9 (10): 770-780. 10.1038/nrm2503.
Acknowledgements
We are grateful to Verônica Grieneisen, Robert Ietswaart, Richard Morris and Marc Roussel for useful feedback on the manuscript. MH and GRW are grateful to the John Innes Centre for financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
GRW and MH implemented the software. MM provided the mathematical background. GRW, MH and MM wrote the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Walther, G.R., Hartley, M. & Mincheva, M. GraTeLPy: graph-theoretic linear stability analysis. BMC Syst Biol 8, 22 (2014). https://doi.org/10.1186/1752-0509-8-22
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1752-0509-8-22