 Research article
 Open access
 Published:
Properties of Boolean dynamics by node classification using feedback loops in a network
BMC Systems Biology volume 10, Article number: 83 (2016)
Abstract
Background
Biological networks keep their functions robust against perturbations. Many previous studies through simulations or experiments have shown that feedback loop (FBL) structures play an important role in controlling the network robustness without fully explaining how they do it. Hence, there is a pressing need to more rigorously analyze the influence of FBL structures on network robustness.
Results
In this paper, I propose a novel node classification notion based on the FBL structures involved. More specifically, I classify a node as a noFBLinupstream (NFU) or noFBLindownstream (NFD) node if no feedback loop is involved with any upstream or downstream path of the node, respectively. Based on those definitions, I first prove that every NFU node is eventually frozen in Boolean dynamics. Thus, NFU nodes converge to a fixed value determined by the upstream source nodes. Second, I prove that a network is robust against an arbitrary state perturbation subject to a nonsource NFD node. This implies that a network state eventually sustains the attractor despite a perturbation subject to a nonsource NFD node. Inspired by this result, I further propose a perturbationsustainable probability that indicates how likely a perturbation effect is to be sustained through propagations. I show that genes with a high perturbationsustainable probability are likely to be essential, disease, and drugtarget genes in large human signaling networks.
Conclusion
Taken together, these results will promote understanding of the effects of FBL on network robustness in a more rigorous manner.
Background
It is well known that biomolecular networks can keep their regulatory functions robust against various types of external or internal perturbations. For instance, the fate decision mechanism in a bacteriophage life cycle [1], the chemotaxis process in Escherichia coli [2], and segmental polarization in Drosophila melanogaster [3] were shown to be robust against noisy environments. It is more interesting that the dynamics of a biological network can be highly related to its structural characteristics [4]. In particular, many recent studies have shown that a feedback loop (FBL), a circular chain of interactions, can play an important role in controlling the robustness or susceptibility of networks [5, 6]. For instance, the negative FBL between MDM2 and p53 maintains an optimal level of p53 and creates appropriate dynamics of p53 expression level changes for a given level of DNA damage [7]. The Xenopus cell cycle is also robustly controlled against a certain level of perturbation with the help of several FBLs [8]. It was shown that a high proportion of coherently coupled FBLs can enhance the robustness of a network against state perturbations [9]. The number of FBLs involved at a node was also found to be positively correlated with the node’s functional importance [10]. Those simulation studies, however, cannot fully explain how the FBLs influence the network robustness. Hence, there is a pressing need to more rigorously analyze the relationship between FBL structures and network dynamics.
To measure network robustness, I herein use a synchronous Boolean network model [9, 11, 12] in which a node state is represented by a Boolean value, and the states of all nodes are synchronously updated at every discrete time step. Every network state moves to another state, and a series of consecutive transitions are represented by a network state trajectory that eventually converges to a fixedpoint or cyclic attractor. The attractor can describe various dynamic behaviors in a biological system, such as multistability and oscillations. If a node state is perturbed, the trajectory might converge to a different attractor. Therefore, a network is considered robust if the attractor does not change against a perturbation. Some tools have been proposed to quantify the network robustness by simulating the state transitions after randomly initializing the node states [13–17]. They have a limitation in network size for analysis, though, due to the exponential complexity of attractor computation. Therefore, it is a critical issue to find analytic results that can identify trivial parts that do not require further computation of state transitions.
In this paper, I focus on the effects of an FBL on Boolean converging dynamics. A state of a node is propagated to other nodes along a path in a chain of consecutive interactions. Therefore, the state cannot be fed back to the original node if it is not involved in an FBL. In other words, the current state of a node will eventually disappear unless a downstream path constructs an FBL. From that idea, I developed an FBLbased notion to classify nodes in a network. In particular, I defined two sets of nodes, noFBLinupstream (NFU) and noFBLindownstream (NFD), according to whether the upstream and downstream paths, respectively, involve any FBLs and proved two theorems regarding NFU and NFD nodes. One is that every NFU node is always frozen irrespective of the initial states of other nodes. This implies that the converging values of all NFU nodes are eventually fixed to a value determined by the upstream source nodes. It also means that a network is likely to be susceptible to a perturbation subject to the source nodes. The other is that a network is robust against an arbitrary perturbation subject to a nonsource NFD node. In other words, a network state eventually converges to the same attractor despite a state perturbation subject to a nonsource NFD node. Inspired by those results, I further developed a perturbationsustainable probability which indicates how likely it is that a perturbation effect will be sustained through a network state trajectory and showed that it can adequately identify functionally important genes, such as essential, diseaseassociated, and drugtarget genes, in large human signaling networks. Taken together, all of these results will promote understanding of the effects of FBLs on Boolean converging dynamics and reduce the computational costs of state transitionbased simulation tools.
Methods
Structural classification of nodes in a network
In this study, a biological network is represented by a directed graph G(V, E) where V = {v _{1}, v _{2}, ⋯, v _{ N }} is a set of nodes and E = {e _{1}, e _{2}, ⋯, e _{ A }} is a set of directed edges (interactions); an edge e ∈ E is an ordered pair of nodes (v _{ i }, v _{ j }) where v _{ i }, v _{ j } ∈ V. I use some notions from graph theory, including FBL and upstream/downstream paths, to represent the biological networks as follows.
Definition
A node u is an input node of v if there exists an interaction from u to v (i.e., (u, v) ∈ E). In addition, indegree of v means the number of input nodes of v.
Definition
A node v is a source node if indegree of v is zero. It is assumed that the state of a source node is fixed to its initial value over all the time.
Definition
Given a network G(V, E), a path P of a length L(≥1) is represented by a sequence of ordered nodes u _{1} u _{2} ⋯ u _{ L + 1} with interactions from u _{ i } to u _{ i + 1} ((u _{ i }, u _{ i + 1}) ∈ E for∀ i ∈ {1, 2, ⋯, L}) with no repeated nodes except u _{1} and u _{ L + 1}. In addition, P is called a feedback loop (FBL) if u _{1} = u _{ L + 1}.
Definition
Given a network G(V, E), an upstream (resp., downstream) path P = u _{1} u _{2} ⋯ u _{ L + 1} of a node v ∈ V is a path in which the last (resp., first) node u _{ L + 1} (resp., u _{1}) is v. Note that if P is a feedback loop, then it is both an upstream and downstream path of v. In addition, P = u _{1} u _{2} ⋯ u _{ L + 1} is a maximal upstream (resp., downstream) path if there is no longer path such as wu _{1} u _{2} ⋯ u _{ L + 1} (resp., u _{1} u _{2} ⋯ u _{ L + 1} w) for some w ∈ V.
Based on those terms, I define noFBLinupstream and noFBLindownstream nodes as follows:
Definition
Given a network G(V, E), a node v is called a noFBLinupstream (NFU) (resp., noFBLindownstream; NFD) node if there is no upstream (resp., downstream) path P = u _{1} u _{2} ⋯ u _{ L } v (resp., P = vu _{1} u _{2} ⋯ u _{ L }) such that for some i ∈ {1, 2, ⋯, L}, u _{ i } is involved in any feedback loop.
Figure 1 shows an example of NFU and NFD nodes in a network. This network contains five NFU nodes v _{1}, v _{2}, v _{5}, v _{9}, and v _{11} and four NFD nodes v _{5}, v _{9}, v _{10} and v _{11}. Note that v _{5}, v _{9} and v _{11} are both NFU and NFD nodes. On the other hand, it also contains five nodes that are neither NFU nor NFD nodes because an FBL is both upstream and downstream of each of them. In particular, note that v _{6} is not directly involved in an FBL. In this paper, I will show that the NFU and NFD nodes can induce interesting dynamic properties in a perturbation analysis.
A perturbation analysis in a Boolean network model
To define the robustness or sensitivity of a network, I use a synchronous Boolean network model used in previous studies [9, 12, 16]. In a Boolean network G(V, E), each v _{ i } ∈ V has a value of 1 (on) or 0 (off) that represent the possible states of the corresponding elements. For example, the values 1 and 0 represent the “turnon” and “turnoff” states of a gene, respectively. A directed interaction (v _{ i }, v _{ j }) can represent a positive (activating) or negative (inhibiting) relationship from v _{ i } to v _{ j }. The value of each variable v _{ i } at time t + 1 is determined by the values of k _{ i } other variables \( {v}_{i_1},{v}_{i_2},\cdots, {v}_{i_{k_i}} \) with an interaction to v _{ i } at time t by a Boolean update function \( {f}_i:{\left\{0,1\right\}}^{k_i}\to \left\{0,1\right\} \) where f _{ i } is a constant value if v _{ i } is a source node. All the Boolean variables are synchronously updated by a set of update functions F = {f _{1}, f _{2}, ⋯, f _{ N }}, and each update rule can be written as \( {v}_i\left(t+1\right)={f}_i\left({v}_{i_1}(t),{v}_{i_2}(t),\cdots, {v}_{i_{k_i}}(t)\right) \).
Many studies have been performed to elucidate the dynamic behaviors of biological networks. In particular, I address robustness against perturbations in terms of Boolean dynamics. In a Boolean network G(V, E) , a network state at time t is represented by an ordered list v(t) = [v _{1}(t), v _{2}(t), ⋯, v _{ N }(t)] ∈ {0, 1}^{N}. Then, for a subset U = {u _{1}, u _{2}, ⋯, u _{ M }} ⊆ V, a subset state of U at time t denoted by v _{ U }(t) = [u _{1}(t), u _{2}(t), ⋯, u _{ M }(t)] ∈ {0, 1}^{M} is an ordered list consisting of values u _{1} through u _{ M } at time t. A state trajectory of V starts from an initial state and eventually converges to either a fixedpoint or limitcycle attractor. These attractors can represent diverse biological network behaviors, such as multistability and oscillation [18–20]. This notion of an attractor introduces robustness in terms of the Boolean network dynamics as follows. If a network sustains an attractor against a perturbation, it is called robust against the perturbation. This concept has been widely used [5, 21–24]. Here, I consider an initialstate perturbation. Given an initial state v(0) = [v _{1}(0), v _{2}(0), ⋯, v _{ N }(0)] at t = 0, an initialstate perturbation subject to node v _{ x } ∈ V represents a situation in which v(0) is mutated to v ^{'}(0) = [v _{1}(0), ⋯, 1 − v _{ x }(0), ⋯, v _{ N }(0)], i.e., the corresponding initial value is switched to \( \overline{v_x}(0) \) (the negation of v _{ x }(0)). An initialstate perturbation represents the abnormal (or malfunctioning) status of a protein or gene caused by a mutation. The attractors to which v(0) and v ' (0) will converge can be compared to each other. The network is called robust or sensitive against the perturbation according to whether the attractors are the same as or different from each other, respectively. Based on this concept, I define some terms more rigorously with respect to the Boolean dynamics, as follows.
Definition
The sequence of states to which v(0) eventually converges is called the attractor induced from v, which is denoted by an ordered list of network states ζ(v) = [v(τ), v(τ + 1), …, v(τ + p − 1)] where v(t) = v(t + p) for ∀ t ≥ τ (p is a length of the attractor), and v(i) ≠ v(j) for ∀ i ≠ j ∈ {τ, τ + 1, …, τ + p − 1}. In addition, for a subset U ⊆ V, ζ(v _{ U }) = [v _{ U }(τ), v _{ U }(τ + 1), …, v _{ U }(τ + p − 1)] represents the states sequence of U in the attractor induced from v.
Definition
For U ⊆ V, ζ(v _{ U }) is frozen if there exists a time step τ such that v(t) = v(t + 1) for ∀ v ∈ U and ∀ t ≥.
Definition
Given two attractors with a same length, ζ(v) = [v(τ), v(τ + 1), …, v(τ + p − 1)] and ζ(v ^{'}) = [v ^{'}(τ ^{'}), v ^{'}(τ ^{'} + 1), …, v ^{'}(τ ^{'} + p − 1)], they are equivalent to each other if there exists a time step offset t ≥ 0 such that v(τ + i) = v ^{'}(τ ^{'} + (i + t) mod p) for ∀ i ∈ {0, 1, …, p − 1}.
Definition
Consider an arbitrary initial state v(0) = [v _{1}(0), v _{2}(0), ⋯, v _{ N }(0)] and its perturbed state at v _{ x } ∈ V, v ^{'}(0) = [v _{1}(0), ⋯, 1 − v _{ x }(0), ⋯, v _{ N }(0)]. The Boolean network is called robust against the perturbation subject to v _{ x } if ζ(v) is equivalent to ζ(v ^{'}). Otherwise, it is called sensitive or susceptible.
Datasets of signaling networks and functionally important genes
In this study, I derive an estimated probability with which a perturbation effect is sustained in a network. To show the usefulness of it, I used two largescale human signaling networks. One is the signaling network of 1659 genes and 7964 interactions constructed in a previous study [25] by integrating all the human signaling pathways in the KEGG (Kyoto Encyclopedia of Genes and Genomes) database [26] (see Additional file 1: Table S1). The other signaling network consists of 6306 genes and 62,937 interactions (version 6) downloaded from http://www.bri.nrc.ca/wang [27] (see Additional file 1: Table S2). In this work, I call them the KEGG and WANG networks, respectively. In addition, I considered essential, disease, and drugtargeted genes to represent functionally important genes. By using the DEG (Database of Essential Genes, version 5.4) database [28], I found 473 and 1519 essential genes included in the KEGG and WANG networks, respectively. I also found 403 and 1557 disease genes in the KEGG and WANG networks, respectively, using the OMIM (Online Mendelian Inheritance in Man) database [29]. Finally, I identified 353 and 1116 drug target genes in the KEGG and WANG networks, respectively, using the DrugBank database [30].
Results
Dynamic properties of NFU nodes
In this section, I show that NFU nodes are eventually frozen in Boolean network dynamics. A Boolean network G(V, E) with a set of update functions F is given.
Lemma 1
Given an initial state v(0) and a node v _{ k } ∈ V, let U = {u _{1}, u _{2}, ⋯, u _{ M }} be the set of input nodes of v _{ k }. If ζ(v _{ U }) is frozen, then \( \zeta \left({\mathbf{v}}_{\left\{{v}_k\right\}}\right) \) is also frozen.
Proof
For ∀ u _{ i } ∈ U, there exists a time step τ _{ i } such that u _{ i }(t) = u _{ i }(t + 1) for ∀ t ≥ τ _{ i } since \( \zeta \left({\mathbf{v}}_{\left\{{u}_i\right\}}\right) \) is frozen. Let \( \tau =\underset{i\in \left\{1,2,\dots, M\right\}}{ \max}\left({\tau}_i\right) \). Then, v _{ k }(t + 2) = f _{ k }(u _{1}(t + 1), ⋯, u _{ M }(t + 1)) = f _{ k }(u _{1}(t), ⋯, u _{ M }(t)) = v _{ k }(t + 1) for ∀ t ≥ τ. Thus, \( \zeta \left({\mathbf{v}}_{\left\{{v}_k\right\}}\right) \) is also frozen ■
Lemma 2
An initial state v(0) and a node v ∈ V are given. If every maximal upstream path of v includes at least one node u where ζ(v _{{u}}) is frozen, then ζ(v _{{v}}) is also frozen.
Proof
Let P _{1}, P _{2}, ⋯, P _{ M } be the list of maximal upstream paths of v, and let u _{ i } ∈ P _{ i } (i = 1, 2, ⋯, M) be the node where \( \zeta \left({\mathbf{v}}_{\left\{{u}_i\right\}}\right) \) is frozen. Then consider a subpath P ^{'}_{ i } of P _{ i } starting from u _{ i } and ending at v. Define W = {w ∈ Vw ∈ P ^{'}_{ i } for some i} and let l(w) be the length of the longest path from w to v. Assuming that L = max_{ w ∈ W } l(w), W can be divided into L + 1 disjoint subsets W _{0}, W _{1}, ⋯ and W _{ L } where W _{ k } = {w ∈ Wl(w) = k}. Then ζ(v _{{w}}) for ∀ w ∈ W is frozen by mathematical induction with respect to l(w), as follows. When k = L, it is obvious that ζ(v _{{w}}) of every w ∈ W _{ L } is frozen because w ∈ {u _{1}, u _{2}, …, u _{ M }}. Assume that ζ(v _{{w}}) of every w ∈ W _{ k + 1} is frozen and consider an arbitrary element w ^{'} ∈ W _{ k }. Then every input node of w ^{'} is an element of W _{ k + 1}. By lemma 1, \( \zeta \left({\mathbf{v}}_{\left\{{w}^{\hbox{'}}\right\}}\right) \) is frozen. Thus, ζ(v _{{w}}) of every w ∈ W _{ k } is also frozen. By mathematical induction, ζ(v _{{w}}) of every w ∈ W is frozen. Therefore, ζ(v _{{v}}) is also frozen because v ∈ W■
Lemma 2 provides a sufficient condition for the frozenness of a node. This can be extended to the case of NFU nodes as follows.
Theorem 1
An initial state v(0) is given. If v ∈ V is an NFU node, then ζ(v _{{v}}) is frozen.
Proof
By the definition of NFU nodes, every maximal upstream of v starts with a source node u whose ζ(v _{{u}}) is frozen. By Lemma 2, ζ(v _{{v}}) is also frozen ■
This theorem implies that the states of NFU nodes are dependent on the states of source nodes, and this might make the network tend to be susceptible to perturbations subject to source nodes.
Corollary
An initial state v(0) is given. If there is no FBL then ζ(v) is frozen.
Proof
Since there are no FBLs, every v ∈ V is an NFU node. By Theorem 1, it follows that ζ(v) is always frozen irrespective of the initial states ■
Theorem 1 and its corollary explain the effect of FBLs on the frozenness of the converging state sequences. More specifically, every NFU node is frozen to a value determined by the set of source nodes included in its upstream paths. I also note that this result is strongly related to previous studies based on synchronous or asynchronous Boolean network models [31–33]. In particular, the corollary corresponds to a previous result having stated that the Boolean dynamics converges to a unique fixed point in an acyclic Boolean network [31]. It is also relevant to the previous results showed that limitcycle attractors can be induced by negative feedback loops [32, 33]. In addition, Theorem 1 can reduce the computation of attractors in a large scale network by easily obtaining the converging values of the NFU nodes.
Dynamic properties of NFD nodes
In the lemmas and a theorem of this section, I investigate the effect of FBLs on robustness. I consider an arbitrary initial state v(0) = [v _{1}(0), v _{2}(0), ⋯, v _{ N }(0)] and a perturbed state at v _{ x } ∈ V from v(0), v ^{'}(0) = [v _{1}(0), ⋯, v _{ x − 1}(0), 1 − v _{ x }(0), v _{ x + 1}(0), ⋯, v _{ N }(0)] in the following lemmas 3 and 4, and theorem 2. I denote the value of a node w ∈ V at time t in the trajectories starting from v(0) and v ^{'}(0) by v _{{w}}(t) and v _{{w}} ^{'}(t), respectively.
Lemma 3
Let v(0) be an initial state, v ^{'}(0) a perturbed state at v _{ x } ∈ V from v(0), and w ∈ V an arbitrary node. If there is no path from v _{ x } to w then v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ 0.
Proof
The state value of node w is updated irrespective of that of node v _{ x } because there is no path from v _{ x } to w. Thus, the lemma holds ■
Lemma 4
Let v(0) be an initial state, v ^{'}(0) a perturbed state at v _{ x } ∈ V from v(0), w ∈ V an arbitrary node. Let Y = {y ∈ Vy is included in some path from v _{ x } to w} and l(y) the length of a longest path from v _{ x } to y ∈ Y. If v _{ x } is a nonsource node and no node in Y is involved with any FBL, then v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ l(w) + 1.
Proof
By mathematical induction with respect to l(y), I show that for every y ∈ Y, v _{{y}}(t) = v _{{y}} ^{'}(t) for ∀ t ≥ l(y) + 1, as follows. When l(y) = 0, it is obvious that y = v _{ x }. Then \( {\mathbf{v}}_{\left\{{v}_x\right\}}(t)={{\mathbf{v}}_{\left\{{v}_x\right\}}}^{\hbox{'}}(t) \) for ∀ t ≥ 1 because v _{ x } is a nonsource node and involved with no FBL. To prove the inductive step, I assume that the property holds for l(y) ≤ k − 1. Consider an arbitrary y ∈ Y such that l(y) = k and let U be the set of input nodes of y. For every u ∈ U, there are two cases to consider: either u ∈ Y or u ∉ Y. In case of u ∈ Y, it is obvious that l(u) ≤ k − 1 by the definition of l(⋅). By the induction hypothesis, v _{{u}}(t) = v _{{u}} ^{'}(t) for ∀ t ≥ l(u) + 1. In case of u ∉ Y, it means that there is no path from v _{ x } to u. Then v _{{u}}(t) = v _{{u}} ^{'}(t) for ∀ t ≥ 0 by lemma 3. From both cases, for ∀ u ∈ U, v _{{u}}(t) = v _{{u}} ^{'}(t) for ∀ t ≥ k. Then v _{{y}}(t) = v _{{y}} ^{'}(t) for ∀ t ≥ k + 1, thereby showing the property holds when l(y) = k. Since w ∈ Y, the lemma holds ■
Theorem 2
Let v(0) be an initial state, v ^{'}(0) a perturbed state at v _{ x } ∈ V from v(0), and w ∈ V an arbitrary node. If v _{ x } is an NFD and nonsource node, then the network is robust against a state perturbation subject to v _{ x }.
Proof
I show that there exists a constant time T such that v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ T in the following three cases. (i) Case that = v _{ x } : Because w is an NFD and nonsource node, v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ 1. (ii) Case that w is not connected by any path from v _{ x } : By lemma 3, v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ 0. (iii) Case that w is connected by at least one path from v _{ x } : Let Y = {y ∈ Vy is included in some path from v _{ x } to w} and l(w) be a longest length of those paths, respectively. Because v _{ x } is an NFD node, no node included in Y is involved with any FBL. By lemma 4, v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ l(w) + 1. By (i),(ii), and (iii), there exists a constant time T such that v _{{w}}(t) = v _{{w}} ^{'}(t) for ∀ t ≥ T and ∀ w ∈ V. Accordingly, the attractors starting at v(0) and v ^{'}(0) are equivalent to each other. Therefore, the network is robust against the state perturbation subject to v _{ x }■
Theorem 2 indicates that biological networks might be robust against perturbations subject to NFD nodes. To support this result, I compared NFD and nonNFD gene groups with respect to the proportions of essential genes, disease genes, and drug targets in two human signaling networks, the KEGG and WANG networks (Fig. 2; see Additional file 1: Tables S3 and S4 for details). As shown in Fig. 2, the proportions of essential genes, disease genes, and drug targets among NFD genes were significantly smaller than those among nonNFD genes in both networks (all pvalues<10^{−10}). I assume that essential genes, disease genes, and drug targets are likely to be susceptible to mutations, perturbations, or other external changes. In this regard, the relatively low proportions of essential genes, disease genes, and drug targets in the NFD group in the largescale signaling networks support Theorem 2. In addition, I further examined the proportions of essential genes, disease genes, and drug targets in NFD group in random networks to examine if the observed result is specific to the signaling networks (see Additional file 2: Figure S1). I created each set of 100 random networks by rewiring the interactions of the KEGG (Additional file 2: Figure S1(A)) and WANG (Additional file 2: Figure S1(B)) networks so that the indegree and the outdegree of the nodes are conserved, and observed that there is little difference between the NFD and nonNFD groups with respect to the proportions of essential genes, disease genes, and drug targets. This implies that the functionally important genes in the real signaling networks are not randomly distributed in terms of NFD classification.
Estimation of sustainability of a perturbation effect
In the previous section, Theorem 2 showed that a network state is robust against a perturbation as long as the perturbation effect is not sustained by downstream FBLs. In other words, the existence of downstream FBLs is a necessary condition to make a network susceptible to a perturbation. Inspired by that result, I have derived an estimated probability that a perturbation effect will be sustained. Given a node v _{ x } ∈ V subject to a perturbation, Lemma 3 shows that only downstream paths of v _{ x } need to be considered, and Lemma 4 shows that only those involved with an FBL need to be considered. I first estimate the probability with which a perturbation subject to v _{ x } is sustained through a single path involved with a FBL. Figure 3 shows an example of a downstream path P = v _{ x } u _{1} u _{2} ⋯ u _{ L } of v _{ x } which includes an FBL, and I consider v(0) and v ^{'}(0) which are an initial state and a perturbed state at v _{ x } ∈ V from v(0), respectively. It is said that the effect of a perturbation starting at v _{ x } at the initial time is sustained through propagations in a sequence of u _{1}, u _{2}, ⋯, u _{ L } if \( {\mathbf{v}}_{\left\{{u}_i\right\}}(i)\ne {{\mathbf{v}}_{\left\{{u}_i\right\}}}^{\hbox{'}}(i) \) for ∀ i ∈ {1, ⋯, L}. Herein, it is assumed that a probability with which u _{ i }(i) is differently updated by the flipped value of u _{ i − 1}(i − 1), denoted by \( \Pr \left({\mathbf{v}}_{\left\{{u}_i\right\}}(i)\ne {{\mathbf{v}}_{\left\{{u}_i\right\}}}^{\hbox{'}}(i)\Big{\mathbf{v}}_{\left\{{u}_{i1}\right\}}\left(i1\right)\ne {{\mathbf{v}}_{\left\{{u}_{i1}\right\}}}^{\hbox{'}}\left(i1\right)\right) \), is the inverse of the indegree of u _{ i } because of the following reason (for simplicity of explanation, u _{0} = v _{ x } is assumed). Let \( W=\left\{{w}_{1,}{w}_2,\cdots, {w}_{d_i}\right\} \) be the set of input nodes of u _{ i } where d _{ i } is the indegree of u _{ i }. By assuming that the input nodes have an even degree of influence on updating u _{ i }, i.e., \( \Pr \left(Y\Big{X}_1\right)=\cdots = \Pr \left(Y\Big{X}_{d_i}\right) \) where Y and X _{ k }(k ∈ {1, ⋯, d _{ i }}) denote two events \( {\mathbf{v}}_{\left\{{u}_i\right\}}(i)\ne {{\mathbf{v}}_{\left\{{u}_i\right\}}}^{\hbox{'}}(i) \) and \( {\mathbf{v}}_{\left\{{w}_k\right\}}\left(i1\right)\ne {{\mathbf{v}}_{\left\{{w}_k\right\}}}^{\hbox{'}}\left(i1\right) \), respectively. It is also assumed that u _{ i } is always differently updated given a perturbation has occurred at one of the input nodes. Accordingly, \( \Pr \left(Y\Big{\mathsf{U}}_{k\in \left\{1,\cdots, {d}_i\right\}}{X}_k\right)= \Pr \left(Y\Big{X}_1\right)+\cdots + \Pr \left(Y\Big{X}_{d_i}\right)=1 \), and thus Pr(YX _{ k }) = 1/d _{ i } (∀ k ∈ {1, ⋯, d _{ i }}). Note that u _{ i − 1} ∈ W because u _{ i − 1} is one of the input nodes of u _{ i }. With this result, the probability with which the perturbation subject to v _{ x } is sustained through a path P can be derived as follows:
Let P _{1}, P _{2}, ⋯, P _{ M } be the set of all downstream paths of v _{ x }. Then I define the perturbationsustainable probability γ(v _{ x }), the probability that the perturbation subject to v _{ x } will be sustained through at least one FBL, as follows:
If a gene with a relatively high γ(v _{ x }) value is subject to a perturbation, the network is likely to induce an abnormal dynamics due to the well conserved perturbation effect. In this regard, the perturbationsustainable probability can indicate how much a gene is functionally or dynamically important in a signaling network. To show the usefulness of this probability, I examined the relationship between γ(v _{ x }) and the proportions of putatively susceptible genes in human signaling networks (Fig. 4; see Additional file 1: Tables S3 and S4 for details). Given a threshold value β, the proportions of essential genes, disease genes, and drug targets among the set of genes such that {v _{ x }γ(v _{ x }) ≥ β} are plotted against the threshold value in the KEGG (Fig. 4a) and WANG (Fig. 4b) networks. As shown in the figure, the genes with a high perturbationsustainable probability are more likely to be essential genes, disease genes, and drug targets in both networks. This implies that the perturbationsustainable probability can adequately identify the functionally important genes in human signaling networks. In addition, it is notable that the relation of the perturbationsustainable probability to the functionally important genes was not observed in the random networks created by rewiring the interactions of the signaling networks (see Additional file 2: Figure S2). As in the results of Additional file 2: Figure S1, this also implies that the functionally important genes in the real signaling networks are not randomly distributed in terms of NFD classification. Taken together, it is interesting that such a simple topological measurement of genes based only on FBLs can efficiently predict the functionally important genes in human signaling networks.
Discussion
In this study, I did not address the dynamics of nonNFU and nonNFD nodes, i.e., nodes that are involved in FBLs, and the analysis of their dynamics remains an open problem. In addition, the updaterule perturbation, another wellknown type of perturbations, was not considered in this study because it influences the network robustness in a different way than the initialstate perturbation by changing the state transition diagram. Therefore, a future study should include analyses of genes that are neither NFU nor NFD nodes, and analysis of robustness against updaterule perturbations. Finally, it should be noted that the analyses in this study might not be effective for other types of biological networks than the signaling networks. For example, NFU/NFD classification was not meaningful in the largescale gene regulatory networks [34, 35] because most genes were classified to NFD nodes. This implies that another method to further classify NFD nodes is required for analysis of those networks.
Conclusions
It is well known that biological networks can keep their regulatory functions robust against external or internal perturbations. More interestingly, the network robustness is highly related to the network’s structural characteristics, including FBLs. However, previous results [2, 9, 10] have been presented mainly through simulation and experiment studies because of the complexity of real biological networks. That raised a pressing need to develop various analytical approaches to validate the promising conjectures. In this paper, I used a synchronous Boolean network model in which a node state is represented by a Boolean value and updated by a logical rule. A network is considered robust if the attractor does not change against a state perturbation. Based on that assumption, I created a novel concept to characterize the nodes with respect to FBL structures: noFBLinupstream (NFU) and noFBLindownstream (NFD). This FBLbased characterization is different from other FBLbased measures [10, 36] in that it focuses on involvement with FBLs in the upstream or downstream paths, not with the node itself. Based on that notion, I proved two simple but useful theorems. One is that an NFU node is always frozen irrespective of the initial states of other nodes. Thus, the converging dynamics of an NFU node can be simply determined. The other is that a network is robust against an arbitrary perturbation subject to a nonsource NFD node. This result shows that a network state eventually converges to the same attractor despite a perturbation subject to nonsource NFD nodes. Note that the two theorems hold for arbitrary update functions as well as initial states. In addition, the second theorem led me to develop a function to approximately compute the perturbationsustainable probability. I verified its effectiveness by showing that the higher the probability, the larger the proportion of essential, diseaseassociated, and drugtarget genes in human signaling networks. I believe these results will promote understanding of the effects of FBLs on network dynamics and reduce the cost of computing robustness in existing tools which simulate a network state trajectory [13–17].
Abbreviations
 FBL:

Feedback loop
 NFD:

NoFBLindownstream
 NFU:

NoFBLinupstream
References
Little JW, Shepley DP, Wert DW. Robustness of a gene regulatory circuit. EMBO J. 1999;18(15):4299–307.
Yi TM, Huang Y, Simon MI, Doyle J. Robust perfect adaptation in bacterial chemotaxis through integral feedback control. Proc Natl Acad Sci U S A. 2000;97(9):4649–53.
Ingolia NT. Topology and robustness in the Drosophila segment polarity network. PLoS Biol. 2004;2(6):e123.
Klein C, Marino A, Sagot MF, Vieira Milreu P, Brilli M. Structural and dynamical analysis of biological networks. Brief Funct Genomics. 2012;11(6):420–33.
Kitano H. Biological robustness. Nat Rev Genet. 2004;5(11):826–37.
Kitano H. Towards a theory of biological robustness. Mol Syst Biol. 2007;3:137.
Lev BarOr R, Maya R, Segel LA, Alon U, Levine AJ, Oren M. Generation of oscillations by the p53Mdm2 feedback loop: a theoretical and experimental study. Proc Natl Acad Sci U S A. 2000;97(21):11250–5.
Morohashi M, Winn AE, Borisuk MT, Bolouri H, Doyle J, Kitano H. Robustness as a measure of plausibility in models of biochemical networks. J Theor Biol. 2002;216(1):19–30.
Kwon YK, Cho KH. Coherent coupling of feedback loops: a design principle of cell signaling networks. Bioinformatics. 2008;24(17):1926–32.
Kwon YK, Choi SS, Cho KH. Investigations into the relationship between feedback loops and functional importance of a signal transduction network based on Boolean network modeling. BMC Bioinformatics. 2007;8:384.
Kauffman S, Peterson C, Samuelsson B, Troein C. Random Boolean network models and the yeast transcriptional network. Proc Natl Acad Sci U S A. 2003;100(25):14796–9.
Le DH, Kwon YK. A coherent feedforward loop design principle to sustain robustness of biological networks. Bioinformatics. 2013;29(5):630–7.
Gonzalez AG, Naldi A, Sanchez L, Thieffry D, Chaouiya C. GINsim: a software suite for the qualitative modelling, simulation and analysis of regulatory networks. BioSystems. 2006;84(2):91–100.
Mussel C, Hopfensitz M, Kestler HA. BoolNetan R package for generation, reconstruction and analysis of Boolean networks. Bioinformatics. 2010;26(10):1378–80.
Zheng J, Zhang D, Przytycki PF, Zielinski R, Capala J, Przytycka TM. SimBoolNeta Cytoscape plugin for dynamic simulation of signaling networks. Bioinformatics. 2010;26(1):141–2.
Trinh HC, Le DH, Kwon YK. PANET: a GPUbased tool for fast parallel analysis of robustness dynamics and feedforward/feedback loop structures in largescale biological networks. PLoS One. 2014;9(7):e103010.
Bock M, Scharp T, Talnikar C, Klipp E. BooleSim: an interactive Boolean network simulator. Bioinformatics. 2014;30(1):131–2.
Bhalla US, Ram PT, Iyengar R. MAP kinase phosphatase as a locus of flexibility in a mitogenactivated protein kinase signaling network. Science. 2002;297(5583):1018–23.
Ferrell Jr JE, Machleder EM. The biochemical basis of an allornone cell fate switch in Xenopus oocytes. Science. 1998;280(5365):895–8.
Pomerening JR, Sontag ED, Ferrell Jr JE. Building a cell cycle oscillator: hysteresis and bistability in the activation of Cdc2. Nat Cell Biol. 2003;5(4):346–51.
Ciliberti S, Martin OC, Wagner A. Robustness can evolve gradually in complex regulatory gene networks with varying topology. PLoS Comput Biol. 2007;3(2):e15.
Huang S, Eichler G, BarYam Y, Ingber DE. Cell fates as highdimensional attractor states of a complex gene regulatory network. Phys Rev Lett. 2005;94(12):128701.
Kitano H. Cancer as a robust system: implications for anticancer therapy. Nat Rev Cancer. 2004;4(3):227–35.
Li F, Long T, Lu Y, Ouyang Q, Tang C. The yeast cellcycle network is robustly designed. Proc Natl Acad Sci U S A. 2004;101(14):4781–6.
Kim JR, Kim J, Kwon YK, Lee HY, HeslopHarrison P, Cho KH. Reduction of complex signaling networks to a representative kernel. Sci Signal. 2011;4(175):ra35.
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30.
Cui Q, Ma Y, Jaramillo M, Bari H, Awan A, Yang S, Zhang S, Liu L, Lu M, O’ConnorMcCourt M, Purisima EO, Wang E. A map of human cancer signaling. Mol Syst Biol. 2007;3:152.
Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 2009;37(Database issue):D455–8.
Amberger J, Bocchini CA, Scott AF, Hamosh A. McKusick’s Online Mendelian Inheritance in Man (OMIM). Nucleic Acids Res. 2009;37(Database issue):D793–6.
Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner R, Guo AC, Wishart DS. DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res. 2011;39(Database issue):D1035–41.
Robert F. Discrete iterations: a metric study. Berlin: Springer Verlag; 1986.
Thomas R, Thieffry D, Kaufman M. Dynamical behaviour of biological regulatory networksI. Biological role of feedback loops and practical use of the concept of the loopcharacteristic state. Bull Math Biol. 1995;57(2):247–76.
Kwon YK, Cho KH. Boolean dynamics of biological networks with multiple coupled feedback loops. Biophys J. 2007;92(8):2975–81.
Bovolenta LA, Acencio ML, Lemke N. HTRIdb: an openaccess database for experimentally verified human transcriptional regulation interactions. BMC Genomics. 2012;13:405–216413405.
Liu ZP, Wu C, Miao H, Wu H. RegNetwork: an integrated database of transcriptional and posttranscriptional regulatory networks in human and mouse. Database (Oxford). 2015. doi:10.1093/database/bav095. Print 2015.
Kwon YK, Cho KH. Quantitative analysis of robustness and fragility in biological networks based on feedback dynamics. Bioinformatics. 2008;24(7):987–94.
Acknowledgements
Not applicable.
Funding
This work was supported by the 2015 Research Fund of University of Ulsan.
Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.
Authors’ contributions
YKK conceived the study, designed the analyses, wrote and revised the manuscript.
Competing interests
The author declares that he has no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Additional files
Additional file 1: Table S1.
Dataset of KEGG network. It consists of 1659 genes and 7964 interactions constructed by integrating all the human signaling pathways in the KEGG (Kyoto Encyclopedia of Genes and Genomes) database. Table S2. Dataset of WANG network. It consists of 6306 genes and 62,937 interactions downloaded from http://www.bri.nrc.ca/wang. Table S3. Detailed analysis result of the KEGG network. It includes information about essential genes, disease genes, drug targets, NRU/NRD genes, and perturbationsustainable probabilities in the KEGG network. Table S4. Detailed analysis result of the WANG network. It includes information about essential genes, disease genes, drug targets, NRU/NRD genes, and perturbationsustainable probabilities in the WANG network. (ZIP 352 kb)
Additional file 2: Figure S1.
Comparison between groups of NFD and nonNFD genes in random networks with respect to the proportions of essential genes, disease genes, and drug targets. In each subfigure, a set of 100 random networks were generated by rewiring the interactions of the signaling networks such that the indegree and the outdegree of all nodes are conserved. (A) Result of random networks shuffled from KEGG network. The average numbers of NFD and nonNFD genes were 509 and 1150, respectively. The average proportions of essential genes in the NFD and the nonNFD groups were 0.2849 and 0.2852, respectively. The proportions of disease genes in the NFD and the nonNFD groups were 0.2417 and 0.2435, respectively. The proportions of drugtargets in the NFD and the nonNFD groups were 0.2141 and 0.2122, respectively. (B) Result of random networks shuffled from WANG network. The average numbers of NFD and nonNFD genes were 1544 and 4761, respectively. The proportions of essential genes in the NFD and the nonNFD groups were 0.2390 and 0.2413, respectively. The proportions of disease genes in the NFD and the nonNFD groups were 0.2455 and 0.2472, respectively. The proportions of drugtargets in the NFD and the nonNFD groups were 0.1768 and 0.1769, respectively. Figure S2. Changes in the proportion of functionally important genes over the threshold value of the perturbationsustainable probability in random networks. In each subfigure, a set of 100 random networks were generated by rewiring the interactions of the signaling network such that the indegree and the outdegree of all nodes are conserved. Given a threshold value β, the yaxis values indicate the average proportions of essential genes, disease genes, and drug targets over the set of candidate genes whose perturbationsustainable probability is larger than or equal to β in random networks. (A) Results in random networks shuffled from KEGG network. For a reliable comparison, the maximal β was set to 0.0266 which generates 131 candidate genes on average. (B) Results in random networks shuffled from WANG network. The maximal β was set to 0.0908, which results in 121 candidate genes on average. (ZIP 362 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Kwon, YK. Properties of Boolean dynamics by node classification using feedback loops in a network. BMC Syst Biol 10, 83 (2016). https://doi.org/10.1186/s129180160322z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s129180160322z