- Research article
- Open Access
- Published:

# Monotonicity, frustration, and ordered response: an analysis of the energy landscape of perturbed large-scale biological networks

*BMC Systems Biology***volume 4**, Article number: 83 (2010)

## Abstract

### Background

For large-scale biological networks represented as signed graphs, the index of frustration measures how far a network is from a monotone system, i.e., how incoherently the system responds to perturbations.

### Results

In this paper we find that the frustration is systematically lower in transcriptional networks (modeled at functional level) than in signaling and metabolic networks (modeled at stoichiometric level). A possible interpretation of this result is in terms of energetic cost of an interaction: an erroneous or contradictory transcriptional action costs much more than a signaling/metabolic error, and therefore must be avoided as much as possible. Averaging over all possible perturbations, however, we also find that unlike for transcriptional networks, in the signaling/metabolic networks the probability of finding the system in its least frustrated configuration tends to be high also in correspondence of a moderate energetic regime, meaning that, in spite of the higher frustration, these networks can achieve a globally ordered response to perturbations even for moderate values of the strength of the interactions. Furthermore, an analysis of the energy landscape shows that signaling and metabolic networks lack energetic barriers around their global optima, a property also favouring global order.

### Conclusion

In conclusion, transcriptional and signaling/metabolic networks appear to have systematic differences in both the index of frustration and the transition to global order. These differences are interpretable in terms of the different functions of the various classes of networks.

## Background

For complex systems such as biological networks, rather than a precise description of the dynamics, which requires a quantity of kinetic details rarely accessible in large scale systems, it is often more reasonable to use a minimal representation, such as a graph of interactions between the molecular variables of interest [1–4] and perhaps a sign describing the mode of the pairwise interaction. Such graphical approaches have been extensively used in recent years to model transcriptional [5, 6] and signaling networks [7–10]. Apart from biological systems, signed adjacency graphs have been investigated in several different contexts, such as economics [11, 12], social balance [13], and in the theory of frustrated spin systems [14, 15], see [16] for a survey. In spite of the minimal amount of information it contains, a signed graph can already be used to study dynamical systems properties. Among the various approaches that have been used for this scope, we recall for example the characterizations of multistationarity of [17], stability [18], and the boolean network analysis of e.g. [10, 19, 20]. In particular, in [21] signed graphs are linked to the theory of monotone dynamical systems [22] and the latter is used as a paradigm to explain the highly predictable and ordered response of biological systems to perturbations. In a biological network, a response to a perturbation propagating incoherently through the network may result in an unpredictable or contradictory behavior of the system, see example in Fig. 1. When its dynamics are always free from such contradictory responses then the system is said monotone [21, 22], see Methods for a more rigorous definition. In dynamical systems language, a monotone system exhibits an ordered response because it lacks sustained oscillations and chaotic behavior, thereby rendering the behavior of the system particularly simple. Hence the investigation of how close a biological system is to being monotone has been the subject of intense research in recent years [21, 23–26].

From a statistical physics perspective, the problem of determining monotonicity (or near monotonicity) is equivalent to checking when an Ising model with signed interactions has no (or little) frustration [21, 23]. In terms of the signed graph, frustration corresponds to undirected cycles having an odd number of negative edges [21]. See also [27] for another recent use of Ising models in the context of complex networks. In this work we are interested in computing the frustration of biological networks of various types: transcriptional, signaling and metabolic. When modeling these different classes of networks as signed graphs, we have to use different levels of resolution: for signaling and metabolic networks we start from a set of stoichiometric reactions and obtain the signed graph by taking the signature of the Jacobian of the corresponding reaction kinetics, hence an edge represents the contribution of a molecular specie to a kinetic reaction, see [8, 23, 26] and the Methods Section. For transcriptional networks, on the contrary, we model interactions at functional level, i.e., we take an edge to represent the entire action of activation/inhibition of a transcription factor on a target gene, and in doing so we lump together many important molecular steps, from the binding of the transcription factor to the promoter region of a target gene to the final release of the newly synthetized mRNA molecule. Energetically, such complex process is various (or many) times more relevant than a signaling event or a metabolic reaction. Also the corresponding time scales differ by several orders of magnitude [4, 28]. Of course, we are forced to use this coarser level of resolution because the stoichiometric details are different for different transcriptional actions, and are not known systematically (see [29] for the only example we know of in this direction). Notice that a similar functional representation, oriented at capturing the "information flow" rather than the "mass flow", is possible also for signaling networks [4, 7, 9, 10]. Although it may elucidate better the causal transfer of "information" along the pathways, it seems less appropriate to describe the energetic content of the biochemical transformations necessary for the propagation of the signal than the stoichiometric level which we use in this paper, see Supplementary Notes in Additional File 1 for a more detailed discussion. In any case, the qualitative difference in the modeling assumptions made should always be kept in mind, and the classes of networks analyzed should be connotated accordingly as "transcriptional, at functional level" and "signaling/metabolic, at stoichiometric level".

Under these assumptions, the frustration index we observe varies considerably according to the type of network analyzed: it is very low for gene regulatory, networks and much higher for signaling and metabolic networks. In this paper we propose an interpretation of this different behavior based on the characteristic "energy" associated to the interactions of a graph. We assume that the costs of the interactions (i.e., the weights of the edges) are all comparable on each class of networks, but not across classes of networks. In particular, transcriptional edges have a much higher cost than the other classes of interactions, and we can speculate that on an evolutionary scale this may have strongly disfavored the development of interactions leading to frustration, i.e., of incoherent or contradictory transcriptional orders. For the "cheaper" signaling and metabolic interactions, instead, such a tight control may not be required, especially since a higher frustration may induce a richer and more complex dynamical behavior.

We know from the theory of Ising models that it is energetically favorable for neighbouring spins to be aligned when the interaction constant is positive and to be antialigned when it is negative. If we associate to the frustration index the global optimum of an "energy" function describing the amount of such unsatisfied interactions, then we can say that networks with low frustration will have a "ground state" (i.e., a global optimum) of lower energy than more frustrated networks. In addition, rather than just focusing on the energy of the optimal configuration, we can average the state of the system over all possible perturbations, and study what is the average frustration of a network. In particular, then, if we take the strength of the interactions of a network as "cooling" parameter, we can use statistical physics arguments [15] to describe how the probability of occupancy of the global minimum of the energy varies with the interaction strength, and therefore how monotonically a network behaves in average in response to random perturbations. What we observe is that the more frustrated signaling/metabolic networks achieve "order" (i.e, tend to populate their global minimum of energy) in a range of interaction energies which is lower than for the transcriptional networks, meaning that these networks (in average) tend to respond to perturbations as coherently as they can even for moderate values of energy. This behavior partially compensates for the higher frustration, which, as already mentioned, might be instrumental to the achievement of more complex dynamics than those required for the transcriptional networks. The transcriptional networks, on the other hand, only contain strong interactions and are therefore not concerned with the lower energetic regime. Coherently, they show a topological structure richer in tree-like subgraphs, which disfavor the transition to ordered behavior, and which are absent in the other classes of networks.

That signaling and metabolic networks may require a lower energetic content to experience a transition to ordered behavior is also confirmed by the structure of their energy landscapes which, unlike for the transcriptional networks, lack high and neat energetic barriers around the global optima, meaning that reconfiguration to the ground state can be easily achieved even at modest energies.

## Results

The representation of a biological network as an n-dimensional signed adjacency matrix is given by a matrix of elements *J*_{
ij
}∈ {±1, 0}, *i*, *j* = 1,..., *n*. is assumed symmetric (i.e., the frustrated cycles we seek are in the underlying undirected graph), and with zero diagonal (i.e., no self-loops), see [21] and the Methods Section for details on the formulation of the problem. As explained in the Methods, for a stoichiometric system we can assume that corresponds to the signature of the Jacobian linearization around an equilibrium point. For networks represented as functional activatory/inhibitory actions, the interpretation is even more straightforward. Coherently with our choice of model, we assume that also the perturbations affecting the system around its equilibrium point are of unit magnitude in each component, *s*_{
i
}∈ {±1}, *i* = 1,..., *n*. In correspondence of a vector **s** = [*s*_{1} ... *s*_{
n
}]^{T}of such signed perturbations, or "spin" variables, let us consider the "energy" function

which expresses the total cost associated to the perturbation **s**. Assuming that all interactions of a network have the same strength, |*J*_{
ij
}| = 1 whenever *J*_{
ij
}≠ 0, the cost of each interaction depends on the sign of each nonzero *J*_{
ij
}: for *J*_{
ij
}> 0 (activator) the aligned *s*_{
i
}, *s*_{
j
}spin configuration is more energetically favorable (-*J*_{
ij
}*s*_{
i
}*s*_{
j
}= -1 < 0) than the antialigned one (-*J*_{ij}*s*_{
i
}*s*_{
j
}= 1 > 0) and viceversa for *J*_{
ij
}< 0. Of all 2^{n}possible spin assignments, those respecting monotonicity will be such that *J*_{
ij
}*s*_{
i
}*s*_{
i
}> 0 on each edge of the graph, i.e., those contributing to minimizing *h*(s). A spin system is said frustrated when not all these constraints *J*_{
ij
}*s*_{
i
}*s*_{
j
}> 0 can be satisfied simultaneously by any assignment. Computing how far a given network is from being monotone corresponds to computing the ground state **s**_{ground}, i.e., the spin assignment that globally minimizes (1). It has been shown [23] that this is an NP-hard problem, equivalent to the MAX-CUT problem or, in terms of the Ising model, to computing the exact frustration index of the network [21, 30], call it *δ*. In [26] (see also Supplementary Notes in Additional File 1 for a quick recap), we proposed efficient heuristic algorithms providing fairly tight upper and lower bounds for *δ* in biological networks of the size of the thousands nodes. From the theory of monotone systems (see [21, 22] and the Methods), is monotone if and only if there exists a diagonal signature matrix *D*_{
σ
}(i.e., a matrix having on the diagonal the vector *σ* of elements *σ*_{
i
}∈ {±1}) such that has all nonnegative entries, see Lemma 2.1 in [22]. _{
σ
}and have different sign patterns but the same frustration index *δ*, as *D*_{
σ
}is a change of sign through a cut set of the graph of and such "gauge transformations" *D*_{
σ
}[31] leave the sign of each cycle of the graph (and hence *δ*) unaltered.

Let us consider first as an illustrative example the yeast cell cycle network introduced in [19] in the context of boolean networks, see Fig. 1. With respect to the original graph of [19], we drop the self-loops and consider the underlying undirected graph (only a pair of edges is incompatible with this symmetrization of the adjacency matrix). The number of negative signs on the symmetrized adjacency matrix is 10. However, a gauge transformation on the three nodes Cib1,2 Clb5,6 and Cln1,2 yields a _{
σ
}with only 4 negative edges, which is a global optimum for the frustration index *δ*, see Fig. 1(a) . The presence of frustrated cycles in a network leads to a lack of coherence in the response of the system to perturbations.

This can be observed in Fig. 1(b), where the response of the yeast cell cycle and of a monotone network built on the same graph are compared. The behavior of the non-monotone cell cycle network is less predictable and potentially contradictory (see also Fig. S2 for analogous considerations on the simpler feedforward loop example [5]). It is then important to have an estimate of how close a true network is to being monotone i.e., frustration-free. Our algorithms allow to obtain a _{
σ
}with as low as possible number of negative signs also for large-scale networks. This number is typically close to *δ*, meaning that it is now much easier to localize on the graph of _{
σ
}the potentially frustrated edges (or, more properly, the frustrated cycles). Another consequence is that the candidate ground state for _{
σ
}that globally minimizes (1) is now straightforward to identify, as it corresponds to the "all spins up" configuration, call it **1**. Hence, the candidate ground state for the original can be found reversing the gauge transformation: **s**_{
ground
}= *D*_{
σ
}**1**. Approximate values for the frustration index *δ* and for the corresponding energy minimum not very far from the true ones can therefore be computed.

### Frustration in large-scale biological networks

For the eight large-scale biological networks listed in Table 1, four transcriptional *(E.coli, Yeast, B.subtilis* and *Corynebacterium*), two signaling *(EGFR* and *Toll-like*) and two metabolic *(E.coli* and *Yeast* networks), we considered the corresponding signed graphs (see Tables S1-S2 for further details on the networks and the Supplementary Notes in Additional File 1 for the construction procedure followed) and estimated δ through the algorithms mentioned above, see Table 2. The theory of signed graphs provides us with a theoretical upper bound on the frustration index (see Supplementary Notes in Additional File 1), call it *δ*_{
max
}, which is a function only of the number of nodes, edges and cycles of the networks. The ratio *δ/δ*_{
max
}, Fig. 2(a), shows a marked difference between transcriptional networks and signaling/metabolic networks, with the former exhibiting a consistently lower level of frustration than the latter. The upper bound *δ*_{
max
}, however, disregards completely the topological structure and the sign arrangements of a network. To take into account also these parameters, we constructed a null-model of the networks, obtained by randomly reshuffling the signs of the edges, while maintaining the same number of positive and negative edges of the original graph, see Supplementary Notes in Additional File 1 for more details. For the Z-score of this null model, a negative value means that the edges are arranged in order to decrease frustration. We can observe in Fig. 2(b) that all the transcriptional networks have a negative Z-score, and only them (p-values of the Z-score in Table 2). The characteristic property of the transcriptional networks that enhances monotonicity is the tendency of many nodes to have a skewed distribution of signs in their edges, see Fig. 3. Up to a gauge transformation, in fact, highly asymmetric sign distributions correspond to highly positive sign concentrations, hence closer to monotone than random sign distributions. The "packing" of signs on certain nodes is primarily due to the mode of action of the transcription factors. Although dual role (i.e., both activator and repressor) transcription factors exist in both prokaryotes and eukaryotes [32, 33], most transcription factors seem to be playing only one role on their target genes. The nature of this single role is sometimes associated to the regulatory domains found on the proteins, especially for activator domains, which are usually enriched in proline, glutamine or acidic amino acid residues [34–36]. The dual role transcription factors are usually able to perform opposite functions according to possible different positions of their binding sequence with respect to the gene sequence, or according to different cellular contexts, or simply enhancing only the formation of the closed complex DNA-RNA polymerase [32]. For example, 71% of the *E.coli* transcription factors function only as activators or repressors, Fig. 3(b). The ontological analysis of the dual role transcription factors is significantly enriched for categories such as interfacing the cell with its extracellular environment and for the elaboration of external stimuli (see Table S5). Hence mixed role transcription factors are more often mediating signaling events than their single role counterparts. It is shown in Fig. 3(a) that all transcriptional networks (and only them) have sign arrangements on the edges that are more skewed than expected (with respect to a binomial distribution model, see Supplementary Notes in Additional File 1 and Table S6) and also this property contributes to their monotonicity (Fig. S5). Another structural difference between transcriptional and signaling/metabolic networks is the overrepresentation in these last classes of short frustrated cycles. As explained in the Supplementary Notes in Additional File 1, this characteristic is encoded in the level of detail (stoichiometric) that we choose to represent our networks, and expresses the lack of global monotonicity of a biochemical reaction involving multiple reagents, see also [21, 23, 24, 26].

### Average frustration and ordered response

The values of δ and *h*(**s**_{
ground
}) alone are not enough to characterize how monotonically the system behaves *in average*. In fact, the energy landscape of frustrated Ising spin systems is known to be usually rugged [37, 38], and the presence of a single deep minimum in (1) is not enough to guarantee that the energy averaged over all configurations **s** (corresponding to all possible multinode perturbations) is indeed more negative than in other systems whose energy landscape is characterized by valleys which are maybe less deep but with larger basins. In other words, to characterize how monotone is the response of the system to arbitrarily complex perturbations we have to consider the average value that *h*(**s**) assumes over all possible spin assignments, weighted by the probability of each **s**. This "internal energy", call it ⟨*h*⟩, is an indicator of how coherently the system is behaving in average: the more negative ⟨*h*⟩ is, the less the responses of the system to perturbations are "contradictory" at some fan-in node or along directed cycles. Denote with the partition function of the system, *β* ∈ ℝ_{+}. As usual in statistical physics, the partition function *Z* is the normalization factor that renders the frequencies of the various spin states true probability densities. For spin systems, *β* has the meaning of an inverse temperature and it is normally used as "cooling" parameter, i.e., when *β*→ ∞ the probability of the state **s**, *p*(**s**), tends to concentrate on the ground states: *p*(**s**_{
ground
}) → 1 as *β* → ∞. In the context of biological networks, the temperature is taken as ~ 298 K and it is not a varying parameter. However, we can use *β* to describe the strength of the interactions of a network. Recall that in forming the energy (1), was taken as a signed adjacency matrix with all interactions equal to 1, regardless of the nature of the network studied. As a matter of fact, metabolic, signaling and transcriptional interactions are characterized by widely different energetic costs. In particular, if in our stoichiometric representation a metabolic reaction or a signaling event might have a comparable energetic content, a link in a gene regulatory network describes the entire cascade of events in which the transcription of a gene can be broken down and overall its cost is much higher than in the other networks. Hence, in our fixed temperature context, taking into account the interaction cost *β* rescales *h*(**s**) to the "absolute" energy *βh*(**s**). The probability of a given configuration **s**, *p*(**s**) = e^{-βh(s)}*Z*(*β*)^{-1}, is a function of *β* and is maximized in the (usually degenerate) ground state **s**_{
ground
}. As for spin systems, , i.e., when *β* is large enough, in average the system will always be found in the configuration **s**_{
ground
}which minimizes the energy (1) and which exhibits the least frustration for the network.

Using *β* as a Lagrange multiplier, the internal energy ⟨*h*⟩ is defined as the expectation value of *h*(**s**),

and expresses this simultaneous weighting of the configurations by their degeneracy and energetic content. The more negative ⟨*h*⟩ is, the more we expect the system to respond coherently to a generic perturbation. For any *β* > 0, ⟨*h*⟩ < 0 and, as *β* increases, ⟨*h*⟩ reaches a stationary value, see Fig. 1(c). For spin systems, small values of *β* represent a regimen where thermal fluctuations are dominant and all states tend to be equally populated. As *β* increases, a spin system usually undergoes a phase transition characterized by the appearance of long range correlations in the expectation values assumed by the *s*_{
i
}. For our biological networks, when *β* (i.e., the energetic content of an edge of the network) is too small, the behavior of the network tends to be random (and all states **s** equiprobable) regardless of the monotonicity of the network, a clear obstacle to carrying out any meaningful task. On the other hand, when *β* → ∞, the probability concentrates exclusively on the ground states (Z(*β*) becomes a Dirac delta function) and the behavior of the system becomes as ordered as its frustration index allows, i.e., the system response is as coherent and coordinated as possible, regardless of the type of perturbation, see Fig. 1(c) . It is then important to see how the probabilities of the various states p(**s**) and the internal energy (⟨*h*⟩) vary as a function of *β* on the various categories of networks under exam. Computing p(**s**) and ⟨*h*⟩ exactly is impossible for systems larger than a few tens of nodes. For larger networks we shall make use of a mean field approximation for heterogeneous networks [39, 40]. This approximation, see the "Methods" Section for details, allows to estimate the expectation value ⟨**s**_{σ}⟩ in the gauge transformed basis, and the corresponding mean field energy *h*_{
mf
}. Fig. 4 shows the behavior of ⟨**s**_{σ}⟩ and *h*_{
mf
}for a transcriptional, a signaling and a metabolic network as function of *β*. In all three cases, the population concentrates in the ground state when *β* grows, and, correspondingly, *h*_{
mf
}achieves its global minimum. The true characteristic value of the interaction strength *β* at which each of the classes of networks should be computed is unknown, except for the suggestion that *β*_{
transcr
}≫ *β*_{
signal
}~ *β*_{
metab
}. Interestingly, as *β* grows, the transcriptional network is slower to reach its energetic minimum than the other two networks, and likewise for the other 5 networks, see Table S4 and Fig. S7. This shift of the coherence barrier towards the low energetic regions is a consequence of the topology of the networks. In fact, as can be seen on Fig. 4, also the completely monotone networks built on the same graphs (blue curves) as well as other networks with random sign assignment to the same edges as our (green curves) present the same characteristic patterns in spite of different δ. A feature behind this difference is the already mentioned overrepresentation of closed undirected cycles of short length in the structure of metabolic and signaling networks. Also the lower dispersion in the number of connectivity degree classes *k* in these networks contributes to the quick convergence of ⟨**s**_{σ}⟩ to 1. However, the main reason behind the different thresholds for *β* is the presence or less of leaves in the graph. For example, the *E.coli* transcriptional network has 38% of the nodes that are not involved in any (undirected) cycle, see Table 1. Dropping these nodes and concentrating on the 2-core of the undirected graph, we obtain mean field plots in which the threshold for order is lower, and similar to those of the signaling/metabolic networks, see Fig. S8. All of our transcriptional networks have a high percentage of nodes that are leaves, much higher that the signaling/metabolic networks, see Table 1. The complete lack of feedback, characteristic of tree-like subnetworks, disfavours the achievement of a globally ordered behavior, which is more easily achieved when short cycles, like the 3-node motifs of signaling/metabolic networks, are abundant. This is expected from the theory of spin systems, where long-range correlations are more easily obtained in dense graphs than in sparse ones. Of course adding leaves to a graph does not change its monotonicity properties (a tree is always monotone).

The qualitative difference in the phase transition to order between transcriptional and signaling/metabolic networks suggests an interpretation coherent with the different energetic content associated to the classes of networks. In fact, we can say that since *β*_{
transcr
}is high, it is much less plausible for a transcriptional network to be operating in a regimen of low *β* than it is for signaling/metabolic networks. On the contrary, for these last two classes of networks, it is not unlikely to have interactions of medium-low strength. Hence it gets much more important that ⟨**s**_{σ}⟩ → 1 even in correspondence of moderate values of *β*, because this helps in maintaining a coherent behavior in response to perturbations, as required in order to carry out correctly a biological task.

### Sampling the energy landscape

Further information, from a different perspective, can be obtained studying the structure of the energy landscape of the different networks [37]. In order to have a picture of how this landscape looks like, we have applied our frustration minimization algorithms to uniformly distributed initial conditions and registered the local and global minima achieved in the process (see Fig. 3 and Table S3). Fig. 5 shows these distributions of minima as a function of the relative Hamming distance. For the transcriptional network of *E.coli* and the *Yeast* metabolic network, the global minima are all localized in a small region, while *EGFR* has two broader valleys of global minima. In all three cases, the global minima are surrounded by many local minima, thus confirming the ruggedness of the landscapes. As can be seen on Fig. S11, unlike *EGFR* and the metabolic network, the local minima of the transcriptional network of *E.coli* tend to have an energetic difference from the global ones which grows linearly with the distance. In addition, the separation between the well of global minima and its surroundings is much more neat in *E.coli* than in the other two networks, as can be seen on the Montecarlo trajectories of Fig. 6 and even more clearly on the average gradient of *h*(**s**) (bottom part of Fig. 6). See also Figs. S10, S12, and S13 for analogous consideration on the remaining 5 networks. Overall, it appears that global and local minima in the transcriptional networks are separated by high and steep energetic barriers, while on the other networks there always exist low-energy routes between random spin configurations and global minima, possibly passing through low-energy local minima. This of course facilitates the achievement of the ground state and the creation of global order even in a regime of moderate values of *β*.

## Discussion

For a gene regulatory network, an edge represents the cost of the entire action of transcription of a gene. This is a complex, multistep process: for prokaryotes, for example, it includes the binding of the transcription factor to the DNA, the recruitment of a polymerase, the unwinding of the DNA helix, the detachment of the σ -factor and the conformational changes in the polymerase preceeding elongation, the release of both the DNA and of the complete mRNA at the termination phase. The energetic cost and time constant of such a complex process are relevant for a cell. Hence, especially in lower organisms, it is natural to expect that in a transcriptional network the genes behave in concert and that the fraction of the gene-gene interactions that contribute to minimizing the energy in response to perturbations is substantially larger than in a metabolic or signaling network, as a frustrated bond costs much more to the cell and its effect lasts much longer. In particular, frustrations manifest themselves on the cycles of the underlying undirected graph of the network as contradictory transcriptional orders. While changing the transcriptional commands is necessary to cope with e.g. different environmental conditions, encoding them as frustrated cycles can easily lead to unpredictable or erroneous dynamical behavior. Therefore, in spite of the presence of certain characteristic motifs leading to frustration (the incoherent feedforward loops mentioned in [5, 6] for the *E.coli* and *Yeast* transcriptional networks are common examples), overall the transcriptional networks we analyze are indeed near-monotone. Both the topology and the sign assignments to the nodes of the transcriptional networks contribute to achieve a degree of monotonicity which is higher than expected from null models. On the contrary, incoherent signaling or metabolic actions are energetically much less relevant than a single transcriptional event and can be easily tolerated by the cell, especially since nonmonotone patterns favour a richer dynamical behavior. While the level of detail at which we model our networks (functional for transcriptional networks, stoichiometric for signaling and metabolic networks) certainly contributes to the systematic differences in the frustration index, other factors such as the tendency of the transcriptional networks to have skewed sign distributions are also crucial in attaining a low frustration. It is interesting, then, to notice that in *E.coli* the transcription factors violating this rule are primarily involved in the mediation of external signaling, rather than in regulatory or structural functions (Table S5).

For spin systems, the tendency to satisfy pairwise all interactions grows when the temperature decreases, although in a frustrated Ising spin system all the conditions can never be satisfied simultaneously. In this paper, we consider the strength of the interactions as the key factor that determines the increase in the probability of finding the system in its ground state (i.e., in its least frustrated/maximally monotone configuration). If we parametrize the networks by the interaction strength and study the probability of finding the system response in the ground state as a function of this cost, we observe that for signaling/metabolic networks it is higher than for transcriptional networks in the region of medium/low values of the interactions. This behavior, which is due to the topological structure of the networks and to the energy landscape it determines, could reflect the tendency of signaling/metabolic networks to attain a globally ordered response in spite of the weaker energetic content of their interactions. As such, it helps maintaining coherence of the response in spite of the higher level of frustration of these networks (which, again, favors a richer dynamical behavior). For transcriptional networks, on the other hand, owing to the strong interactions, the regime of low energies is less important, hence tree-like motifs, which hinder the establishment of long-range correlations, are abundant.

A Montecarlo investigation of the energy landscape of the networks [37, 41] suggests that transcriptional networks tend to have a more funneled landscape than the other networks (at least around the global optima), with a single deep well of global minima delimited by high barriers, while in signaling and metabolic networks the optima are surrounded by local minima of comparable energy. Order in these classes of networks is favored also by the lack of neat energetic barriers separating local and global optima, which enables the reconfiguration to the global optimum through low-energy paths.

Several are the *caveat* and limitations of our study. First of all, the different levels of resolution for the different classes of networks may be a source (or *the* source) of the systematic differences we are observing. Hints in this direction come for example from the observation that networks at functional level tend to have less cycles than networks at stoichiometric level (see Supplementary Notes in Additional File 1 for the origin of this fact), and that functional models of signaling pathways may also have asymmetric sign distributions (for example non-specific kinases catalyzing the phosphorylation of various proteins will have many positive edges, while non-specific phosphatases will have multiple negative edges). This is observed to some extent in the functional model of the hippocampal signaling network proposed in [9]. Notice that this network has a large fraction (approximately a third) of interactions representing protein-protein or protein-ligand bindings, to which it is unclear how to associate a sign in an unambiguous manner. The ambiguity of course also propagates to the level of frustration one obtains correspondingly. More generally, we are not aware of any systematic way to map the pathway charts available at stoichiometric level to the functional level, allowing to univocally assign a sign to each edge without at the same time loosing in this process a large part of the molecular species involved. Notice also that the opposite option, namely representing transcriptional networks at stoichiometric level, is *de facto* impossible with our current knowledge.

Another important source of uncertainty comes from the limited coverage of the biological networks currently available. In particular, for the transcriptional networks, the fraction of target genes having at least a transcription factor is below 50% of the genes. Furthermore, our considerations about an higher than expected monotonicity may very well be overturned once more complex organisms (for whom the regulatory mechanisms are expected to be much more complex) are taken into account.

## Conclusion

In conclusion, we have observed that distinct classes of biological networks seem to be characterizable by different features in response to perturbations. At least when we model transcriptional networks at functional level (i.e., as activation/inhibition links) and signaling and metabolic networks at stoichiometric level, we can observe that transcriptional networks appear to be less frustrated than expected and much less frustrated than signaling and metabolic networks, meaning that they might admit highly coherent responses to perturbations. On the other hand, the signaling/metabolic networks seem to have the ability to achieve an average ordered response in a lower range of interaction strengths than the transcriptional networks. We explain the first feature as the need to avoid as much as possible erroneous or contradictory transcriptional actions which would cost much more to the cell than analogous incoherent signaling/metabolic events. The second feature may partially compensate for the higher frustration of these last networks, by lowering the interaction strength needed for a transition to ordered response (in average), and thereby ensuring the effectiveness of this reduced coherent behavior in an energetic range more critical for these classes of networks.

## Methods

**Model formulation: the signed adjacency matrix of a dynamical system** For an n-dimensional system of differential equations

consider the linearization around an operating point x_{o}. If **z** = **x** - **x**_{
o
},

where is the Jacobian matrix computed at . The system is at rest at , implying that **z** = 0 is an equilibrium point. A perturbation around **x**_{o} is then any vector **z** around 0 (assuming both positive and negative values on each of its components *z*_{
i
}). For a large-scale biological network it is very difficult to have a precise knowledge of the functional form of *f*(·) or even of the Jacobian matrix . It is often more reasonable to assume that only the sign pattern is known of :

i.e., _{
ns
}of elements *J*_{
ns
},_{
ij
} ∈ {±1,0} is the signed adjacency matrix of a directed graph representing our network. Coherently with *J*_{
ns
},_{
ij
} ∈ {±1,0}, also the magnitude of the perturbations **z** is to be considered as unknown except for its sign: **s** = *sign*(**z**), meaning *s*_{
i
}∈ {±1} ∀i = 1,... n. The entries *J*_{
ns
},_{
ij
}of the matrix _{
ns
}represent the effect of the *j*-th variable on the *i*-th variable which can be activatory, *J*_{
ns
},_{
ij
}> 0, inhibitory, *J*_{
ns
},_{
ij
}< 0, or inexisting, *J*_{
ns
},_{
ij
}= 0. In general, this effect can change of sign with the operating point **x**_{o}[21], but we shall not consider this scenario here. As a matter of fact, it is worth remarking that for common choices of f(**x**), such as mass-action or Michaelis-Menten, the partial derivatives have indeed constant sign .

If, rather than in the directed graph of adjacency matrix _{
ns
}, we are interested in the underlying undirected graph (resulting by dropping the arrows in the edges), then this is obtained symmetrizing the matrix _{
ns
}Denote such symmetric signed adjacency matrix. The symmetrization operation is always possible as long as edge pairs *J*_{
ns
},_{
ij
}and *J*_{
ns
},_{
ji
}are compatible, i.e., *J*_{
ns
},_{
ij
}*J*_{
ns
},_{
ji
}≥ 0. In all of our networks, the symmetrization operation leads to very few or no conflicting signs at all, see Table S1.

### Monotone dynamical system

A partial order in ℝ^{n}is a signature vector σ = [σ_{1} ...σ_{
n
}], σ_{
i
}∈ {±1}, which defines an order relation among vectors in ℝ^{n}: *x*' and *x*'' ∈ ℝ^{n}are said ordered with respect to the partial order . A system is monotone with respect to the partial order *σ* if for any pair of initial conditions *x*'(0) ≤_{σ}*x*''(0) one has that *x*'(t) ≤_{σ}*x*''(t) for every *t ≥* 0. In terms of the signature adjacency matrix _{
ns
}of the Jacobian linearization , a system is monotone if and only if

As explained in detail in [21], the non strict inequality for monotonicity allows to test such conditions (3), rather than in the original directed graph of (2), on its underlying undirected counterpart, in which we conventionally drop the self-loops (for which σ_{
i
}σ_{
i
}*J*_{
ns,ii
}> 0 if and only if *J*_{
ns,ii
}> 0, i.e., the order relations (3) are trivial). Therefore, from now on we shall consider only the symmetrized version of _{
ns
},with all diagonal elements fixed to 0, i.e., the matrix . Practically, this symmetrization operation means that we are interested not only to "true" directed cycles and their frustration, but also to multiple directed paths starting and ending on the same nodes (and forming cycles on the underlying undirected graph). See the feedforward examples in Supplementary Notes in Additional File 1 and Fig. S2.

### Mean field approximation in heterogeneous signed networks

Mean field approximations [15] are necessary to compute estimates of quantities such as *Z*, *p*(**s**) and ⟨*h*⟩. The approximation described here is suitable for heterogeneous networks, i.e., networks in which the connectivity of the nodes is not constant. It extends the approach proposed in [39, 40] to systems with frustration. For a given signed network , apply first the gauge transformation D_{σ} required to minimize the overall number of negative signs on the edges, while maintaining the frustration index *δ* invariant. Denote then *k*(1),..., *k*(£) the £ different connectivity degrees of the nodes of _{σ}, of probabilities *p*_{
k
}(1),...,*p*_{
k
}(£), and ⟨*k*⟩ the average connectivity degree of . The nodes having degree *k* will have a certain distribution of positive and negative edges. Let *k*_{
pn
}(1),..., *k*_{
pn
}(£) be the differences between positive and negative edges averaged over all nodes of each degree class. As after preprocessing with *D* σ each node has more positive than negative edges, we are guaranteed that *k*_{
pn
}≥ *k*/2. In order to compute the expectation value of **s**_{σ} on each degree class, we use the self-consistency equation for heterogeneous networks. Following [39], the self-consistency equation on the degree class *k* is given by

where

is the effective "field magnetization" of each node from its neighboring nodes and the subindex σ in **s** indicates that the value is computed in the gauge transformed basis. The use of *k*_{
pn
}instead of the degree *k* corrects the equations (4)-(5) for the frustration of the system. In practice, for our gauge transformed networks the number of negative signs is at most 20% (often much less), meaning that *k*_{
pn
}~ *k* for most degree classes. From (4) and (5), we have an expression for the mean field expectation value ⟨**s**_{σ}⟩ weighted with respect to the degree classes:

and, neglecting fluctuations around each ⟨**s**_{σ}⟩ the mean field Hamiltonian is

As , the energy is invariant to the gauge transformation *D*_{σ}. In fact, from **s**_{σ} = *D*_{σ}**s**, we have . Hence the mean field calculations for _{σ} are valid also in the original . In addition, however, as **s**_{σ,ground}= **1**, in the gauge transformed system we have that, as *β* → ∞, ⟨**s**_{σ}⟩ → 1, a property which is in general not verified in the original basis, which will instead concentrate at its own ground state . Therefore, for all practical purposes, ⟨**s**_{σ}⟩ can be taken as "order parameter" of the spin glass. In fact, since the gauge transformation minimizes the number of negative edges, it also maximizes the number of spins whose value is +1 in the ground state. Hence, just like in a ferromagnet (i.e., in a spin system in which for all nonzero *J*_{
ij
}one has *J*_{
ij
}= +1), the average value of **s**_{σ} (i.e., the "magnetization") tends to 1 when the system is "cooled".

## References

- 1.
Jamshidi N, Palsson B: Formulating genome-scale kinetic models in the post-genome era. Mol Syst Biol. 2008, 4: 171- 10.1038/msb.2008.8

- 2.
Kholodenko BN: Cell-signalling dynamics in time and space. Nat Rev Mol Cell Biol. 2006, 7: 165-176. 10.1038/nrm1838

- 3.
Kwon YK, Cho KH: Coherent coupling of feedback loops: a design principle of cell signaling networks. Bioinformatics. 2008, 24 (17): 1926-1932. 10.1093/bioinformatics/btn337

- 4.
Papin JA, Hunter T, Palsson BØ, Subramaniam S: Reconstruction of cellular signalling networks and analysis of their properties. Nat Rev Mol Cell Biol. 2005, 6 (2): 99-111. 10.1038/nrm1570

- 5.
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: simple building blocks of complex networks. Science. 2002, 298 (5594): 824-827. 10.1126/science.298.5594.824

- 6.
Shen-Orr SS, Milo R, Mangan S, Alon U: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet. 2002, 31: 64-68. 10.1038/ng881

- 7.
Klamt S, Saez-Rodriguez J, Lindquist JA, Simeoni L, Gilles ED: A methodology for the structural and functional analysis of signaling and regulatory networks. BMC Bioinformatics. 2006, 7: 56-56. 10.1186/1471-2105-7-56

- 8.
Fages F, Soliman S: From reaction models to influence graphs and back: a theorem. Formal Methods in Systems Biology. 2008, 90-102. full_text. Springer-Verlag LNBI 5054,

- 9.
Ma'ayan A, Jenkins SL, Neves S, Hasseldine A, Grace E, Dubin-Thaler B, Eungdamrong NJ, Weng G, Ram PT, Rice JJ, Kershenbaum A, Stolovitzky GA, Blitzer RD, Iyengar R: Formation of regulatory patterns during signal propagation in a Mammalian cellular network. Science. 2005, 309: 1078-1083. 10.1126/science.1108876

- 10.
Samaga R, Saez-Rodriguez J, Alexopoulos LG, Sorger PK, Klamt S: The logic of EGFR/ErbB signaling: theoretical properties and analysis of high-throughput data. PLoS Comput Biol. 2009, 5 (8):

- 11.
Harary F: Graph theoretic methods in the management sciences. Management Sci. 1959, 5: 387-403. 10.1287/mnsc.5.4.387.

- 12.
Quirk J, Ruppert R: Qualitative economics and the stability of equilibrium. Rev Econ Stud. 1965, 32: 311-326. 10.2307/2295838.

- 13.
Antal T, Krapivsky PL, Redner S: Dynamics of social balance on networks. Phys Rev E. 2005, 72 (3): 036121-10.1103/PhysRevE.72.036121.

- 14.
Chowdhury D: Spin Glasses and other frustrated systems. 1986, Princeton University Press,

- 15.
Fischer K, Hertz J: Spin Glasses. 1991, Cambridge University Press,

- 16.
Zaslavsky T: Bibliography of signed and gain graphs. Electr J Combinatorics. 1998, DS8-

- 17.
Soulé C: Graphic Requirements for Multistationarity. ComplexUs. 2003, 1: 123-133. 10.1159/000076100.

- 18.
Thieffry D: Dynamical roles of biological regulatory circuits. Brief Bioinform. 2007, 8 (4): 220-225. 10.1093/bib/bbm028

- 19.
Li F, Long T, Lu Y, Ouyang Q, Tang C: The yeast cell-cycle network is robustly designed. Proc Natl Acad Sci USA. 2004, 101 (14): 4781-4786. 10.1073/pnas.0305937101

- 20.
Chaves M, Albert R, Sontag ED: Robustness and fragility of Boolean models for genetic regulatory networks. J Theor Biol. 2005, 235 (3): 431-449. 10.1016/j.jtbi.2005.01.023

- 21.
Sontag ED: Monotone and near-monotone biochemical networks. Systems and Synthetic Biology. 2007, 1: 59-87. 10.1007/s11693-007-9005-9

- 22.
Smith HL: Systems of ordinary differential equations which generate an order preserving flow. A survey of results. SIAM Review. 1988, 30: 87-113. 10.1137/1030003.

- 23.
DasGupta B, Enciso GA, Sontag E, Zhang Y: Algorithmic and complexity results for decompositions of biological networks into monotone subsystems. Biosystems. 2007, 90: 161-178. 10.1016/j.biosystems.2006.08.001

- 24.
Ma'ayan A, Iyengar R, Sontag E: Proximity of Intracellular Regulatory Networks to Monotone. IET Systems Biology. 2008, 2: 103-112. 10.1049/iet-syb:20070036

- 25.
Sontag E, Veliz-Cuba A, Laubenbacher R, Jarrah AS: The effect of negative feedback loops on the dynamics of boolean networks. Biophys J. 2008, 95 (2): 518-526. 10.1529/biophysj.107.125021

- 26.
Iacono G, Ramezani F, Soranzo N, Altafini C: Determining the distance to monotonicity of a biological network: a graph-theoretical approach. IET Systems Biology. 2010, 4: 223-235. 10.1049/iet-syb.2009.0040

- 27.
Ma'ayan A, Cecchi GA, Wagner J, Rao AR, Iyengar R, Stolovitzky G: Ordered cyclic motifs contribute to dynamic stability in biological and engineered networks. Proc Natl Acad Sci USA. 2008, 105 (49): 19235-19240. 10.1073/pnas.0805344105

- 28.
Fell D: Understanding the control of metabolism. 1997, Portland Press,

- 29.
Thiele I, Jamshidi N, Fleming RM, Palsson BØ: Genome-scale reconstruction of Escherichia coli's transcriptional and translational machinery: a knowledge base, its mathematical formulation, and its functional characterization. PLoS Comput Biol. 2009, 5 (3): 10.1371/journal.pcbi.1000312.

- 30.
Barahona F: On the computational complexity of Ising spin glass models. J Phys A Math Gen. 1982, 15: 3241-3253. 10.1088/0305-4470/15/10/028.

- 31.
Toulouse G: Theory of the frustration effect in spin glasses: I. Communications on Physics. 1977, 2: 115.

- 32.
Roy S, Garges S, Adhya S: Activation and repression of transcription by differential contact: two sides of a coin. J Biol Chem. 1998, 273 (23): 14059-14062. http://www.hubmed.org/display.cgi?uids=9603899 10.1074/jbc.273.23.14059

- 33.
Ma J: Crossing the line between activation and repression. Trends Genet. 2005, 21: 54-59. http://www.hubmed.org/display.cgi?uids=15680515 10.1016/j.tig.2004.11.004

- 34.
Han K, Manley JL: Transcriptional repression by the Drosophila even-skipped protein: definition of a minimal repression domain. Genes Dev. 1993, 7 (3): 491-503. http://www.hubmed.org/display.cgi?uids=8095483 10.1101/gad.7.3.491

- 35.
Rhodius VA, Busby SJ: Positive activation of gene expression. Curr Opin Microbiol. 1998, 1 (2): 152-159. http://www.hubmed.org/display.cgi?uids=10066477 10.1016/S1369-5274(98)80005-2

- 36.
Stepchenko A, Nirenberg M: Mapping activation and repression domains of the vnd/NK-2 homeodomain protein. Proc Natl Acad Sci USA. 2004, 101 (36): 13180-13185. http://www.hubmed.org/display.cgi?uids=15340160 10.1073/pnas.0404775101

- 37.
Brooks CL, Onuchic JN, Wales DJ: Statistical thermodynamics. Taking a walk on a landscape. Science. 2001, 293: 612-613. 10.1126/science.1062559

- 38.
Seyed-allaei H, Seyed-allaei H, Ejtehadi MR: Energy-landscape networks of spin glasses. Physical Review E. 2008, 77 (3): 031105-10.1103/PhysRevE.77.031105.

- 39.
Leone M, Vázquez A, Vespignani A, Zecchina R: Ferromagnetic ordering in graphs with arbitrary degree distribution. Eur Phys J B. 2002, 28 (2): 191-197. 10.1140/epjb/e2002-00220-0.

- 40.
Bianconi G: Mean field solution of the Ising model on a Barabási-Albert network. Phys Lett A. 2002, 303 (2-3): 166-168. 10.1016/S0375-9601(02)01232-X.

- 41.
Wang J, Huang B, Xia X, Sun Z: Funneled landscape leads to robustness of cell networks: yeast cell cycle. PLoS Comput Biol. 2006, 2: e147- 10.1371/journal.pcbi.0020147

- 42.
Salgado H, Gama-Castro S, Peralta-Gil M, Diaz-Peredo E, Sanchez-Solano F, Santos-Zavaleta A, Martinez-Flores I, Jimenez-Jacinto V, Bonavides-Martinez C, Segura-Salazar J, Martinez-Antonio A, Collado-Vides J: RegulonDB (version 5.0):

*Escherichia coli*K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 2006, D394-D397. 34 Database, - 43.
Sierro N, Makita Y, de Hoon M, Nakai K: DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res. 2008, 36: D93-96. 10.1093/nar/gkm910

- 44.
Baumbach J: CoryneRegNet 4.0 - A reference database for corynebacterial gene regulatory networks. BMC Bioinformatics. 2007, 8: 429- 10.1186/1471-2105-8-429

- 45.
Oda K, Matsuoka Y, Funahashi A, Kitano H: A comprehensive pathway map of epidermal growth factor receptor signaling. Mol Syst Biol. 2005, 1: 2005- 10.1038/msb4100014

- 46.
Oda K, Kitano H: A comprehensive map of the toll-like receptor signaling network. Mol Syst Biol. 2006, 2: 2006-0015. 10.1038/msb4100057.

- 47.
Reed JL, Vo TD, Schilling CH, Palsson BØ: An expanded genome-scale model of

*Escherichia coli*K-12 (iJR904 GSM/GPR). Genome Biol. 2003, 4 (9): R54- 10.1186/gb-2003-4-9-r54 - 48.
Förster J, Famili I, Fu P, Palsson BØ, Nielsen J: Genome-scale Reconstruction of the

*Saccharomyces cerevisiae*Metabolic Network. Genome Res. 2003, 13 (2): 244-253. 10.1101/gr.234503

## Acknowledgements

The authors would like to thank K. Nakai for providing the transcriptional network of *B.subtilis*. This work was sponsored in part by a grant from Illy Caffé, Trieste, Italy.

## Author information

## Additional information

### Authors' contributions

GI and CA participated in the conceiving of the study, in the statistical analysis and in the drafting of the manuscript. Both authors read and approved the final manuscript.

## Electronic supplementary material

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Gauge Transformation
- Metabolic Network
- Biological Network
- Energy Landscape
- Boolean Network