An approach to evaluate the topological significance of motifs and other patterns in regulatory networks
© Goemann et al; licensee BioMed Central Ltd. 2009
Received: 28 October 2008
Accepted: 19 May 2009
Published: 19 May 2009
The identification of network motifs as statistically over-represented topological patterns has become one of the most promising topics in the analysis of complex networks. The main focus is commonly made on how they operate by means of their internal organization. Yet, their contribution to a network's global architecture is poorly understood. However, this requires switching from the abstract view of a topological pattern to the level of its instances. Here, we show how a recently proposed metric, the pairwise disconnectivity index, can be adapted to survey if and which kind of topological patterns and their instances are most important for sustaining the connectivity within a network.
The pairwise disconnectivity index of a pattern instance quantifies the dependency of the pairwise connections between vertices in a network on the presence of this pattern instance. Thereby, it particularly considers how the coherence between the unique constituents of a pattern instance relates to the rest of a network. We have applied the method exemplarily to the analysis of 3-vertex topological pattern instances in the transcription networks of a bacteria (E. coli), a unicellular eukaryote (S. cerevisiae) and higher eukaryotes (human, mouse, rat). We found that in these networks only very few pattern instances break lots of the pairwise connections between vertices upon the removal of an instance. Among them network motifs do not prevail. Rather, those patterns that are shared by the three networks exhibit a conspicuously enhanced pairwise disconnectivity index. Additionally, these are often located in close vicinity to each other or are even overlapping, since only a small number of genes are repeatedly present in most of them. Moreover, evidence has gathered that the importance of these pattern instances is due to synergistic rather than merely additive effects between their constituents.
A new method has been proposed that enables to evaluate the topological significance of various connected patterns in a regulatory network. Applying this method onto transcriptional networks of three largely distinct organisms we could prove that it is highly suitable to identify most important pattern instances, but that neither motifs nor any pattern in general appear to play a particularly important role per se. From the results obtained so far, we conclude that the pairwise disconnectivity index will most likely prove useful as well in identifying other (higher-order) pattern instances in transcriptional and other networks.
Network analysis is increasingly recognized as a powerful approach to understand the organization of intracellular systems. The topology (i.e., the architecture) of a network describes how its elements are interconnected to one another, thereby providing the necessary structural basis for the subsequent analysis of the dynamics of the system. Various biological networks, such as metabolic or protein interaction networks, share global statistical features, i.e., (i) the small-world property referring to the shortest paths between any two vertices and highly clustered connections and (ii) the scale-free property, indicating that the vertex degrees follow a power-law distribution [1–7]. This implies a certain hierarchy of connectedness, as most vertices have a low degree and few vertices (hubs) have a markedly increased number of immediate neighbors.
This hierarchy is reflected in the modular organization of biological regulatory systems with each module performing its special functional task, separable from the functions of other modules [8, 9]. Such a modularity of networks can be characterized topologically whereby their scale-free organization coincides with hierarchical modularity . These hierarchical networks comprise many small clusters that are densely interconnected rather than consisting of independent groups of vertices . Accordingly, modules may overlap with each other so that a nested type of organization is possible with smaller modules being part of bigger ones. It has been observed for various biological networks that the clustering coefficient of the vertices is approximately inversely proportional to their degree, which has been understood as the most important indication of hierarchical modularity of a network [3, 11–13]. Understanding the organization of modules and their structural and functional roles emerges as a new challenge when studying biological networks. The corresponding analyses require proceeding from the level of vertices and their edges to the level of groups of these elements. It has been shown that 'network motifs' are an important feature of biological networks and may represent the simplest building blocks from which the bigger functional modules and whole networks are made [8, 14, 15]. They appear to relate to the lowest level of a hierarchical modularity.
Network motifs depict distinct topological patterns that occur more often in a given network than in random networks with the same size and degree distribution [14, 15]. In contrast, significantly underrepresented patterns are known as anti-motifs . Proteins belonging to specific motifs in the yeast protein interaction network tend to be highly conserved across species during evolution thereby underpinning that also their respective motifs may have an important, evolutionarily selected biological function [17, 18]. The same network motifs have been found in diverse organisms from bacteria and yeast to plants and animals reviewed in . The concept of network motifs as the building blocks of evolution has become one of the central topics in the analysis of complex networks. Usually, studies focus on how each network motif can carry out particular information-processing functions by means of its specific internal organization [19–23].
So far little attention has been paid to the role of motifs within a whole network, i.e., how they are embedded and how important they are for supporting the global architecture. Motifs are not isolated entities, but they are integral parts of the whole network. Thus, the targeted removal of the links among the vertices of all feed-forward loops and bi-fan clusters from the transcription regulatory network of E. coli fragmented this network into many small, isolated subgraphs . Although this observation already indicates that motifs may be of big importance for the structure of a whole network it hides the impact of a single feed-forward loop or bi-fan representative in E. coli. It is unclear whether such a fragmentation is caused by a limited number of these representatives only and if the significance of a representative goes along with a particular kind of motif like the feed-forward loop. Furthermore, networks contain other topological patterns than motifs and it remains to be seen whether they take a minor role for the topology of a network . Therefore, further studies are necessary and they require switching from the abstract view of a topological pattern to the level of their various representatives, the instances of a pattern. In general, a topological pattern depicts a unique kind of organization between a defined number of vertices which is given by the edges between these vertices. A pattern instance refers to a distinct set of vertices and all edges between them so that the arrangement of the edges reflects the respective pattern. To estimate the significance of such an instance for the topology of a whole network one has to consider how it relates to the rest of the network, i.e., its environment, and therewith which kind of influence it may have. No practical methods and theoretical approaches are yet available for this purpose.
To evaluate the topological significance of individual components in complex biological systems, we have recently introduced a new topological parameter – the pairwise disconnectivity index of a network's element . Such an element might be a vertex (i.e., molecule, gene), an edge (i.e., reaction, interaction), as well as a group of vertices and/or edges. The pairwise disconnectivity index quantifies how essential an element is for sustaining the communication ability between all connected ordered pairs of vertices in a network. It can be viewed as a measure of sensitivity (robustness) of this network to the presence (absence) of each element. Here, we show how this concept can be used to estimate the topological significance of a pattern instance and to find out the role of the corresponding pattern within a whole network. Subsequently, we apply this approach exemplarily to the analysis of 3-vertex topological patterns in transcription networks from different organisms: a bacterium (E. coli), a unicellular eukaryote (S. cerevisiae) and higher eukaryotes (mammals, mainly human, mouse, and rat).
The topological significance of a pattern instance
In Eq. 1 N is the total number of ordered pairs of vertices in a graph G = (V, E) that are connected by at least one directed path of any length. It is supposed that N > 0, i.e., there exists at least one edge in the network that links two different vertices. N' is the number of ordered pairs of vertices in the subgraph G' = (V, E') of G where E' = E/ . Therefore, G' is the subgraph of G that results from removing the intrinsic edges of the pattern instance from G. The pairwise disconnectivity index of a pattern instance ranges between 0 and 1, whereas zero indicates that the removal of its intrinsic edges does not disconnect vertices within the network and one denotes the cases when no pair of vertices is connected any more.
as the pairwise disconnectivity index of a pattern that consists of J instances. With it Eq. 2 also states the topological significance of a randomly chosen instance of the pattern .
Applying the pairwise disconnectivity index to the analysis of topological patterns in regulatory networks
We have applied our approach to the characterization of three-vertex topological patterns in transcription regulation networks from three different organisms: a bacteria (Escherichia coli) , a unicellular eukaryote (the yeast Saccharomyces cerevisiae)  and higher eukaryotes (mammals: human, mouse, rat) [25, 26]. 3-vertex motifs were identified by means of the Z-Score as proposed by Alon and colleagues . This normalized value states whether the abundance of a pattern in the real network exceeds its occurrence in a number of random ensembles: that is, a positive Z-Score refers to an over-representation in the real network, whereas a negative Z-Score means under-representation. Since there is no commonly accepted threshold Z-Score value for defining motifs, we consider patterns with Z-Score > 0 as motifs and all other ones as non-motifs. For the networks of E. coli and S. cerevisiae 3-vertex motifs were already identified [14, 15], whereas for the mammalian transcription network this is reported for the first time. To distinguish between different motifs many of which have no commonly accepted names, we used the identification numbers (IDs) of small connected graphs as it is provided by the FANMOD software [27, 28]. The name of a pattern instance was generated by combining a prefix E, Y or M for referring to E. coli, S. cerevisiae or mammalian, respectively, with the corresponding ID followed by the pairwise disconnectivity index rank of the instance among all instances of a given pattern.
Bacterial transcription network
Yeast transcription network
The transcription network of S. cerevisiae consists of 688 vertices and 1079 edges. It features three additional patterns besides those ones that have already been identified in E. coli. A positive Z-Score is attributed to four patterns in S. cerevisiae, although the patterns ID = 102 and ID = 166 occur only once (Figure 3). Likewise to the observations from E. coli, the average topological significance of the motif ID = 6 is lower than that of the feed-forward loop. On average, a randomly selected FFL instance breaks the connection between less than 1% of all connected pairs of genes, which is lower than for instances of the pattern ID = 12. Their mean pairwise disconnectivity index is about 0.0135 and appears to be the highest of all patterns in the S. cerevisiae network with a negative Z-Score.
The highest pairwise disconnectivity index is about 0.08 (Figure 6) and refers to a feed-forward loop instance that embodies the genes RME1, IME1 and IME1_UME6 (Figure 7). RME1 is known to encode a zinc finger protein that can repress the transcription of IME1 . RME1 and IME1 are the master regulators of meiosis in S. cerevisiae [33–35]. An ime1 disruption prevents expression of almost all meiotic genes and all tested meiotic events . RME1 is essential for sustaining the communication abilities between lots of gene pairs, similar to the genes MCM1, SNF2_SWI1 and SWI5. Gene MCM1 is central to the transcription control of cell-type specific genes and the pheromone response. The SNF2/SWI complex is an evolutionarily conserved ATP-dependent chromatin remodeling complex that plays an important role in DNA damage repair, DNA replication and stress response . SWI5 activates the expression of cell cycle genes . Altogether, these genes exert vital functions in S. cerevisiae and each of them appears quite frequently among the pattern instances with the highest topological significance.
Mammalian transcription network
The third network represents genes coding for transcription factors in mammalian species (human, mouse, and rat) and their interplay. This mammalian network consists of 279 vertices and 657 edges and has been extracted from the contents of the TRANSPATH® database on signal transduction  and the TRANSFAC® database on eukaryotic cis-acting regulatory DNA elements and trans-acting factors . Unlike the other two networks it contains all of the thirteen possible 3-vertex patterns. Although five patterns display positive Z-Scores, only four of them indicate a clear over-representation (Figure 3). In addition, one might find it difficult to classify the pattern ID = 102 as a motif due to its low frequency. Nevertheless, the FFL is a motif in mammals and the only pattern that is over-represented in all three networks. Although its occurrence rises with the increasing density and complexity of the networks, its topological significance is decreasing notably. Actually, a low average pairwise disconnectivity index can be observed for almost all motifs in mammals, with motif ID = 174 as the only exception.
Three of the seven patterns with a negative Z-Score have been found in the networks of E. coli and S. cerevisiae too, but unlike in mammals the pattern ID = 6 is a motif in them. Yet, its average topological significance for these networks does not differ greatly. Similar applies to the pattern ID = 12 that exhibits one of the highest mean pairwise disconnectivity indices here as well. In contrast, just a minor role seems to be adopted by the pattern ID = 36 though it is the second most common one. Other non-motif patterns in the mammalian network are crucial for linking only 1% of gene pairs mostly on average. Nevertheless, their appearance is a hint on the more complex organization of transcription regulation in higher organisms. Thus, it seems to be convenient that the pattern ID = 238 can be found only here (Figure 3): it represents the mutual transcription control of three retinoic acid receptor isoforms with the vertices RAR-alpha, RAR-beta and RAR-gamma. Note that this pattern does not even occur in any random network of similar size and degree distribution. On the other hand, it is still surprising that the pattern ID = 164 appears nearly 200 times in the mammalian network, but neither in the network of E. coli nor in the network of S. cerevisiae.
A note on the joint deletion of intrinsic edges
The unusually often appearance of the same links (i.e., intrinsic edges) between genes in the pattern instances with the highest pairwise disconnectivity indices in all three networks raises the question of their contribution to the estimated significance of these pattern instances. Probably, the removal of individual intrinsic edges may already destroy the connection between many gene pairs so that their simultaneous removal is not as crucial. Otherwise they may have a significant non-additive impact taken together. However, answering this requires knowing the effect of deleting a single interaction (i.e., edge) in a network which can be accomplished in a similar way as for a pattern instance. It has been introduced as the pairwise disconnectivity index of an edge in  and specifies the fraction of ordered pairs becoming disconnected due to the removal of an individual edge.
A pattern instance is positioned below the diagonal dotted lines in Figure 11 due to considerable overlapping in the sets of pairwise linked genes which become disconnected upon the separate removal of the intrinsic edges of the instance. For example, consider how the vertices 1 and 5 in Figure 1A are linked. To disconnect them it is enough to delete one of the edges 1 → 2 or 2 → 5 at a time. Such kinds of dependencies seem to exist in larger scales in the analyzed networks pinpointing to lots of gene pairs that are connected in a linear chain-like manner as reflected by the pattern ID = 12 (Figure 3). There are almost no independent alternative paths between such gene pairs so that the connection between them is very sensitive upon the deletion of a single intrinsic edge. Therewith, the pattern ID = 12 is contained virtually exclusively amongst the pattern instances below the diagonal dotted lines in Figure 11.
The concurrent elimination of the intrinsic edges of a pattern instance located above the diagonal dotted lines breaks also pairwise connections between genes that are not so easily assailable as described above. At least two paths between such genes exist, each using a unique combination of intrinsic edges. Thus, they cannot be affected by eliminating a single intrinsic edge only. For example, in Figure 2 there are three paths linking vertex E2 with E6: The first one includes the intrinsic edge X → Z. The second consists of the intrinsic edges X → Y and Y → Z whereas the third path contains only the edge Y → Z. However, no matter which of the intrinsic edges is deleted, the vertex pair (E2, E6) remains untouched since at least one of the three paths is still present. Their connection is disrupted only if the whole pattern is deleted. Such dependencies can be observed in Figure 11 for few pattern instances in E. coli, but increasingly in the other two networks. This trend is most distinctive in the mammalian network. Besides the pattern instances with a high pairwise disconnectivity index, a considerable number of motif instances appear in the lower left corner of the plot for the mammalian network (Figure 11, red triangles): their intrinsic edges have an extremely small or even no impact at all on pairwise connections between genes. But as motif instances, they are a bottleneck for linking many gene pairs.
A new method to asses the global role of patterns and motifs
The work presented here describes a method that has been proved to be suitable for evaluating the role of topological patterns within a network. This holds true regardless of the size and complexity of these patterns. The method assesses the significance of a pattern depending on the contribution of its instances, i.e. connected subgraphs, for the connectivity of a network. The approach is based on the technique described previously in , which estimates the necessity of a network element (e.g., a vertex or an edge) for sustaining the communication ability between connected pairs of vertices in a network. This is accomplished in a similar way as wet experiments in a lab: a gene (corresponding to a vertex in a graph) is knocked out and the effect of this removal is observed in the considered context. The same may be applied to a reaction (an edge in the graph), when a gene has been mutated and the encoded product (vertex) is still present, but unable to undergo a certain reaction.
In this work, we have proposed to proceed likewise for pattern instances, but disturbing the interactions between the involved vertices rather than eliminating the vertices themselves. Consequently, only the causal links between these vertices are destroyed and therewith the respective pattern is removed in a minimally invasive way. This is conducted without making any a priori assumptions on the analyzed network and its properties. In contrast to the attempt made in , we destroy the coherence between the edges of only one single pattern instance at a time, leaving the remainder of the network intact. On the one hand, different impacts on the network connectivity exerted by the various instances of a pattern can thus be discovered. On the other hand, the topological role of a pattern can be determined more realistically since an overrating is avoided.
3-Vertex patterns in transcriptional networks
We exemplarily applied the method developed and proposed here to the analysis of transcriptional regulation networks of three very distinct taxa (E. coli, S. cerevisiae and mammals, i. e. human, mouse, and rat); for simplicity, we focused here on 3-vertex topological patterns in these networks, but the method can easily be adopted to the analysis more complex and larger patterns. A first check of which of the thirteen possible 3-vertex patterns are present in these networks at all revealed that all of them can be found in the mammalian network, the S. cerevisiae network contains seven and that of E. coli only four of them. Moreover, these latter four patterns are shared by all three networks. Amongst them, only the "feed-forward loop" is statistically over-represented and, thus, could be considered as a "motif" (Figure 3).
As to be expected, the abundance of a pattern decreases with its complexity: Thus, 3-vertex patterns with two edges occur much more frequently than those with three edges, etc. The order of the abundance is almost the same in all three transcription networks. It is of interest that the network patterns "coupled feedback loop" (Figure 3, ID = 78) and "3-vertex-circuit" (Figure 3, ID = 140) do not exist in the networks of E. coli and S. cerevisiae and are clearly under-represented in the network of mammals (Figure 3), although they are widespread in signaling circuits of various bacterial and eukaryotic organisms [40–44]. We assume that this is an intrinsic property of transcriptional networks and cannot be explained by the incompleteness of the underlying knowledge, since other patterns of similar complexity (e.g., the mentioned feed-forward loop) are not consistently under-represented among these three networks.
All networks studied here appear to be rather robust against the elimination of a randomly chosen pattern instance. Therewith, the various 3-vertex patterns in these networks display a low topological significance on average. Mostly, the overall majority of the instances of a pattern have a rather small effect on the existing pairwise connections between genes, in most cases even less than 1% of all pairwise connections are affected.
Motifs do not seem to be more important than non-motif patterns for the global architecture of a network
Also the motifs among the 3-vertex patterns examined did not exhibit a generally higher importance for the connectivity of the whole network than non-motif patterns, as one might have expected. This is, however, in agreement with previous studies on the evolutionary and functional assessment of motifs in the regulatory networks of different yeasts, which have provided evidence that motifs are not subject to any particular evolutionary pressure to preserve the corresponding interaction pattern [45, 46]. No simple relationships have been found between evolutionary conservation and over-representation of network patterns, on the one hand, and their functional enrichment, on the other hand, in the yeast regulatory network . In accordance with these observations, our results indicate that there is no positive correlation between the abundance (i.e., over-representation) of a network pattern and its topological significance. Thus, focusing on motifs exclusively rather than searching for important pattern instances in general would have lead to a completely different and deceptive picture.
Pattern instances can be identified that are crucial for the connectivity of the network
In spite of the generally low impact of all types of patterns (including motifs) found in the analyzed networks, a few pattern instances cause a significant perturbation upon their removal. This trend is manifested in the heterogeneous distribution of the pairwise disconnectivity index among all the instances of a pattern (Figs. 3, 4, 5). Topologically, this may originate from the way how a pattern instance is embedded, i.e., its particular position within the whole context of the respective network. Biologically, such heterogeneity might be caused by the influence of the genes in the network that are forming a pattern instance. In the networks, the topologically most significant pattern instances consist preferably of genes that provide basic functions for the organism. Interestingly, most of these instances belong to one of the patterns that are shared by the three networks, which may emphasize the importance of these patterns. Furthermore, such instances may indicate locations within the networks rendering them vulnerable upon a targeted removal.
Among the pattern instances that are of particular importance for the network connectivity, motif instances again do not play a predominant role over instances from non-over-represented patterns. In the mammalian network, most of the outliers even belong to the non-motif patterns. Altogether, our data support the view that far not all instances of any pattern (motif or not), but only few of them may play specific functional roles  and thereby exhibit a strong impact on pairwise connections between genes in transcription networks.
Pattern instances of high topological significance tend to form clusters
In all the networks analyzed here, a limited number of genes repeatedly appears in the pattern instances displaying the highest topological significance. For example, in E. coli the gene flhDC is part of all pattern instances that disconnect at least 4% of the gene pairs, preferably together with the genes fliAZY or ompR_envZ. Similar observations can be made in S. cerevisiae for the genes MCM1, SIN3, SNF2_SWI1 and SWI5. Likewise in the mammalian network, the interaction between the genes c-myc and PAX3 participates in many of the pattern instances with a high pairwise disconnectivity index. Altogether, the common occurrence of genes and interactions between them underlines the key importance of these constituents for the corresponding organism. All these genes are engaged in important processes and at least in E. coli and S. cerevisiae they are crucial for linking a significant number of gene pairs . Hence, their damage can be lethal for the respective organism. Furthermore, these pattern instances are not located in different regions of a network. They are connected with each other and seem to form a bigger pattern cluster that controls a lot of pairwise connections between genes in these networks.
Edges of pattern instances display synergistic effects
In many cases, the intrinsic edges of a pattern instance contribute to its pairwise disconnectivity index in a synergistic manner, i.e., the simultaneous removal of the respective edges exerts a much higher than merely additive effect (Figure 11). Although the approach we used for this purpose is a conservative approximation, it shows a principal tendency in these networks. More exact computations of this feature may be desirable but developing suitable algorithms for this, which have to take into account the particular characteristics of every pattern separately, was beyond the scope of this paper. However, we find that our approach was adequate to disclose clearly that the intrinsic edges of certain pattern instances display synergistic effects. This is the case for the pattern instances with the highest pairwise disconnectivity index in each of the three networks. Some other candidates have been found in E. coli and increasingly more in S. cerevisiae and mammals. This trend goes along exactly with the increasing density of the networks (1.2 edges per vertex in the E. coli network, 1.6 in S. cerevisiae and 2.3 in Mammals). The reason for this is on the hand: a more densely connected network provides a higher average vertex degree and thereby offers more alternative paths between pairs of vertices. These paths need not to share a similar set of edges, i.e., the connection of a pair is becoming more robust requiring more edges to be removed in order to disconnect it.
Prospects of the proposed method
It should be noted that the observations reported here have been made for the networks as they are known at present. In particular the mammalian network may still suffer from incomplete knowledge. However, our method can be used for monitoring changes in such networks obtained from updated pathway databases like TRANSPATH®  in the future. We see our results as the beginning of a large work which may consider the analysis of increasingly larger patterns including more than 3 vertices. More regulatory networks of various types (e.g., signal transduction networks, protein-protein interaction networks, gene expression networks) from different organism must be considered and tested in this regard in future as well. First attempts with signaling networks have confirmed the basic conclusions drawn here in spite of small characteristic differences in some details. Thus, we feel that the basic trends reported here will hold true for the more complete transcriptional as well as for other types of networks that will come up in future with increased reliability of high-throughput approaches and their systematic application.
On the other side, our method provides for the first time the possibility to assess the impact of patterns and motifs in general as well as individual pattern instances onto the overall connectivity of a graph. It is therefore suitable to identify bottlenecks in a biological network, which may be particularly important for the normal function of a cell, and may be top candidates to investigate disease mechanisms related to these functions. Since it identifies individual components in a network (vertices, edges, or pattern instances), it works independently of any a priori knowledge about the statistical over- or under-representation of certain network features. Though our approach was developed for the analysis of biological regulatory networks, it seems to be suitable for the analysis of other networks regardless of the particular nature of processes they represent (e.g., ecological, social, technical networks).
We have developed a new method that quantifies how the elimination of a topological pattern instance affects the existing communication abilities within a network. We have applied this method exemplarily to the analysis of 3-vertex topological patterns and their instances in the transcription networks from a bacteria, yeast and mammals.
The elimination of most 3-vertex pattern instances does not drastically affect the global structure of transcription networks. However, these networks are vulnerable upon a targeted perturbation of few pattern instances. In these cases, the links between their genes contribute to the pairwise disconnectivity index of the pattern instance in a synergistic manner, i.e., the simultaneous removal of the respective edges exerts a much higher than merely additive effect. The topological significance of an instance does not easily correlate with the abundance of the respective pattern in a network. Although motifs might play an essential role in their respective local contexts, they do not seem to be more important than non-motif patterns for the global architecture of a network. Rather, the topological role of a pattern instance is unique and mainly determined by its location and the way how it is embedded in a given network.
Literature-based databases of experimentally verified direct relationships for Escherichia coli  and Saccharomyces cerevisiae  have been used where E. coli V1.1 and S. cerevisiae V1.3 are available at http://www.weizmann.ac.il/mcb/UriAlon. The mammalian network of transcription factor genes (human, mouse, rat) was retrieved from the TRANSPATH® Professional database (release 8.3, made in 2007) on signal transduction  and TRANSFAC® Professional database (release 11.3, made in 2007) on eukaryotic cis-acting regulatory DNA elements and trans-acting factors . The network describes the causal relationships between genes that are coding for transcription factors, based on the regulation of these genes from transcription factors. However, the transcription factors themselves are not part of the network, i.e., the interaction chain "gene A codes for transcription factor A regulates gene B" has been summarized to: "gene A → gene B", which is a commonly used technique when inferring gene regulatory networks. Furthermore, genes are represented at the level of "ortholog abstraction", at which all species-specific data (human, mouse, rat) that refer to mammalian genes have been summarized to corresponding generic entries.
Selected genes (vertices) in the yeast and mammalian transcription networks were checked for their viability using the BIOBASE Knowledge Library™ http://www.biobase.de and the Saccharomyces Genome Database (Stanford Genomic Resources ).
The networks were scanned for 3-vertex topological patterns using the FANMOD software with default settings [27, 28]. The statistical significance of the network motifs was evaluated by means of the Z-Score , Z = (M real - M rand )/SD, where M real and M rand are the numbers of appearance of the motif in the real network and the randomized networks, respectively. SD is the standard deviation. The sign of edges (such as 'positive' for activation or 'negative' for inhibition) is not considered.
The pairwise disconnectivity index of an edge
For estimating the impact of a single intrinsic edge on the existing pairwise connections between genes we have applied the pairwise disconnectivity index on an edge as defined in . In this manner it states the fraction of those ordered pairs of vertices that have been disconnected upon the removal of an edge, i.e., . Similar to Eq. 1, N is the number of linked ordered pairs of vertices in a network and we assume N > 0. The term N' stands for the number of connected ordered pairs of vertices in the network we obtain when deleting the edge e. Hence, Dis(e) = 0 the edge e is not crucial for linking at least of vertex pair. In contrast, Dis(e) = 1 if no vertex pairs remains connected.
This work has been supported in part by grant 031U110A (Intergenomics) of the German Federal Ministry of Education and Research (BMBF) and by grant 503568 (COMBIO) within the 6th Framework Programme for Research, Technological Development and Demonstration of the European Commission.
- Albert R, Jeong H, Barabási AL: Lethality and centrality in protein networks. Nature. 2001, 411: 41-42. 10.1038/35075138View Article
- Albert R, Jeong H, Barabási AL: Error and attack tolerance of complex networks. Nature. 2000, 406: 378-382. 10.1038/35019019View ArticlePubMed
- Barabási AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004, 5: 101-113. 10.1038/nrg1272View ArticlePubMed
- Watts DJ, Strogatz SH: Collective dynamics of 'small-world' networks. Nature. 1998, 393: 440-442. 10.1038/30918View ArticlePubMed
- Dorogovtsev SN, Mendes JFF: Evolution of networks. Adv Phys. 2002, 51: 1079-1187. 10.1080/00018730110112519.View Article
- Newman MEJ: The structure and function of complex networks. SIAM Review. 2003, 45: 167-256. 10.1137/S003614450342480.View Article
- Albert R: Scale-free networks in cell biology. J Cell Sci. 2005, 118: 4947-4957. 10.1242/jcs.02714View ArticlePubMed
- Hartwell LH, Hopfield JJ, Leibler S, Murray AW: From molecular to modular cell biology. Nature. 1999, 402: C47-C52. 10.1038/35011540View ArticlePubMed
- Spirin V, Mirny LA: Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA. 2003, 100: 12123-12128. 10.1073/pnas.2032324100PubMed CentralView ArticlePubMed
- Ravasz E, Barabási AL: Hierarchical organization in complex networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2003, 67 (2 Pt 2): 026112-View ArticlePubMed
- Dorogovtsev SN, Goltsev AV, Mendes JF: Pseudofractal scale-free web. Phys Rev E Stat Nonlin Soft Matter Phys. 2002, 65 (6 Pt 2): 066122-View ArticlePubMed
- Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL: Hierarchical organization of modularity in metabolic networks. Science. 2002, 297: 1551-1555. 10.1126/science.1073374View ArticlePubMed
- Potapov AP, Voss N, Sasse N, Wingender E: Topology of mammalian transcription networks. Genome Inf Ser. 2005, 16: 270-278.
- Shen-Orr SS, Milo R, Mangan S, Alon U: Network motifs in the transcriptional network of Escherichia coli. Nat Genet. 2002, 31: 64-68. 10.1038/ng881View ArticlePubMed
- Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: Simple building blocks of complex networks. Science. 2002, 298: 824-827. 10.1126/science.298.5594.824View ArticlePubMed
- Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U: Superfamilies of evolved and designed networks. Science. 2004, 5: 1538-1542. 10.1126/science.1089167.View Article
- Wuchty S, Oltvai ZN, Barabási AL: Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet. 2003, 35: 176-179. 10.1038/ng1242View ArticlePubMed
- Dobrin R, Beg QK, Barabási AL, Oltvai ZN: Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics. 2004, 5: 10- 10.1186/1471-2105-5-10PubMed CentralView ArticlePubMed
- Vázquez A, Dobrin R, Sergi D, Eckmann JP, Oltvai ZN, Barabási AL: The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci USA. 2004, 101: 17940-17945. 10.1073/pnas.0406024101PubMed CentralView ArticlePubMed
- Alon U: Network motifs: theory and experimental approaches. Nat Rev Genet. 2007, 8: 450-461. 10.1038/nrg2102View ArticlePubMed
- Mangan S, Alon U: Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci USA. 2003, 100: 11980-11985. 10.1073/pnas.2133841100PubMed CentralView ArticlePubMed
- Mangan S, Zaslaver A, Alon U: The coherent feedforward loop serves as a sign-sensitive delay element in transcription networks. J Mol Biol. 2003, 334: 197-204. 10.1016/j.jmb.2003.09.049View ArticlePubMed
- Kalir S, Mangan S, Alon U: A coherent feed-forward loop with a SUM input function prolongs flagella expression in Escherichia coli. Mol Syst Biol. 2005, 1: 2005.0006- 10.1038/msb4100010PubMed CentralView ArticlePubMed
- Potapov AP, Goemann B, Wingender E: The pairwise disconnectivity index as a new metric for the topological analysis of regulatory networks. BMC Bioinformatics. 2008, 9: 227-PubMed CentralView ArticlePubMed
- Krull M, Pistor S, Voss N, Kel A, Reuter I, Kroneberg D, Michael H, Schwarzer K, Potapov A, Choi C, Kel-Margoulis O, Wingender E: TRANSPATH®: An information resource for storing and visualizing signaling pathways and their pathological aberrations. Nucleic Acids Res. 2006, 34: D546-D551. 10.1093/nar/gkj107PubMed CentralView ArticlePubMed
- Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K, Voss N, Stegmaier P, Lewicki-Potapov B, Saxel H, Kel AE, Wingender E: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 2006, 34: D108-D110. 10.1093/nar/gkj143PubMed CentralView ArticlePubMed
- Wernicke S, Rasche F: FANMOD: a tool for fast network motif detection. Bioinformatics. 2006, 22: 1152-1153. 10.1093/bioinformatics/btl038View ArticlePubMed
- FANMOD. http://www.minet.uni-jena.de/~wernicke/motifs/
- Soutourina O, Kolb A, Krin E, Laurent-Winter C, Rimsky S, Danchin A, Bertin P: Multiple control of flagellum biosynthesis in Escherichia coli: role of H-NS protein and the cyclic AMP-catabolite activator protein complex in transcription of the flhDC master operon. J Bacteriol. 1999, 181: 7500-7508.PubMed CentralPubMed
- Bertin P, Terao E, Lee EH, Lejeune P, Colson C, Danchin A, Collatz E: The H-NS protein is involved in the biogenesis of flagella in Escherichia coli. J Bacteriol. 1994, 176: 5537-5540.PubMed CentralPubMed
- Pratt LA, Hsing W, Gibson KE, Silhavy TJ: From acids to osmZ: multiple factors influence synthesis of the OmpF and OmpC porins in Escherichia coli. Mol Microbiol. 1996, 20: 911-917. 10.1111/j.1365-2958.1996.tb02532.xView ArticlePubMed
- Toone WM, Johnson AL, Banks GR, Toyn JH, Stuart D, Wittenberg C, Johnston LH: Rme1, a negative regulator of meiosis, is also a positive activator of G1 cyclin gene expression. EMBO J. 1995, 14: 5824-5832.PubMed CentralPubMed
- Mitchell AP: Control of meiotic gene expression in Saccharomyces cerevisiae. Microbiol Rev. 1994, 58: 56-70.PubMed CentralPubMed
- Bowdish KS, Yuan HE, Mitchell AP: Positive control of yeast meiotic genes by the negative regulator UME6. Mol Cell Biol. 1995, 15: 2955-2961.PubMed CentralView ArticlePubMed
- Rubin-Bejerano I, Mandel S, Robzyk K, Kassir Y: Induction of meiosis in Saccharomyces cerevisiae depends on conversion of the transcriptional represssor Ume6 to a positive regulator by its regulated association with the transcriptional activator Ime1. Mol Cell Biol. 1996, 16: 2518-2526.PubMed CentralView ArticlePubMed
- Osley MA, Tsukuda T, Nickoloff JA: ATP-dependent chromatin remodeling factors and DNA damage repair. Mutat Res. 2007, 618: 65-80.PubMed CentralView ArticlePubMed
- McBride HJ, Yu Y, Stillman DJ: Distinct regions of the Swi5 and Ace2 transcription factors are required for specific gene activation. J Biol Chem. 1999, 274: 21029-21036. 10.1074/jbc.274.30.21029View ArticlePubMed
- Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med. 2004, 10: 789-799. 10.1038/nm1087View ArticlePubMed
- Li J, Liu KC, Jin F, Lu MM, Epstein JA: Transgenic rescue of congenital heart disease and spina bifida in Splotch mice. Development. 1999, 126: 2495-2503.PubMed
- Venkatesh KV, Bhartiya S, Ruhela A: Multiple feedback loops are key to a robust dynamic performance of tryptophan regulation in Escherichia coli. FEBS Lett. 2004, 63: 234-240. 10.1016/S0014-5793(04)00310-2.View Article
- Brandman O, Ferrell JE, Li R, Meyer T: Interlinked fast and slow positive feedback loops drive reliable cell decisions. Science. 2005, 310: 496-498. 10.1126/science.1113834PubMed CentralView ArticlePubMed
- Ramsey SA, Smith JJ, Orrell D, Marelli M, Petersen TW, de Atauri P, Bolouri H, Aitchison JD: Dual feedback loops in the GAL regulon suppress cellular heterogeneity in yeast. Nat Genet. 2006, 38: 1082-1087. 10.1038/ng1869View ArticlePubMed
- Kim D, Kwon YK, Cho KH: Coupled positive and negative feedback circuits form an essential building block of cellular signaling pathways. BioEssays. 2007, 29: 85-90. 10.1002/bies.20511View ArticlePubMed
- Kim JR, Yoon Y, Cho KH: Coupled feedback loops form dynamic motifs of cellular networks. Biophys J. 2008, 94: 359-365. 10.1529/biophysj.107.105106PubMed CentralView ArticlePubMed
- Mazurie A, Bottani S, Vergassola M: An evolutionary and functional assessment of regulatory network motifs. Genome Biology. 2005, 6: R35- 10.1186/gb-2005-6-4-r35PubMed CentralView ArticlePubMed
- Meshi O, Shlomi T, Ruppin E: Evolutionary conservation and over-representation of functionally enriched network patterns in the yeast regulatory network. BMC Syst Biol. 2007, 1: 1- 10.1186/1752-0509-1-1PubMed CentralView ArticlePubMed
- Konagurthu AS, Lesk AM: On the origin of distribution patterns of motifs in biological networks. BMC Systems Biology. 2008, 2: 73-81. 10.1186/1752-0509-2-73PubMed CentralView ArticlePubMed
- Saccharomyces Genome Database. http://www.yeastgenome.org
- DiVa. Program for evaluating the pairwise disconnectivity index. http://www.bioinf.med.uni-goettingen.de/services/
- The R project for statistical computing. http://www.r-project.org
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.