- Research article
- Open Access
Organizational structure and the periphery of the gene regulatory network in B-cell lymphoma
© Simoes et al.; licensee BioMed Central Ltd. 2012
- Received: 24 November 2011
- Accepted: 14 May 2012
- Published: 14 May 2012
The physical periphery of a biological cell is mainly described by signaling pathways which are triggered by transmembrane proteins and receptors that are sentinels to control the whole gene regulatory network of a cell. However, our current knowledge about the gene regulatory mechanisms that are governed by extracellular signals is severely limited.
The purpose of this paper is three fold. First, we infer a gene regulatory network from a large-scale B-cell lymphoma expression data set using the C3NET algorithm. Second, we provide a functional and structural analysis of the largest connected component of this network, revealing that this network component corresponds to the peripheral region of a cell. Third, we analyze the hierarchical organization of network components of the whole inferred B-cell gene regulatory network by introducing a new approach which exploits the variability within the data as well as the inferential characteristics of C3NET. As a result, we find a functional bisection of the network corresponding to different cellular components.
Overall, our study allows to highlight the peripheral gene regulatory network of B-cells and shows that it is centered around hub transmembrane proteins located at the physical periphery of the cell. In addition, we identify a variety of novel pathological transmembrane proteins such as ion channel complexes and signaling receptors in B-cell lymphoma.
- B-cell lymphoma
- Gene expression data
- Gene regulatory network
- Statistical network inference
The inference of gene regulatory networks from gene expression data is crucial for enhancing our understanding about relations between genes [1–3]. In general, a gene network describes a map of direct physical (biochemical) interactions among genes, gene products or metabolites that occur in the living cell [4, 5] and, hence, enable a systems biology approach [6–8]. It has been demonstrated that gene regulatory networks, as a specific type thereof, can be indirectly inferred from steady state gene expression data, which are measured under different conditions either in individual tissues or cell types [9–11].
In general, it is believed that the gene regulatory network is governed by major hub genes like transcription factors that directly bind specific DNA segments in the nucleus and activate or repress the expression of other genes [1, 12]. Further, it has been proposed that the genes in cellular networks are organized by a hierarchical and modular structure. This assumption has been studied, e.g., for metabolic networks . A hierarchical modularity implies functional community structures of interconnected layers in the network with a potentially heterogeneous modularity structure. For example, for the protein network of E. coli it has been demonstrated that the center of the network has a higher modularity than the periphery of the network 
The inference of gene interactions in a gene regulatory network from gene expression data is often discussed in connection with the nuclear transcriptional regulatory network [1, 16, 17]. In the simplified transcription factor vs target gene model, a transcription factor affects directly the gene expression of the mRNA of a target gene. This may give the impression that gene interactions inferred from expression data need to be interpreted in the context of transcription regulation. For this reason, inferred networks from gene expression data are frequently equated with the transcriptional regulatory network. However, this is not justified because expression data convey only information about the dynamic state of genes correspondingly their mRNAs and, hence, do not provide direct information about any type of biochemical binding, including transcription regulation, at all. Instead, inferred interactions from expression data are not limited to transcription regulation, but can also include protein-protein interactions . To emphasize this, we use the terminology gene regulatory network for a network that is inferred from gene expression data to point out that this is not necessarily a transcription regulatory network but a mixture of this and a protein-protein network .
The major purpose of this paper is to infer a gene regulatory network from a large-scale B-cell lymphoma gene expression data set, and to investigate its structural and biological organization. Immature B-cell lymphocytes are cells from the bone marrow that play an important role in the adaptive immune system. When B-cells are activated by an antigen they differentiate to memory B-cells, to antibody secreting plasma B-cells or proliferate intermediately to germinal centers (centroblasts and centrocytes) . B-cells are one of the most interesting cell types for the study of mammalian signaling and cell differentiation processes due to their unique physiological properties governing the adaptive immune system. Malignancy of the different B-cell lymphocyte types leads to a variety of lymphoma and leukemia disease phenotypes such as B-cell chronic lymphocytic leukemia (BCLL, germinal center), Burkitt lymphoma (BL, germinal center), Diffuse large B-cell lymphoma (DLBCL, germinal center), Follicular lymphoma (FL, germinal center), Hairy cell leukemia (HCL, memory B-cells), Mantle cell lymphoma (MCL, immature B-cells) and Multiple myeloma (MM, plasma cells). For our analysis, we use the microarray data set from  which contains samples from the germinal centers of lymphoma patients and experimental transformed germinal center cell types.
In a previous study, it has been found that the C3NET inference algorithm has a considerably higher true positive (TP) rate for leaf edges of genes in a network that are sparsely connected . For this reason we hypothesize that this method has characteristics which are very beneficial for the inference of peripheral regions of the gene regulatory network of B-cells. Due to the fact that B-cells are highly receptive to external stimuli, as described above, knowledge of these interactions seems viable for gaining a deeper functional understanding of the intricate differentiation processes.
In order to analyze the structural organization of B-cell lymphoma, we infer a gene regulatory network by using C3NET in combination with an ensemble approach. This means, instead of applying the inference method to one data set, we are applying it to a bootstrap  ensemble of data sets. This allows not only to assess local network-based measures down to the level of individual edges [23, 24] but also to obtain an average network structure which is amenable for a hierarchical analysis, as we will show in this article.
There are several large-scale B-cell lymphoma related gene expression data sets available of germinal center tumor samples from Diffuse large B-cell lymphoma (DLBCL), Follicular lymphoma (FL) and Burkitt lymphoma (BL) [25–29]. In this paper, we study the gene regulatory network from B-cell lymphoma by using the data set in . For an independent validation of our results we study in addition two Diffuse large B-cell lymphoma data set described in [25, 27].
In this paper, we infer the peripheral region of the gene regulatory network inferred from a large-scale B-cell lymphoma gene expression data set by using the C3NET algorithm. We provide a functional and a structural analysis of the largest connected component for this network. Further, we analyze the hierarchical organization of the network components of the B-cell gene regulatory network as revealed by the bootstrap approach.
In the following section we present the methods and the data used for our analysis.
Simulated Gene Expression Data
We simulate gene expression data sets for a variety of different network structures by using SynTReN and GeNGe [30, 31]. For each network type, we generate 300 data sets with a sample size of 100, 200, 500 and 1000. Further, for each of these data sets, a bootstrap ensemble of size b = 100 was generated by sampling with replacement.
In addition, we generate simulated gene expression data sets for a network consisting of 8 network modules, which are organized in a hierarchical manner; see Figure 4 for a visualization. Each network module is generated using a Modular Topology Model (MTM) network model, each with a size of 25 genes. A MTM network has properties such as a scale-free degree distribution, high clustering coefficients and short path lengths as observed in real biological networks [32, 33]. We construct 5 different networks by weakly connecting the 8 individual modules with a different number of connections. Specifically, the individual network modules are connected by 0, 3, 5, 10 and 15 edges, resulting in a total of 5 networks, each consisting of 200 genes. For each of the 5 networks, we generate independently 100 gene expression data sets with sample size 500 by using netsim . Netsim generates time-series data. In order to obtain steady state expression data each sample in a data set is taken after the 50th time point. The gene expression profiles are generated with a sigmoidal activator function.
Preprocessing of B-cell lymphoma microarray data sets
The collection of the microarray gene expression data used in this study are from , which are accessible from the NCBI Gene Expression Omnibus (GEO)  (accession GSE2350). We denote the GSE2350 dataset that includes transformed and untransformed B-cell lymphoma samples as the Basso GSE2350 dataset. For our analysis we consider only samples for which raw gene expression data in form of CEL files are available. From the total of 387 samples of the GSE2350 dataset, 344 samples were available with raw CEL files. In the following, we call this data set D. The data set includes two Affymetrix chip platforms, hgu95a and hgu95a_v2. We used the mixture CDF environment hgu95av12mixcdf_1.0.tar.gz available from http://bmbolstad.com/misc/mixtureCDF/MixtureCDF.html to include only probe sets that have the same probe set annotation.
For a cross-dataset validation of our study, we preprocessed two additional B-cell lymphoma data set. We retrieved a Diffuse-large B-cell lymphoma hgu133plus2 Affymetrix microarray data set with accession GSE11318  including 203 samples, and a Diffuse large B-cell lymphoma hgu133a Affymetrix microarray data set with the accession GSE22470  including 271 samples. These two data sets contain only untransformed B-cell lymphoma samples. We denote these as the Lenz GSE11318 dataset and the Salaverria GSE22470 dataset.
We processed all CEL files for each data set using RMA, a quantile normalization and summarization [35–37]. We extracted the log2 expression intensities for each probe set. Because a gene can be represented by more than one probe set, we calculate the median expression value for each gene by mapping the annotation of Affymetrix-ID to Entrez gene IDs to obtain a summary value for the genes. The Basso GSE2350 dataset comprises a total of 9,684 genes and 344 samples, where we do not exclude any unmapped probesets.
In order to perform a cross-dataset validation of the Basso GSE2350 dataset, we discarded all gene and probe set identifiers from the Lenz GSE11318 dataset and Salaverria GSE22470 dataset that are not present in the Basso GSE2350 dataset. After removal, the expression matrix of the Lenz GSE11318 dataset comprises 8,727 genes and 203 samples and the expression matrix of the Salaverria GSE22470 dataset comprises 8,664 genes and 271 samples.
Gene regulatory network inference
Here, p is the precision and r the recall.
The procedure for the Gene Ontology (GO)  enrichment analysis was implemented in R using the Entrez gene to GO annotation from the hgu95a_v2 and the org.Hs.eg.db package and for the GO enrichment analysis the topGO package  from Bioconductor in R . The significance level of the enrichment for a GO term was determined by a hyper-geometric test (Fisher’s Exact Test ). For the analysis, only terms assigned to more than 3 candidate genes are considered for the analysis.
Network gene centrality pathway analysis
For the cross-dataset validation of the B-cell C3NET gene regulatory networks inferred from different data sets, we conducted a pathway-based network comparison. This method allows to identify functional subnetworks with the strongest structural similarities between pairs of gene regulatory networks.
Betweenness centrality measures the proportion of all shortest paths between gene v k and gene v l , which traverse gene v i denoted by , referred to all shortest paths between gene v k and gene v l denoted by p kl .
For two given gene regulatory networks, G a and G b , we estimate the betweenness centrality values for all genes from a Gene Ontology (GO) term. Then, for each GO term, we perform Spearman’s rank correlation test  for the ranks of the betweenness centrality values. We adjust p-values using a FDR  correction for a given significance level of α = 0. 05. For the analysis we use the Gene Ontology (GO) annotation from the Bioconductor org.Hs.eg.db package.
Hierarchical network organization
For our analysis we are using U to define an error measure d2, defined in section ‘Graph edit distance hierarchy error’.
For our analysis we use the resulting K × K distance matrix D for a hierarchical clustering in combination with the “Ward” method. The overall procedure is summarized in Figure 5.
Consistency of bootstrap ensembles
We start our analysis by performing simulations to compare the distributions of F-scores of an ensemble of independently generated data sets with two bootstrap ensembles. For an illustration of the generation of these bootstrap data and the difference between the three types of F-scores, see Figure 2. The colors of the arrows in this figure correspond to the colors of the boxplots shown in Figure 3. That means the blue boxplots correspond to F-scores obtained for an ensemble of 300 independently generated data sets. The red boxplots correspond to 30000 ( = 300 × 100) F-scores obtained by bootstrapping each of the 300 data sets 100 times. We call this bootstrap ensemble BE 1. The boxplots in green show the 300 averaged F-scores, i.e., each F-score is averaged over 100 bootstrap samples. We call this bootstrap ensemble BE 2. Figure 3 shows the distribution of these F-scores for scale-free networks in dependence of four different sample sizes.In general, one can see that the distributions of F-scores of the two bootstrap ensembles are similar in range, median and the interquartiles to the F-scores obtained for the ensemble of independently generated data sets. However, the F-scores for BE 1 (shown in red) contain some outliers. This can be expected, because the bootstrapping of the data leads in general to a loss of information, due to the fact that not all samples are available for the inference task. For this reason, the median F-scores decline slightly, as can be seen from Figure 3. However, this decline is rather moderate, e.g., compared to the overall increase for larger sample sizes. Further, there are only few outliers, indicating that only very few bootstrap data sets lead to atypical results. Hence, our analysis demonstrates that the usage of bootstrap ensembles leads to a good approximation compared to results for and ensemble of independently generated data sets. Due to the fact that the latter data are only available in simulation studies, but not for real biological data, a bootstrap ensemble is a valid approach to estimate the variability of the population of inferred networks from an ensemble of data sets. We repeated the above analysis for different network topologies, including random networks and directed acyclic graphs (not shown), and found qualitatively similar results as for the scale-free networks shown above.
These results demonstrate that the bootstrap data lead to very similar results as the independent data, independent of the sample size. Hence, in the specific context of network inference bootstrapping data is an efficient means to generate an ensemble of data to resemble an independently generated ensemble.
Inferrability of a hierarchical organization
Dendrogram clustering error
Graph edit distance hierarchy error
In Figure 8 B we show the the empirical cumulative distribution function (ecdf) of the graph edit distance hierarchy error d2. The values of d2 decrease with an increasing number of interconnecting edges between the network modules. This means adding edges between the network modules helps in reducing the inference error. For the networks with no interconnections between the network modules (black) d2 is largest, as expected. These results correspond to the absence of a hierarchy between the network modules. Overall, the results for d2 are similar to d1 demonstrating that regardless of the chosen error measure a relatively low number of interconnecting edges is sufficient to enable the recovery of at least parts of the present hierarchy in the network.
Analyzing network components of the B-cell C3NET gene regulatory network
For the B-cell C3NET gene regulatory network, the K = 25 largest network components (5% right quantile) have > 100 genes and comprise a total of 4,673 genes representing 48% of all genes in the network. The giant connected component consists of 884 genes and 883 edges. For the two DLBCL-C3NET gene regulatory networks the largest K = 25 network components of the inferred networks comprise 3,331/3,477 genes representing 38%/40% of all genes in the entire gene regulatory network. The giant connected components of the two DLBCL gene regulatory networks consist of 299/395 genes and 298/394 edges.
Functional Network Analysis
In order to obtain a biological interpretation of the inferred B-cell C3NET gene regulatory network, we perform a Gene Ontology  enrichment analysis for each of the K = 25 largest network components. To perform this analysis, the inferred network component are used to define gene lists for which we perform an enrichment analysis.The Tables 12 and 3 present results for the giant connected component. In these tables, the top 15 enriched GO terms with a significant p-value ≤ 5e− 4 are shown. The three tables correspond to the Gene Ontology categories Biological Process (Table 1), Molecular Function (Table 2) and Cellular Component (Table 3). The genes in the giant connected component show an enrichment in biological processes for G-protein-coupled-receptor protein signaling pathway (89 genes), cell-cell signaling (87 genes) and calcium ion transport (26 genes) (Table 1). The cellular component analysis shows an enrichment, e.g., for plasma membrane proteins (264 genes), ion channel complexes (125 genes) and cell junction proteins (48 genes) (Table 3). The molecular function analysis shows an enrichment, e.g., for G-protein coupled receptor activity (60 genes) and ion channel activity (38 genes) (Table 2).
GO category Biological Process: Enrichment analysis of the genes in the giant connected component
G-protein coupled receptor protein signaling pathway
neurological system process
multicellular organismal process
cell surface receptor linked signaling pathway
transmission of nerve impulse
metal ion transport
divalent metal ion transport
calcium ion transport
GO category Molecular function: Enrichment analysis of the genes in the giant connected component
G-protein coupled receptor activity
transmembrane receptor activity
signal transducer activity
molecular transducer activity
calcium ion binding
metal ion transmembrane transporter activity
cation channel activity
ion channel activity
gated channel activity
passive transmembrane transporter activity
transmembrane transporter activity
substrate-specific channel activity
GO category Cellular component: Enrichment analysis of the genes in the giant connected component
integral to membrane
intrinsic to membrane
plasma membrane part
integral to plasma membrane
intrinsic to plasma membrane
ion channel complex
cation channel complex
calcium channel complex
voltage-gated calcium channel complex
The numbers of the leaves in the dendrogram correspond to the rank-labels of the network components, whereas ‘1’ corresponds to the GCC. The provided GO terms correspond to the most frequently enriched terms found in the corresponding branches of the dendrogram. The hierarchical clustering based on the functional GO analysis of the network components separates the dendrogram into two principle branches. The first branch, shown in red, consists of highly enriched extracellular proteins, intrinsic and integral membrane proteins, cell junction and ion channel complex proteins. The second branch, shown in blue, is highly enriched for intracellular proteins from the nucleus, mitochondrion and cytoplasm.
In order to provide a comparison with the gene regulatory networks inferred from the two DLBCL gene expression data sets, we perform the same analysis for the Lenz GSE11318 dataset and the Salaverria GSE22470 dataset (Figure 10). The network components of the DLBCL gene regulatory networks show a similar bipartition as observed for the Basso GSE2350 gene regulatory network separating into two principle branches for the peripheral and the intracellular regions of the cell. A major difference is that the DLBCL gene regulatory networks show a bipartition within the principle branch enriched with genes of the peripheral regions.
Hierarchical organization of the B-cell C3NET gene regulatory network
Next, we study the hierarchical organization of the K = 25 largest network components of the B-cell gene regulatory network. This analysis is similarly conducted as for the simulated data, described in section ’Inferrability of a hierarchical organization’. That means, first, we generate b = 100 bootstrap data sets from which we infer an ensemble of networks . Then, we determine from these networks a distance matrix D, which we use for a hierarchical clustering. As agglomeration clustering method we use again the “Ward” method.The resulting dendrograms are shown in Figure 10 B (second column). Also in these dendrograms, the rank-labels of the network components correspond to the leaf labels. As for the clustering of significantly enriched GO terms between the individual network components, we observe a bifurcation into two principal branches. Though the subgroupings of individual components differ to some extend in the respective categories, one can see that the same two principal branches are obtained as for the clustering of the GO terms in Figure 10 A. The first branch corresponds to the extracellular and membrane intrinsic proteins enriched network components and the second branch belongs to intracellular network components enriched by genes in the nucleus, mitochondria and cytoplasm.
We would like to emphasize that the generation of both dendrograms is based on complementary information. Figure 10 A is obtained from dissimilarity values among GO terms, not considering the inferred interactions among genes. In contrast, Figure 10 B is obtained from a structural analysis of the inferred network, not considering GO terms. This demonstrates that the extracted information from two complementary analysis methods leads to coinciding information with respect to the principle separation of cellular components of a biological cell.
Further, we compare the results of the B-cell C3NET gene regulatory network to the DLBCL-C3NET gene regulatory networks inferred from the Lenz GSE11318 dataset and the Salaverria GSE22470 dataset (Figure 10 B). Although the subgroupings between the functional and structural hierarchical clustering differ to some extend, overall, the network components of the two DLBCL gene regulatory networks show a similar clustering into two major branches of peripheral and intracellular regions. However, the bipartiton of the structural network components (second column in Figure 10) is less pronounced as observed for the B-cell C3NET gene regulatory network for the Basso GSE2350 dataset.
Identification of novel key signaling pathways in B-cell lymphoma
Hub genes of the B-cell C3NET gene regulatory network are genes with the largest node degree among all genes in the network. Intuitively, such genes are the most interesting targets to study as they are more likely to be associated with multiple pathways, e.g., signaling pathways and thus form putative key regulators for a large diversity of biological processes.
From the entire B-cell C3NET gene regulatory network, we extracted the largest 25 hub genes with more than 20 connections. In Table 4 we give an overview of these hub genes including their gene identifiers and a selected GO term in order to facilitate the interpretation of their functional context. The selected hub genes play crucial roles in signaling processes such as receptors, ion channels and transporters, cell adhesion proteins and transcription factors. To our knowledge these genes were not studied in B-cell lymphoma to date (Table 4).
Top 25 hub genes with a degree (deg) larger than 20 found in the B-cell lymphoma gene regulatory network - genes are described by their Entrez gene id, gene symbol, and, if available, one selected annotation term from GO (category Biological Process), bc refers to the betweenness centrality and the number in brackets to its rank with respect to the bc values
calcium channel, voltage-dependent, L type, alpha 1F subunit (calcium ion transport GO:0006816)
glutathione S-transferase mu 5 (metabolic process GO:0008152)
tubby homolog (mouse) (response to stimulus GO:0050896)
claudin 9 (calcium-independent cell-cell adhesion GO:0016338)
5-hydroxytryptamine (serotonin) receptor 7 (adenylate cyclase-coupled) (signal transduction GO:0007165)
cytochrome P450, family 4, subfamily A, polypeptide 11 (long-chain fatty acid metabolic process GO:0001676)
calcitonin-related polypeptide alpha (endothelial cell proliferation GO:0001935)
Zic family member 2 (odd-paired homolog, Drosophila) (cell differentiation GO:0030154)
ELAV (embryonic lethal, abnormal vision, Drosophila)-like 2 (Hu antigen B) ( NA)
cytochrome P450, family 2, subfamily A, polypeptide 7 (oxidation-reduction process GO:0055114)
proline-rich protein HaeIII subfamily 1 ( NA)
nuclear receptor subfamily 5, group A, member 1 (cell-cell signaling GO:0007267)
mitochondrial ribosomal protein L3 (translation GO:0006412)
sorting nexin 29 (cell communication GO:0007154)
solute carrier family 6 (neurotransmitter transporter, L-proline), member 7 (proline transport GO:0015824)
Rho GTPase activating protein 33 (signal transduction GO:0007165)
amiloride-sensitive cation channel 1, neuronal (sodium ion transport GO:0006814)
ephrin-A2 (cell-cell signaling GO:0007267)
transglutaminase 4 (prostate) (peptide cross-linking GO:0018149)
aquaporin 8 (water transport GO:0006833)
purinergic receptor P2X, ligand-gated ion channel, 6 (signal transduction GO:0007165)
gonadotropin-releasing hormone 2 (signal transduction GO:0007165)
proline-rich protein BstNI subfamily 4 ( NA)
NADH dehydrogenase (ubiquinone) 1, alpha/beta subcomplex, 1, 8kDa (electron transport chain GO:0022900)
arginine vasopressin receptor 1B (signal transduction GO:0007165)
Influence of activator and repressor links
In this section we study the inferrability of activator and repressor links. First we determine the correlation coefficient of all significant edges in the inferred network and obtain their corresponding p-values from testing for a vanishing Pearson correlation coefficient. Second, we conduct a multiple testing correction using the Benjamini-Hochberg procedure . The edges that are statistically significant are identified as activator correspondingly repressor edges if the sign of the correlation coefficient is positive respectively negative.
In the inferred B-cell C3NET gene regulatory network, we identify a total of 847 repressor edges and 8,372 activator edges. The estimated true reconstruction rate for repressor and activator edges is obtained from the bootstrap ensemble. A two-sample Kolmogorov-Smirnov test  comparing the distributions of the true reconstruction rates indicates a significant difference between these two distributions with a p-value of p = 2. 2 × 10− 16. Further, we find that activator edges are easier to infer than repressor edges, because activator edges have statistically a higher true positive rate than repressor edges.
Relationship of node degrees in the gene regulatory network and gene expression values
Next, we investigate the node degrees of genes in the inferred B-cell C3NET gene regulatory network and compare these with the variances of their gene expression values. We perform a loess (locally weighted scatterplot smoothing)  regression on the logarithm of the variances of the gene expression values and the corresponding node degree for each gene. We observe a positive correlation for genes up to a node degree of 7. In contrast, genes with a higher node degree show a negative correlation (results not shown). Thus, genes with a higher node degree in the inferred B-cell C3NET gene regulatory network show a smaller variation in their expression profile among the different samples of the expression data set.
Similarly, the connection between the gene expression variation and the node degrees in a protein-protein network was studied in . There it was shown that with an increasing degree of the proteins, the gene expression variation decreases. Hence, for degrees larger than 7, both results coincide, however, for smaller degrees there seem to be differences between a protein-protein network and a gene regulatory network.
Cross-dataset validation for cellular component subnetworks
We perform a cross-dataset validation studying the structural similarity of our B-cell C3NET gene regulatory network with two additional DLBCL-C3NET gene regulatory networks we inferred from observational germinal center tumor data sets from  (Lenz GSE11318 dataset) and  (Salaverria GSE22470 dataset). In order to assess the structural similarity between networks, we use the (vertex) betweenness centrality measure  in combination with Spearman’s rank correlation coefficient . We use Spearman’s rank correlation coefficient to test if structural components of two networks are similar to each other with respect to the order of the vertex betweenness centrality values of the genes. Specifically, in the following, we study two different scales of the networks. First, we compare the entire networks using all genes. This corresponds to a global comparison. Second, we compare subnetworks defined as cellular components according to the gene ontology database. This corresponds to a local comparison.
From the global comparison, we find that the B-cell C3NET gene regulatory network shows a significant correlation of r ∼ 0. 12 ( p ≤ 2. 2− 16) to the DLBCL-C3NET gene regulatory networks of the Salaverria GSE22470 dataset and r ∼ 0. 14 ( p ≤ 2. 2− 16) to the DLBCL-C3NET gene regulatory network of the Lenz GSE11318 dataset. A comparison between the two DLBCL-C3NET gene regulatory networks shows also a significant correlation of r ∼ 0. 24 ( p ≤ 2. 2− 16).For the local comparisons, we test a total of 435 cellular components (corresponding to gene sets) that can be found in the networks having more than 10 genes. From these cellular components, we identify the ones with a statistically significant Spearman rank correlation coefficient between profile vectors whose components correspond to the vertex betweenness centrality values of the genes in cellular components. To the resulting nominal p-values, we are applying the Benjamini-Hochberg multiple testing correction procedure  to control the FDR at a level of 5%.
From the comparisons of the B-cell C3NET gene regulatory network with the DLBCL-C3NET gene regulatory network obtained from the Lenz GSE11318 dataset, we identify 95 (21%) gene sets, and for the comparison of the B-cell C3NET gene regulatory network with the DLBCL-C3NET gene regulatory networks obtained from the Salaverria GSE22470 dataset, we find 72 (16.5%) gene sets with a statistically significant correlation. In total, 58 terms are simultaneously significant in both network comparisons. These terms involve the basal part of cell, cell periphery, endosome and 17 gene sets sharing the parental term GO:0032991 macromolecular complex, e.g., histone mehyltransferase complex, anaphase-promoting complex, ribosome and cation chanel complex. In Table 5, we show the 30 Gene Ontology cellular component gene sets with the highest structural similarity between the B-cell C3NET gene regulatory network and the two DLBCL-C3NET gene regulatory networks. Each of the presented terms is statistically significant in both comparisons and the subscript ‘ave’ indicates the averaged values over these two comparisons.
Network similarity analysis for cellular components between the B-cell C3NET gene regulatory network and the Lenz and Salaverria gene regulatory network for 30 from the 58 cellular component subnetworks with the highest correlation coefficient of the betweenness centrality, significant in both comparisons - the columns denote the size (number of genes) of a Gene Ontology term represented in the subnetworks, betw avg the average betweenness for the two comparisons, r avg Spearman’s rank correlation coefficient and p avg the FDR adjusted p-value
pval avg (FDR)
basal part of cell
histone methyltransferase complex
very-low-density lipoprotein particle
high-density lipoprotein particle
cullin-RING ubiquitin ligase complex
endoplasmic reticulum lumen
DNA-directed RNA polymerase II, core complex
endoplasmic reticulum membrane
endoplasmic reticulum part
cation channel complex
transcription factor TFIID complex
In this article, we inferred a B-cell gene regulatory network from B-cell lymphoma gene expression data  using the C3NET algorithm . We found that the inferred B-cell C3NET gene regulatory network is characterized by individual network components that are organized by smaller interconnected network modules with intramodular hub genes. Further, we found that the giant connected component of the network is composed of 884 genes which show a significant enrichment for plasma membrane proteins that are involved in G protein signaling pathways and ion channel complexes. From the literature, it is known that ion channels play a key role for the signal transduction mechanism in lymphocytes . Additionally, we found that the 25 largest components of the entire network can be categorized into two major classes. The first class, including the largest network component, is enriched by genes that are located at the membrane and the extracellular space at the physical periphery of the cell whereas the second class comprises network components located in the intracellular organelles such as in the cytoplasm, nucleus and mitochondrion. Further, the hub genes of the B-cell C3NET gene regulatory network were identified to play crucial roles in cell signaling, adhesion and cell proliferation processes.
It is believed that B-cell lymphoma subtypes show characteristic gene expression profiles of B-cells that are arrested in specific developmental stages . The emergence of a lymphoma phenotype is thus understood to result from an impairment of pathways that control B-cell differentiation, proliferation and apoptosis processes . The organizational structure of gene regulatory networks is a rich source of information to study specific molecular mechanisms of B-cell lymphoma. However, the combination of observational and experimental conditions from a variety of different B-cell lymphoma, including transformed and untransformed cells, as for our data , does not allow to infer a gene regulatory network for one particular subtype of B-cell lymphoma. Thus we are of the opinion that our inferred B-cell C3NET gene regulatory network represents an average representation of B-cell lymphoma reflecting different phenotypic subtypes with which the information conveyed by the gene expression values is associated.
In  it has been demonstrated that not all regions within a network can be inferred with the same inference accuracy. That means, the inference of networks is heterogeneous with respect to distinct edges in the network. It has been shown that moderately interconnected genes are easier to infer. This corresponds to the edges of linearly connected genes and the edges toward the leaf nodes of the network that are at the ’periphery’ of the network. The results in  have been obtained for simulated data. However, for a real biological gene regulatory network it was unclear what genes correspond to the periphery of this cellular network. In contrast, in this paper we demonstrated that the periphery of the inferred B-cell C3NET gene regulatory network is centered around transmembrane proteins and the linear parts of the gene regulatory network correspond to signaling pathways and transmembrane receptor or ion channel proteins involved in signaling cascades. We would like to note that these transmembrane proteins could form putative drug targets for B-cell lymphoma.
The C3NET algorithm selects at most one edge for each gene, having maximum mutual information value. Therefore, this algorithm intends to capture the conservative causal core of the whole regulatory network only. This is in contrast to many other network inference methods [17, 21, 58]. For this reason, it is no surprise that a previous analysis of the same data set employing a different network inference method  found that their inferred regulatory network is governed by major hub genes, which mark key regulators such as transcription factors . In particular, the network inferred by ARACNE consisted of 129,000 edges and their major hub genes are reported to be cell cycle regulators. In contrast to these results, we found by our analysis a network with 9, 684 genes and 9, 221 edges enriched for signaling pathways and transmembrane receptors characterizing the physical periphery of a cell rather than its nucleus. From this and the conservative characteristics of C3NET, we conclude that the strongest signal within the data set  is actually from signaling pathways rather than from transcription regulation. Doubtlessly, the later is present too, however, with a reduced strength.
Another difference to the study in  is that we introduced in this article a novel bootstrap approach to reveal the hierarchical organization of the B-cell C3NET gene regulatory network. Due to the inferential characteristics of C3NET the resulting network inferred from using all 344 microarray samples resulted in several separate network components which we used to define network modules. That means, there is no need to apply module finding algorithms [59–61] but we obtain such modules naturally by the application of C3NET. In order to infer the hierarchical organization of these modules, we utilized a bootstrap ensemble, from which we estimated an ensemble of networks. Combining the ensemble of these networks with the information about the network components obtained from the complete data set, allowed us to obtain a structural clustering reflecting the hierarchical organization of these network components. We would like to emphasize that this hierarchical clustering does not utilize information about GO terms. This is in contrast to the hierarchical clustering of GO terms presented in Figure 10 A. Nevertheless, we identified the major branches in Figure 10 B that correspond well to the clustering of the GO terms in Figure 10 A. We would like to indicate that our results confirm findings presented in . It was found that the yeast and the E. coli protein network can be separated into two highly modular subnetworks which showed a functional enrichment for intracellular and extracellular processes. Hence, this may hint to a fundamental organization scheme of cellular networks. A potential hypothesis derived from these results is that the hierarchy among the network components may reflect aspects of the information flow between these components [62, 63].
There are several advantages resulting form our approach, we would like to highlight. First, our investigation of the hierarchical organization of the B-cell C3NET gene regulatory network is at the abstraction level of network components or modules, but not genes. As such it resembles a systems approach [64–66]. This leads to a tremendous reduction in the complexity of the problem, and specifically in the interpretation of the obtained dendrograms shown in Figure 10. Second, on a technical note the size of the bootstrap ensemble was chosen large enough so that a further increase in its size does not lead to a modification of the obtained clustering. For this reason, the obtained results are stable. Third, the merit of bootstrapping is well known in many branches of statistics [22, 67], where it is frequently used to quantify the variability within the data. In our approach, we utilize the data variability by exploiting mutual information values which are too weak in the whole data set to either pass a statistical test or which are not the maximum mutual information value for any gene. For example, there may be genes that have several significant interactions with other genes within a very small margin. For such cases, the bootstrapping allows to favor different gene pairs, because a slight change in the constitution of a data set may lead to alternating selections regarding the maximum mutual information valued gene pair.
Finally, we would like to note that results from our reanalysis of the data set  demonstrate that the biological information buried within large-scale high-throughput data is rich allowing to investigate a multitude of different biological questions.
With the increasing quality of network inference algorithms, we are heading toward the next major challenge we are facing in the post-genomic era, namely: What do the inferred networks mean? An analysis of the hierarchical organization of a network is just one aspect thereof, but we think, an import one. Due to the fact that one can study the hierarchy among genes, pathways, subnetworks or combinations thereof the complexity of this problem might be unprecedented. The bootstrap approach presented in this paper represents a simple, yet, flexible method in order to tame the complexity of the problem resulting, additionally, in an interpretable structure.
We would like to thank Gökmen Altay and Ken Mills for fruitful discussions. For our numerical simulations we used R , SynTRen , GeNGe  and netsim , and for the visualization of the networks igraph . This project is partly supported by the Department for Employment and Learning through its “Strengthening the all-Island Research Base” initiative.
- Guelzim N, Bottani S, Bourgine P, Kepes F: Topological and causal structure of the yeast transcriptional regulatory network. Nat Genet 2002, 31: 60. 10.1038/ng873View ArticleGoogle Scholar
- Stolovitzky G, Califano A (Eds): Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference. Malden:Wiley-Blackwell; 2007.Google Scholar
- Xing B, van der Laan M: A causal inference approach for constructing transcriptional regulatory networks. Bioinformatics 2005,21(21):4007. 10.1093/bioinformatics/bti648View ArticleGoogle Scholar
- Barabási AL, Oltvai ZN: Network Biology: Understanding the Cell’s Functional Organization. Nat Rev 2004, 5: 101. 10.1038/nrg1272View ArticleGoogle Scholar
- Emmert-Streib F, Glazko G: Network biology: A direct approach to study biological function. Wiley Interdiscip Rev Syst Biol Med 2010,3(4):379.View ArticleGoogle Scholar
- Alon U: An Introduction to Systems Biology: Design Principles of Biological Circuits. Boca Raton: Chapman & Hall/CRC; 2006.Google Scholar
- Dehmer M, Emmert-Streib F, Graber A, Salvador A (Eds): Applied Statistics for Network Biology: Methods for Systems Biology. Weinheim: Wiley-Blackwell; 2011.View ArticleGoogle Scholar
- Palsson B: Systems Biology. New York: Cambridge University Press, Cambridge; 2006.View ArticleGoogle Scholar
- Bulashevska S, Eils R: Inferring genetic regulatory logic from expression data. Bioinformatics 2005, 21: 2706. 10.1093/bioinformatics/bti388View ArticleGoogle Scholar
- Emmert-Streib F, Dehmer M (Eds): Analysis of Microarray Data: A Network Based Approach. Weinheim: Wiley-VCH; 2008.View ArticleGoogle Scholar
- Husmeier D: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 2003,19(17):2271. 10.1093/bioinformatics/btg313View ArticleGoogle Scholar
- Lee TI, et al.: Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Science 2002,298(5594):799. 10.1126/science.1075090View ArticleGoogle Scholar
- Ravasz E, Somera A, Mongru D, Oltvai Z, Barabasi A: Hierarchical organization of modularity in metabolic networks. Science 2002, 297: 1551. 10.1126/science.1073374View ArticleGoogle Scholar
- Tamames J, Moya A, Valencia A: Modular organization in the reductive evolution of protein-protein interaction networks. Genome Biol 2007, 8: R94. 10.1186/gb-2007-8-5-r94View ArticleGoogle Scholar
- Vinogradov A: Modularity of cellular networks shows general center-periphery polarization. Bioinformatics 2008, 24: 2814. 10.1093/bioinformatics/btn555View ArticleGoogle Scholar
- Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA: Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol 2004, 14: 283. 10.1016/j.sbi.2004.05.004View ArticleGoogle Scholar
- Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 2006,7(Suppl 1):S7. 10.1186/1471-2105-7-S1-S7View ArticleGoogle Scholar
- Altay G, Emmert-Streib F: Inferring the conservative causal core of gene regulatory networks. BMC Syst Biol 2010, 4: 132. 10.1186/1752-0509-4-132View ArticleGoogle Scholar
- Emmert-Streib F, Glazko G, Altay G, de Matos Simoes R: Statistical inference and reverse engineering of gene regulatory networks from observational expression data. Front Genet 2012, 3: 8.View ArticleGoogle Scholar
- Kuppers R: Mechanisms of B-cell lymphoma pathogenesis. Nat Rev Cancer 2005, 5: 251. 10.1038/nrc1589View ArticleGoogle Scholar
- Basso K, Margolin A, Stolovitzky G, Klein U, Dalla-Favera R, Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet 2005, 37: 382. 10.1038/ng1532View ArticleGoogle Scholar
- Efron B, Tibshirani R: An Introduction to the Bootstrap. New York: Chapman and Hall/CRC; 1994.Google Scholar
- Altay G, Emmert-Streib F: Revealing differences in gene network inference algorithms on the network level by ensemble methods. Bioinformatics 2010, 26: 1738. 10.1093/bioinformatics/btq259View ArticleGoogle Scholar
- Emmert-Streib F, Altay G: Local network-based measures to assess the inferability of different regulatory networks. IET Syst Biol 2010, 4: 277. 10.1049/iet-syb.2010.0028View ArticleGoogle Scholar
- Salaverria I, Philipp C, Oschlies I, Kohler C, Kreuz M, Szczepanowski M, Burkhardt B, Trautmann H, Gesk S, Andrusiewicz M, Berger H, Fey M, Harder L, Hasenclever D, Hummel M, Loeffler M, Mahn F, Martin-Guerrero I, Pellissery S, Pott C, Pfreundschuh M, Reiter A, Richter J, Rosolowski M, Schwaenen C, Stein H, Trumper L, Wessendorf S, Spang R, Kuppers R, Klapper W, Siebert R: Translocations activating IRF4 identify a subtype of germinal center-derived B-cell lymphoma affecting predominantly children and young adults. Blood 2011, 118: 139. 10.1182/blood-2011-01-330795View ArticleGoogle Scholar
- Lenz G, Wright G, Dave S, Xiao W, Powell J, Zhao H, Xu W, Tan B, Goldschmidt N, Iqbal J, Vose J, Bast M, Fu K, Weisenburger D, Greiner T, Armitage J, Kyle A, May L, Gascoyne R, Connors J, Troen G, Holte H, Kvaloy S, Dierickx D, Verhoef G, Delabie J, Smeland E, Jares P, Martinez A, Lopez-Guillermo A, Montserrat E, Campo E, Braziel R, Miller T, Rimsza L, Cook J, Pohlman B, Sweetenham J, Tubbs R, Fisher R, Hartmann E, Rosenwald A, Ott G, Muller-Hermelink H, Wrench D, Lister T, Jaffe E, Wilson W, Chan W, Staudt L: Stromal gene signatures in large-B-cell lymphomas. N Engl J Med 2008, 359: 2313. 10.1056/NEJMoa0802885View ArticleGoogle Scholar
- Lenz G, Wright G, Emre N, Kohlhammer H, Dave S, Davis R, Carty S, Lam L, Shaffer A, Xiao W, Powell J, Rosenwald A, Ott G, Muller-Hermelink H, Gascoyne R, Connors J, Campo E, Jaffe E, Delabie J, Smeland E, Rimsza L, Fisher R, Weisenburger D, Chan W, Staudt L: Molecular subtypes of diffuse large B-cell lymphoma arise by distinct genetic pathways. Proc Natl Acad Sci USA 2008, 105: 13520. 10.1073/pnas.0804295105View ArticleGoogle Scholar
- Deffenbacher K, Iqbal J, Liu Z, Fu K, Chan W: Recurrent chromosomal alterations in molecularly classified AIDS-related lymphomas: an integrated analysis of DNA copy number and gene expression. J Acquir Immune Defic Syndr 2010, 54: 18.View ArticleGoogle Scholar
- Hummel M, Bentink S, Berger H, Klapper W, Wessendorf S, Barth T, Bernd H, Cogliatti S, Dierlamm J, Feller A, Hansmann M, Haralambieva E, Harder L, Hasenclever D, Kuhn M, Lenze D, Lichter P, Martin-Subero J, Moller P, Muller-Hermelink H, Ott G, Parwaresch R, Pott C, Rosenwald A, Rosolowski M, Schwaenen C, Sturzenhofecker B, Szczepanowski M, Trautmann H, Wacker H, Spang R, Loeffler M, Trumper L, Stein H, Siebert R: A biologic definition of Burkitt’s lymphoma from transcriptional and genomic profiling. N Engl J Med 2006, 354: 2419. 10.1056/NEJMoa055351View ArticleGoogle Scholar
- Hache H, Wierling C, Lehrach H, Herwig R: GeNGe: systematic generation of gene regulatory networks. Bioinformatics 2009, 25: 1205. 10.1093/bioinformatics/btp115View ArticleGoogle Scholar
- Van den Bulcke T, Van Leemput K, Naudts B, van Remortel P, Ma H, Verschoren A, De Moor, Marchal K: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics 2006, 7: 43. 10.1186/1471-2105-7-43View ArticleGoogle Scholar
- Di Camillo B, Toffolo G, Cobelli C: A gene network simulator to assess reverse engineering algorithms. Ann N Y Acad Sci 2009, 1158: 125. 10.1111/j.1749-6632.2008.03756.xView ArticleGoogle Scholar
- Newman MEJ: The Structure and Function of Complex Networks. SIAM Rev 2003, 45: 167. 10.1137/S003614450342480View ArticleGoogle Scholar
- Barrett T, Troup D, Wilhite S, Ledoux P, Evangelista C, Kim I, Tomashevsky M, Marshall K, Phillippy K, Sherman P, Muertter R, Holko M, Ayanbule O, Yefanov A, Soboleva A: NCBI GEO: archive for functional genomics data sets–10 years on. Nucleic Acids Res 2011, 39: D1005—D1010.View ArticleGoogle Scholar
- Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003,31(4):e15. 10.1093/nar/gng015View ArticleGoogle Scholar
- Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003,19(2):185. 10.1093/bioinformatics/19.2.185View ArticleGoogle Scholar
- Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U, Speed T: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4: 249. 10.1093/biostatistics/4.2.249View ArticleGoogle Scholar
- Meyer P, Kontos K, Lafitte F, Bontempi G: Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol 2007, 2007: 79879.View ArticleGoogle Scholar
- Olsen C, Meyer P, Bontempi G: On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information. EURASIP J Bioinform Syst Biol 2009, 2009: 308959.View ArticleGoogle Scholar
- Ashburner M, Ball C, Blake J, Botstein D, Butler H, et al., et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 2000, 25: 25. 10.1038/75556View ArticleGoogle Scholar
- Alexa A, Rahnenfuhrer J, Lengauer T: Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 2006, 22: 1600. 10.1093/bioinformatics/btl140View ArticleGoogle Scholar
- Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini A, Sawitzki G, Smith C, Smyth G, Tierney L, Yang J, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5: R80. 10.1186/gb-2004-5-10-r80View ArticleGoogle Scholar
- Sheskin DJ: Handbook of Parametric and Nonparametric Statistical Procedures. Boca Raton: RC Press; 2004.Google Scholar
- Freeman LC: A set of measures of centrality based on betweenness. Sociometry 1977, 40: 35. 10.2307/3033543View ArticleGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc, Ser B (Methodological) 1995, 57: 125.Google Scholar
- Bunke H: What is the distance between graphs? Bull EATCS 1983, 20: 35.Google Scholar
- Bunke H: On a relation between graph edit distance and maximum common subgraph. Pattern Recogn Lett 1997,18(9):689.View ArticleGoogle Scholar
- Emmert-Streib F: The Chronic Fatigue Syndrome: A Comparative Pathway Analysis. J Comput Biol 2007,14(7):961. 10.1089/cmb.2007.0041View ArticleGoogle Scholar
- McRory J, Hamid J, Doering C, Garcia E, Parker R, Hamming K, Chen L, Hildebrand M, Beedle A, Feldcamp L, Zamponi G, Snutch T: The CACNA1F gene encodes an L-type calcium channel with unique biophysical properties and tissue distribution. J Neurosci 2004, 24: 1707. 10.1523/JNEUROSCI.4846-03.2004View ArticleGoogle Scholar
- Zheng A, Yuan F, Li Y, Zhu F, Hou P, Li J, Song X, Ding M, Deng H: Claudin-6 and claudin-9 function as additional coreceptors for hepatitis C virus. J Virol 2007, 81: 12465. 10.1128/JVI.01457-07View ArticleGoogle Scholar
- Dong Y, Reddy D, Green K, Chauhan M, Wang H, Nagamani M, Hankins G, Yallampalli C: Calcitonin gene-related peptide (CALCA) is a proangiogenic growth factor in the human placental development. Biol Reprod 2007, 76: 892. 10.1095/biolreprod.106.059089View ArticleGoogle Scholar
- Lai P, Wang C, Chen W, Kao Y, Tsai H, Tachibana T, Chang W, Chung B: Steroidogenic Factor 1 (NR5A1) resides in centrosomes and maintains genomic stability by controlling centrosome homeostasis. Cell Death Differ 2011,18(12):1836. 10.1038/cdd.2011.54View ArticleGoogle Scholar
- Worby C, Dixon J: Sorting out the cellular functions of sorting nexins. Nat Rev Mol Cell Biol 2002, 3: 919. 10.1038/nrm974View ArticleGoogle Scholar
- Cleveland WS, Devlin SJ: Locally weighted regression: An approach to regression analysis by local fitting. J Am Stat Assoc 1988, 83: 596. 10.1080/01621459.1988.10478639View ArticleGoogle Scholar
- Zhou L, Ma X, Sun F: The effects of protein interactions, gene essentiality and regulatory regions on expression variation. BMC Syst Biol 2008, 2: 54. 10.1186/1752-0509-2-54View ArticleGoogle Scholar
- Lewis R, Cahalan M: Ion channels and signal transduction in lymphocytes. Annu Rev Physiol 1990, 52: 415. 10.1146/annurev.ph.52.030190.002215View ArticleGoogle Scholar
- Shaffer A, Rosenwald A, Staudt L: Lymphoid malignancies: the dark side of B-cell differentiation. Nat Rev Immunol 2002, 2: 920. 10.1038/nri953View ArticleGoogle Scholar
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, et al., et al.: Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles. PLoS Biol 2007,5(1):e8. 10.1371/journal.pbio.0050008View ArticleGoogle Scholar
- Fortunato S: Community detection in graphs. Phys R 2010,486(3–5):75.View ArticleGoogle Scholar
- Newman MEJ, Girvan M: Finding and evaluating community structures in networks. Phys Rev E 2004, 69: 026113.View ArticleGoogle Scholar
- Rosvall M, Bergstrom C: An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci USA 2007,104(18):7327. 10.1073/pnas.0611034104View ArticleGoogle Scholar
- Emmert-Streib F, Dehmer M: Information processing in the transcriptional regulatory network of yeast: functional robustness. BMC Syst Biol 2009, 3: 35. 10.1186/1752-0509-3-35View ArticleGoogle Scholar
- Emmert-Streib F, Dehmer M: Predicting cell cycle regulated genes by causal interactions. Plos One 2009,4(8):e6633. 10.1371/journal.pone.0006633View ArticleGoogle Scholar
- von Bertalanffy: An outline of general systems theory. Br J Philosophy Sci 1950,1(2):134.Google Scholar
- Emmert-Streib F, Dehmer M: Networks for systems biology: conceptual connection of data and function. IET Syst Biol 2011,5(3):185. 10.1049/iet-syb.2010.0025View ArticleGoogle Scholar
- Vidal M: A unifying view of 21st century systems biology. FEBS Lett 2009,583(24):3891. 10.1016/j.febslet.2009.11.024View ArticleGoogle Scholar
- Davison A, Hinkley D: Bootstrap Methods and Their Application. Cambridge University Press; 1997.View ArticleGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2008. [ISBN 3-900051-07-0]Google Scholar
- Csardi G, Nepusz T: The igraph software package for complex network research. InterJournal 2006, Complex Systems: 1695. . [http://igraph.sf.net] .Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited