Detection of statistically significant network changes in complex biological networks

Background Biological networks contribute effectively to unveil the complex structure of molecular interactions and to discover driver genes especially in cancer context. It can happen that due to gene mutations, as for example when cancer progresses, the gene expression network undergoes some amount of localized re-wiring. The ability to detect statistical relevant changes in the interaction patterns induced by the progression of the disease can lead to the discovery of novel relevant signatures. Several procedures have been recently proposed to detect sub-network differences in pairwise labeled weighted networks. Methods In this paper, we propose an improvement over the state-of-the-art based on the Generalized Hamming Distance adopted for evaluating the topological difference between two networks and estimating its statistical significance. The proposed procedure exploits a more effective model selection criteria to generate p-values for statistical significance and is more efficient in terms of computational time and prediction accuracy than literature methods. Moreover, the structure of the proposed algorithm allows for a faster parallelized implementation. Results In the case of dense random geometric networks the proposed approach is 10-15x faster and achieves 5-10% higher AUC, Precision/Recall, and Kappa value than the state-of-the-art. We also report the application of the method to dissect the difference between the regulatory networks of IDH-mutant versus IDH-wild-type glioma cancer. In such a case our method is able to identify some recently reported master regulators as well as novel important candidates. Conclusions We show that our network differencing procedure can effectively and efficiently detect statistical significant network re-wirings in different conditions. When applied to detect the main differences between the networks of IDH-mutant and IDH-wild-type glioma tumors, it correctly selects sub-networks centered on important key regulators of these two different subtypes. In addition, its application highlights several novel candidates that cannot be detected by standard single network-based approaches. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0412-6) contains supplementary material, which is available to authorized users.

copy number. Signatures of differentially expressed and/or methylated genes are the downstream effect of global cell de-regulation in different conditions such as cancer subtypes. Therefore, it is argued that driver mutations activate functional pathways described by different global re-wiring of the underlying gene regulatory network.
The identification of significant changes induced by the presence or the progression of disease can help to discover novel molecular diagnostics and prognostic signatures. For example, it is known that, according to the mutation of the gene IDH [15,16], the majority of malignant brain tumors can be divided two main macro-categories, which can be further divided in seven molecular and clinically distinct subtypes [17]. These two macro-groups are characterized by highly different global expression and epigenomic profiles. Hence, one of the main questions to understand the molecular basis of diseases is how to identify significant changes in the regulatory structure in different conditions.
Various techniques have been developed to compare two graphs including graph matching and graph similarity algorithms [18][19][20]. However, the problem addressed in this paper is different from popular graph theory problems including graph isomorphism [21] and sub-graph matching [22]. Here the goal is to identify statistically significant differences between two weighted networks (with or without labels).
One common statistic used to distinguish one graph, A from another B, having the same number of nodes N, is the Mean Absolute Difference (MAD) metric, defined as: d(A, B) = 1 N(N−1) i =j |a ij − b ij |, where a ij and b ij are edge weights corresponding to the topology of networks A and B. This distance measure is equivalent to the Hamming distance [23] and has been extensively used in literature to compare networks [24,25]. Another statistic used to test association between networks is the Quadratic Assignment Procedure (QAP) defined as: The QAP metric is used in a permutation-based procedure to differentiate two networks [26,27]. Ruan et al. showed that these metrics are not always sensitive to subtle topological variations [28].
Our aim is to detect statistically significant differences between two networks under the premise that any true topological difference between the two networks would involve only a small set of edges when compared to all the edges in the network. Recently, a Generalized Hamming Distance (GHD) based method was introduced to measure the distance between two labeled graphs [28], where it was shown that the GHD statistic is more robust than MAD and QAP metrics for identifying subtle variations in the topology of paired networks. In particular the authors showed that GHD permutation distribution follows a normal distribution with closed-form expression for first two moments under the null hypothesis that networks A and B are independent. Utilizing the moments, corresponding p-values were obtained in closed-form. They also propose a differential sub-network identification technique namely dGHD. The advantage of this technique is thatunlike previous differential network analysis techniques [25,29,30] -it provides a closed-form solution for p-values for the differential sub-network left after iterative removal of the least differential nodes. We propose an improvement over dGHD, namely Closed-Form approach that exploits the conditions for asymptotic normality which is computationally cheaper and attains better prediction performance than the dGHD algorithm. Computational efficiency and prediction accuracy is crucial in cancer contexts where networks have a large number of nodes and the topological difference is associated to few driver genes.

Preliminaries on generalized hamming distance
The Generalized Hamming Distance is a way to estimate the distance between two weighted graphs [28]. Let where a ij and b ij are mean centered edge-weights defined as: The edge weights, a ij and b ij , depend on the topology of the network and provide a measure of connectivity between every pair of nodes i and j in A and B. Different metrics have been adopted to measure the connectivity between pairs of nodes, including: topological overlap (TO) [31,32], cosine similarity and Pearson correlation [33]. In our experiments, we used the cosine similarity to capture first order interactions between the nodes in the network. Cosine similarity computation scales well for large sparse networks and can be used in place of TO, as it has nearly perfect correlation with it.
Given two networks A and B, a permutation π of the labels of the vertices of A (keeping the edges unchanged) generates a permuted network A π . The quantity GHD π (A π , B) represents the test statistics of an inferential problem having as null hypothesis H o : Graphs A and B are independent [28]. The distribution of GHD π can be obtained through an exhaustive calculation which can be approximated by a Monte Carlo approach. The authors of [28], indeed, simplified this calculation showing that under the null hypothesis it can be approximated well by a normal distribution with moments that can be obtained analytically.
This can be shown as: where μ π is the asymptotic value of the mean GHD statistic and σ π is the asymptotic value of the standard deviation of GHD statistic computed between A π and B. In order to calculate the μ π and σ π values we define: Here a t ij and b t ij are the edge weights with the power t. Furthermore, we require the following terms: Using these definitions the closed-form expression for mean μ π and variance σ 2 π are expressed as: Given a significance threshold α (e.g. 0.01), p-values > α indicate that there is no sufficient evidence to reject the null hypothesis (H o ) that graphs A and B are independent. Hence, higher p-values indicate more probability that the two graphs under consideration are independent.

Differential sub-network detection with GHD
The GHD distance is able to tell us to what extent are two graphs different but is not able to identify which parts of the graph are similar and which are different. In this work, we are interested in detecting which part of the graphs contribute to make the two graphs different. We call such different sub-graphs differential sub-networks.
The notion of differential sub-networks is based on the idea that when comparing two networks only a subset of edges would present altered interaction. The goal is to identify the set of nodes, namely V * , associated with such a subset of edges and the p-values p * corresponding to the nodes in V * . This goal, formulated as a statistical test, requires that for such a subset V * there is no sufficient evidence to reject the null hypothesis that the corresponding sub-networks The idea here is to adopt an iterative technique to identify the set of nodes V * which contributes more to the difference. We start from the dGHD algorithm proposed in [28]. The algorithm measures the edge connectivity with topological overlap metric and benefits from the closedform solution of p-value (Eq. (3)). In the dGHD algorithm, an iterative procedure is followed where at each iteration the change in centralized GHD (cGHD) i.e. cGHD = GHD(A, B) − μ π is estimated after the removal of one node. The node where the change in cGHD (i.e. difference in cGHD before and after removal of a node) is maximum is removed. The GHD statistic is computed for remaining sub-networks and the p-value is estimated. This process is repeated till a user specified minimal set size is reached or it is no-longer possible to have closed-form representation for p-values which happens for N ≤ 3 as shown in Eq. 3. The p-values are then adjusted for multiple testing by controlling the false discovery rate [34].
The dGHD algorithm suffers from the following limitations: a) During the i th iteration, the GHD measure is calculated N − i times on different sub-graphs with an overall time complexity The algorithm is prone to discovery more false positives since it uses the change in cGHD ( cGHD) as a model selection criterion. We overcome such limitations by proposing the following improvements: 1. Remove nodes by exploiting the Closed-Form. We use the idea that nodes which have similar topology in networks A and B will contribute the least to cGHD. So, we first calculate the closed-form contribution of each node in cGHD once using Eq. 4 and then iteratively remove nodes with least contributions. However, this process is continued till we observe that the p-value of the remaining sub-network becomes greater than a threshold θ. 2. Using a different model selection criterion. Once the p-value reaches θ, we follow a procedure similar to the dGHD algorithm but use the more intuitive criterion of selecting the node that when removed makes the cGHD value maximum rather than using the change in the cGHD value (before and after removal of a node) as a model selection criterion. By using this model selection criterion, we iteratively identify and remove that node whose contribution is least in the cGHD. The advantage of the Closed-Form approach is that we significantly reduce the computational complexity and improve the predictive performance. A simple alternative to the Closed-Form approach would be to sort all the nodes based on their contribution to cGHD and thus rank all the nodes based on their capability to differentiate the two networks with complexity (O (N log N)). However, then we will not be able to identify statistically different sub-networks between the two graphs as indicated in [28].

Closed-form approach
We propose a fast approach to perform differential subnetwork analysis taking into consideration the contribution of each node to GHD and μ π . Using Eqs. (1) and (3) this can mathematically be represented as: We observe that if we sum GHD(A, B)(i) and μ π (i)∀i ∈ V , we obtain GHD(A, B) and μ π . We use the idea that nodes which have similar topology in networks A and B will contribute the least to centralized GHD, i.e. GHD(A, B) − μ π . We calculate the Closed-Form contribution of each node in the centralized GHD (cGHD) once using Eq. (4) and then iteratively remove nodes with least contribution to the cGHD, i.e. nodes having similar topology in graphs A and B. Thus, we calculate cGHD once and sort all the nodes based on their contribution to the cGHD metric.
This process is continued till we observe that the pvalue of the remaining sub-network becomes greater than a threshold θ. Once the p-value reaches θ, we estimate the mean of the permutation distribution for the nodes (V K ) of the remaining sub-network. Furthermore, we define V K|i as the value of cGHD after removal of node i. We adopt a different model selection criterion than that proposed in [28] to remove non-differential nodes. We use the intuitive criterion of selecting that node after removal of which the cGHD value becomes maximum, i.e. the node which was most similar in terms of topology for the paired-graphs. Finally, the obtained p-values are adjusted for multiple testing by controlling the false discovery rate [34]. Provided the paired-graphs A and B, the calculation of V K|i can be done independently for each i. Details of the Closed-Form method is provided in Algorithm 1. The sensitivity of the Closed-Form approach with the param-eter θ is demonstrated in Experimental Results section. Table 1 summarizes the improvements with respect to the dGHD algorithm in terms of time complexity.

Algorithm 1: Closed-Form
Data: Graphs A and B with N vertices V. Result: Subset V * representing the set of nodes which comprise the differential sub-network & p-values for GHD measure. V * = {} // Empty Set for differential sub-network nodes.
Add t to V K // Perform in parallel. n * = max i V K // Select that node after removal of which cGHD becomes maximum. Remove node n * from V K i.e V K = V K \ n * and O = O \ n * else if p-value < θ then n * = min i (O) // Select node in the sub-network with least contribution. Remove node n * from O. // O is sorted so remove 1 st node. if p-value > 0.01 then Append n * to V * . N = N − 1. Adjust the p-values for false-discovery rate [34].

Alternative procedure (fast approximation)
We propose an alternative procedure to the Closed-Form approach namely the Fast Approximation method where we first calculate the cGHD value without including the i th node, ∀i ∈ V once. This helps to estimate the cGHD value after removal of the i th node and can be performed in parallel. Our aim is to quickly discard those nodes after removal of which the cGHD value becomes large Table 1 Time complexity comparison   dGHD Closed-form Here K represents the number of nodes for which p-value is greater than θ and generally K N. An important remark is that the cGHD calculation after removal of each node can be done independently in parallel. So, in case we have T processors, the complexity of the proposed approach will reduce ≈ linearly w.r.t. T thereby removing nodes which were contributing least to the cGHD value. This helps to reduce the dependence between the two sub-networks by removing nodes which have similar topology in graphs A and B. Again, the idea is motivated by the premise that only a subset of nodes will form the differential sub-networks in graph A and B.
In this approach, we iteratively discard those nodes after removal of which the cGHD value becomes maximal till the p-value for the remaining sub-network reaches a threshold θ. Once the p-value reaches θ, we return back to the procedure of estimating V K|i ∀i ∈ V K as described in the Closed-Form approach. We use the same model selection criterion of selecting that node after removal of which the cGHD value becomes maximum as used in the Closed-Form approach. We then adjust the obtained p-values for multiple testing by controlling the false discovery rate [34]. We refer to this technique as a Fast Approximation to the dGHD [28]. We explain the Fast Approximation technique in detail in Algorithm 2.

Algorithm 2: Fast Approximation
Data: Graphs A and B with N vertices V. Result: Subset V * representing the set of nodes which comprise the differential sub-network & p-values for GHD measure. V * = {} // Empty Set for differential sub-network nodes. V K = V // Initialize a copy of the set of vertices V.
// Estimate cGHD value after removal of node i. Add t to V K . // Perform in parallel.

Sort V K in descending order and keep in
Append n * to V * . N = N − 1 Adjust the p-values for false-discovery rate.
From our experiments, we observe that the results of the Closed-Form approach and the Fast Approximation technique are identical. Although, in the case of Closed-Form approach, we calculate closed-form contribution of each node in the cGHD value and remove the node with least contribution, while in case of Fast Approximation we select that node after removal of which cGHD value becomes maximum, the ordered list O obtained for both the methods is identical. Moreover, the computational complexity of the Fast-Approximation technique is the same as that of Closed-Form approach.

Inference of the glioma networks and master regulator analysis
We used the TCGA pan-glioma samples dataset including 1250 samples (463 IDH-mutant and 653 IDH-wild-type), 583 of which profiled with Agilent microarray and 667 with RNA-Seq Illumina HiSeq (REF) downloaded from the TCGA portal. The batch effects between the two platform were corrected using the COMBAT algorithm [35]. The final gene expression data matrix includes 12,985 genes and 1250 samples. We re-constructed two gene regulatory networks belonging to two different glioma subtypes: IDH-mutant and IDH-wild-type. Both networks were re-constructed with a four step procedure that follows ARACNe [36]: i) Computation of mutual information between gene expression profiles to determine interaction between Transcription Factors (TFs) and target genes [37]; ii) data processing inequality to filter out indirect relationships [36], iii) permutation test with 1000 re-samplings to keep only statistically significant relationships. We also assembled a global glioma network using all the available 1250 transcriptional profiles using the aforementioned method. In this last case we also used intersection with transcription factor (TF) binding sites to keep only relationships due to promoter binding. We used a set of 457 TF binding sites available in the MotifDB Bioconductor package.
Master Regulator Analysis (MRA) algorithm [38] was applied to the global glioma network in order to compute the statistical significance of the overlap between the regulon of each TF (i.e. its ARACNe inferred targets) and the differentially expressed gene list (Wilcoxon-Mann-Whitney test FDR ≤ 0.05) between IDH-mutant and IDH-wild-type samples. Given a gene interaction network, generated by ARACNe and a gene phenotype signature (e.g. a set of differentially expressed genes), the MRA algorithm computes for each TF the enrichment of the phenotype signature in the regulon of that TF. The regulon of a TF is defined as its neighborhood in the gene interaction network. There are two different methods to evaluate the enrichment of the signature in the regulon. One method uses the statistical Fisher's exact test, while the other approach uses Gene Set Enrichment Analysis (GSEA). Here we used this last method.
A Master Regulator (MR) gene is a TF which regulon exhibit a statistical significant enrichment of the given phenotype signature.

Validation in the Rembrandt dataset
We used an independent dataset to perform the same analysis of network differencing between IDH-mutant and IDH-wild-type gliomas and check the the overlap between the two analyses. Raw gene expression (Affymetrix U133 Plus 2.0) from the publically available Repository for Molecular Brain Neoplasia Data (Rembrandt) (https:// caintegrator.nci.nih.gov/rembrandt/) included 444 samples divided in 218 Glioblastoma, 148 Astrocytoma, 67 Oligodendrogliomas and 11 mixed histologies. Expression subtype and IDH status was inferred from gene expression following the procedure in [39] resulting in 153 wild-type and 162 mutant samples. These two set of expression profiles were used to generate two regulatory networks using the same approach reported above.

Results and discussion
For all our experiments, we used the Closed-Form approach (since results obtained from Closed-Form and Fast-Approximation techniques are identical) and compare it with the dGHD method [28].

Cosine similarity and topological overlap
The one-step topological overlap measure used to estimate the edge weights is defined as: In this work we use the cosine similarity to calculate the edge weights a ij . The cosine similarity takes into consideration one-step neighborhood of nodes i and j while constructing the edge weight and is very efficient to calculate for sparse matrices. The weights a ij are estimated as follows: where A ij represents the adjacency matrix. We perform an experiment to calculate the correlation between the one-step topological measure and the cosine similarity measure. For this experiment, we generated 250 random geometric networks using N = 250 and the connectivity parameter d = 0.15. Figure 1 shows that the cosine similarity metric is nearly perfectly correlated (Pearson correlation = 0.952) to the topological overlap measure.

Sensitivity to θ
In this experiment, we check the sensitivity of the proposed Closed-Form approach w.r.t. the heuristic θ. For this experiment, we first generated 100 random geometric (RG) networks. In a RG network nodes are generated by uniformly sampling N points on [0, 1] 2 . An edge is then drawn between these points if the Euclidean distance between the points is less than a parameter d. This parameter d controls the density of the RG network where smaller values of d result in sparse networks while larger values of d generates dense networks. In our case, we conducted experiments using two different settings. In the first case, we use d = 0.15, while in the second setting, we use d = 0.3. For both experiments we fix N = 250. For each value of d and for each generated RG network A, we permute the first 50 rows and columns of the network to generate network B. Therefore, the first 50 nodes in networks A and B form the gold-standard.
In order to test the sensitivity of the proposed approach w.r.t. θ, we estimate the fraction of permuted nodes correctly identified by the Closed-Form method for various values of θ. We used a grid of θ values varying from = 10 −50 , . . . , 10 −300 in multiplicative steps of 10 −20 . The goal of this experiment is to show that the fraction of correctly identified nodes w.r.t. various θ ∈ remains nearly constant for smaller values of θ. Figure 2 shows the result for RG networks with density parameter d = 0.15 and d = 0.3. From Fig. 2, we observe that the median fraction of permuted nodes identified by the proposed approaches increases slowly before it converges to a nearly constant value as we decrease the threshold θ (i.e. increase absolute log of threshold θ). From this experiment, we conclude that the fraction of truly differential nodes identified by the proposed methods increases as we decrease the threshold θ before it starts to converge for smaller values of threshold θ.
We performed further experiments using different θ for various values of N and observed that threshold θ behaves similarly independent of the value of N. We used the θ = 10 −50 as heuristic cut-off for future experiments.

Predictive performance comparison
Experimental Setup: The next simulation study that we carried out was to compare the predictive performance of the proposed approach w.r.t. the dGHD [28] technique. For this experiment, we generate 100 RG networks with N = 1, 000. For the first experiment we fix the density parameter d = 0.15 and permute first 100 nodes in network A to obtain network B. Thus, these first 100 nodes form the differential sub-network for the paired networks A and B.
In the second case, we use the density parameter d = 0.3 to generate the edges for network A. We then generate a small RG network with 100 nodes using density parameter d = 0.5. This small dense sub-network is then used to replace the network formed by first 100 nodes in the original network A to form network B. Thus, in the second experiment, these 100 nodes form the differential subnetwork for the paired networks A and B. This kind of mechanism can appear in real-life networks, for example, in case of cancer the transcription activity of some set of genes might get enhanced or suppressed in patients resulting in more or fewer edges in a sub-network of the gene or DNA methylation network. Hence, the networks generated in the first case are much sparser in comparison to the networks generated in the second case.
Evaluation Metrics: We define the following terms to be used in our analysis: • True Positives (TP) -Refers to the nodes that are correctly identified as part of a differential network. • False Positives (FP) -Refers to the nodes that are incorrectly identified as part of a differential network. • False Negatives (FN) -Refers to the nodes that are part of the differential sub-network but are not identified correctly as part of the sub-network. • True Negatives (TN) -Refers to the nodes that are correctly identified as nodes which are not part of the differential sub-network A * and B * .

ROC and PR curve comparisons:
We generate two set of plots including the receiver operating characteristic (ROC) curves and the precision-recall (PR) curves. To generate the plots as shown in Fig. 3, we use the 'ROCR' [40] package in R. It generates relatively smooth curves by automatically using different thresholds to estimate the true positive rate i.e.  The data in Fig. 3a and c shows that Closed-Form approach achieves better performance in case of differential sub-networks formed by permuted nodes and sub-networks with higher density. One of the reasons for relatively poor performance of the dGHD approach is that it has low true positive rate (TPR) and a high false positive rate (FPR) when the network has more edges. This is also reflected by the relatively low Recall and Precision values for the dGHD algorithm in Table 2 when d = 0.3 and d = 0.5. From Fig. 3c, we can observe that the performance of both the dGHD and Closed-Form algorithm improves w.r.t. ROC when the differential subnetwork is denser than the remaining network. However, the gap between the PR curves of Closed-Form and dGHD methods increases when the differential sub-network is denser.
AUC comparison: For all further simulated experiments, we use p-value 0.01 as cut-off in order to determine TP, TN, FP and FN respectively. We also evaluated the area under the ROC curve (AUC_ROC [41]) and area under PR curve (AUC_PR [41]) for 100 runs of Closed-Form and dGHD methods (using p-value 0.01 as cut-off ) as shown in Fig. 4.
We observe from Fig. 4a and b that the dGHD method has lower variance w.r.t. AUC_ROC and AUC_PR metrics in comparison to Closed-Form approach in the case of permuted differential sub-network. However, in case of denser differential sub-network, the Closed-Form approach has much smaller variance in comparison to dGHD algorithm w.r.t. AUC_ROC and AUC_PR metrics as depicted in Fig. 4c and d respectively. This suggests that the performance of Closed-Form technique is better than dGHD method when differential sub-networks are formed either using permuted nodes or higher density. In order to test for significance we performed the Student's t-test under the null that the difference in the mean values of the two ROC distributions is zero i.e. μ AUC_ROC A − μ AUC_ROC B = 0. At a significance level of 5%, we obtain   p-value of 0.48 in case of permuted sub-network, thereby accepting the null i.e. the difference between the two distributions is not significant. In the case of paired networks with a denser differential sub-network (i.e. d = 0.5), we obtain p-value of 3.42 × 10 −14 for the Student's t-test, thereby rejecting the null. Similarly for the two PR distributions we obtained p-value of 0.42 in case of permuted sub-network and p-value of 2.64 × 10 −20 for the denser differential sub-network.

Comparison with community detection techniques
The task of identifying differential sub-networks can also be rephrased as one of finding heavy sub-networks on a single network (say C) constructed by considering the absolute difference in the edge weights between the topological graph of network A and the topological graph of network B i.e. C ij = a ij − b ij , ∀i, j ∈ V . This problem can then be construed as one of identifying dense modules in the network C i.e. from the previous experiments we want to discover a module corresponding to the set of nodes which have permuted or identify the denser sub-network forming the differential sub-network as a module.
The task of identifying dense/heavy modules in a network (C) is often referred as community detection or graph partitioning or graph clustering. There is a plethora of research associated with the problem of community detection including [42][43][44][45][46][47][48][49]. Several of these methods such as jActiveModules [50] and Spinglass algorithm [45] have also been applied to identify biologically meaningful modules (like functional modules, protein complexes, disease associated genes etc.) in biological networks as shown in [51,52]. For our task of identifying dense modules in network C we applied 3 different community detection methods namely Louvain [43], Infomap [44] and Spinglass [45] techniques to have a comprehensive comparison with the proposed Closed-Form approach. We used the implementation of these methods available in the 'igraph' package in R and run each of these methods at their default settings.
We used the same set of RG networks as in the previous experiments to have a comparison with the community detection techniques. Since we are considering the difference in the topology of networks A and B in network C, we remove all the similarity between the two networks and the module with the maximum internal volume (i.e. total weight of edges within the community) is the one capturing the maximum difference between the topologies of networks A and B. Hence, we consider the densest inferred module as the one comprising the differential sub-network and label all the nodes belonging to this cluster as differential while all the other modules are considered non-differential. Using this notion to label the inferred communities, we compare the results obtained for the 3 different community detection techniques w.r.t. the gold standard (i.e. the actual set of labeled nodes which either belong to the permuted subnetwork or belong to the denser sub-network) in a binary classification framework [53,54]. These results are integrated in Table 2 along with the results of dGHD technique and the proposed Closed-Form (CF) approach. We assess the results obtained from the 3 community detection methods w.r.t. several quality metrics commonly used for binary classification including Precision, Recall, Kappa, Accuracy, Specificity, AUC_ROC and computational time. From Table 2, we observe that the Louvain method clearly outperforms the Infomap and Spinglass techniques in correctly identifying the differential subnetwork as a module with respect to the various evaluation metrics.

Simulated result analysis
Finally, the summary Table 2 highlights the computational efficiency and better predictive capabilities of the proposed technique in comparison to dGHD algorithm. For this comparison, we report the results obtained on 100 random runs of RG networks with N = 1000, d = 0.15 and d = 0.3 respectively, where the first 100 nodes are permuted. We also report results when the first 100 nodes form the denser differential sub-networks i.e. in experiments where d = 0.15 use d = 0.3 to form denser sub-network and where d = 0.3 use d = 0.5 to form denser sub-network. We also conducted experiments on undirected Power Law (PL) graphs using N = 1000 and E = 10, 000 with power law exponents α = {2, 3} respectively. We permuted the first 100 nodes of each PL network (B) to form the permuted network (A). We performed 100 random runs and report the mean values for various evaluation metrics. Table 2 compares the Closed-Form, Louvain, Infomap, Spin-glass and dGHD techniques w.r.t. various standard evaluation metrics like AUC, Precision, Recall, Accuracy, Specificity, Kappa statistic and computational Time for all the simulation experiments. Higher values of these evaluation metrics represents better quality results. Here the time required by dGHD algorithm is normalized to 1 and the time required by the other algorithms is scaled by the same normalization factor.
We observe from Table 2 that the Closed-Form approach performs exceedingly well in case of experiments on denser RG networks (d = 0.3) and PL graphs. It emerges as the best method on these networks for various evaluation metrics. For this configuration, in case of both permuted and denser differential sub-networks, the mean AUC_ROC of Closed-Form approach is at least 10% higher than the dGHD algorithm. This is also reflected in higher values of Precision (0.714 and 0.771) and Recall (0.789 and 0.930) metrics for Closed-Form approach in comparison to low values of Precision (0.645 and 0.7) and Recall (0.577 and 0.731) for the dGHD algorithm in case of these experiments.
However, in case of sparse networks where its relatively easier to identify differential sub-networks ( [28]), both Closed-Form and dGHD method have similar predictive performance. For sparse networks, the Louvain method nearly outperforms all other methods for the task of identifying the differential sub-network as a module. From Table 2, we observe that the 3 community detection techniques have nearly perfect Recall scores but usually have relatively low Precision values. This indicates that these methods correctly identify all the nodes forming the differential sub-network but also detect a large quantity of false-positives in the densest module, thereby reducing the Precision values. The Louvain and Infomap methods are extremely fast and interestingly the Louvain method has highest Precision (0.887) which is at least 10% higher than dGHD algorithm and 5% higher than Closed-Form approach while identifying the dense differential subnetwork in a sparse network (d = 0.15, d = 0.3) as shown in Table 2.
We observe that among the community detection techniques the Louvain method is the most efficient and is highly competitive with the dGHD algorithm but cannot outperform the Closed-Form approach on denser networks and Power Law graphs.

Case study in glioma
As a case study, we performed the differential subnetworks analysis of two gene regulatory networks reconstructed from the glioma dataset available on the TCGA. It is well known that the majority of gliomas are divided into two main macro-categories according to the mutation of the gene IDH1 [15,17,55]. Therefore, an important biological question, that motivated the development of the reported methodology, was to identify the sub-networks of of transcription factors (TFs) having a different regulatory program in these two major conditions. We re-constructed two gene regulatory networks belonging to two different glioma subtypes: IDH-mutant and IDH-wild-type as reported in the "Methods" Section.
In our final networks we have 457 TFs and 4,085 targets. We observe that these networks consist of 13,683 unique connections for IDH-mutant and 14,158 for IDHwild-type between TF-TF and TF-target. Using these networks, we construct two unipartite topological graphs as described in the Methods section for the 457 TFs. We then perform the proposed differential sub-network analysis to identify the TFs which are part of differential sub-networks in these topological graphs. Figure 5 shows the significant differential sub-networks and Table 3 reports the topmost TFs which are part of differential sub-networks as detected by our algorithm. In this table, GHD and μ π represent the generalized Hamming Distance and its asymptotic mean between the subgraphs after removing the specific transcription factor in each row of the table. Additional file 1: Table S1 instead reports the results for all the 457 considered transcription factors.
In order to highlight the difference of Closed-Form approach with other standard network analysis methods, we also assembled a global glioma network using all the available transcriptional profiles using the same method described above and performed a master regulator analysis [38] with respect to the molecular phenotype under investigation, i.e. genes differentially expressed between IDH mutant and wild type. Master regulator analysis is extensively adopted to identify TFs that act as principal regulators in driving the phenotype from one condition to another.
Interestingly, among the topmost TFs (out of 457) forming the differential sub-networks, we found several genes known to have a central role in controlling specific glioma subtypes as well as novel candidates that deserve further biological validation. In particular, differential network analysis reveals that the sub-network of STAT3 is one the most different between IDH-mutant and IDH-wild-type networks and a particularly significant Master Regulator of this wild-type phenotype. Members of our group have previously shown that STAT3, together with C/EBPβ, is a key regulator of the mesenchymal differentiation and predicts the poor clinical outcome of IDH-wild-type gliomas [38]. Another key regulator of the IDH-wild-type gliomas was recently reported by using an integrative functional copy number analysis is the set of HOXA genes [17]. Moreover, another key network hub that the algorithm detects as different is SOX10 which appears to be an active master regulator of the IDH-mutant phenotype. We recently reported that the GCIMP-low subgroup in the IDH-mutant cohort can mediated by loss of CpG methylation and binding of SOX factors [17]. Furthermore, our algorithm identifies methyl-CpG-binding domain protein 2 (MBD2) as a differential network hub. In particular, MBD2 has no links in the IDH-wild-type network whereas it is highly connected in the IDH-mutant network where it is characterized by the CpG island methylator phenotype (GCIMP) [56]. Further investigation is needed to claim such a hypothesis as MBD2 is known also as a mediator of the epigenetic gene regulation and its role in Glioblastoma is being studied as its over-expression may drive tumor growth by suppressing the anti-angiogenic activity of key tumor suppressors [57].
The differential network method highlights several other TFs as hubs of differential sub-networks which are not detected with standard MRA. For example, ETV1 and ETV4 which are over-expressed in gliomas of the Codel subtype carrying the mutation of the CIC gene [58]. Another differential sub-network hub not detected by standard MRA is the tumor suppressor RFX1 whch has been identified as an important target/regulator of the malignancy of Glioblastoma [59], where as the cell cycle regulators such as E2F1 and E2F1, which play a role in progression of IDH-mutant glioma are also detected by the Closed-Form algorithm [60].
An important warning that we want to mention is the presence of potential confounding effects due to the adopted dataset obtained by merging the expression profiles from two different platforms. With the additional difficulty that the distribution between IDHwild-type tumors and IDH-mutated tumors is unequal between the two platforms (92% of microarray data are wild-type). We adopted this integrated dataset in order to build the two IDH networks and the global glioma network. The main computation in this case is the estimation of the mutual information between pairs of gene profiles (variables) in a set of observations (patients) and each individual pair of values is always extracted in the same platform. We used a robust knearest neighbor estimator proposed in [61] available in the PARMIGENE R package [62]. This estimator is not based on binning of values and is non parametric, working on the geometry of the scatterplot of each pair of gene expression values. Therefore, each observation (sample) can be seen as another evidence of dependency (or in-dependency) between the variables regardless to the platform. Although, we found this merged dataset useful for the estimation of dependencies between genes, its adoption for deriving conclusions in terms of sample groups and pathway analysis should be made with caution.
As a further independent experiment, we performed the same analysis using the REMBRANDT dataset with the network differential analysis on the two networks independently built with ARACNe and the Master Regulator Analysis on the global network. The Table 4 reports the results for the most different TF sub-networks detected by the Closed-Form algorithm on this dataset. Interestingly of the top nine differential nodes obtained in the TCGA Fig. 5 Differential sub-networks between IDH-mutant and IDH wild-type detected by the closed form approach. In red the connection present only in the IDH-mutant sub-network, while in green those present only in the IDH-wild-type sub-network. In black are represented common connections dataset five (FOXJ3, NFIA, CREB1, SOX10, KLF13) are also detected as significant in the REMBRANDT dataset suggesting that these TFs have a very different regulatory program in glioma subtypes. Moreover, differently from the TCGA experiment, we observe a significant overlap between the results of Closed-Form and that of the MRA. In particular 70 of the 75 nodes forming the differential sub-networks are also enriched in the MRA (p-value of the The columns reports the differential measures in terms of Z-score of the proposed differencing test (Eq. (2)), the GHD computed between the two networks, the mean of the null GHD distribution. The last column reports the False Discovery Rate of the GSEA enrichment obtained with a Master Regulator Analysis  The columns reports the differential measures in terms of Z-score of the proposed differencing test (Eq. (2)), the GHD computed between the two networks, the mean of the null GHD distribution. The last column reports the False Discovery Rate of the GSEA enrichment obtained with a Master Regulator Analysis Fisher exact test: 3.3810 −9 . However, in this case the number of significant master regulators is considerably higher than that obtained in the TCGA case (297 vs. 144).

Conclusion
The comparison of gene expression profiles across different phenotypes is enabling the discovery of novel biomarkers for prognosis or diagnosis. They hold the key to identify novel targets for therapeutical intervention.
In this paper, we proposed an improvement to the state-of-the-art for comparing two labeled/unlabeled graphs that are representative of two conditions (e.g. the macro-categories according to the mutation of the gene IDH1 in our case study) and identifying statistically significant differences in their topology. We used the centralized GHD (cGHD) metric [28] to calculate the distance between the two labeled networks. We proposed a Closed-Form approach, an improvement to the dGHD algorithm, to detect localized topological differences between paired networks. The Closed-Form approach calculates the closed-form contribution of each node in the cGHD metric and efficiently removes nodes with the smaller contributions in the cGHD value. From our experiments on scale free random geometric networks, we discovered that the Closed-Form approach was 10-15x faster than dGHD from a computational complexity point of view. For differential sub-network analysis in very sparse paired graphs, both the Closed-Form and dGHD methods had good predictive performance. They reached mean AUC values of ≈ 0.935 and ≈ 0.926 respectively for 100 random runs of simulation experiments. However, for relatively denser networks, the Closed-Form approach outperformed dGHD. The proposed method achieved a mean AUC of ≈ 0.877 while the dGHD technique reached a mean AUC of ≈ 0.724. The Closed-Form approach also achieved much higher Precision, Recall and Kappa values in comparison to the dGHD method for relatively denser networks. We applied our algorithm to detect the main differences between the networks of IDH-mutant and IDH-wild-type glioma tumors and show that it correctly selects subnetworks centered on important key regulators of these two different subtypes. The adopted dataset is the result of the merging of two different profiling platforms and, as reported in the Results section, its use for other purposes should be made with caution. We also report the results on the same data using standard Master Regulator Analysis on a global network, and show the overlap between the experiments. Indeed, it is known that MRA tends to have many false positives due to correlations between TF profiles which could eventually attenuated with synergy and shadow analysis. On the contrary, the Closed-Form algorithm for network differencing tends to be more conservative as also suggested by the fact that only the significantly different sub-networks are detected in both datasets.

Additional file
Additional file 1: Table S1. Description of data: GHD and MRA Results for all the 457 considered transcription factors on the TCGA and Rembrandt datasets. (XLSX 62.7 kb)