Information theoretic approaches for inference of biological networks from continuous-valued data
© The Author(s) 2016
Received: 25 February 2016
Accepted: 23 August 2016
Published: 6 September 2016
Characterising programs of gene regulation by studying individual protein-DNA and protein-protein interactions would require a large volume of high-resolution proteomics data, and such data are not yet available. Instead, many gene regulatory network (GRN) techniques have been developed, which leverage the wealth of transcriptomic data generated by recent consortia to study indirect, gene-level relationships between transcriptional regulators. Despite the popularity of such methods, previous methods of GRN inference exhibit limitations that we highlight and address through the lens of information theory.
We introduce new model-free and non-linear information theoretic measures for the inference of GRNs and other biological networks from continuous-valued data. Although previous tools have implemented mutual information as a means of inferring pairwise associations, they either introduce statistical bias through discretisation or are limited to modelling undirected relationships. Our approach overcomes both of these limitations, as demonstrated by a substantial improvement in empirical performance for a set of 160 GRNs of varying size and topology.
The information theoretic measures described in this study yield substantial improvements over previous approaches (e.g. ARACNE) and have been implemented in the latest release of NAIL (Network Analysis and Inference Library). However, despite the theoretical and empirical advantages of these new measures, they do not circumvent the fundamental limitation of indeterminacy exhibited across this class of biological networks. These methods have presently found value in computational neurobiology, and will likely gain traction for GRN analysis as the volume and quality of temporal transcriptomics data continues to improve.
KeywordsGene regulatory network Transcriptional regulation Gene expression
Although it is well-established that networks of molecular interactions underlie critical cellular functions including development, differentiation and homeostasis, accurate reconstruction of network topologies using only gene expression data is a difficult problem that has received much attention in recent years [1–3]. Gene regulatory networks (GRNs) assume that active regulatory interactions can be captured as weighted, pairwise associations between genes and, accordingly, that complex interactions (e.g. between RNAs and proteins) may be mapped onto this level.
In 2007, the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge was launched to promote and advance research in network-based analyses of biological data [4, 5]. Early DREAM challenges focused primarily on simulated data-sets, whereby a ‘true’ network topology was used to generate artificial gene expression data [6, 7]. Entrants developed algorithms to reconstruct this network from the expression data alone, with performance evaluated empirically against the supposed true network. Subsequent DREAM challenges have introduced experimental data, with mRNA transcript abundance quantified using qPCR (quantitative polymerase chain reaction), microarray and now high-throughput RNA-seq technologies. Although more biologically relevant than simulated data, ‘true’ network topologies for these systems remain fragmentary approximations of gene regulatory interactions and are thus inappropriate for benchmarking .
Although correlation is a straightforward method for assigning network edge weights, it has several fundamental limitations. Firstly, Pearson’s r assumes that X and Y are normally-distributed and thus it can only identify linear relationships, which can be unsuitable in the context of qPCR, microarray or RNA-seq-quantified transcript abundance. Rank-based correlation metrics such as Spearman’s ρ and Kendall’s τ coefficients are often applied to partially correct for this issue. Secondly, correlation is a symmetric measure (r X,Y =r Y,X ) and thus it can not infer the directionality or causality of biological interactions, even when applied to the analysis of appropriate time-series or gene knock-down data.
Despite widespread practical utility of GRN analysis, recent studies have made broad-stroked dismissals of the field on theoretical bases . Instead, we take a two-fold approach of (a) building upon previous inference methods by leveraging the latest information theoretic advancements, and (b) discussing alternative modelling approaches that better suit some scenarios. Our measures have been implemented in post-publication versions of NAIL [8, 9], and we refer the reader to  for our general commentary on the conundrum of reconciling theoretical versus empirical performance bounds.
This section describes how recent innovations in information theory may be effectively applied to infer network connectivity from continuous-valued biological data. Although described in the context of GRN inference, the following techniques can be applied to many other domains (e.g. inference of neural, proteomic or metabolomic networks).
Information theoretic measures for biological network inference
which is interpreted as the symmetric quantity of information ‘shared’ by X and Y, I D (X;Y)=I D (Y;X). Mutual information makes no assumptions regarding the distribution or linearity of relationships between transcript abundance values.
Mutual information estimators for continuous-valued data
Unlike the Gaussian distribution model, kernel density estimation allows for the model-free identification of non-linear relationships between gene expression levels. However, it is both statistically biased and sensitive to the selection of kernel bandwidth [31, 32]. To provide a bias-corrected and robust method for continuous-valued MI estimation, we instead implement and evaluate two variants of the Kraskov, Stögbauer and Grassberger (KSG) algorithm .
which is more accurate for large values of M and thus more appropriate for large (genome-wide) GRN inference. Both of these algorithms correct for bias and have been empirically demonstrated as robust to the selection of K .
Extensions to information theoretic network inference
Transfer entropy (TE) allows directional gene regulatory associations to be inferred, which can be interpreted as capturing evidence of causal relationships. Under the erroneous assumption that gene expression values are normally distributed, TE reduces to Granger Causality , which has previously been applied to sparse vector autoregressive (SVAR) inference of GRNs from microarray data .
Data and evaluation
where x i is the ‘concentration’ of gene product i, N A and N I are the number of activators and inhibitors (with concentrations A and I) and K represents the concentration at which activating/inhibiting effects are half their saturated value. The efficiency of mRNA transcription and degradation for the i-th gene are parameterised by α i and β i respectively, and n controls the sigmoidicity of the function.
Results and discussion
AUC (Mutual Information)
AUC (Transfer Entropy)
Kernel (ARACNE )
It is evident that the performance of MI-based inference of undirected GRNs is comparable to random guessing (Taxble 1, theoretic A U C=0.5). Application of the DPI yielded no significant improvement for any MI estimator. These results are consistent with recent studies which found that the most sophisticated GRN algorithms perform no better than simple correlation-based inference, due to the fundamental limitation of considering only pairwise expression relationships. A detailed analysis by Maetschke et al. demonstrated that the utility of these techniques is limited to small networks with star-like topologies and that exclusively contain activating or inhibiting interactions . Several common regulatory network motifs have since been identified that are particularly difficult to infer . Moreover, Krishnan et al. have provided a theoretical explanation as to why many non-trivial GRNs are unable to be reverse-engineered from expression data alone; i.e. multiple dissimilar networks produce indistinguishable abundance profiles due to latent protein-mediated effects .
The inclusion of directed information transfer to extend GRN inference yielded improved performance across all networks, with all TE-based methods performing significantly better than random (presumably because these measures are better able to capture activation and inhibition relationships, which are inherently directional). These methods outperformed other Mendes-benchmarked algorithms applying variants of correlation or MI estimation [11, 20, 30], and thus both kernel and KSG-estimated TE have been implemented for causal network inference in the latest version of NAIL . To our knowledge, this is the most comprehensive set of information theoretic tools available for biological network inference. NAIL is available to download from https://sourceforge.net/projects/nailsystemsbiology/.
Previous GRN inference frameworks have implemented mutual information as a means of inferring pairwise gene-level associations (e.g. minet , relevance networks , MRNET  and ARACNE ). However, these tools either introduce statistical bias through discretisation of expression data or are limited to modelling undirected relationships. In this article, we have proposed and evaluated new model-free and non-linear information theoretic measures that circumvent these limitations, leading to substantial improvement in empirical performance across a benchmark set of 160 synthetic GRNs.
Although NAIL is the first GRN toolkit to incorporate the measures described in this article, it does not overcome another fundamental limitation of previous models; i.e. unambiguous network reconstruction requires that the number of time samples must be greater than the number of genes, and even the highest time resolution data-sets fall short by several orders-of-magnitude. To explore transcriptional regulation in the context of current data availability, we refer the reader to the emerging body of literature surround predictive gene expression modelling [53–55]. This class of top-down modelling leverages transcriptomic and epigenetic data as independent observations of an underlying regulatory function, thus circumventing the issue of indeterminacy inherent to GRN analysis.
Despite conflicting reports of the utility of GRNs between theoretical and empirical studies , we believe that this class of network inference will continue to be of widespread value for exploring fundamental regulatory processes. Moreover, the methods described in this paper can be readily applied to computational neuroscience [56, 57] and other fields of complex systems theory [58, 59]. We encourage researchers to investigate how such network abstractions can be applied to their class of biological problems.
This work was supported by an Australian Postgraduate Award [DMB]; the Australian Federal and Victoria State Governments and the Australian Research Council through the ICT Centre of Excellence program, National ICT Australia (NICTA) [DMB]; and the Australian Research Council Centre of Excellence in Convergent Bio-Nano Science and Technology (project number CE140100036) [EJC]. The views expressed herein are those of the authors and are not necessarily those of NICTA or the Australian Research Council.
Analysis and interpretation of data: DMB and EJC. Study design and concept: DMB and EJC. Software development and data processing: DMB. Drafting the paper: DMB. Both authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011; 12(1):56–68.PubMedPubMed CentralView ArticleGoogle Scholar
- Ideker T, Krogan NJ. Differential network biology. Mol Syst Biol. 2012; 8(1):565.PubMedPubMed CentralGoogle Scholar
- Pe’er D, Hacohen N. Principles and strategies for developing network models in cancer. Cell. 2011; 144(6):864–73.PubMedPubMed CentralView ArticleGoogle Scholar
- Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Kellis M, Collins JJ, Stolovitzky G, et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012; 9(8):796–804.PubMedPubMed CentralView ArticleGoogle Scholar
- Stolovitzky G, Monroe D, Califano A. Dialogue on reverse-engineering assessment and methods. Ann N Y Acad Sci. 2007; 1115(1):1–22.PubMedView ArticleGoogle Scholar
- Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK, Alexopoulos LG, Xue X, Clarke ND, Altan-Bonnet G, Stolovitzky G. Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PloS ONE. 2010; 5(2):9202.View ArticleGoogle Scholar
- Stolovitzky G, Prill RJ, Califano A. Lessons from the DREAM2 challenges. Ann N Y Acad Sci. 2009; 1158(1):159–95.PubMedView ArticleGoogle Scholar
- Hurley D, Araki H, Tamada Y, Dunmore B, Sanders D, Humphreys S, Affara M, Imoto S, Yasuda K, Tomiyasu Y, et al.Gene network inference and visualization tools for biologists: application to new human transcriptome datasets. Nucleic Acids Res. 2012; 40(6):2377–98.PubMedView ArticleGoogle Scholar
- Hurley DG, Cursons J, Wang YK, Budden DM, Crampin EJ, et al.NAIL, a software toolset for inferring, analyzing and visualizing regulatory networks. Bioinformatics. 2015; 31(2):277–8.PubMedView ArticleGoogle Scholar
- Madhamshettiwar PB, Maetschke SR, Davis MJ, Reverter A, Ragan MA. Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets. Genome Med. 2012; 4(5):1–16.View ArticleGoogle Scholar
- Maetschke SR, Madhamshettiwar PB, Davis MJ, Ragan MA. Supervised, semi-supervised and unsupervised inference of gene regulatory networks. Brief Bioinformatics. 2013; 15(2):195–211.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang Y, Hurley D, Schnell S. Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks. PloS ONE. 2013; 8(8):72103.View ArticleGoogle Scholar
- Wildenhain J, Crampin E. Reconstructing gene regulatory networks: from random to scale-free connectivity. IEE Proc Syst Biol. 2006; 153(4):247–56.View ArticleGoogle Scholar
- Le Novère N. Quantitative and logic modelling of molecular and gene networks. Nat Rev Genet. 2015; 16(3):146–58.PubMedPubMed CentralView ArticleGoogle Scholar
- Krishnan A, Giuliani A, Tomita M. Indeterminacy of reverse engineering of gene regulatory networks: the curse of gene elasticity. PLoS ONE. 2007; 2(6):562.View ArticleGoogle Scholar
- Budden DM, Jones M. Cautionary tales of inapproximability. J Comput Biol.(in press).Google Scholar
- Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948; 27(3):379–423.View ArticleGoogle Scholar
- Lazo AC, Rathie PN. On the entropy of continuous probability distributions. Inf Theory IEEE Trans. 1978; 24(1):120–2.View ArticleGoogle Scholar
- Meyer PE, Lafitte F, Bontempi G. minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics. 2008; 9(1):461.PubMedPubMed CentralView ArticleGoogle Scholar
- Butte AJ, Kohane IS. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Pacific Symposium on Biocomputing, vol. 5, World Scientific;2000;5:415–426.Google Scholar
- Meyer PE, Kontos K, Lafitte F, Bontempi G. Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol. 2007; 2007:8–8.View ArticleGoogle Scholar
- Paninski L. Estimation of entropy and mutual information. Neural Comput. 2003; 15(6):1191–253.View ArticleGoogle Scholar
- Schäfer J, Strimmer K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005; 4(1):1–30.Google Scholar
- Schürmann T, Grassberger P. Entropy estimation of symbol sequences. Chaos. 1996; 6(3):414–27.PubMedView ArticleGoogle Scholar
- Cover TM, Thomas JA. Elements of information theory: John Wiley & Sons; 2012.Google Scholar
- Ross BC. Mutual information between discrete and continuous data sets. PloS ONE. 2014; 9(2):87357.View ArticleGoogle Scholar
- Roulston MS. Estimating the errors on measured entropy and mutual information. Physica D Nonlinear Phenom. 1999; 125(3):285–94.View ArticleGoogle Scholar
- Seok J, Kang YS. Mutual information between discrete variables with many categories using recursive adaptive partitioning. Sci Rep. 2015; 5:1–10.Google Scholar
- Bar-Joseph Z, Gerber GK, Gifford DK, Jaakkola TS, Simon I. Continuous representations of time-series gene expression data. J Comput Biol. 2003; 10(3-4):341–56.PubMedView ArticleGoogle Scholar
- Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, Califano A. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006; 7(Suppl 1):7.View ArticleGoogle Scholar
- Kaiser A, Schreiber T. Information transfer in continuous processes. Physica D Nonlinear Phenom. 2002; 166(1):43–62.View ArticleGoogle Scholar
- Schreiber T. Measuring information transfer. Phys Rev Lett. 2000; 85(2):461.PubMedView ArticleGoogle Scholar
- Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E. 2004; 69(6):066138.View ArticleGoogle Scholar
- Beaudry NJ, Renner R. An intuitive proof of the data processing inequality. Quantum Inf Comput. 2012; 12(5-6):432–41.Google Scholar
- Guo X, Wang XF. Signaling cross-talk between TGF- β/BMP and other pathways. Cell Res. 2009; 19(1):71–88.PubMedPubMed CentralView ArticleGoogle Scholar
- Oeckinghaus A, Hayden MS, Ghosh S. Crosstalk in NF- κB signaling pathways. Nat Immunol. 2011; 12(8):695–708.PubMedView ArticleGoogle Scholar
- Frenzel S, Pompe B. Partial mutual information for coupling analysis of multivariate time series. Phys Rev Lett. 2007; 99(20):204101.PubMedView ArticleGoogle Scholar
- Gómez-Herrero G, Wu W, Rutanen K, Soriano MC, Pipa G, Vicente R. Assessing coupling dynamics from an ensemble of time series. 2010. https://arxiv.org/pdf/1008.0539.pdf.
- Prokopenko M, Lizier JT. Transfer entropy and transient limits of computation. Sci Rep. 2014; 4:1–7.View ArticleGoogle Scholar
- Barnett L, Barrett AB, Seth AK. Granger causality and transfer entropy are equivalent for gaussian variables. Phys Rev Lett. 2009; 103(23):238701.PubMedView ArticleGoogle Scholar
- Fujita A, Sato JR, Garay-Malpartida HM, Yamaguchi R, Miyano S, Sogayar MC, Ferreira CE. Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC Syst Biol. 2007; 1(1):39.PubMedPubMed CentralView ArticleGoogle Scholar
- Mendes P, Sha W, Ye K. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics. 2003; 19(suppl 2):122–9.View ArticleGoogle Scholar
- Hill AV. The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves. J Physiol (London). 1910; 40:4–7.Google Scholar
- Hofmeyr J-HS, Cornish-Bowden H. The reversible hill equation: how to incorporate cooperative enzymes into metabolic models. Comput Appl Biosci. 1997; 13(4):377–85.PubMedGoogle Scholar
- Mendes P. GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems. Comput Appl Biosci. 1993; 9(5):563–71.PubMedGoogle Scholar
- Rényi A, Erdős P. On random graphs. Publ Math. 1959; 6(290-297):5.Google Scholar
- Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature. 1998; 393(6684):440–2.PubMedView ArticleGoogle Scholar
- Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999; 286(5439):509–12.PubMedView ArticleGoogle Scholar
- Featherstone DE, Broadie K. Wrestling with pleiotropy: genomic and topological analysis of the yeast gene expression network. Bioessays. 2002; 24(3):267–74.PubMedView ArticleGoogle Scholar
- Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL. The large-scale organization of metabolic networks. Nature. 2000; 407(6804):651–4.PubMedView ArticleGoogle Scholar
- Newman ME. The structure and function of complex networks. SIAM Rev. 2003; 45(2):167–256.View ArticleGoogle Scholar
- Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn. 2001; 45(2):171–86.View ArticleGoogle Scholar
- Budden DM, Hurley DG, Cursons J, Markham JF, Davis MJ, Crampin EJ. Predicting expression: the complementary power of histone modification and transcription factor binding data. Epigenetics Chromatin. 2014; 7(1):1–12.View ArticleGoogle Scholar
- Budden DM, Hurley DG, Crampin EJ. Predictive modelling of gene expression from transcriptional regulatory elements. Brief Bioinform. 2014; 16(4):616–28.PubMedView ArticleGoogle Scholar
- Budden DM, Hurley DG, Crampin E. Modelling the conditional regulatory activity of methylated and bivalent promoters. Epigenetics Chromatin. 2015; 8(1):1–10.View ArticleGoogle Scholar
- Lindner M, Vicente R, Priesemann V, Wibral M. TRENTOOL: A MATLAB open source toolbox to analyse information flow in time series data with transfer entropy. BMC Neurosci. 2011; 12(1):119.PubMedPubMed CentralView ArticleGoogle Scholar
- Lizier JT, Heinzle J, Horstmann A, Haynes JD, Prokopenko M. Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fmri connectivity. J Comput. 2011; 30(1):85–107.Google Scholar
- Barnett L, Bossomaier T. Transfer entropy as a log-likelihood ratio. Phys Rev Lett. 2012; 109(13):138105.PubMedView ArticleGoogle Scholar
- Boedecker J, Obst O, Lizier JT, Mayer NM, Asada M. Information processing in echo state networks at the edge of chaos. Theory Biosci. 2012; 131(3):205–13.PubMedView ArticleGoogle Scholar