10.1145/3107411.3107418acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedings
research-article

Differential Community Detection in Paired Biological Networks

ABSTRACT

Motivation: Biological networks unravel the inherent structure of molecular interactions which can lead to discovery of driver genes and meaningful pathways especially in cancer context. Often due to gene mutations, the gene expression undergoes changes and the corresponding gene regulatory network sustains some amount of localized re-wiring. The ability to identify significant changes in the interaction patterns caused by the progression of the disease can lead to the revelation of novel relevant signatures.

Methods: The task of identifying differential sub-networks in paired biological networks (A:control,B:case) can be re-phrased as one of finding dense communities in a single noisy differential topological (DT) graph constructed by taking absolute difference between the topological graphs of A and B. In this paper, we propose a fast three-stage approach, namely Differential Community Detection (DCD), to identify differential sub-networks as differential communities in a de-noised version of the DT graph. In the first stage, we iteratively re-order the nodes of the DT graph to determine approximate block diagonals present in the DT adjacency matrix using neighbourhood information of the nodes and Jaccard similarity. In the second stage, the ordered DT adjacency matrix is traversed along the diagonal to remove all the edges associated with a node, if that node has no immediate edges within a window. Finally, we apply community detection methods on this de-noised DT graph to discover differential sub-networks as communities.

Results: Our proposed DCD approach can effectively locate differential sub-networks in several simulated paired random-geometric networks and various paired scale-free graphs with different power-law exponents. The DCD approach easily outperforms community detection methods applied on the original noisy DT graph and recent statistical techniques in simulation studies. We applied DCD method on two real datasets: a) Ovarian cancer dataset to discover differential DNA co-methylation sub-networks in patients and controls; b) Glioma cancer dataset to discover the difference between the regulatory networks of IDH-mutant and IDH-wild-type. We demonstrate the potential benefits of DCD for finding network-inferred bio-markers/pathways associated with a trait of interest. Conclusion: The proposed DCD approach overcomes the limitations of previous statistical techniques and the issues associated with identifying differential sub-networks by use of community detection methods on the noisy DT graph. This is reflected in the superior performance of the DCD method with respect to various metrics like Precision, Accuracy, Kappa and Specificity. The code implementing proposed DCD method is available at https://sites.google.com/site/raghvendramallmlresearcher/codes.

References

  1. Ahern, T., Horvath-Puho, E., Spindler, K., Sorensen, H., Ording, A., and Erichsen, R. Colorectal cancer, comorbidity, and risk of venous thromboembolism: assessment of biological interactions in a Danish nationwide cohort. British Journal of Cancer 114, 1 (2016), 96--102.Google ScholarGoogle Scholar
  2. Benjamini, Y., and Yekutieli, D. The control of false discovery rate in multiple testing under dependency. Annals of Statistics 29 (2001), 1165--1188.Google ScholarGoogle Scholar
  3. Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10 (2008), P10008.Google ScholarGoogle Scholar
  4. Boginski, V., Butenko, S., and Pardolas, P. M. Statistical analysis of financial networks. Computational Statistics and Data Analysis 48, 2 (2005), 431--443.Google ScholarGoogle Scholar
  5. Brandes, U., and Eriebach, T. Network Analysis: Methodological Foundations. Springer 3418 (2005). Google ScholarGoogle Scholar
  6. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. Graph structure in the web. Comput. Netw. 33, 1--6 (2000), 309--320. Google ScholarGoogle Scholar
  7. Ceccarelli, M., Barthel, F. P., Malta, T. M., Sabedot, T. S., Salama, S. R., Murray, B. A., Morozova, O., Newton, Y., Radenbaugh, A., Pagnotta, S. M., et al. Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell 164, 3 (Feb. 2016), 550--563.Google ScholarGoogle Scholar
  8. Ceccarelli, M., Cerulo, L., and Santore, A. De novo reconstruction of gene regulatory networks from time series data, an approach based on formal methods. Methods 69, 3 (Oct 2014), 298--305.Google ScholarGoogle Scholar
  9. Dittrich, M. T., Klau, G. W., Rosenwald, A., Dandekar, T., and Müller, T. Identifying functional modules in protein--protein interaction networks: an integrated exact approach. Bioinformatics 24, 13 (2008), i223--i231. Google ScholarGoogle Scholar
  10. Erath, A., Löchl, M., and Axhausen, K. Graph-theoretical analysis of the swiss road and railway networks over time. Networks and Spatial Economics 9, 3 (2009), 379--400.Google ScholarGoogle Scholar
  11. Ernst, J., and Kellis, M. Chromhmm: automating chromatin-state discovery and characterization. Nature methods 9, 3 (2012), 215--216.Google ScholarGoogle Scholar
  12. Falcon, S., and Gentleman, R. Using GOstats to test gene lists for GO term association. Bioinformatics 23, 2 (2007), 257--258. Google ScholarGoogle Scholar
  13. Fuller, T., Ghazalpour, A., Aten, J., Drake, T., Lusis, A., and Horvath, S. Weighted Gene Co-expression Network Analysis Strategies Applied to Mouse Weight. Mammilian Genome 18, 6 (2007), 463--472.Google ScholarGoogle Scholar
  14. Gill, R., Datta, S., and Datta, S. A statistical framework for differential network analysis from microarrya data. BMC: Bioinformatics 11, 1 (2010), 95.Google ScholarGoogle Scholar
  15. Girvan, M., and Newman, M. E. Community structure in social and biological networks. Proc. of the national academy of sciences 99, 12 (2002), 7821--7826.Google ScholarGoogle Scholar
  16. Ha, M., Baladandayuthapani, V., and Do, K. Dingo: differential network analysis in genomics. Bioinformatics 31, 21 (2015), 3413--20.Google ScholarGoogle Scholar
  17. Horvath, S., Zhang, Y., Langfelder, P., Kahn, R. S., Boks, M. P., van Eijk, K., van den Berg, L. H., and Ophoff, R. A. Aging effects on DNA methylation modules in human brain and blood tissue. Genome biology 13, 10 (2012), R97.Google ScholarGoogle Scholar
  18. Huang, D. W., Sherman, B. T., and Lempicki, R. A. Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature protocols 4, 1 (2009), 44--57.Google ScholarGoogle Scholar
  19. Hubert, L. J. Assignment methods in combinatorial data analysis. Marcel Dekker 1 (1987). Google ScholarGoogle Scholar
  20. Ideker, T., Ozier, O., Schwikowski, B., and Siegel, A. Discovery regulartory and signalling circuits in molecular interaction networks. Bioinformatics 18 (2002).Google ScholarGoogle Scholar
  21. Jiao, Y., Widschwendter, M., and Teschendorff, A. E. A systems-level integrative framework for genome-wide dna methylation and gene expression data identifies differential gene expression modules under epigenetic control. Bioinformatics 30, 16 (2014), 2360--2366.Google ScholarGoogle Scholar
  22. Jin, L., Chen, Y., Wang, T., Hui, P., and Vasilakos, A. Understanding user behavior in online social networks: a survey. Communications Magazine, IEEE 51, 9 (September 2013), 144--150.Google ScholarGoogle Scholar
  23. Johnson, W. E., Li, C., and Rabinovic, A. Adjusting batch effects in microarray expression data using empirical bayes methods. Biostatistics 8, 1 (2007), 118--127.Google ScholarGoogle Scholar
  24. Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K. R., Rastas, P., Morgunova, E., Enge, M., Taipale, M., Wei, G., et al. Dna-binding specificities of human transcription factors. Cell 152, 1 (2013), 327--339.Google ScholarGoogle Scholar
  25. Keller, A., Bakes, C., Gerasch, A., Kaufmann, M., Kohlbacher, O., Meese, E., and Lenhof, H. A novel algorithm for detecting differentially regulated paths based on gene enrichment analysis. Bioinfomatics 25, 21 (2009), 2787--2794. Google ScholarGoogle Scholar
  26. Kulakovskiy, I. V., Vorontsov, I. E., Yevshin, I. S., Soboleva, A. V., Kasianov, A. S., Ashoor, H., Ba-Alawi, W., Bajic, V. B., Medvedeva, Y. A., Kolpakov, F. A., et al. Hocomoco: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic acids research 44, D1 (2016), D116--D125.Google ScholarGoogle Scholar
  27. Lamirel, J.-C., Cuxac, P., Mall, R., and Safi, G. A new efficient and unbiased approach for clustering quality evaluation. New Frontiers in Applied Data Mining (2012), 209--220. Google ScholarGoogle Scholar
  28. Lena, P. D., Wu, G., Martelli, P., Casadio, R., and Nardini, M. C. An efficient tool for molecular interaction maps overlap. BMC Bioinforma 14, 1 (2013), 159.Google ScholarGoogle Scholar
  29. Levandowsky, M., and Winter, D. Distance between sets. Nature 234, 5323 (1971), 34--35.Google ScholarGoogle Scholar
  30. Li, D., Brown, J. B., Orsini, L., Pan, Z., Hu, G., and He, S. Moda: Module differential analysis for weighted gene co-expression network. arXiv preprint arXiv:1605.04739 (2016).Google ScholarGoogle Scholar
  31. Mall, R., Cerulo, L., Bensmail, H., Iavarone, A., and Ceccarelli, M. Detection of statistically significant network changes in complex biological networks. BMC Systems Biology 11, 1 (2017), 32.Google ScholarGoogle Scholar
  32. Mall, R., Langone, R., and Suykens, J. A. Kernel spectral clustering for big data networks. Entropy 15, 5 (2013), 1567--1586.Google ScholarGoogle Scholar
  33. Mall, R., Langone, R., and Suykens, J. A. Self-tuned kernel spectral clustering for large scale networks. In Big Data, 2013 IEEE International Conference on (2013), IEEE, pp. 385--393.Google ScholarGoogle Scholar
  34. Mall, R., Langone, R., and Suykens, J. A. Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks. PloS one 9, 6 (2014), e99966.Google ScholarGoogle Scholar
  35. Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer Research 27, 2 (1967), 209.Google ScholarGoogle Scholar
  36. Marbach, D., Lamparter, D., Quon, G., Kellis, M., Kutalik, Z., and Bergmann, S. Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nature methods (2016).Google ScholarGoogle Scholar
  37. Margolin, A. A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Favera, R. D., and Califano, A. Aracne: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S-1 (2006).Google ScholarGoogle Scholar
  38. Mathelier, A., Fornes, O., Arenillas, D. J., Chen, C.-y., Denay, G., Lee, J., Shi, W., Shyr, C., Tan, G., Worsley-Hunt, R., et al. Jaspar 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic acids research 44, D1 (2016), D110--D115.Google ScholarGoogle Scholar
  39. Merico, D., Isserlin, R., Stueker, O., Emili, A., and Bader, G. D. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PloS one 5, 11 (2010), e13984.Google ScholarGoogle Scholar
  40. Mislove, A., Marcon, M., Gummadi, K. P., Druschel, P., and Bhattacharjee, B. Measurement and analysis of online social networks. In Proc. of the 7th ACM SIGCOMM Conference on Internet Measurement (2007), IMC '07, ACM, pp. 29--42. Google ScholarGoogle Scholar
  41. Nacu, S., Critchley-Throne, R., Lee, R., and Holmes, S. Gene expression network analysis and applications to immunology. Bioinformatics 23, 7 (2007), 850--858. Google ScholarGoogle Scholar
  42. Orman, G. K., and Labatut, V. A comparison of community detection algorithms on artificial networks. In International Conference on Discovery Science (2009), Springer, pp. 242--256. Google ScholarGoogle Scholar
  43. Prvzulj, N. Biological network comparison using graphlet degree distribution. Bioinformatics 23, 2 (2007), e177--e183. Google ScholarGoogle Scholar
  44. Ramana, M., Scheinerman, E., and Ullman, D. Fractional isomorphism of graphs. Discrete Mathematics 132, 1 (1994), 247--265. Google ScholarGoogle Scholar
  45. Reichardt, J., and Bornholdt, S. Statistical mechanics of community detection. Physical Review E 74, 1 (2006), 016110.Google ScholarGoogle Scholar
  46. Rosvall, M., and Bergstrom, C. T. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PloS one 6, 4 (2011), e18209.Google ScholarGoogle Scholar
  47. Ruan, D. Statistical methods for comparing labelled graphs. PhD thesis, Imperial College London, 2014.Google ScholarGoogle Scholar
  48. Ruan, D., Young, A., and Montana, G. Differential analysis of biological networks. BMC bioinformatics 16, 1 (2015), 327.Google ScholarGoogle Scholar
  49. Shervashidze, N., Schweitzer, P., van Leeuwen, E. J., Mehlhorn, K., and Borgwardt, K. Weisfeiler-Lehman Graph Kernels. Journal of Machine Learning Research 12 (2011), 2539--2561. Google ScholarGoogle Scholar
  50. Teschendorff, A. E., Menon, U., Gentry-Maharaj, A., Ramus, S. J., Weisenberger, D. J., Shen, H., Campan, M., Noushmehr, H., Bell, C. G., Maxwell, A. P., et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome research 20, 4 (2010), 440--446.Google ScholarGoogle Scholar
  51. Wallace, T., Martin, D., and Ambs, S. Interaction among genes, tumor biology and the environment in cancer health disparities: examining the evidence on a national and global scale. Carcinogenesis 32, 8 (2011), 1107--1121.Google ScholarGoogle Scholar
  52. West, J., Beck, S., Wang, X., and Teschendorff, A. E. An integrative network algorithm identifies age-associated differential methylation interactome hotspots targeting stem-cell differentiation pathways. Scientific reports 3 (2013), 1630.Google ScholarGoogle Scholar
  53. Yang, Q., and Sze, S. Path matching and graph matching in biological networks. Journal of Computational Biology 14, 1 (2007), 56--67.Google ScholarGoogle Scholar
  54. Yang, X., Shao, X., Gao, L., and Zhang, S. Systematic dna methylation analysis of multiple cell lines reveals common and specific patterns within and across tissues of origin. Human molecular genetics 24, 15 (2015), 4374--4384.Google ScholarGoogle Scholar
  55. Zhang, B., Horvath, S., et al. A general framework for weighted gene co-expression network analysis. Statistical applications in genetics and molecular biology 4, 1 (2005), 1128.endthebibliographyGoogle ScholarGoogle Scholar

Index Terms

  1. Differential Community Detection in Paired Biological Networks

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!