skip to main content
research-article
Open Access

Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity

Authors Info & Claims
Published:03 April 2018Publication History
Skip Abstract Section

Abstract

Graph-based semi-supervised learning (SSL) algorithms predict labels for all nodes based on provided labels of a small set of seed nodes. Classic methods capture the graph structure through some underlying diffusion process that propagates through the graph edges. Spectral diffusion, which includes personalized page rank and label propagation, propagates through random walks. Social diffusion propagates through shortest paths. These diffusions are linear in the sense of not distinguishing between contributions of few "strong" relations or many "weak" relations.

Recent methods such as node embeddings and graph convolutional networks (GCN) attained significant gains in quality for SSL tasks. These methods vary on how the graph structure, seed label information, and other features are used, but do share a common thread of nonlinearity that suppresses weak relations and reenforces stronger ones.

Aiming for quality gain with more scalable methods, we revisit classic linear diffusion methods and place them in a self-training framework. The resulting bootstrapped diffusions are nonlinear in that they re-enforce stronger relations, as with the more complex methods. Surprisingly, we observe that SSL with bootstrapped diffusions not only significantly improves over the respective non-bootstrapped baselines but also outperform state-of-the-art SSL methods. Moreover, since the self-training wrapper retains the scalability of the base method, we obtain both higher quality and better scalability.

References

  1. S. Abney. 2004. Understanding the Yarowsky Algorithm. Comput. Linguist., Vol. 30, 3 (Sept.. 2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Atwood and D. Towsley. 2016. Diffusion-Convolutional Neural Networks. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Bavelas. 1948. A mathematical model for small group structures. Human Organization Vol. 7 (1948), 16--30.Google ScholarGoogle ScholarCross RefCross Ref
  4. F. Bloch and M. O. Jackson. 2007. The formation of networks with transfers among players. Journal of Economic Theory Vol. 133, 1 (2007), 83--110.Google ScholarGoogle ScholarCross RefCross Ref
  5. A. Blum and S. Chawla. 2001. Learning from Labeled and Unlabeled Data Using Graph Mincuts ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Blum, J. Lafferty, M. R. Rwebangira, and R. Reddy. 2004. Semi-supervised Learning Using Randomized Mincuts. ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Carlson, A. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka, Jr., and T. M. Mitchell. 2010. Toward an Architecture for Never-ending Language Learning AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. O. Chapelle, B. Schölkopf, and A. Zien. 2006. Semi-supervised learning. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. R. K. Chung. 1997. Spectral Graph Theory. American Mathematical Society.Google ScholarGoogle Scholar
  10. E. Cohen. 1997. Size-Estimation Framework with Applications to Transitive Closure and Reachability. J. Comput. System Sci. Vol. 55 (1997), 441--453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Cohen. 2016. Semi-Supervised Learning on Graphs through Reach and Distance Diffusion. CoRR Vol. abs/1603.09064 (2016). http://arxiv.org/abs/1603.09064Google ScholarGoogle Scholar
  12. E. Cohen, D. Delling, F. Fuchs, A. Goldberg, M. Goldszmidt, and R. Werneck. 2013. Scalable Similarity Estimation in Social Networks: Closeness, Node Labels, and Random Edge Lengths COSN. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. 2015. Distance-Based Influence in Networks: Computation and Maximization. Technical Report cs.SI/1410.6976. arXiv. http://arxiv.org/abs/1410.06976Google ScholarGoogle Scholar
  14. E. Cohen and H. Kaplan. 2007. Spatially-decaying aggregation over a network: Model and algorithms. J. Comput. System Sci. Vol. 73 (2007), 265--288. Full version of a SIGMOD 2004 paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Condon and R. M. Karp. 2001. Algorithms for Graph Partitioning on the Planted Partition Model. Random Struct. Algorithms Vol. 18, 2 (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Defferrard, X. Bresson, and P. Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. 2013. Scalable Influence Estimation in Continuous-Time Diffusion Networks. NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. C. Freeman. 1979. Centrality in social networks: Conceptual clarification. Social Networks Vol. 1 (1979).Google ScholarGoogle Scholar
  19. M. Gomez-Rodriguez, J. Leskovec, and A. Krause. 2010. Inferring Networks of Diffusion and Influence. In KDD.Google ScholarGoogle Scholar
  20. A. Grover and J. Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. KDD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Henaff, J. Bruna, and Y. LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. CoRR Vol. abs/1506.05163 (2015). http://arxiv.org/abs/1506.05163Google ScholarGoogle Scholar
  22. T. Joachims. 1999. Transductive Inference for Text Classification Using Support Vector Machines ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Kempe, J. M. Kleinberg, and É. Tardos. 2003. Maximizing the spread of influence through a social network KDD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. N. Kipf and M. Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks ICLR.Google ScholarGoogle Scholar
  25. F. Lin and W. W. Cohen. 2008. The MultiRank Bootstrap Algorithm: Self-Supervised Political Blog Classification and Ranking Using Semi-Supervised Link Classification ICWSM.Google ScholarGoogle Scholar
  26. M. H. Malewicz, G.and Austern, A.J.C Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. 2010. Pregel: a system for large-scale graph processing. SIGMOD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. 2007. Measurement and Analysis of Online Social Networks IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Nandanwar and N. N. Murty. 2016. Structural Neighborhood Based Classification of Nodes in a Network KDD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Opsahl, F. Agneessens, and J. Skvoretz. 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks Vol. 32 (2010). http://toreopsahl.com/2010/03/20/Google ScholarGoogle Scholar
  31. L. Page, S. Brin, R. Motwani, and T. Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab.Google ScholarGoogle Scholar
  32. B. Perozzi, R. Al-Rfou, and S. Skiena. 2014. DeepWalk: Online Learning of Social Representations KDD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Ravi and Q. Diao. 2016. Large-Scale Semi-Supervised Learning Using Streaming Approximation AISTATS.Google ScholarGoogle Scholar
  34. G. Sabidussi. 1966. The centrality index of a graph. Psychometrika, Vol. 31, 4 (1966), 581--603.Google ScholarGoogle ScholarCross RefCross Ref
  35. H. J. Scudder. 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory Vol. 11 (1965). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. 2008. Collective classification in network data. AI Magazine (2008).Google ScholarGoogle Scholar
  37. D. A. Spielman and S-H Teng. 2013. A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning. SIAM J. Comput., Vol. 42, 1 (2013), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Subramanya and P. P. Talukdar. 2014. Graph-based semi-supervised learning. Morgan & Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Whitney and A. Sarkar. 2012. Bootstrapping via Graph Propagation. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Yang, W. W. Cohen, and R. Salakhutdinov. 2016. Revisiting Semi-Supervised Learning with Graph Embeddings ICML. JMLR.org. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Yarowsky. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Schölkopf. 2004. Learning with Local and Global Consistency. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. X. Zhu and Z. Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. (2002). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.3864Google ScholarGoogle Scholar
  44. X. Zhu, Z. Ghahramani, and J. Laffery. 2003. Semi-supervised learning using Gaussian fields and harmonic functions ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Bootstrapped Graph Diffusions: Exposing the Power of Nonlinearity

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
      Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 2, Issue 1
      March 2018
      603 pages
      EISSN:2476-1249
      DOI:10.1145/3203302
      Issue’s Table of Contents

      Copyright © 2018 Owner/Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 April 2018
      Published in pomacs Volume 2, Issue 1

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!