skip to main content
10.1145/3580305.3599414acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Free Access

Leveraging Relational Graph Neural Network for Transductive Model Ensemble

Published:04 August 2023Publication History

ABSTRACT

Traditional methods of pre-training, fine-tuning, and ensembling often overlook essential relational data and task interconnections. To address this gap, our study presents a novel approach to harnessing this relational information via a relational graph-based model. We introduce Relational grAph Model ensemBLE model, abbreviated as RAMBLE. This model distinguishes itself by performing class label inference simultaneously across all data nodes and task nodes, employing the relational graph in a transductive manner. This fine-grained approach allows us to better comprehend and model the intricate interplay between data and tasks. Furthermore, we incorporate a novel variational information bottleneck-guided scheme for embedding fusion and aggregation. This innovative technique facilitates the creation of an informative fusion embedding, honing in on embeddings beneficial for the intended task while simultaneously filtering out potential noise-laden embeddings. Our theoretical analysis, grounded in information theory, confirms that the use of relational information for embedding fusion allows us to achieve higher upper and lower bounds on our target task's accuracy. We thoroughly assess our proposed model across eight diverse datasets, and the experimental results demonstrate the model's effective utilization of relational knowledge derived from all pre-trained models, thereby enhancing its performance on our target tasks.

Skip Supplemental Material Section

Supplemental Material

rtfp0341-2min-promo.mp4

mp4

3.9 MB

meeting_02.mp4

mp4

3.9 MB

References

  1. Fady Alajaji, Po-Ning Chen, et al. 2018. An Introduction to Single-User Information Theory. Springer.Google ScholarGoogle Scholar
  2. Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A Saurous, and Kevin Murphy. 2018. Fixing a broken ELBO. In International Conference on Machine Learning. PMLR, 159--168.Google ScholarGoogle Scholar
  3. Alexander A Alemi, Ian Fischer, Joshua V Dillon, and Kevin Murphy. 2016. Deep variational information bottleneck. arXiv preprint arXiv:1612.00410 (2016).Google ScholarGoogle Scholar
  4. Daniel Bolya, Rohit Mittapalli, and Judy Hoffman. 2021a. Scalable Diverse Model Selection for Accessible Transfer Learning. In NeurIPS.Google ScholarGoogle Scholar
  5. Daniel Bolya, Rohit Mittapalli, and Judy Hoffman. 2021b. Scalable Diverse Model Selection for Accessible Transfer Learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 19301--19312.Google ScholarGoogle Scholar
  6. Gavin Brown. 2009. An information theoretic perspective on multiple classifier systems. In International Workshop on Multiple Classifier Systems. Springer, 344--353.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jiahang Cao, Jinyuan Fang, Zaiqiao Meng, and Shangsong Liang. 2022a. Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces. arXiv preprint arXiv:2211.03536 (2022).Google ScholarGoogle Scholar
  8. Kaidi Cao, Jiaxuan You, and Jure Leskovec. 2022b. Relational multi-task learning: Modeling relations between data and tasks. In International Conference on Representation Learning (ICLR).Google ScholarGoogle Scholar
  9. Kaidi Cao, Jiaxuan You, and Jure Leskovec. 2023. Relational multi-task learning: Modeling relations between data and tasks. arXiv preprint arXiv:2303.07666 (2023).Google ScholarGoogle Scholar
  10. Chih chan Tien and Shane Steinert-Threlkeld. 2021. Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining. In ACL.Google ScholarGoogle Scholar
  11. Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3438--3445.Google ScholarGoogle ScholarCross RefCross Ref
  12. Guanzheng Chen, Jinyuan Fang, Zaiqiao Meng, Qiang Zhang, and Shangsong Liang. 2022. Multi-Relational Graph Representation Learning with Bayesian Gaussian Process Network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 5530--5538.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jie Chen, Tengfei Ma, and Cao Xiao. 2018. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018).Google ScholarGoogle Scholar
  14. Jianfei Chen, Jun Zhu, and Le Song. 2017. Stochastic training of graph convolutional networks with variance reduction. arXiv preprint arXiv:1710.10568 (2017).Google ScholarGoogle Scholar
  15. M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar
  16. Thomas G Dietterich. 2000. Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer, 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jinyuan Fang, Shangsong Liang, Zaiqiao Meng, and Maarten De Rijke. 2021a. Hyperspherical variational co-embedding for attributed networks. ACM Transactions on Information Systems (TOIS), Vol. 40, 3 (2021), 1--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jinyuan Fang, Shangsong Liang, Zaiqiao Meng, and Qiang Zhang. 2021b. Gaussian process with graph convolutional kernel for relational learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 353--363.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hongyang Gao and Shuiwang Ji. 2019. Graph representation learning via hard and channel-wise attention networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 741--749.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Raphael Gontijo-Lopes, Yann Dauphin, and Ekin D Cubuk. 2021. No one representation to rule them all: Overlapping features of training methods. arXiv preprint arXiv:2110.12899 (2021).Google ScholarGoogle Scholar
  22. Priya Goyal, Quentin Duval, Jeremy Reizenstein, Matthew Leavitt, Min Xu, Benjamin Lefaudeux, Mannat Singh, Vinicius Reis, Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Ishan Misra. 2021. VISSL. https://github.com/facebookresearch/vissl.Google ScholarGoogle Scholar
  23. Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).Google ScholarGoogle Scholar
  24. Martin Hellman and Josef Raviv. 1970. Probability of error, equivocation, and the Chernoff bound. IEEE Transactions on Information Theory, Vol. 16, 4 (1970), 368--372.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Long-Kai Huang, Junzhou Huang, Yu Rong, Qiang Yang, and Ying Wei. 2022. Frustratingly easy transferability estimation. In International Conference on Machine Learning. PMLR, 9201--9225.Google ScholarGoogle Scholar
  26. Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive sampling towards fast graph representation learning. Advances in neural information processing systems, Vol. 31 (2018).Google ScholarGoogle Scholar
  27. Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, and Aleksander Madry. 2022. A data-based perspective on transfer learning. arXiv preprint arXiv:2207.05739 (2022).Google ScholarGoogle Scholar
  28. Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).Google ScholarGoogle Scholar
  29. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Fei-Fei Li. 2011. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR workshop on fine-grained visual categorization (FGVC), Vol. 2. Citeseer.Google ScholarGoogle Scholar
  30. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google ScholarGoogle Scholar
  31. Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby. 2020. Big transfer (bit): General visual representation learning. In European conference on computer vision. Springer, 491--507.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Simon Kornblith, Jonathon Shlens, and Quoc V Le. 2019. Do better imagenet models transfer better?. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2661--2671.Google ScholarGoogle ScholarCross RefCross Ref
  33. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  34. Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, Vol. 30 (2017).Google ScholarGoogle Scholar
  35. Hongkang Li, Meng Wang, Sijia Liu, Pin-Yu Chen, and Jinjun Xiong. 2022. Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling. In International Conference on Machine Learning. PMLR, 13014--13051.Google ScholarGoogle Scholar
  36. Shangsong Liang, Zhuo Ouyang, and Zaiqiao Meng. 2021. A normalizing flow-based co-embedding model for attributed networks. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 16, 3 (2021), 1--31.Google ScholarGoogle Scholar
  37. Siwei Liu, Zaiqiao Meng, Craig Macdonald, and Iadh Ounis. 2023. Graph neural pre-training for recommendation with side information. ACM Transactions on Information Systems, Vol. 41, 3 (2023), 1--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ziqi Liu, Chaochao Chen, Longfei Li, Jun Zhou, Xiaolong Li, Le Song, and Yuan Qi. 2019. Geniepath: Graph neural networks with adaptive receptive paths. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4424--4431.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Chunwei Ma, Ziyun Huang, Mingchen Gao, and Jinhui Xu. 2021a. Few-shot Learning via Dirichlet Tessellation Ensemble. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  40. Xiaojun Ma, Junshan Wang, Hanyue Chen, and Guojie Song. 2021b. Improving graph neural networks with structural adaptive receptive fields. In Proceedings of the Web Conference 2021. 2438--2447.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Chris J Maddison, Andriy Mnih, and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016).Google ScholarGoogle Scholar
  42. S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi. 2013. Fine-Grained Visual Classification of Aircraft. Technical Report. arxiv: 1306.5151 [cs-cv]Google ScholarGoogle Scholar
  43. Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, and Amit Dhurandhar. 2022. Auto-Transfer: Learning to Route Transferrable Representations. arXiv preprint arXiv:2202.01011 (2022).Google ScholarGoogle Scholar
  44. Maria-Elena Nilsback and Andrew Zisserman. 2008. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 722--729.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, Vol. 22 (2010), 1345--1359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Michal Pándy, Andrea Agostinelli, Jasper Uijlings, Vittorio Ferrari, and Thomas Mensink. 2022. Transferability Estimation using Bhattacharyya Class Separability. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9172--9182.Google ScholarGoogle ScholarCross RefCross Ref
  47. Ariadna Quattoni and Antonio Torralba. 2009. Recognizing indoor scenes. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 413--420.Google ScholarGoogle ScholarCross RefCross Ref
  48. Yu Rong, Wenbing Huang, Tingyang Xu, and Junzhou Huang. 2019. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903 (2019).Google ScholarGoogle Scholar
  49. Zhiqiang Shen, Zechun Liu, Jie Qin, Marios Savvides, and Kwang-Ting Cheng. 2021. Partial is better than all: Revisiting fine-tuning strategy for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 9594--9602.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yang Shu, Zhi Kou, Zhangjie Cao, Jianmin Wang, and Mingsheng Long. 2021. Zoo-Tuning: Adaptive Transfer from a Zoo of Models. ArXiv, Vol. abs/2106.15434 (2021).Google ScholarGoogle Scholar
  51. Qingyun Sun, Jianxin Li, Hao Peng, Jia Wu, Xingcheng Fu, Cheng Ji, and Philip S. Yu. 2021. Graph Structure Learning with Variational Information Bottleneck. In AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  52. Susheel Suresh, Pan Li, Cong Hao, and Jennifer Neville. 2021. Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 15920--15933.Google ScholarGoogle Scholar
  53. Anh T Tran, Cuong V Nguyen, and Tal Hassner. 2019. Transferability and hardness of supervised classification tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1395--1405.Google ScholarGoogle ScholarCross RefCross Ref
  54. Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. 2011. The caltech-ucsd birds-200--2011 dataset. (2011).Google ScholarGoogle Scholar
  55. Xiao Wang, Meiqi Zhu, Deyu Bo, Peng Cui, Chuan Shi, and Jian Pei. 2020. Am-gcn: Adaptive multi-channel graph convolutional networks. In Proceedings of the 26th ACM SIGKDD International conference on knowledge discovery & data mining. 1243--1253.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Mitchell Wortsman, Gabriel Ilharco, Samir Ya Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, et al. 2022. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International Conference on Machine Learning. PMLR, 23965--23998.Google ScholarGoogle Scholar
  57. Tailin Wu, Hongyu Ren, Pan Li, and Jure Leskovec. 2020b. Graph information bottleneck. Advances in Neural Information Processing Systems, Vol. 33 (2020), 20437--20448.Google ScholarGoogle Scholar
  58. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020a. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, Vol. 32, 1 (2020), 4--24.Google ScholarGoogle ScholarCross RefCross Ref
  59. Tianpei Yang, Weixun Wang, Hongyao Tang, Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, et al. 2021. An efficient transfer learning framework for multiagent reinforcement learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 17037--17048.Google ScholarGoogle Scholar
  60. Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and Li Fei-Fei. 2011. Human action recognition by learning bases of action attributes and parts. In 2011 International conference on computer vision. IEEE, 1331--1338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Shingo Yashima, Teppei Suzuki, Kohta Ishikawa, Ikuro Sato, and Rei Kawakami. 2022. Feature Space Particle Inference for Neural Network Ensembles. arXiv preprint arXiv:2206.00944 (2022).Google ScholarGoogle Scholar
  62. Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? Advances in neural information processing systems, Vol. 27 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Jialin Zhao, Yuxiao Dong, Ming Ding, Evgeny Kharlamov, and Jie Tang. 2021. Adaptive Diffusion in Graph Neural Networks. Advances in Neural Information Processing Systems, Vol. 34 (2021), 23321--23333.Google ScholarGoogle Scholar
  64. Cheng Zheng, Bo Zong, Wei Cheng, Dongjin Song, Jingchao Ni, Wenchao Yu, Haifeng Chen, and Wei Wang. 2020. Robust graph representation learning via neural sparsification. In International Conference on Machine Learning. PMLR, 11458--11468.Google ScholarGoogle Scholar
  65. Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.Google ScholarGoogle Scholar
  66. Zhi-Hua Zhou and Nan Li. 2010. Multi-information ensemble diversity. In International Workshop on Multiple Classifier Systems. Springer, 134--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2019. A Comprehensive Survey on Transfer Learning. Proc. IEEE, Vol. 109 (2019), 43--76.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Leveraging Relational Graph Neural Network for Transductive Model Ensemble

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
      August 2023
      5996 pages
      ISBN:9798400701030
      DOI:10.1145/3580305

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    • Article Metrics

      • Downloads (Last 12 months)462
      • Downloads (Last 6 weeks)68

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader