skip to main content
research-article

Learning the User’s Deeper Preferences for Multi-modal Recommendation Systems

Published:24 February 2023Publication History
Skip Abstract Section

Abstract

Recommendation system plays an important role in the rapid development of micro-video sharing platform. Micro-video has rich modal features, such as visual, audio, and text. It is of great significance to carry out personalized recommendation by integrating multi-modal features. However, most of the current multi-modal recommendation systems can only enrich the feature representation on the item side, while it leads to poor learning of user preferences. To solve this problem, we propose a novel module named Learning the User’s Deeper Preferences (LUDP), which constructs the item-item modal similarity graph and user preference graph in each modality to explore the learning of item and user representation. Specifically, we construct item-item similar modalities graph using multi-modal features, the item ID embedding is propagated and aggregated on the graph to learn the latent structural information of items; The user preference graph is constructed through the historical interaction between the user and item, on which the multi-modal features are aggregated as the user’s preference for the modal. Finally, combining the two parts as auxiliary information enhances the user and item representation learned from the collaborative signals to learn deeper user preferences. Through a large number of experiments on two public datasets (TikTok, Movielens), our model is proved to be superior to the most advanced multi-modal recommendation methods.

REFERENCES

  1. [1] Baltrušaitis Tadas, Ahuja Chaitanya, and Morency Louis-Philippe. 2018. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2 (2018), 423443.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Berg Rianne van den, Kipf Thomas N., and Welling Max. 2017. Graph convolutional matrix completion. arXiv preprint arXiv:1706.02263 (2017).Google ScholarGoogle Scholar
  3. [3] Chen Jingyuan, Zhang Hanwang, He Xiangnan, Nie Liqiang, Liu Wei, and Chua Tat Seng. 2017. Attentive collaborative filtering: Multimedia recommendation with item- and component-level attention. In International ACM SIGIR Conference. 335344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Chen Tao, He Xiangnan, and Kan Min Yen. 2016. Context-aware image tweet modelling and recommendation. In ACM on Multimedia Conference. 10181027.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Chen Tianqi, Zhang Weinan, Lu Qiuxia, Chen Kailong, and Yu Yong. 2012. SVDFeature: A toolkit for feature-based collaborative filtering. Journal of Machine Learning Research 13, 1 (2012), 36193622.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Cheng Zhiyong, Chang Xiaojun, Zhu Lei, Kanjirathinkal Rose C., and Kankanhalli Mohan. 2019. MMALFM: Explainable recommendation by leveraging reviews and images. ACM Transactions on Information Systems (TOIS) 37, 2 (2019), 128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  8. [8] Feigl Josef and Bogdan Martin. 2019. Neural networks for personalized item rankings. Neurocomputing 342 (2019), 6065.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Guo Q., Sun Z., Zhang J., and Theng Y. L.. [n. d.]. Modeling heterogeneous influences for point-of-interest recommendation in location-based social networks. Nanyang Technological University, Singapore ([n. d.]).Google ScholarGoogle Scholar
  10. [10] Guo Wei, Su Rong, Tan Renhao, Guo Huifeng, Zhang Yingxue, Liu Zhirong, Tang Ruiming, and He Xiuqiang. 2021. Dual graph enhanced embedding neural network for CTRPrediction. arXiv preprint arXiv:2106.00314 (2021).Google ScholarGoogle Scholar
  11. [11] He K., Zhang X., Ren S., and Sun J.. 2016. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] He Li, Chen Hongxu, Wang Dingxian, Jameel Shoaib, Yu Philip, and Xu Guandong. 2021. Click-through rate prediction with multi-modal hypergraphs. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 690699.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] He Ming, Huang Zekun, and Wen Han. 2021. MPIA: Multiple preferences with item attributes for graph convolutional collaborative filtering. In International Conference on Web Engineering. Springer, 225239.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] He R. and McAuley J.. 2015. VBPR: Visual Bayesian personalized ranking from implicit feedback. (2015).Google ScholarGoogle Scholar
  15. [15] He R. and McAuley J.. 2016. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. International World Wide Web Conferences Steering Committee.Google ScholarGoogle Scholar
  16. [16] He Xiangnan, Deng Kuan, Wang Xiang, Li Yan, Zhang Yongdong, and Wang Meng. 2020. LightGCN: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 639648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Hershey S., Chaudhuri S., Ellis D. P. W., Gemmeke J. F., and Wilson K.. 2016. CNN architectures for large-scale audio classification. IEEE (2016).Google ScholarGoogle Scholar
  18. [18] Hu Yifan, Koren Yehuda, and Volinsky Chris. 2008. Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining. IEEE, 263272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Huang Xiaowen, Qian Shengsheng, Fang Quan, Sang Jitao, and Xu Changsheng. 2020. Meta-path augmented sequential recommendation with contextual co-attention network. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16, 2 (2020), 124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Jie H., Li S., Gang S., and Albanie S.. 2017. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence PP, 99 (2017).Google ScholarGoogle Scholar
  21. [21] Kingma D. and Ba J.. 2014. Adam: A method for stochastic optimization. Computer Science (2014).Google ScholarGoogle Scholar
  22. [22] Kipf Thomas N. and Welling Max. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google ScholarGoogle Scholar
  23. [23] Koren Y., Bell R., and Volinsky C.. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 3037.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Lin Xixun, Wu Jia, Zhou Chuan, Pan Shirui, and Wang Bin. 2021. Task-adaptive neural process for user cold-start recommendation. (2021).Google ScholarGoogle Scholar
  25. [25] Liu Fan, Cheng Zhiyong, Zhu Lei, Gao Zan, and Nie Liqiang. 2021. Interest-aware message-passing GCN for recommendation. In Proceedings of the Web Conference 2021. 12961305.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Ma Jianxin, Cui Peng, Kuang Kun, Wang Xin, and Zhu Wenwu. 2019. Disentangled graph convolutional networks. In International Conference on Machine Learning. PMLR, 42124221.Google ScholarGoogle Scholar
  27. [27] Rendle Steffen, Freudenthaler Christoph, Gantner Zeno, and Schmidt-Thieme Lars. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).Google ScholarGoogle Scholar
  28. [28] Rossum B. V. and Frasincar F.. 2019. Augmenting LOD-based recommender systems using graph centrality measures. Erasmus University Rotterdam, Rotterdam, The Netherlands (2019).Google ScholarGoogle Scholar
  29. [29] Su Xiaoyuan and Khoshgoftaar Taghi M.. 2009. A survey of collaborative filtering techniques. Advances in Artificial Intelligence 2009 (2009).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Sun Rui, Cao Xuezhi, Zhao Yan, Wan Junchen, Zhou Kun, Zhang Fuzheng, Wang Zhongyuan, and Zheng Kai. 2020. Multi-modal knowledge graphs for recommender systems. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 14051414.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Tao Zhulin, Wei Yinwei, Wang Xiang, He Xiangnan, Huang Xianglin, and Chua Tat-Seng. 2020. MGAT: Multimodal graph attention network for recommendation. Information Processing & Management 57, 5 (2020), 102277.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Veličković Petar, Cucurull Guillem, Casanova Arantxa, Romero Adriana, Lio Pietro, and Bengio Yoshua. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google ScholarGoogle Scholar
  33. [33] Wang Qifan, Wei Yinwei, Yin Jianhua, Wu Jianlong, Song Xuemeng, Nie Liqiang, and Zhang Min. 2021. DualGNN: Dual graph neural network for multimedia recommendation. IEEE Transactions on Multimedia (2021).Google ScholarGoogle Scholar
  34. [34] Wang Xiang, He Xiangnan, Wang Meng, Feng Fuli, and Chua Tat-Seng. 2019. Neural graph collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 165174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Wang Xiao, Zhu Meiqi, Bo Deyu, Cui Peng, Shi Chuan, and Pei Jian. 2020. AM-GCN: Adaptive multi-channel graph convolutional networks. In KDD’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Wei Yinwei, Wang Xiang, He Xiangnan, Nie Liqiang, Rui Yong, and Chua Tat-Seng. 2021. Hierarchical user intent graph network for multimedia recommendation. IEEE Transactions on Multimedia (2021).Google ScholarGoogle Scholar
  37. [37] Wei Yinwei, Wang Xiang, Li Qi, Nie Liqiang, Li Yan, Li Xuanping, and Chua Tat-Seng. 2021. Contrastive learning for cold-start recommendation. In Proceedings of the 29th ACM International Conference on Multimedia. 53825390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Wei Yinwei, Wang Xiang, Nie Liqiang, He Xiangnan, and Chua Tat-Seng. 2020. Graph-refined convolutional network for multimedia recommendation with implicit feedback. In Proceedings of the 28th ACM International Conference on Multimedia. 35413549.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Wei Yinwei, Wang Xiang, Nie Liqiang, He Xiangnan, Hong Richang, and Chua Tat-Seng. 2019. MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video. In Proceedings of the 27th ACM International Conference on Multimedia. 14371445.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Xu Cai, Guan Ziyu, Zhao Wei, Wu Quanzhou, Yan Meng, Chen Long, and Miao Qiguang. 2020. Recommendation by users’ multimodal preferences for smart city applications. IEEE Transactions on Industrial Informatics 17, 6 (2020), 41974205.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Ying Rex, He Ruining, Chen Kaifeng, Eksombatchai Pong, Hamilton William L., and Leskovec Jure. 2018. Graph convolutional neural networks for web-scale recommender systems. ACM (2018).Google ScholarGoogle Scholar
  42. [42] Zhang Jinghao, Zhu Yanqiao, Liu Qiang, Wu Shu, Wang Shuhui, and Wang Liang. 2021. Mining latent structures for multimedia recommendation. arXiv preprint arXiv:2104.09036 (2021).Google ScholarGoogle Scholar
  43. [43] Zhao Z., Yang Y., Li C., and Nie L.. 2020. GuessUNeed: Recommending courses via neural attention network and course prerequisite relation embeddings. ACM Transactions on Multimedia Computing Communications and Applications 16, 4 (2020), 117.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning the User’s Deeper Preferences for Multi-modal Recommendation Systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3s
          June 2023
          270 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3582887
          • Editor:
          • Abdulmotaleb El Saddik
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 February 2023
          • Online AM: 7 December 2022
          • Accepted: 20 November 2022
          • Revised: 13 June 2022
          • Received: 28 January 2022
          Published in tomm Volume 19, Issue 3s

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)426
          • Downloads (Last 6 weeks)72

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!