skip to main content
research-article

Multi-View Graph Matching for 3D Model Retrieval

Published:05 July 2020Publication History
Skip Abstract Section

Abstract

3D model retrieval has been widely utilized in numerous domains, such as computer-aided design, digital entertainment, and virtual reality. Recently, many graph-based methods have been proposed to address this task by using multi-view information of 3D models. However, these methods are always constrained by many-to-many graph matching for the similarity measure between pairwise models. In this article, we propose a multi-view graph matching method (MVGM) for 3D model retrieval. The proposed method can decompose the complicated multi-view graph-based similarity measure into multiple single-view graph-based similarity measures and fusion. First, we present the method for single-view graph generation, and we further propose the novel method for the similarity measure in a single-view graph by leveraging both node-wise context and model-wise context. Then, we propose multi-view fusion with diffusion, which can collaboratively integrate multiple single-view similarities w.r.t. different viewpoints and adaptively learn their weights, to compute the multi-view similarity between pairwise models. In this way, the proposed method can avoid the difficulty in the definition and computation of the traditional high-order graph. Moreover, this method is unsupervised and does not require a large-scale 3D dataset for model learning. We conduct evaluations on four popular and challenging datasets. The extensive experiments demonstrate the superiority and effectiveness of the proposed method compared against the state of the art. In particular, this unsupervised method can achieve competitive performances against the most recent supervised and deep learning method.

References

  1. Cosimo Rubino, Marco Crocco, and Alessio Del Bue. 2018. 3D object localisation from multi-view image detections. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 6 (2018), 1281--129Google ScholarGoogle Scholar
  2. Jin Xie, Guoxian Dai, Fan Zhu, Ling Shao, and Yi Fang. 2018. Deep nonlinear metric learning for 3-D shape retrieval. IEEE Transactions on Cybernetics 48, 1 (2018), 412--422.Google ScholarGoogle ScholarCross RefCross Ref
  3. An-An Liu, Weizhi Nie, Yue Gao, and Yuting Su. 2018. View-based 3-D model retrieval: A benchmark. IEEE Transactions on Cybernetics 48, 3 (2018), 916--928.Google ScholarGoogle Scholar
  4. Jin Xie, Guoxian Dai, Fan Zhu, Edward K. Wong, and Yi Fang. 2017. DeepShape: Deep-learned shape descriptor for 3D shape retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 7 (2017), 1335--1345.Google ScholarGoogle ScholarCross RefCross Ref
  5. Haoxuan You, Yifan Feng, Rongrong Ji, and Yue Gao. 2018. PVNet: A joint convolutional network of point cloud and multi-view for 3D shape recognition. In Proceedings of the ACM Conference on Multimedia (MM’18). 1310--1318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Mihael Ankerst, Gabi Kastenmüller, Hans-Peter Kriegel, and Thomas Seidl. 1999. 3D shape histograms for similarity search and classification in spatial databases. In Proceedings of the 6th International Symposium on Advances in Spatial Databases. 207--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Masaki Hilaga, Yoshihisa Shinagawa, Taku Komura, and Tosiyasu L. Kunii. 2001. Topology matching for fully automatic similarity estimation of 3D shapes. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’01). 203--212.Google ScholarGoogle Scholar
  8. Petros Daras and Apostolos Axenopoulos. 2010. A 3D shape retrieval framework supporting multimodal queries. International Journal of Computer Vision 89, 2–3 (2010), 229--247.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Shenghua Gao, Lixin Duan, and Ivor W. Tsang. 2016. DEFEATnet— deep conventional image representation for image classification. IEEE Transactions on Circuits and Systems for Video Technology 26, 3 (2016), 494--505.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhiyong Cheng, Xiaojun Chang, Lei Zhu, Rose Catherine Kanjirathinkal, and Mohan S. Kankanhalli. 2019. MMALFM: Explainable recommendation by leveraging reviews and images. ACM Transactions on Information Systems 37, 2 (2019), Article 16, 28 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zhineng Chen, Shanshan Ai, and Caiyan Jia. 2019. Structure-aware deep learning for product image classification. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 1s (2019), Article 4, 20 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chenggang Yan, Liang Li, Chunjie Zhang, Bingtao Liu, Yongdong Zhang, and Qionghai Dai. 2019. Cross-modality bridging and knowledge transferring for image understanding. IEEE Transactions on Multimedia 21, 10 (2019), 2675--2685.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hongtao Xie, Shancheng Fang, Zheng-Jun Zha, Yating Yang, Yan Li, and Yongdong Zhang. 2019. Convolutional attention networks for scene text recognition. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 1s (2019), Article 3, 17 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jingjing Li, Ke Lu, Zi Huang, Lei Zhu, and Heng Tao Shen. 2019. Heterogeneous domain adaptation through progressive alignment. IEEE Transactions on Neural Networks and Learning Systems 30, 5 (2019), 1381--1391.Google ScholarGoogle ScholarCross RefCross Ref
  15. Hanli Wang, Bo Xiao, Lei Wang, Fengkuangtian Zhu, Yu-Gang Jiang, and Jun Wu. 2015. CHCF: A cloud-based heterogeneous computing framework for large-scale image retrieval. IEEE Transactions on Circuits and Systems for Video Technology 25, 12 (2015), 1900--1913.Google ScholarGoogle ScholarCross RefCross Ref
  16. Duc-Tien Dang-Nguyen, Luca Piras, Giorgio Giacinto, Giulia Boato, and Francesco G. B. De Natale. 2017. Multimodal retrieval with diversification and relevance feedback for tourist attraction images. ACM Transactions on Multimedia Computing, Communications, and Applications 13, 4 (2017), Article 49, 24 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dapeng Tao, Yanan Guo, Baosheng Yu, Jianxin Pang, and Zhengtao Yu. 2018. Deep multi-view feature learning for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 28, 10 (2018), 2657--2666.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Amir Mazaheri, Boqing Gong, and Mubarak Shah. 2018. Learning a multi-concept video retrieval model with multiple latent variables. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 2 (2018), Article 46, 21 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lei Zhu, Zi Huang, Zhihui Li, Liang Xie, and Heng Tao Shen. 2018. Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Transactions on Neural Networks and Learning Systems 29, 11 (2018), 5264--5276.Google ScholarGoogle ScholarCross RefCross Ref
  20. Anan Liu, Yuting Su, Weizhi Nie, and Mohan S. Kankanhalli. 2017. Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1 (2017), 102--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ning Xu, Hanwang Zhang, Anan Liu, Weizhi Nie, Yuting Su, Jie Nie, and Yongdong Zhang. 2020. Multi-level policy and reward-based deep reinforcement learning framework for image captioning. IEEE Transactions on Multimedia 22, 5 (2020), 1372--1383.Google ScholarGoogle ScholarCross RefCross Ref
  22. Chenggang Yan, Yunbin Tu, Xingzheng Wang, Yongbing Zhang, Xinhong Hao, Yongdong Zhang, and Qionghai Dai. 2020. STAT: Spatial-temporal attention mechanism for video captioning. IEEE Transactions on Multimedia 22, 1 (2020), 229--241.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tarik Filali Ansary, Mohamed Daoudi, and Jean-Philippe Vandeborre. 2007. A Bayesian 3-D search engine using adaptive views clustering. IEEE Transactions on Multimedia 9, 1 (2007), 78--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yue Gao, Jinhui Tang, Richang Hong, Shuicheng Yan, Qionghai Dai, Naiyao Zhang, and T.-S. Chua. 2012. Camera constraint-free view-based 3-D object retrieval. IEEE Transactions on Image Processing 21, 4 (2012), 2269--2281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Liqiang Nie, Meng Wang, Yue Gao, Zheng-Jun Zha, and Tat-Seng Chua. 2013. Beyond text QA: Multimedia answer generation by harvesting web information. IEEE Transactions on Multimedia 15, 2 (2013), 426--441.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anan Liu, Zhongyang Wang, Weizhi Nie, and Yuting Su. 2015. Graph-based characteristic view set extraction and matching for 3D model retrieval. Information Sciences 320 (2015), 429--442.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mingjie Liang, Huaqing Min, Ronghua Luo, and Jin-Hui Zhu. 2015. Simultaneous recognition and modeling for learning 3-D object models from everyday scenes. IEEE Transactions on Cybernetics 45, 10 (2015), 2237--2248.Google ScholarGoogle ScholarCross RefCross Ref
  28. Yue Gao and Qionghai Dai. 2014. View-based 3D object retrieval: Challenges and approaches. IEEE MultiMedia 21, 3 (2014), 52--57.Google ScholarGoogle ScholarCross RefCross Ref
  29. Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik G. Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proceedings of the 2015 International Conference on Computer Vision (ICCV’15). 945--953.Google ScholarGoogle Scholar
  30. Wenhui Li and Yang An. 2017. View-wised discriminative ranking for 3D object retrieval. Multimedia Tools and Applications 1 (2017), 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yi Zhen, Yue Gao, Dit-Yan Yeung, Hongyuan Zha, and Xuelong Li. 2016. Spectral multimodal hashing and its application to multimedia retrieval. IEEE Transactions on Cybernetics 46, 1 (2016), 27--38.Google ScholarGoogle ScholarCross RefCross Ref
  32. Yue Gao, Qionghai Dai, Meng Wang, and Naiyao Zhang. 2011. 3D model retrieval using weighted bipartite graph matching. Signal Processing: Image Communication 26, 1 (2011), 39--47.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Anan Liu, Weizhi Nie, Yue Gao, and Yuting Su. 2016. Multi-modal clique-graph matching for view-based 3D model retrieval. IEEE Transactions on Image Processing 25, 5 (2016), 2103--2116.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jun Yu, Yong Rui, Yuan Yan Tang, and Dacheng Tao. 2014. High-order distance-based multiview stochastic learning in image classification. IEEE Transactions on Cybernetics 44, 12 (2014), 2431--2442.Google ScholarGoogle ScholarCross RefCross Ref
  35. Xueliang Liu, Meng Wang, Bao-Cai Yin, Benoit Huet, and Xuelong Li. 2015. Event-based media enrichment using an adaptive probabilistic hypergraph model. IEEE Transactions on Cybernetics 45, 11 (2015), 2461--2471.Google ScholarGoogle ScholarCross RefCross Ref
  36. Zan Gao, Deyu Wang, Xiangnan He, and Hua Zhang. 2018. Group-pair convolutional neural networks for multi-view based 3D object retrieval. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18).Google ScholarGoogle Scholar
  37. Yuting Su, Wenhui Li, Anan Liu, and Weizhi Nie. 2018. Hierarchical graph structure learning for multi-view 3D model retrieval. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18). 913--919.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Luren Yang and Fritz Albregtsen. 1996. Fast and exact computation of Cartesian geometric moments using discrete Green’s theorem. Pattern Recognition 29, 7 (1996), 1061--1073.Google ScholarGoogle ScholarCross RefCross Ref
  39. Przemyslaw Polewski, Wei Yao, Peter Krzystek, Marco Heurich, and Uwe Stilla. 2014. Detection of fallen tree segments in airborne LiDAR point clouds of a temperate forest by combining point/primitive-level shape descriptors. In Gemeinsame Tagung. 1--12.Google ScholarGoogle Scholar
  40. Robert Osada, Thomas A. Funkhouser, Bernard Chazelle, and David P. Dobkin. 2001. Matching 3D models with shape distributions. In Proceedings of the 2001 International Conference on Shape Modeling and Applications (SMI’01). 154--166.Google ScholarGoogle Scholar
  41. Ke Lu, Qian Wang, Jian Xue, and Weiguo Pan. 2014. 3D model retrieval and classification by semi-supervised learning with content-based similarity. Information Sciences 281 (2014), 703--713.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. D. Papoiu, N. M. Emerson, T. S. Patel, R. A. Kraft, R. Valdes-Rodriguez, L. A. Nattkemper, R. C. Coghill, and G. Yosipovitch. 2014. Voxel-based morphometry and arterial spin labeling fMRI reveal neuropathic and neuroplastic features of brain processing of itch in end-stage-renal-disease. Journal of Neurophysiology 112, 7 (2014), 1729.Google ScholarGoogle ScholarCross RefCross Ref
  43. Masaki Hilaga, Yoshihisa Shinagawa, Taku Komura, and Tosiyasu L. Kunii. 2001. Topology matching for fully automatic similarity estimation of 3D shapes. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’01). 203--212.Google ScholarGoogle Scholar
  44. H. Sundar, Deborah Silver, Nikhil Gagvani, and Sven J. Dickinson. 2003. Skeleton based shape matching and retrieval. In Proceedings of the International Conference on Shape Modeling and Applications (SMI’03). 130--142, 290.Google ScholarGoogle Scholar
  45. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1912--1920.Google ScholarGoogle Scholar
  46. Valeria Garro and Andrea Giachetti. 2016. Scale space graph representation and kernel matching for non rigid and textured 3D shape retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 6 (2016), 1258--1271.Google ScholarGoogle ScholarCross RefCross Ref
  47. Rongrong Ji, Lingyu Duan, Jie Chen, Tiejun Huang, and Wen Gao. 2014. Mining compact bag-of-patterns for low bit rate mobile visual search. IEEE Transactions on Image Processing 23, 7 (2014), 3099--3113.Google ScholarGoogle ScholarCross RefCross Ref
  48. Biao Leng, Changchun Du, Shuang Guo, Xiangyang Zhang, and Zhang Xiong. 2015. A powerful 3D model classification mechanism based on fusing multi-graph. Neurocomputing 168 (2015), 761--769.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Weizhi Nie, Anan Liu, and Yuting Su. 2016. 3D object retrieval based on sparse coding in weak supervision. J.ournal of Visual Communication and Image Representation 37 (2016), 40--45.Google ScholarGoogle Scholar
  50. Biao Leng, Xiangyang Zhang, Ming Yao, and Zhang Xiong. 2015. A 3D model recognition mechanism based on deep Boltzmann machines. Neurocomputing 151 (2015), 593--602.Google ScholarGoogle ScholarCross RefCross Ref
  51. Richang Hong, Zhenzhen Hu, Ruxin Wang, Meng Wang, and Dacheng Tao. 2016. Multi-view object retrieval via multi-scale topic models. IEEE Transactions on Image Processing 25, 12 (2016), 5814--5827.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Feiping Nie, Jing Li, and Xuelong Li. 2017. Convex multiview semi-supervised classification. IEEE Transactions on Image Processing 26, 12 (2017), 5718--5729.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. An-An Liu, Weizhi Nie, and Yuting Su. 2019. 3D object retrieval based on multi-view latent variable model. IEEE Transactions on Circuits and Systems for Video Technology 29, 3 (2019), 868--880.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yue Gao, Meng Wang, Dacheng Tao, Rongrong Ji, and Qionghai Dai. 2012. 3-D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing 21, 9 (2012), 4290--4303.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Sicheng Zhao, Hongxun Yao, Yanhao Zhang, Yasi Wang, and Shaohui Liu. 2015. View-based 3D object retrieval via multi-modal graph learning. Signal Processing 112 (2015), 110--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Daniel Carlos Guimarães Pedronette and Ricardo da Silva Torres. 2013. Image re-ranking and rank aggregation based on similarity of ranked lists. Pattern Recognition 46, 8 (2013), 2350--2360.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 3652--3661.Google ScholarGoogle ScholarCross RefCross Ref
  58. Song Bai, Peng Tang, Philip H. S. Torr, and Longin Jan Latecki. 2019. Re-ranking via metric fusion for object retrieval and person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 740--749.Google ScholarGoogle ScholarCross RefCross Ref
  59. Yifan Feng, Zizhao Zhang, Xibin Zhao, Rongrong Ji, and Yue Gao. 2018. GVCNN: Group-view convolutional neural networks for 3D shape recognition. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. 264--272.Google ScholarGoogle ScholarCross RefCross Ref
  60. Zan Gao, Deyu Wang, Hua Zhang, Yanbing Xue, and Guangping Xu. 2016. A fast 3D retrieval algorithm via class-statistic and pair-constraint model. In Proceedings of the 24th ACM International Conference on Multimedia (MM’16). 117--121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. William Webber, Alistair Moffat, and Justin Zobel. 2010. A similarity measure for indefinite rankings. ACM Transactions on Information Systems 28, 4 (2010), Article 20, 38 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Michael Donoser and Horst Bischof. 2013. Diffusion processes for retrieval revisited. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. 1320--1327.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Yang Long, Li Liu, Fumin Shen, Ling Shao, and Xuelong Li. 2018. Zero-shot learning using synthesised unseen visual data with diffusion regularisation. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 10 (2018), 2498--2512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, and Qi Tian. 2017. Ensemble diffusion for retrieval. In Proceedings of the IEEE International Conference on Computer Vision. 774--783.Google ScholarGoogle ScholarCross RefCross Ref
  65. Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum 22 (2003), 223--232.Google ScholarGoogle Scholar
  66. Philip Shilane, Patrick Min, Michael M. Kazhdan, and Thomas A. Funkhouser. 2004. The Princeton shape benchmark. In Proceedings of the 2004 International Conference on Shape Modeling and Applications (SMI’04). 167--178.Google ScholarGoogle Scholar
  67. Bastian Leibe and Bernt Schiele. 2003. Analyzing appearance and contour based methods for object categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 409--415.Google ScholarGoogle ScholarCross RefCross Ref
  68. Feng Lu, Imari Sato, and Yoichi Sato. 2015. Uncalibrated photometric stereo based on elevation angle recovery from BRDF symmetry of isotropic materials. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 168--176.Google ScholarGoogle ScholarCross RefCross Ref
  69. Henning Müller, Wolfgang Müller, David Squire, Stéphane Marchand-Maillet, and Thierry Pun. 2001. Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognition Letters 22, 5 (2001), 593--601.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1106--1114.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-View Graph Matching for 3D Model Retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 3
      August 2020
      364 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3409646
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 July 2020
      • Online AM: 7 May 2020
      • Revised: 1 March 2020
      • Accepted: 1 March 2020
      • Received: 1 June 2019
      Published in tomm Volume 16, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!