skip to main content
research-article

Discriminative Visual Similarity Search with Semantically Cycle-consistent Hashing Networks

Published:06 October 2022Publication History
Skip Abstract Section

Abstract

Deep hashing has great potential in large-scale visual similarity search due to its preferable efficiency in storage and computation. Technically, deep hashing for visual similarity search inherits the powerful representation capability of deep neural networks, and it encodes visual features into compact binary codes by preserving representative semantic visual features. Works in this field mainly focus on building the relationship between the visual and objective hash spaces, while they seldom study the triadic cross-domain semantic knowledge transfer among visual, semantic, and hashing spaces, leading to a serious semantic ignorance problem during space transformation. In this article, we propose a novel deep tripartite semantically interactive hashing framework, dubbed Semantically Cycle-consistent Hashing Networks (SCHNs), for discriminative hash code learning. Particularly, we construct a flexible semantic space and a transitive latent space, in conjunction with the visual space, to jointly deduce the privileged discriminative hash space. Specifically, a new semantic space is conceived to strengthen the flexibility and completeness of categories in the semantic feature inference phase. At the same time, a transitive latent space is formulated to explore and uncover the shared semantic interactivity embedded in visual and semantic features. Moreover, to further ensure semantic consistency across multiple spaces, we propose to build a cyclic adversarial learning module to preserve and keep their semantic concurrence during space transformation. Notably, our SCHN, for the first time, establishes the cyclic principle of deep semantic-preserving hashing by adaptive semantic parsing across different spaces in a single-modal visual similarity search. In addition, the entire learning framework is jointly optimized in an end-to-end manner. Extensive experiments performed on diverse large-scale datasets evidence the superiority of our method against other state-of-the-art deep hashing algorithms. The source codes of this article are available at https://github.com/JalinWang/SCHN.

REFERENCES

  1. [1] Cao Zhangjie, Long Mingsheng, Wang Jianmin, and Yu Philip S.. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE International Conference on Computer Vision. 56085617.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chen Yaxiong and Lu Xiaoqiang. 2021. Deep category-level and regularized hashing with global semantic similarity learning. IEEE Transactions on Cybernetics 51, 12 (2021), 6240–6252.Google ScholarGoogle Scholar
  3. [3] Chua Tat-Seng, Tang Jinhui, Hong Richang, Li Haojie, Luo Zhiping, and Zheng Yantao. 2009. NUS-WIDE: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Deng Cheng, Chen Zhaojia, Liu Xianglong, Gao Xinbo, and Tao Dacheng. 2018. Triplet-based deep hashing network for cross-modal retrieval. IEEE Transactions on Image Processing 27, 8 (2018), 38933903.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Deng Cheng, Yang Erkun, Liu Tongliang, and Tao Dacheng. 2019. Two-stream deep hashing with class-specific centers for supervised image search. IEEE Transactions on Neural Networks and Learning Systems 31, 6 (2019), 21892201.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Gao Lianli, Zhu Xiaosu, Song Jingkuan, Zhao Zhou, and Shen Heng Tao. 2019. Beyond product quantization: Deep progressive quantization for image retrieval. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 723729.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Gionis Aristides, Indyk Piotr, and Motwani Rajeev. 1999. Similarity search in high dimensions via hashing. In Proceedings of 25th International Conference on Very Large Data Bases. 518529.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Gong Yunchao, Lazebnik Svetlana, Gordo Albert, and Perronnin Florent. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 12 (2013), 29162929.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2014. Generative adversarial nets. Advances in Neural Information Processing Systems 27 (2014), 1–9.Google ScholarGoogle Scholar
  10. [10] Huiskes Mark J. and Lew Michael S.. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval. 3943.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Jain Himalaya, Zepeda Joaquin, Pérez Patrick, and Gribonval Rémi. 2017. SuBiC: A supervised, structured binary code for image search. In Proceedings of the IEEE International Conference on Computer Vision. 833842.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Jiang Qing-Yuan and Li Wu-Jun. 2015. Scalable graph hashing with feature transformation. In Proceedings of the International Joint Conference on Artificial Intelligence. 22482254.Google ScholarGoogle Scholar
  13. [13] Kang Wang-Cheng, Li Wu-Jun, and Zhou Zhi-Hua. 2016. Column sampling based discrete supervised hashing. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 12301236.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Kipf Thomas N. and Welling Max. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google ScholarGoogle Scholar
  15. [15] Krizhevsky Alex and Hinton Geoffrey. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. University of Toronto.Google ScholarGoogle Scholar
  16. [16] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E.. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (2012), 10971105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Lai Hanjiang, Pan Yan, Liu Ye, and Yan Shuicheng. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 32703278.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Lai Hanjiang, Pan Yan, Liu Ye, and Yan Shuicheng. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 32703278.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Li Qi, Sun Zhenan, He Ran, and Tan Tieniu. 2020. A general framework for deep supervised discrete hashing. International Journal of Computer Vision 128, 8 (2020), 22042222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Li Wu-Jun, Wang Sheng, and Kang Wang-Cheng. 2016. Feature learning based deep supervised hashing with pairwise labels. In Proceedings of the International Joint Conference on Artificial Intelligence. 17111717.Google ScholarGoogle Scholar
  21. [21] Lin Kevin, Yang Huei-Fang, Hsiao Jen-Hao, and Chen Chu-Song. 2015. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2735.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Liu An-An, Su Yu-Ting, Nie Wei-Zhi, and Kankanhalli Mohan. 2016. Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1 (2016), 102114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Liu An-An, Wang Yanhui, Xu Ning, Nie Weizhi, Nie Jie, and Zhang Yongdong. 2020. Adaptively clustering-driven learning for visual relationship detection. IEEE Transactions on Multimedia 23 (2020), 45154525.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Liu Hong, Ji Rongrong, Wang Jingdong, and Shen Chunhua. 2018. Ordinal constraint binary coding for approximate nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 4 (2018), 941955.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Liu Wu, Bao Qian, Sun Yu, and Mei Tao. 2022. Recent advances in monocular 2d and 3d human pose estimation: A deep learning perspective. Computing Surveys (2022). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Liu Wei, Wang Jun, Ji Rongrong, Jiang Yu-Gang, and Chang Shih-Fu. 2012. Supervised hashing with kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 20742081.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Liu Wei, Wang Jun, Kumar Sanjiv, and Chang Shih-Fu. 2011. Hashing with graphs. In Proceedings of the 28th International Conference on Machine Learning. 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Liu Xinchen, Liu Wu, Ma Huadong, and Fu Huiyuan. 2016. Large-scale vehicle re-identification in urban surveillance videos. In 2016 IEEE International Conference on Multimedia and Expo. IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Liu Xinchen, Liu Wu, Mei Tao, and Ma Huadong. 2017. Provid: Progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia 20, 3 (2017), 645658.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Maaten Laurens van der and Hinton Geoffrey. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, (November2008), 25792605.Google ScholarGoogle Scholar
  31. [31] Pan Yingwei, Yao Ting, Li Houqiang, Ngo Chong-Wah, and Mei Tao. 2015. Semi-supervised hashing with semantic confidence for large scale visual search. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 5362.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019), 80268037.Google ScholarGoogle Scholar
  33. [33] Peng Yuxin, Zhang Jian, and Ye Zhaoda. 2020. Deep reinforcement learning for image hashing. IEEE Transactions on Multimedia 22, 8 (2020), 20612073.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Qiu Zhaofan, Pan Yingwei, Yao Ting, and Mei Tao. 2017. Deep semantic hashing with generative adversarial networks. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 225234.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Rastegari Mohammad, Ordonez Vicente, Redmon Joseph, and Farhadi Ali. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision. Springer, 525542.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Shen Fumin, Mu Yadong, Yang Yang, Liu Wei, Liu Li, Song Jingkuan, and Shen Heng Tao. 2017. Classification by retrieval: Binarizing data and classifiers. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 595604.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Shen Fumin, Shen Chunhua, Liu Wei, and Shen Heng Tao. 2015. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3745.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Shi Weiwei, Gong Yihong, Chen Badong, and Hei Xinhong. 2021. Transductive semi-supervised deep hashing. IEEE Transactions on Neural Networks and Learning Systems (2021), 114. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Wang Jingdong, Zhang Ting, Sebe Nicu, Shen Heng Tao, et al. 2018. A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 4 (2018), 769790.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Wang Zijian, Zhang Zheng, Luo Yandan, Huang Zi, and Shen Heng Tao. 2020. Deep collaborative discrete hashing with semantic-invariant structure construction. IEEE Transactions on Multimedia 23 (2020), 12741286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Weiss Yair, Torralba Antonio, and Fergus Rob. 2009. Spectral hashing. In Proceedings of the Neural Information Processing Systems. 17531760.Google ScholarGoogle Scholar
  42. [42] Xia Rongkai, Pan Yan, Lai Hanjiang, Liu Cong, and Yan Shuicheng. 2014. Supervised hashing for image retrieval via image representation learning. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. 21562162.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Xu Ning, Zhang Hanwang, Liu An-An, Nie Weizhi, Su Yuting, Nie Jie, and Zhang Yongdong. 2019. Multi-level policy and reward-based deep reinforcement learning framework for image captioning. IEEE Transactions on Multimedia 22, 5 (2019), 13721383.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Zhang Hanwang, Shen Fumin, Liu Wei, He Xiangnan, Luan Huanbo, and Chua Tat-Seng. 2016. Discrete collaborative filtering. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 325334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Zhang Peichao, Zhang Wei, Li Wu-Jun, and Guo Minyi. 2014. Supervised hashing with latent factor models. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 173182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Zhang Zheng, Lai Zhihui, Wong W., Xie Guo-Sen, Liu Li, and Shao Ling. 2019. Scalable supervised asymmetric hashing with semantic and latent factor embedding. IEEE Transactions on Image Processing 28, 10 (2019), 48034818.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhang Zheng, Liu Luyao, Luo Yadan, Huang Zi, Shen Fumin, Shen Heng Tao, and Lu Guangming. 2021. Inductive structure consistent hashing via flexible semantic calibration. IEEE Transactions on Neural Networks and Learning Systems 32, 10 (2021), 45144528.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zhang Zheng, Liu Li, Shen Fumin, Shen Heng Tao, and Shao Ling. 2019. Binary multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 7 (2019), 17741782.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zhang Zheng, Luo Haoyang, Zhu Lei, Lu Guangming, and Shen Heng Tao. 2022. Modality-invariant asymmetric networks for cross-modal hashing. IEEE Transactions on Knowledge and Data Engineering (2022). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Zhang Zheng, Wang Jianning, and Lu Guangming. 2021. Towards discriminative visual search via semantically cycle-consistent hashing networks. In ACM Multimedia Asia. 17.Google ScholarGoogle Scholar
  51. [51] Zheng Liang, Shen Liyue, Tian Lu, Wang Shengjin, Wang Jingdong, and Tian Qi. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 11161124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Zheng Zhedong, Zheng Liang, and Yang Yi. 2017. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision. 37543762.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Zhu Han, Long Mingsheng, Wang Jianmin, and Cao Yue. 2016. Deep hashing network for efficient similarity retrieval. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 24152421.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Discriminative Visual Similarity Search with Semantically Cycle-consistent Hashing Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 2s
      June 2022
      383 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3561949
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 October 2022
      • Online AM: 20 April 2022
      • Accepted: 15 April 2022
      • Revised: 4 April 2022
      • Received: 12 February 2022
      Published in tomm Volume 18, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!