skip to main content
research-article
Free Access

Simultaneous Image Reconstruction and Feature Learning with 3D-CNNs for Image Set–Based Classification

Published:08 April 2021Publication History
Skip Abstract Section

Abstract

Image set–based classification has attracted substantial research interest because of its broad applications. Recently, lots of methods based on feature learning or dictionary learning have been developed to solve this problem, and some of them have made gratifying achievements. However, most of them transform the image set into a 2D matrix or use 2D convolutional neural networks (CNNs) for feature learning, so the spatial and temporal information is missing. At the same time, these methods extract features from original images in which there may exist huge intra-class diversity. To explore a possible solution to these issues, we propose a simultaneous image reconstruction with deep learning and feature learning with 3D-CNNs (SIRFL) for image set classification. The proposed SIRFL approach consists of a deep image reconstruction network and a 3D-CNN-based feature learning network. The deep image reconstruction network is used to reduce the diversity of images from the same set, and the feature learning network can effectively retain spatial and temporal information by using 3D-CNNs. Extensive experimental results on five widely used datasets show that our SIRFL approach is a strong competitor for the state-of-the-art image set classification methods.

References

  1. Ognjen Arandjelovic, Gregory Shakhnarovich, John Fisher, Roberto Cipolla, and Trevor Darrell. 2005. Face recognition with image sets using manifold density divergence. In Proceedings of CVPR. 581–588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Hakan Cevikalp and Bill Triggs. 2010. Face recognition based on image sets. In Proceedings of CVPR. 2567–2573.Google ScholarGoogle Scholar
  3. Liang Chen. 2014. Dual linear regression based classification for face cluster recognition. In Proceedings of CVPR. 2673–2680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. 2017. A multi-task deep network for person re-identification. In Proceedings of AAAI. 3988–3994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yi Chen Chen, Vishal M. Patel, P. Jonathon Phillips, and Rama Chellappa. 2015. Dictionary-based face and person recognition from unconstrained video. IEEE Access 3 (2015), 1783–1798.Google ScholarGoogle ScholarCross RefCross Ref
  6. Gong Cheng, Peicheng Zhou, and Junwei Han. 2018. Duplex metric learning for image set classification. IEEE Transactions on Image Processing 27, 1 (2018), 281–292.Google ScholarGoogle ScholarCross RefCross Ref
  7. Qingxiang Feng, Yicong Zhou, and Rushi Lan. 2016. Pairwise linear regression classification for image set retrieval. In Proceedings of CVPR. 4865–4872.Google ScholarGoogle Scholar
  8. Xizhan Gao, Quansen Sun, Haitao Xu, and Jianqiang Gao. 2020. Sparse and collaborative representation based kernel pairwise linear regression for image set classification. Expert Systems with Applications 140 (2020), 112886.Google ScholarGoogle Scholar
  9. Xizhan Gao, Quansen Sun, Haitao Xu, Dong Wei, and Jianqiang Gao. 2019. Multi-model fusion metric learning for image set classification. Knowledge-Based Systems 164 (2019), 253–264.Google ScholarGoogle Scholar
  10. Munawar Hayat, Mohammed Bennamoun, and Senjian An. 2014. Learning non-linear reconstruction models for image set classification. In Proceedings of CVPR. 1915–1922. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Munawar Hayat, Mohammed Bennamoun, and Senjian An. 2014. Reverse training: An efficient approach for image set classification. In Proceedings of ECCV. 784–799.Google ScholarGoogle Scholar
  12. Munawar Hayat, Mohammed Bennamoun, and Senjian An. 2015. Deep reconstruction models for image set classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 4 (2015), 713–727.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yiqun Hu, Ajmal S. Mian, and Robyn A. Owens. 2011. Sparse approximated nearest points for image set classification. In Proceedings of CVPR. 121–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Peiguang Jing, Yuting Su, Zhengnan Li, Jing Liu, and Liqiang Nie. 2019. Low-rank regularized tensor discriminant representation for image set classification. Signal Processing 156 (2019), 62–70.Google ScholarGoogle ScholarCross RefCross Ref
  15. Xiao-Yuan Jing, Xinyu Zhang, Xiaoke Zhu, Fei Wu, Xinge You, Yang Gao, Shiguang Shan, and Jingyu Yang. 2021. Multiset feature learning for highly imbalanced data classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 1 (2021), 139–156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Fei Fei Li. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of CVPR. 1725–1732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Minyoung Kim, Sanjiv Kumar, Vladimir Pavlovic, and Henry A. Rowley. 2008. Face tracking and recognition with visual constraints in real-world videos. In Proceedings of CVPR. 1–8.Google ScholarGoogle Scholar
  18. Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar. 2009. Attribute and simile classifiers for face verification. In Proceedings of ECCV. 365–372.Google ScholarGoogle Scholar
  19. Kuang-Chih Lee, Jeffrey Ho, Ming-Hsuan Yang, and David J. Kriegman. 2003. Video-based face recognition using probabilistic appearance manifolds. In Proceedings of CVPR. 313–320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bastian Leibe and Bernt Schiele. 2003. Analyzing appearance and contour based methods for object categorization. In Proceedings of CVPR. 409–415.Google ScholarGoogle Scholar
  21. Huihui Li, Yan Zeng, and Ning Yang. 2018. Image reconstruction for compressed sensing based on joint sparse bases and adaptive sampling. Machine Vision and Applications 29, 1 (2018), 145–157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xiaocui Li, Hongzhi Yin, Ke Zhou, and Xiaofang Zhou. 2020. Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web 23, 2 (2020), 781–798.Google ScholarGoogle ScholarCross RefCross Ref
  23. Bo Liu, Liping Jing, Jia Li, Jian Yu, Alex Gittens, and Michael W. Mahoney. 2019. Group collaborative representation for image set classification. International Journal of Computer Vision 127, 2 (2019), 181–206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jiwen Lu, Gang Wang, Weihong Deng, and Pierre Moulin. 2014. Simultaneous feature and dictionary learning for image set based face recognition. In Proceedings of ECCV. 265–280.Google ScholarGoogle Scholar
  25. Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou. 2015. Multi-manifold deep metric learning for image set classification. In Proceedings of CVPR. 1137–1145.Google ScholarGoogle Scholar
  26. Jiwen Lu, Gang Wang, and Jie Zhou. 2017. Simultaneous feature and dictionary learning for image set based face recognition. IEEE Transactions on Image Processing 26, 8 (2017), 4042–4054.Google ScholarGoogle ScholarCross RefCross Ref
  27. Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In Proceedings of AAAI. 2793–2799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Syed Afaq Ali Shah, Uzair Nadeem, Mohammed Bennamoun, Ferdous Ahmed Sohel, and Roberto Togneri. 2017. Efficient image set classification using linear regression based image reconstruction. In Proceedings of CVPR Workshops. 601–610.Google ScholarGoogle Scholar
  29. Xinhang Song, Luis Herranz, and Shuqiang Jiang. 2018. Depth CNNs for RGB-D scene recognition: Learning from scratch better than transferring from RGB-CNNs. arXiv:1801.06797 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Haoliang Sun, Xiantong Zhen, Yuanjie Zheng, Gongping Yang, Yilong Yin, and Shuo Li. 2017. Learning deep match kernels for image-set classification. In Proceedings of CVPR. 3307–3316.Google ScholarGoogle Scholar
  31. Ruiping Wang and Xilin Chen. 2009. Manifold discriminant analysis. In Proceedings of CVPR. 429–436.Google ScholarGoogle Scholar
  32. Ruiping Wang, Huimin Guo, Larry S. Davis, and Qionghai Dai. 2012. Covariance discriminative learning: A natural and efficient approach to image set classification. In Proceedings of CVPR. 2496–2503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Wen Wang, Ruiping Wang, Zhiwu Huang, Shiguang Shan, and Xilin Chen. 2018. Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets. IEEE Transactions on Image Processing 27, 1 (2018), 151–163.Google ScholarGoogle Scholar
  34. Wen Wang, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2017. Prototype discriminative learning for image set classification. IEEE Signal Processing Letters 24, 9 (2017), 1318–1322.Google ScholarGoogle ScholarCross RefCross Ref
  35. Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In Proceedings of CVPR. 529–534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fei Wu, Xiao Yuan Jing, Wangmeng Zuo, Ruiping Wang, and Xiaoke Zhu. 2017. Discriminant tensor dictionary learning with neighbor uncorrelation for image set based classification. In Proceedings of IJCAI. 3069–3075. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Man Zhang, Ran He, Dong Cao, Zhenan Sun, and Tieniu Tan. 2016. Simultaneous feature and sample reduction for image-set classification. In Proceedings of AAAI, Vol. 16. 1401–1407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Xiaoke Zhu, Xiao-Yuan Jing, Fei Wu, Yunhong Wang, Wangmeng Zuo, and Wei-Shi Zheng. 2017. Learning heterogeneous dictionary pair with feature projection matrix for pedestrian video retrieval via single query image. In Proceedings of AAAI. 4341–4348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xiaoke Zhu, Xiao-Yuan Jing, Fei Wu, Di Wu, Li Cheng, Sen Li, and Ruimin Hu. 2017. Multi-kernel low-rank dictionary pair learning for multiple features based image classification. In Proceedings of AAAI. 2970–2976. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Simultaneous Image Reconstruction and Feature Learning with 3D-CNNs for Image Set–Based Classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM/IMS Transactions on Data Science
      ACM/IMS Transactions on Data Science  Volume 2, Issue 2
      May 2021
      149 pages
      ISSN:2691-1922
      DOI:10.1145/3454114
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 April 2021
      • Accepted: 1 August 2020
      • Revised: 1 July 2020
      • Received: 1 February 2020
      Published in tds Volume 2, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)93
      • Downloads (Last 6 weeks)11

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!