skip to main content
research-article

Learning Streamed Attention Network from Descriptor Images for Cross-Resolution 3D Face Recognition

Authors Info & Claims
Published:23 January 2023Publication History
Skip Abstract Section

Abstract

In this article, we propose a hybrid framework for cross-resolution 3D face recognition which utilizes a Streamed Attention Network (SAN) that combines handcrafted features with Convolutional Neural Networks (CNNs). It consists of two main stages: first, we process the depth images to extract low-level surface descriptors and derive the corresponding Descriptor Images (DIs), represented as four-channel images. To build the DIs, we propose a variation of the 3D Local Binary Pattern (3DLBP) operator that encodes depth differences using a sigmoid function. Then, we design a CNN that learns from these DIs. The peculiarity of our solution consists in processing each channel of the input image separately, and fusing the contribution of each channel by means of both self- and cross-attention mechanisms. This strategy showed two main advantages over the direct application of Deep-CNN to depth images of the face; on the one hand, the DIs can reduce the diversity between high- and low-resolution data by encoding surface properties that are robust to resolution differences. On the other, it allows a better exploitation of the richer information provided by low-level features, resulting in improved recognition. We evaluated the proposed architecture in a challenging cross-dataset, cross-resolution scenario. To this aim, we first train the network on scanner-resolution 3D data. Next, we utilize the pre-trained network as feature extractor on low-resolution data, where the output of the last fully connected layer is used as face descriptor. Other than standard benchmarks, we also perform experiments on a newly collected dataset of paired high- and low-resolution 3D faces. We use the high-resolution data as gallery, while low-resolution faces are used as probe, allowing us to assess the real gap existing between these two types of data. Extensive experiments on low-resolution 3D face benchmarks show promising results with respect to state-of-the-art methods.

REFERENCES

  1. Abbass Mohammed Y., Kwon Ki-Chul, Kim Nam, Abdelwahab Safey A., El-Samie Fathi E. Abd, and Khalaf Ashraf A. M.. 2021. Efficient object tracking using hierarchical convolutional features model and correlation filters. The Visual Computer 37, 4 (2021), 831842.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Al-Obaydy Wasseem N. Ibrahem and Suandi Shahrel Azmin. 2020. Open-set single-sample face recognition in video surveillance using fuzzy ARTMAP. Neural Computing and Applications 32, 5 (2020), 14051412.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Berretti S., Bimbo A. Del, and Pala P.. 2010. 3D face recognition using isogeodesic stripes. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 12 (Dec.2010), 21622177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Besl P. J. and McKay N. D.. 1992. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 2 (Feb.1992), 239256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bondi E., Pala P., Berretti S., and Bimbo A. Del. 2016. Reconstructing high-resolution face models from Kinect depth sequences. IEEE Transactions on Information Forensics and Security 11, 12 (Dec.2016), 28432853.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Neto J. B. Cardia and Marana A. N.. 2014. 3D Face Recognition Using Kinect. Master’s thesis. São Paulo State University (UNESP), Bauru SP 17033-360, Brazil.Google ScholarGoogle Scholar
  7. Neto J. B. Cardia and Marana A. N.. 2018. Utilizing deep learning and 3DLBP for 3D face recognition. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP). 135142.Google ScholarGoogle Scholar
  8. Neto J. B. Cardia, Marana A. N., Ferrari C., Berretti S., and Bimbo A. Del. 2019. Depth based face recognition by learning from 3D-LBP images. In Eurographics Workshop on 3D Object Retrieval (3DOR’19).Google ScholarGoogle Scholar
  9. Drira H., Amor B. Ben, Srivastava A., Daoudi M., and Slama R.. 2013. 3D face recognition under expressions, occlusions, and pose variations. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (Sept.2013), 22702283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Drosou A., Moschonas P., and Tzovaras D.. 2013. Robust 3D face recognition from low resolution images. In International Conference of the BIOSIG Special Interest Group. 18.Google ScholarGoogle Scholar
  11. Faltemier T. C., Bowyer K. W., and Flynn P. J.. 2008. A region ensemble for 3-D face recognition. IEEE Transactions on Information Forensics and Security 3, 1 (March2008), 6273.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ferrari Claudio, Berretti Stefano, Pala Pietro, and Bimbo Alberto Del. 2021. A sparse and locally coherent morphable face model for dense semantic correspondence across heterogeneous 3D faces. IEEE Transactions on Pattern Analysis and Machine Intelligence (to appear).Google ScholarGoogle Scholar
  13. Ferrari Claudio, Berretti Stefano, Pala Pietro, and Bimbo Alberto Del. 2022. The MICC-3D face dataset. Sensors (to appear).Google ScholarGoogle Scholar
  14. Ferrari Claudio, Lisanti Giuseppe, Berretti Stefano, and Bimbo Alberto Del. 2018. Investigating nuisances in DCNN-based face recognition. IEEE Transactions on Image Processing 27, 11 (2018), 56385651.Google ScholarGoogle ScholarCross RefCross Ref
  15. Galteri Leonardo, Ferrari Claudio, Lisanti Giuseppe, Berretti Stefano, and Bimbo Alberto Del. 2019. Deep 3D morphable model refinement via progressive growing of conditional generative adversarial networks. Computer Vision and Image Understanding 185 (2019), 3142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gilani Syed Zulqarnain and Mian Ajmal. 2018. Learning from millions of 3D scans for large-scale 3D face recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 18961905.Google ScholarGoogle Scholar
  17. Gilani Syed Zulqarnain, Mian Ajmal, and Eastwood Peter. 2017. Deep, dense and accurate 3D face correspondence for generating population specific deformable models. Pattern Recognition 69 (2017), 238250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Goswami G., Bharadwaj S., Vatsa M., and Singh R.. 2013. On RGB-d face recognition using Kinect. In IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS’13). 16. Google ScholarGoogle ScholarCross RefCross Ref
  19. He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2015. Deep residual learning for image recognition. arxiv:1512.03385 [cs.CV]. https://arxiv.org/abs/1512.03385.Google ScholarGoogle Scholar
  20. He R., Cao J., Song L., Sun Z., and Tan T.. 2020. Adversarial cross-spectral face completion for NIR-VIS face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 5 (2020), 10251037.Google ScholarGoogle ScholarCross RefCross Ref
  21. Hernandez M., Choi J., and Medioni G.. 2012. Laser scan quality 3-D face modeling using a low-cost depth camera. In European Signal Processing Conference (EUSIPCO’12). 19951999.Google ScholarGoogle Scholar
  22. Howard Andrew, Sandler Mark, Chu Grace, Chen Liang-Chieh, Chen Bo, Tan Mingxing, Wang Weijun, Zhu Yukun, Pang Ruoming, Vasudevan Vijay, Le Quoc V., and Adam Hartwig. 2019. Searching for MobileNetV3. arXiv:1905.02244 [cs.CV]. https://arxiv.org/abs/1905.02244.Google ScholarGoogle Scholar
  23. Hu Zhenguo, Zhao Qijun, and Liu Feng. 2019. Revisiting depth-based face recognition from a quality perspective. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19) Workshops. 00.Google ScholarGoogle ScholarCross RefCross Ref
  24. Huang Y., Wang Y., and Tan T.. 2006. Combining statistics of geometrical and correlative features for 3D face recognition. In British Machine Vision Conference (BMVC’06). 90.1–90.10. .Google ScholarGoogle ScholarCross RefCross Ref
  25. Jiang Luo, Zhang Juyong, and Deng Bailin. 2020. Robust RGB-D face recognition using attribute-aware loss. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 10 (2020), 2552–2566. Google ScholarGoogle ScholarCross RefCross Ref
  26. Kim D., Hernandez M., Choi J., and Medioni G.. 2017. Deep 3D face identification. In IEEE International Joint Conference on Biometrics (IJCB’17). 133142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Li B. Y. L., Mian A. S., Liu Wanquan, and Krishna A.. 2013. Using Kinect for face recognition under varying poses, expressions, illumination and disguise. In 2013 IEEE Workshop on Applications of Computer Vision (WACV’13). 186192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liao Jiashu, Kot Alex, Guha Tanaya, and Sanchez Victor. 2020. Attention selective network for face synthesis and pose-invariant face recognition. In 2020 IEEE International Conference on Image Processing (ICIP’20). 748752. Google ScholarGoogle ScholarCross RefCross Ref
  29. Mantecón T., del-Blanco C. R., Jaureguizar F., and García N.. 2016. Visual face recognition using bag of dense derivative depth patterns. IEEE Signal Processing Letters 23, 6 (June2016), 771775.Google ScholarGoogle ScholarCross RefCross Ref
  30. Min R., Choi J., Medioni G., and Dugelay J.. 2012. Real-time 3D face identification from a depth camera. In International Conference on Pattern Recognition (ICPR’12). 17391742.Google ScholarGoogle Scholar
  31. Min Rui, Kose Neslihan, and Dugelay Jean-Luc. 2014. KinectFaceDB: A kinect database for face recognition. IEEE Transactions on Systems, Man, and Cybernetics: Systems 44, 11 (Nov.2014), 15341548. Google ScholarGoogle ScholarCross RefCross Ref
  32. Mu Guodong, Huang Di, Hu Guosheng, Sun Jia, and Wang Yunhong. 2019. Led3D: A lightweight and efficient deep approach to recognizing low-quality 3D faces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 57735782.Google ScholarGoogle ScholarCross RefCross Ref
  33. Ojala Timo, Pietikäinen Matti, and Harwood David. 1996. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 29, 1 (Jan.1996), 5159.Google ScholarGoogle ScholarCross RefCross Ref
  34. Parchami Mostafa, Bashbaghi Saman, and Granger Eric. 2017. CNNs with cross-correlation matching for face recognition in video surveillance using a single training sample per person. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’17). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  35. Parkhi Omkar M., Vedaldi Andrea, and Zisserman Andrew. 2015. Deep face recognition. In British Machine Vision Conference (BMVC’15). 6.Google ScholarGoogle ScholarCross RefCross Ref
  36. Phillips P. Jonathon, Flynn Patrick J., Scruggs Todd, Bowyer Kevin W., Chang Jin, Hoffman Kevin, Marques Joe, Min Jaesik, and Worek William. 2005. Overview of the face recognition grand challenge. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. 947954.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Savran Arman, Alyüz Neşe, Dibeklioğlu Hamdi, Çeliktutan Oya, Gökberk Berk, Sankur Bülent, and Akarun Lale. 2008. Bosphorus database for 3D face analysis. In European Workshop on Biometrics and Identity Management. Springer, 4756.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Schroff F., Kalenichenko D., and Philbin J.. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 815823.Google ScholarGoogle ScholarCross RefCross Ref
  39. Singh Maneet, Nagpal Shruti, Vatsa Mayank, Singh Richa, and Majumdar Angshul. 2018. Identity aware synthesis for cross resolution face recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18) Workshops. 59259209.Google ScholarGoogle ScholarCross RefCross Ref
  40. Spreeuwers Luuk. 2011. Fast and accurate 3D face recognition. International Journal of Computer Vision 93, 3 (July2011), 389414.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 59986008.Google ScholarGoogle Scholar
  42. Wang Heng, Tran Du, Torresani Lorenzo, and Feiszli Matt. 2020a. Video modeling with correlation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google ScholarGoogle ScholarCross RefCross Ref
  43. Wang Qiangchang, Wu Tianyi, Zheng He, and Guo Guodong. 2020b. Hierarchical pyramid diverse attention networks for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google ScholarGoogle ScholarCross RefCross Ref
  44. Wang Qiang, Zheng Yun, Pan Pan, and Xu Yinghui. 2021. Multiple object tracking with correlation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 38763886.Google ScholarGoogle ScholarCross RefCross Ref
  45. Xiong Xingwang, Wen Xu, and Huang Cheng. 2019. Improving RGB-D face recognition via transfer learning from a pretrained 2D network. In International Symposium on Benchmarking, Measuring and Optimization. Springer, 141148.Google ScholarGoogle Scholar
  46. Yin Xi and Liu Xiaoming. 2018. Multi-task convolutional neural network for pose-invariant face recognition. IEEE Transactions on Image Processing 27, 2 (Feb.2018), 964975.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yu Yuechen, Xiong Yilei, Huang Weilin, and Scott Matthew R.. 2020. Deformable siamese attention networks for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 67286737.Google ScholarGoogle ScholarCross RefCross Ref
  48. Zhou Bolei, Khosla Aditya, Lapedriza Agata, Oliva Aude, and Torralba Antonio. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 29212929.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Learning Streamed Attention Network from Descriptor Images for Cross-Resolution 3D Face Recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 1s
      February 2023
      504 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3572859
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 January 2023
      • Online AM: 25 March 2022
      • Accepted: 14 March 2022
      • Revised: 1 February 2022
      • Received: 22 September 2021
      Published in tomm Volume 19, Issue 1s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)161
      • Downloads (Last 6 weeks)20

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!