Abstract
In this article, we propose a hybrid framework for cross-resolution 3D face recognition which utilizes a Streamed Attention Network (SAN) that combines handcrafted features with Convolutional Neural Networks (CNNs). It consists of two main stages: first, we process the depth images to extract low-level surface descriptors and derive the corresponding Descriptor Images (DIs), represented as four-channel images. To build the DIs, we propose a variation of the 3D Local Binary Pattern (3DLBP) operator that encodes depth differences using a sigmoid function. Then, we design a CNN that learns from these DIs. The peculiarity of our solution consists in processing each channel of the input image separately, and fusing the contribution of each channel by means of both self- and cross-attention mechanisms. This strategy showed two main advantages over the direct application of Deep-CNN to depth images of the face; on the one hand, the DIs can reduce the diversity between high- and low-resolution data by encoding surface properties that are robust to resolution differences. On the other, it allows a better exploitation of the richer information provided by low-level features, resulting in improved recognition. We evaluated the proposed architecture in a challenging cross-dataset, cross-resolution scenario. To this aim, we first train the network on scanner-resolution 3D data. Next, we utilize the pre-trained network as feature extractor on low-resolution data, where the output of the last fully connected layer is used as face descriptor. Other than standard benchmarks, we also perform experiments on a newly collected dataset of paired high- and low-resolution 3D faces. We use the high-resolution data as gallery, while low-resolution faces are used as probe, allowing us to assess the real gap existing between these two types of data. Extensive experiments on low-resolution 3D face benchmarks show promising results with respect to state-of-the-art methods.
- . 2021. Efficient object tracking using hierarchical convolutional features model and correlation filters. The Visual Computer 37, 4 (2021), 831–842.Google Scholar
Digital Library
- . 2020. Open-set single-sample face recognition in video surveillance using fuzzy ARTMAP. Neural Computing and Applications 32, 5 (2020), 1405–1412.Google Scholar
Digital Library
- . 2010. 3D face recognition using isogeodesic stripes. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 12 (
Dec. 2010), 2162–2177.Google ScholarDigital Library
- . 1992. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 2 (
Feb. 1992), 239–256. Google ScholarDigital Library
- . 2016. Reconstructing high-resolution face models from Kinect depth sequences. IEEE Transactions on Information Forensics and Security 11, 12 (
Dec. 2016), 2843–2853.Google ScholarDigital Library
- . 2014. 3D Face Recognition Using Kinect. Master’s thesis. São Paulo State University (UNESP), Bauru SP 17033-360, Brazil.Google Scholar
- . 2018. Utilizing deep learning and 3DLBP for 3D face recognition. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP). 135–142.Google Scholar
- . 2019. Depth based face recognition by learning from 3D-LBP images. In Eurographics Workshop on 3D Object Retrieval (3DOR’19).Google Scholar
- . 2013. 3D face recognition under expressions, occlusions, and pose variations. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (
Sept. 2013), 2270–2283.Google ScholarDigital Library
- . 2013. Robust 3D face recognition from low resolution images. In International Conference of the BIOSIG Special Interest Group. 1–8.Google Scholar
- . 2008. A region ensemble for 3-D face recognition. IEEE Transactions on Information Forensics and Security 3, 1 (
March 2008), 62–73.Google ScholarDigital Library
- . 2021. A sparse and locally coherent morphable face model for dense semantic correspondence across heterogeneous 3D faces. IEEE Transactions on Pattern Analysis and Machine Intelligence (to appear).Google Scholar
- . 2022. The MICC-3D face dataset. Sensors (to appear).Google Scholar
- . 2018. Investigating nuisances in DCNN-based face recognition. IEEE Transactions on Image Processing 27, 11 (2018), 5638–5651.Google Scholar
Cross Ref
- . 2019. Deep 3D morphable model refinement via progressive growing of conditional generative adversarial networks. Computer Vision and Image Understanding 185 (2019), 31–42.Google Scholar
Digital Library
- . 2018. Learning from millions of 3D scans for large-scale 3D face recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 1896–1905.Google Scholar
- . 2017. Deep, dense and accurate 3D face correspondence for generating population specific deformable models. Pattern Recognition 69 (2017), 238–250.Google Scholar
Digital Library
- . 2013. On RGB-d face recognition using Kinect. In IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS’13). 1–6. Google Scholar
Cross Ref
- . 2015. Deep residual learning for image recognition.
arxiv:1512.03385 [cs.CV]. https://arxiv.org/abs/1512.03385.Google Scholar - . 2020. Adversarial cross-spectral face completion for NIR-VIS face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 5 (2020), 1025–1037.Google Scholar
Cross Ref
- . 2012. Laser scan quality 3-D face modeling using a low-cost depth camera. In European Signal Processing Conference (EUSIPCO’12). 1995–1999.Google Scholar
- . 2019. Searching for MobileNetV3. arXiv:1905.02244 [cs.CV]. https://arxiv.org/abs/1905.02244.Google Scholar
- . 2019. Revisiting depth-based face recognition from a quality perspective. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19) Workshops. 0–0.Google Scholar
Cross Ref
- . 2006. Combining statistics of geometrical and correlative features for 3D face recognition. In British Machine Vision Conference (BMVC’06). 90.1–90.10. .Google Scholar
Cross Ref
- . 2020. Robust RGB-D face recognition using attribute-aware loss. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 10 (2020), 2552–2566. Google Scholar
Cross Ref
- . 2017. Deep 3D face identification. In IEEE International Joint Conference on Biometrics (IJCB’17). 133–142. Google Scholar
Digital Library
- . 2013. Using Kinect for face recognition under varying poses, expressions, illumination and disguise. In 2013 IEEE Workshop on Applications of Computer Vision (WACV’13). 186–192. Google Scholar
Digital Library
- . 2020. Attention selective network for face synthesis and pose-invariant face recognition. In 2020 IEEE International Conference on Image Processing (ICIP’20). 748–752. Google Scholar
Cross Ref
- . 2016. Visual face recognition using bag of dense derivative depth patterns. IEEE Signal Processing Letters 23, 6 (
June 2016), 771–775.Google ScholarCross Ref
- . 2012. Real-time 3D face identification from a depth camera. In International Conference on Pattern Recognition (ICPR’12). 1739–1742.Google Scholar
- . 2014. KinectFaceDB: A kinect database for face recognition. IEEE Transactions on Systems, Man, and Cybernetics: Systems 44, 11 (
Nov. 2014), 1534–1548. Google ScholarCross Ref
- . 2019. Led3D: A lightweight and efficient deep approach to recognizing low-quality 3D faces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5773–5782.Google Scholar
Cross Ref
- . 1996. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 29, 1 (
Jan. 1996), 51–59.Google ScholarCross Ref
- . 2017. CNNs with cross-correlation matching for face recognition in video surveillance using a single training sample per person. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’17). IEEE, 1–6.Google Scholar
Cross Ref
- . 2015. Deep face recognition. In British Machine Vision Conference (BMVC’15). 6.Google Scholar
Cross Ref
- . 2005. Overview of the face recognition grand challenge. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. 947–954.Google Scholar
Digital Library
- . 2008. Bosphorus database for 3D face analysis. In European Workshop on Biometrics and Identity Management. Springer, 47–56.Google Scholar
Digital Library
- . 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 815–823.Google Scholar
Cross Ref
- . 2018. Identity aware synthesis for cross resolution face recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18) Workshops. 592–59209.Google Scholar
Cross Ref
- . 2011. Fast and accurate 3D face recognition. International Journal of Computer Vision 93, 3 (
July 2011), 389–414.Google ScholarDigital Library
- . 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.Google Scholar
- . 2020a. Video modeling with correlation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google Scholar
Cross Ref
- . 2020b. Hierarchical pyramid diverse attention networks for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google Scholar
Cross Ref
- . 2021. Multiple object tracking with correlation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 3876–3886.Google Scholar
Cross Ref
- . 2019. Improving RGB-D face recognition via transfer learning from a pretrained 2D network. In International Symposium on Benchmarking, Measuring and Optimization. Springer, 141–148.Google Scholar
- . 2018. Multi-task convolutional neural network for pose-invariant face recognition. IEEE Transactions on Image Processing 27, 2 (
Feb. 2018), 964–975.Google ScholarDigital Library
- . 2020. Deformable siamese attention networks for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6728–6737.Google Scholar
Cross Ref
- . 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921–2929.Google Scholar
Cross Ref
Index Terms
Learning Streamed Attention Network from Descriptor Images for Cross-Resolution 3D Face Recognition
Recommendations
3D face recognition using local binary patterns
It is well recognized that expressions can significantly change facial geometry that results in a severe problem for robust 3D face recognition. So it is crucial for many applications that how to extract expression-robust features to describe 3D faces. ...
3D face recognition: a survey
3D face recognition has become a trending research direction in both industry and academia. It inherits advantages from traditional 2D face recognition, such as the natural recognition process and a wide range of applications. Moreover, 3D face ...
Improved Network for Face Recognition Based on Feature Super Resolution Method
AbstractLow-resolution face images can be found in many practical applications. For example, faces captured from surveillance videos are typically in small sizes. Existing face recognition deep networks, trained on high-resolution images, perform poorly ...






Comments