Abstract
Face recognition from two-dimensional (2D) still images and videos is quite successful even with “in the wild” conditions. Instead, less consolidated results are available for the cases in which face data come from non-conventional cameras, such as infrared or depth. In this article, we investigate this latter scenario assuming that a low-resolution depth camera is used to perform face recognition in an uncooperative context. To this end, we propose, first, to automatically select a set of frames from the depth sequence of the camera because they provide a good view of the face in terms of pose and distance. Then, we design a progressive refinement approach to reconstruct a higher-resolution model from the selected low-resolution frames. This process accounts for the anisotropic error of the existing points in the current 3D model and the points in a newly acquired frame so that the refinement step can progressively adjust the point positions in the model using a Kalman-like estimation. The quality of the reconstructed model is evaluated by considering the error between the reconstructed models and their corresponding high-resolution scans used as ground truth. In addition, we performed face recognition using the reconstructed models as probes against a gallery of reconstructed models and a gallery with high-resolution scans. The obtained results confirm the possibility to effectively use the reconstructed models for the face recognition task.
- S. Berretti, M. Daoudi, P. Turaga, and A. Basu. 2018. Representation, analysis, and recognition of 3D humans: A survey. ACM Transactions on Multimedia Computing Communications, and Applications 14, 1s (March 2018), 16:1--16:36. Google Scholar
Digital Library
- S. Berretti, A. Del Bimbo, and P. Pala. 2010. 3D face recognition using isogeodesic stripes. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 12 (Dec 2010), 2162--2177. Google Scholar
Digital Library
- S. Berretti, A. Del Bimbo, and P. Pala. 2013. Sparse matching of salient facial curves for recognition of 3-D faces with missing parts. IEEE Transactions on Information Forensics and Security 8, 2 (Feb 2013), 374--389. Google Scholar
Digital Library
- S. Berretti, P. Pala, and A. Del Bimbo. 2014. Face recognition by super-resolved 3D models from consumer depth cameras. IEEE Transactions on Information Forensics and Security 9, 9 (Sept 2014), 1436--1449. Google Scholar
Digital Library
- S. Berretti, N. Werghi, A. Del Bimbo, and P. Pala. 2013. Matching 3D face scans using interest points and local histogram descriptors. Computers 8 Graphics 37, 5 (2013), 509--525. Google Scholar
Digital Library
- V. Blanz and T. Vetter. 1999. A morphable model for the synthesis of 3D faces. In ACM Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’99). ACM Press/Addison-Wesley Publishing Co., New York, NY, 187--194. Google Scholar
Digital Library
- E. Bondi, P. Pala, S. Berretti, and A. Del Bimbo. 2016. Reconstructing high-resolution face models from Kinect depth sequences. IEEE Transactions on Information Forensics and Security 11, 12 (Dec 2016), 2843--2853. Google Scholar
Digital Library
- Q. Cao, L. Lin, Y. Shi, X. Liang, and G. Li. 2017. Attention-aware face hallucination via deep reinforcement learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 1656--1664.Google Scholar
- J. Choi, A. Sharma, and G. Medioni. 2013. Comparing strategies for 3D face recognition from a 3D sensor. In 2013 IEEE RO-MAN. IEEE, Gyeongju, Korea, 19--24.Google Scholar
- Z. Cui, H. Chang, S. Shan, B. Zhong, and X. Chen. 2014. Deep network cascade for image super-resolution. In European Conference on Computer Vision (ECCV’14), David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 49--64.Google Scholar
- M. Dimitrievski, B. Goossens, P. Veelaert, and W. Philips. 2017. High resolution depth reconstruction from monocular images and sparse point clouds using deep convolutional neural network. SPIE Optical Engineering + Applications 10410 (2017), 10410--10410--9.Google Scholar
- C. Dong, C. C. Loy, K. He, and X. Tang. 2014. Learning a deep convolutional network for image super-resolution. In European Conference on Computer Vision (ECCV’14), D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars (Eds.). Springer International Publishing, Cham, 184--199.Google Scholar
- P. Dou, S. K. Shah, and I. A. Kakadiaris. 2017. End-to-end 3D face reconstruction with deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 1503--1512.Google Scholar
- D. C. Dowson and B. V. Landau. 1982. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis 12, 3 (1982), 450--455.Google Scholar
Cross Ref
- H. Drira, B. Ben Amor, A. Srivastava, M. Daoudi, and R. Slama. 2013. 3D face recognition under expressions, occlusions, and pose variations. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (Sept 2013), 2270--2283. Google Scholar
Digital Library
- P. Fankhauser, M. Bloesch, D. Rodriguez, R. Kaestner, M. Hutter, and R. Siegwart. 2015. Kinect v2 for mobile robot navigation: Evaluation and modeling. In IEEE International Conference on Advanced Robotics (ICAR’15). IEEE, Istanbul, Turkey, 388--394.Google Scholar
- M. Devanne, H. Wannous, S. Berretti, P. Pala, M. Daoudi, and A. Del Bimbo. 2015. 3-D human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Transactions on Cybernetics 45, 7 (July 2015), 1340--1352.Google Scholar
Cross Ref
- M. Hernandez, J. Choi, and G. Medioni. 2012. Laser scan quality 3-D face modeling using a low-cost depth camera. In European Signal Processing Conference (EUSIPCO’12). IEEE, Bucharest, Romania, 1995--1999.Google Scholar
- B. K. P. Horn and M. J. Brooks (Eds.). 1989. Shape from Shading. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- P. Huber, P. Kopp, W. Christmas, M. Rätsch, and J. Kittler. 2017. Real-time 3D face fitting and texture fusion on in-the-wild videos. IEEE Signal Processing Letters 24, 4 (April 2017), 437--441.Google Scholar
Cross Ref
- K. Al Ismaeil, D. Aouada, B. Mirbach, and B. Ottersten. 2016. Enhancement of dynamic depth scenes by upsampling for precise super-resolution (UP-SR). Computer Vision and Image Understanding 147 (2016), 38--49. Google Scholar
Digital Library
- K. Al Ismaeil, D. Aouada, T. Solignac, B. Mirbach, and B. Ottersten. 2015. Real-time non-rigid multi-frame depth video super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’15). IEEE, Boston, MA, 8--16.Google Scholar
- K. Al Ismaeil, D. Aouada, T. Solignac, B. Mirbach, and B. Ottersten. 2017. Real-time enhancement of dynamic depth videos with non-rigid deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 10 (Oct 2017), 2045--2059.Google Scholar
Digital Library
- S. Izadi, R. A. Newcombe, D. Kim, O. Hilliges, D. Molyneaux, S. Hodges, P. Kohli, J. Shotton, A. J. Davison, and A. Fitzgibbon. 2011. KinectFusion: Real-time dynamic 3D surface reconstruction and interaction. In ACM SIGGRAPH. ACM, Vancouver, BC, Canada, 23:1--23:1. Google Scholar
Digital Library
- A. S. Jackson, A. Bulat, V. Argyriou, and G. Tzimiropoulos. 2017. Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In IEEE International Conference on Computer Vision (ICCV’17). IEEE, Venice, Italy, 1031--1039.Google Scholar
- I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard. 2016. The megaface benchmark: 1 million faces for recognition at scale. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 4873--4882.Google Scholar
- J. Kim, J. K. Lee, and K. M. Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 1646--1654.Google Scholar
- S. Liang, I. Kemelmacher-Shlizerman, and L. G. Shapiro. 2014. 3D face hallucination from a single depth frame. In International Conference on 3D Vision, Vol. 1. IEEE, Tokyo, Japan, 31--38. Google Scholar
Digital Library
- C. Liu, H.-Y. Shum, and W. T. Freeman. 2007. Face hallucination: Theory and practice. International Journal of Computer Vision 75, 1 (Oct 2007), 115--134. Google Scholar
Digital Library
- A. Myronenko and X. Song. 2010. Point set registration: Coherent point drift. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 12 (Dec 2010), 2262--2275. Google Scholar
Digital Library
- K. Nasrollahi and T. B. Moeslund. 2014. Super-resolution: A comprehensive survey. Machine Vision and Applications 25, 6 (Aug 2014), 1423--1468. Google Scholar
Digital Library
- R. A. Newcombe, D. Fox, and S. M. Seitz. 2015. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE, Boston, MA, 343--352.Google Scholar
- R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and Augmented Reality. IEEE, Basel, Switzerland, 127--136. Google Scholar
Digital Library
- K. Nguyen, C. Fookes, S. Sridharan, M. Tistarelli, and M. Nixon. 2018. Super-resolution for biometrics: A comprehensive survey. Pattern Recognition 78 (2018), 23--42. Google Scholar
Digital Library
- G. Pan, S. Han, Z. Wu, and Y. Wang. 2006. Super-resolution of 3D face. In European Conference on Computer Vision (ECCV’06). Springer, Berlin, 389--401. Google Scholar
Digital Library
- S. C. Park, M. K. Park, and M. G. Kang. 2003. Super-resolution image reconstruction: A technical overview. IEEE Signal Processing Magazine 20, 3 (May 2003), 21--36.Google Scholar
- G. Passalis, P. Perakis, T. Theoharis, and I. A. Kakadiaris. 2011. Using facial symmetry to handle pose variations in real-world 3D face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 10 (Oct 2011), 1938--1951. Google Scholar
Digital Library
- S. Peng, G. Pan, and Z. Wu. 2005. Learning-based super-resolution of 3D face model. In IEEE International Conference on Image Processing (ICIP’05), Vol. 2. IEEE, Genoa, Italy, II--382--5.Google Scholar
- J. S. J. Ren, L. Xu, Q. Yan, and W. Sun. 2015. Shepard convolutional neural networks. In Advances in Neural Information Processing Systems, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 901--909. http://papers.nips.cc/paper/5774-shepard-convolutional-neural-networks.pdf. Google Scholar
Digital Library
- R. Sagawa, N. Osawa, and Y. Yagi. 2006. A probabilistic method for aligning and merging range images with anisotropic error distribution. In International Symposium on 3D Data Processing, Visualization, and Transmission. IEEE, Chapel Hill, NC, 559--566. Google Scholar
Digital Library
- S. Schuon, C. Theobalt, J. Davis, and S. Thrun. 2009. LidarBoost: Depth superresolution for ToF 3D shape scanning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, Miami, FL, 343--350.Google Scholar
- J. Sell and P. O’Connor. 2014. The Xbox one system on a chip and Kinect sensor. IEEE Micro 34, 2 (Mar 2014), 44--53.Google Scholar
Cross Ref
- A. T. Tran, T. Hassner, I. Masi, and G. Medioni. 2017. Regressing robust and discriminative 3D morphable models with a very deep neural network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, Honolulu, HI, 5163--5172.Google Scholar
- J. D. van Ouwerkerk. 2006. Image super-resolution survey. Image and Vision Computing 24, 10 (2006), 1039--1052.Google Scholar
Cross Ref
- Paul Viola and Michael J. Jones. 2004. Robust real-time face detection. International Journal of Computer Vision 57, 2 (May 2004), 137--154. Google Scholar
Digital Library
- N. Wang, D. Tao, X. Gao, X. Li, and J. Li. 2014. A comprehensive survey to face hallucination. International Journal of Computer Vision 106, 1 (Jan 2014), 9--30. Google Scholar
Digital Library
- X. Wang and X. Tang. 2005. Hallucinating face by eigentransformation. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 35, 3 (Aug 2005), 425--434. Google Scholar
Digital Library
- J. Williams and M. Bennamoun. 2000. Multiple view surface registration with error modeling and analysis. In IEEE International Conference on Image Processing, Vol. 1. IEEE, Vancouver, BC, Canada, 545--548.Google Scholar
- R. J. Woodham. 1980. Photometric method for determining surface orientation from multiple images. Optical Engineering 19, 1 (Feb. 1980), 139--144.Google Scholar
Cross Ref
- Chenghua Xu, Tieniu Tan, Yunhong Wang, and Long Quan. 2006. Combining local features for robust nose location in 3D facial data. Pattern Recognition Letters 27, 13 (2006), 1487--1494. Google Scholar
Digital Library
- C. Yang, S. Liu, and M. Yang. 2013. Structured face hallucination. In IEEE Conference on Computer Vision and Pattern Recognition. 1099--1106. Google Scholar
Digital Library
- J. Yang, J. Wright, T. S. Huang, and Y. Ma. 2010. Image super-resolution via sparse representation. IEEE Transactions on Image Processing 19, 11 (Nov 2010), 2861--2873. Google Scholar
Digital Library
- Q. Yang, R. Yang, J. Davis, and D. Nister. 2007. Spatial-depth super resolution for range images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). IEEE, Minneapolis, MN, 1--8.Google Scholar
- L. Yue, H. Shen, J. Li, Q. Yuan, H. Zhang, and L. Zhang. 2016. Image super-resolution: The techniques, applications, and future. Signal Processing 128 (2016), 389--408. Google Scholar
Digital Library
- Z. Zhang and O. Faugeras. 1992. A 3D world model builder with a mobile robot. The International Journal of Robotics Research 11, 4 (1992), 269--285. Google Scholar
Digital Library
- E. Zhou, H. Fan, Z. Cao, Y. Jiang, and Q. Yin. 2015. Learning face hallucination in the wild. In AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 3871--3877. http://dl.acm.org/citation.cfm?id=2888116.2888253. Google Scholar
Digital Library
- S. Zhu, S. Liu, C. C. Loy, and X. Tang. 2016. Deep cascaded Bi-network for face hallucination. In European Conference on Computer Vision (ECCV’16), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 614--630.Google Scholar
Index Terms
Reconstructing 3D Face Models by Incremental Aggregation and Refinement of Depth Frames
Recommendations
Automatic 3D face recognition from depth and intensity Gabor features
As is well known, traditional 2D face recognition based on optical (intensity or color) images faces many challenges, such as illumination, expression, and pose variation. In fact, the human face generates not only 2D texture information but also 3D ...
3D face recognition: a survey
3D face recognition has become a trending research direction in both industry and academia. It inherits advantages from traditional 2D face recognition, such as the natural recognition process and a wide range of applications. Moreover, 3D face ...
RGB-D face recognition under various conditions via 3D constrained local model
Display Omitted Proposing a method for 3D face recognition using the affordable sensor Kinect.Aligning the face images using 3D constrained local model (CLM-Z).Eliminating irrelevant data by finding face landmarks and computing feature descriptors for ...






Comments