skip to main content
research-article

Features-Enhanced Multi-Attribute Estimation with Convolutional Tensor Correlation Fusion Network

Published:15 October 2019Publication History
Skip Abstract Section

Abstract

To achieve robust facial attribute estimation, a hierarchical prediction system referred to as tensor correlation fusion network (TCFN) is proposed for attribute estimation. The system includes feature extraction, correlation excavation among facial attribute features, score fusion, and multi-attribute prediction. Subnetworks (Age-Net, Gender-Net, Race-Net, and Smile-Net) are used to extract corresponding features while Main-Net extracts features not only from an input image but also from corresponding pooling layers of subnetworks. Dynamic tensor canonical correlation analysis (DTCCA) is proposed to explore the correlation of different targets’ features in the F7 layers. Then, for binary classifications of gender, race, and smile, corresponding robust decisions are achieved by fusing the results of subnetworks with those of TCFN while for age prediction, facial image into one of age groups, and then ELM regressor performs the final age estimation. Experimental results on benchmarks with multiple face attributes (MORPH-II, Adience Benchmark datasets, LAP-2016, and CelebA) show that the proposed approach has superior performance compared to state of the art.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Arxiv Preprint Arxiv:1603.04467 (2016).Google ScholarGoogle Scholar
  2. T. Ahonen, A. Hadid, and M. Pietikainen. 2006. Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 12 (Dec. 2006), 2037--2041. DOI:https://doi.org/10.1109/TPAMI.2006.244Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Stefano Berretti, Alberto Del Bimbo, and Pietro Pala. 2006. Description and retrieval of 3D face models using iso-geodesic stripes. In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (MIR’06). ACM, New York, NY, 13--22. DOI:https://doi.org/10.1145/1178677.1178683Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Stefano Berretti, Alberto Del Bimbo, and Pietro Pala. 2011. Partial match of 3D faces using facial curves between SIFT keypoints. In Proceedings of the 4th Eurographics Conference on 3D Object Retrieval (3DOR’11). Eurographics Association, 117--120. DOI:https://doi.org/10.2312/3DOR/3DOR11/117-120Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Garrison W. Cottrell and JanetMetcalfe. 1990. EMPATH: Face, emotion, and gender recognition using holons. In Advances in Neural Information Processing Systems. 564--571.Google ScholarGoogle Scholar
  6. A. Dantcheva and F. Brémond. 2017. Gender estimation based on smile-dynamics. IEEE Transactions on Information Forensics and Security 12, 3 (March 2017), 719--729. DOI:https://doi.org/10.1109/TIFS.2016.2632070Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. H. Dibeklioǧlu, F. Alnajar, A. Ali Salah, and T. Gevers. 2015. Combining facial dynamics with appearance for age estimation. IEEE Transactions on Image Processing 24, 6 (June 2015), 1928--1943. DOI:https://doi.org/10.1109/TIP.2015.2412377Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Duan, K. Li, and K. Li. 2018. An ensemble CNN2ELM for age estimation. IEEE Transactions on Information Forensics and Security 13, 3 (March 2018), 758--772. DOI:https://doi.org/10.1109/TIFS.2017.2766583Google ScholarGoogle ScholarCross RefCross Ref
  9. Mingxing Duan, Kenli Li, and Qi Tian. 2018. A novel multi-task tensor correlation neural network for facial attribute prediction. arXiv:1804.02810 (4 2018).Google ScholarGoogle Scholar
  10. Mingxing Duan, Kenli Li, Canqun Yang, and Keqin Li. 2018. A hybrid deep learning CNN-ELM for age and gender classification. Neurocomputing 275 (2018), 448--461. DOI:https://doi.org/10.1016/j.neucom.2017.08.062Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Max Ehrlich, Timothy J. Shields, Timur Almaev, and Mohamed R. Amer. 2016. Facial attributes classification using multi-task representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 47--55.Google ScholarGoogle Scholar
  12. E. Eidinger, R. Enbar, and T. Hassner. 2014. Age and gender estimation of unfiltered faces. IEEE Transactions on Information Forensics and Security 9, 12 (Dec. 2014), 2170--2179. DOI:https://doi.org/10.1109/TIFS.2014.2359646Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Escalera, M. T. Torres, B. Martínez, X. Baró, H. J. Escalante, I. Guyon, G. Tzimiropoulos, C. Corneanu, M. Oliu, M. A. Bagheri, and M. Valstar. 2016. ChaLearn looking at people and faces of the world: Face analysis workshop and challenge 2016. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). 706--713. DOI:https://doi.org/10.1109/CVPRW.2016.93Google ScholarGoogle ScholarCross RefCross Ref
  14. Yun Fu, Guodong Guo, and Thomas S. Huang. 2010. Age synthesis and estimation via faces: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 11 (2010), 1955--1976.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bin-Bin Gao, Hong-Yu Zhou, Jianxin Wu, and Xin Geng. 2018. Age estimation using expectation of label distribution learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18). 712–718.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Asuman Gunay and Vasif V. Nabiyev. 2008. Automatic age classification with LBP. In Proceedings of the International Symposium on Computer and Information Sciences. 1--4.Google ScholarGoogle Scholar
  17. G. Guo and G. Mu. 2010. Human age estimation: What is the influence across race and gender?. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops. 71--78. DOI:https://doi.org/10.1109/CVPRW.2010.5543609Google ScholarGoogle Scholar
  18. Guodong Guo and Guowang Mu. 2014. A framework for joint estimation of age, gender and ethnicity on a large database. Image and Vision Computing 32, 10 (2014), 761--770.Google ScholarGoogle Scholar
  19. F. Gürpinar, H. Kaya, H. Dibeklioglu, and A. A. Salah. 2016. Kernel ELM and CNN based facial age estimation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). 785--791. DOI:https://doi.org/10.1109/CVPRW.2016.103Google ScholarGoogle ScholarCross RefCross Ref
  20. M. A. Hajizadeh and H. Ebrahimnezhad. 2011. Classification of age groups from facial image using histograms of oriented gradients. In Proceedings of the 2011 7th Iranian Conference on Machine Vision and Image Processing. 1--5. DOI:https://doi.org/10.1109/IranianMVIP.2011.6121582Google ScholarGoogle ScholarCross RefCross Ref
  21. H. Han, A. K. Jain, F. Wang, S. Shan, and X. Chen. 2018. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 11 (Nov. 2018), 2597--2609. DOI:https://doi.org/10.1109/TPAMI.2017.2738004Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Emily Hand and Rama Chellappa. 2017. Attributes for Improved Attributes: A Multi-Task Network Utilizing Implicit and Explicit Relationships for Facial Attribute Classification. In AAAI. 4068–4074.Google ScholarGoogle Scholar
  23. David R. Hardoon, Sandor R. Szedmak, and John R. Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Computation 16, 12 (2004), 2639.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Hassner, S. Harel, E. Paz, and R. Enbar. Open University of Israel. http://www.openu.ac.il/home/hassner/Adience.Google ScholarGoogle Scholar
  25. Wen Bing Horng, Cheng Ping Lee, and Chun Wen Chen. 2001. Classification of age groups based on facial features. Tamkang Journal of Science and Engineering 4, 4 (2001), 183--192.Google ScholarGoogle Scholar
  26. Z. Hu, Y. Wen, J. Wang, M. Wang, R. Hong, and S. Yan. 2017. Facial age estimation with age difference. IEEE Transactions on Image Processing 26, 7 (July 2017), 3087--3097. DOI:https://doi.org/10.1109/TIP.2016.2633868Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2016. Learning deep representation for imbalanced classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5375--5384.Google ScholarGoogle ScholarCross RefCross Ref
  28. X. Jing, X. Zhu, F. Wu, R. Hu, X. You, Y. Wang, H. Feng, and J. Yang. 2017. Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. IEEE Transactions on Image Processing 26, 3 (March 2017), 1363--1378. DOI:https://doi.org/10.1109/TIP.2017.2651364Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Neeraj Kumar, Peter Belhumeur, and Shree Nayar. 2008. FaceTracer: A search engine for large collections of images with faces. In European Conference on Computer Vision. 340--353.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Young Ho Kwon and N. Da Vitoria Lobo. 1999. Age classification from facial images. Computer Vision and Image Understanding 74, 1 (1999), 1--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2006. On the best rank-1 and rank-(R1,R2,…, RN) approximation of higher-order tensors. Siam Journal on Matrix Analysis and Applications 21, 4 (2006), 1324--1342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gil Levi and Tal Hassncer. 2015. Age and gender classification using convolutional neural networks. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’15). 34--42.Google ScholarGoogle ScholarCross RefCross Ref
  33. Kuan Hsien Liu, Shuicheng Yan, and C. C. Jay Kuo. 2015. Age estimation via grouping and decision fusion. IEEE Transactions on Information Forensics and Security 10, 11 (2015), 2408--2423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Liu, S. Li, M. Kan, J. Zhang, S. Wu, W. Liu, H. Han, S. Shan, and X. Chen. 2015. AgeNet: Deeply learned regressor and classifier for robust apparent age estimation. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW’15). 258--266. DOI:https://doi.org/10.1109/ICCVW.2015.42Google ScholarGoogle Scholar
  35. Xin Liu, Shaoxin Li, Meina Kan, Jie Zhang, Shuzhe Wu, Wenxian Liu, Hu Han, Shiguang Shan, and Xilin Chen. 2015. AgeNet: Deeply learned regressor and classifier for robust apparent age estimation. In IEEE International Conference on Computer Vision Workshop. 258--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Z. Liu, P. Luo, X. Wang, and X. Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV’15). 3730--3738. DOI:https://doi.org/10.1109/ICCV.2015.425Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. P. Luo, X. Wang, and X. Tang. 2013. A deep sum-product architecture for robust facial attributes analysis. In Proceedings of the 2013 IEEE International Conference on Computer Vision. 2864--2871. DOI:https://doi.org/10.1109/ICCV.2013.356Google ScholarGoogle Scholar
  38. Yong Luo, Dacheng Tao, Kotagiri Ramamohanarao, Chao Xu, and Yonggang Wen. 2015. Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Transactions on Knowledge and Data Engineering 27, 11 (2015), 3111--3124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. C. Mallı, M. Aygün, and H. K. Ekenel. 2016. Apparent age estimation using ensemble of deep learning models. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). 714--721. DOI:https://doi.org/10.1109/CVPRW.2016.94Google ScholarGoogle ScholarCross RefCross Ref
  40. Markus Mathias, Rodrigo Benenson, Marco Pedersoli, and Luc Van Gool. 2014. Face detection without bells and whistles. In Proceedings of the European Conference on Computer Vision. 720--735.Google ScholarGoogle ScholarCross RefCross Ref
  41. Hongying Meng, Di Huang, Heng Wang, Hongyu Yang, Mohammed AI-Shuraifi, and Yunhong Wang. 2013. Depression recognition based on dynamic facial and vocal expression features using partial least square regression. In Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge (AVEC’13). ACM, New York, NY, 21--30. DOI:https://doi.org/10.1145/2512530.2512532Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Hongyu Pan, Hu Han, Shiguang Shan, and Xilin Chen. 2018. Mean-variance loss for deep age estimation from a face. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google ScholarGoogle ScholarCross RefCross Ref
  43. R. Polikar. 2006. Ensemble based systems in decision making. IEEE Circuits and Systems Magazine 6, 3 (2006), 21--45. DOI:https://doi.org/10.1109/MCAS.2006.1688199Google ScholarGoogle Scholar
  44. G. J. Qi, C. Aggarwal, Q. Tian, H. Ji, and T. Huang. 2012. Exploring context and content links in social media: A latent space method. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (May 2012), 850--862. DOI:https://doi.org/10.1109/TPAMI.2011.191Google ScholarGoogle Scholar
  45. Guo Jun Qi, Xian Sheng Hua, and Hong Jiang Zhang. 2009. Learning semantic distance from community-tagged media collection. In Proceedings of the International Conference on Multimedia 2009. 243--252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rajeev Ranjan, Vishal M. Patel, and Rama Chellappa. 2016. HyperFace: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. CoRR abs/1603.01249 (2016). arxiv:1603.01249 http://arxiv.org/abs/1603.01249Google ScholarGoogle Scholar
  47. R. Ranjan, S. Sankaranarayanan, C. D. Castillo, and R. Chellappa. 2017. An All-In-One convolutional neural network for face analysis. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG’17). 17--24. DOI:https://doi.org/10.1109/FG.2017.137Google ScholarGoogle ScholarCross RefCross Ref
  48. Karl Ricanek and Tamirat Tesafaye. 2006. MORPH: A longitudinal image database of normal adult age-progression. In International Conference on Automatic Face and Gesture Recognition. 341--345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. R. Rothe, R. Timofte, and L. V. Gool. 2015. DEX: Deep expectation of apparent age from a single image. In Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW’15). 252--257. DOI:https://doi.org/10.1109/ICCVW.2015.41Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2016. Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision 126, 2–4 (2018), 144–157.Google ScholarGoogle Scholar
  51. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, and Michael Bernstein. 2014. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2014), 211--252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014).Google ScholarGoogle Scholar
  53. C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1--9. DOI:https://doi.org/10.1109/CVPR.2015.7298594Google ScholarGoogle ScholarCross RefCross Ref
  54. P. Thukral, K. Mitra, and R. Chellappa. 2012. A hierarchical approach for human age estimation. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’12). 1529--1532. DOI:https://doi.org/10.1109/ICASSP.2012.6288182Google ScholarGoogle ScholarCross RefCross Ref
  55. Michal Uricar, Radu Timofte, Rasmus Rothe, Jiri Matas, and Luc Van Gool. 2016. Structured output SVM prediction of apparent age, gender and smile from deep features. In Computer Vision and Pattern Recognition Workshops. 730--738.Google ScholarGoogle ScholarCross RefCross Ref
  56. M. Uricár, R. Timofte, R. Rothe, J. Matas, and L. V. Gool. 2016. Structured output SVM prediction of apparent age, gender and smile from deep features. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’16). 730--738. DOI:https://doi.org/10.1109/CVPRW.2016.96Google ScholarGoogle ScholarCross RefCross Ref
  57. Z. Wu, Q. Ke, J. Sun, and H. Y. Shum. 2011. Scalable face image retrieval with identity-based quantization and multireference reranking. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 10 (Oct. 2011), 1991--2001. DOI:https://doi.org/10.1109/TPAMI.2011.111Google ScholarGoogle Scholar
  58. Xiao-Yuan Jing and D. Zhang. 2004. A face and palmprint recognition approach based on discriminant DCT feature extraction. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 34, 6 (Dec. 2004), 2405--2415. DOI:https://doi.org/10.1109/TSMCB.2004.837586Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yongqiang Yao, Di Huang, Xudong Yang, Yunhong Wang, and Liming Chen. 2018. Texture and geometry scattering representation-based facial expression recognition in 2D+3D videos. ACM Transactions on Multimedia Computing Communications, and Applications 14, 1s (March 2018), Article 18, 23 pages. DOI:https://doi.org/10.1145/3131345Google ScholarGoogle Scholar
  60. Dong Yi, Zhen Lei, and Stan Z. Li. 2014. Age estimation by multi-scale convolutional network. In Asian Conference on Computer Vision. 144--158.Google ScholarGoogle Scholar
  61. K. Zhang, N. Liu, X. Yuan, X. Guo, C. Gao, and Z. Zhao. 2018. Fine-grained age group classification in the wild. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR’18). 788--793. DOI:https://doi.org/10.1109/ICPR.2018.8545333Google ScholarGoogle ScholarCross RefCross Ref
  62. N. Zhang, M. Paluri, M. Ranzato, T. Darrell, and L. Bourdev. 2014. PANDA: Pose aligned networks for deep attribute modeling. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 1637--1644. DOI:https://doi.org/10.1109/CVPR.2014.212Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Z. Zhang, P. Luo, C. C. Loy, and X. Tang. 2016. Learning deep representation for face alignment with auxiliary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 5 (May 2016), 918--930. DOI:https://doi.org/10.1109/TPAMI.2015.2469286Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2018. From facial expression recognition to interpersonal relation prediction. International Journal of Computer Vision 126, 5 (1 May 2018), 550--569. DOI:https://doi.org/10.1007/s11263-017-1055-1Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Yang Zhong, Josephine Sullivan, and Haibo Li. 2016. Face attribute prediction using off-the-shelf CNN features. In International Conference on Biometrics. 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  66. X. Zhu, X. Jing, X. You, X. Zhang, and T. Zhang. 2018. Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. IEEE Transactions on Image Processing 27, 11 (Nov. 2018), 5683--5695. DOI:https://doi.org/10.1109/TIP.2018.2861366Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Features-Enhanced Multi-Attribute Estimation with Convolutional Tensor Correlation Fusion Network

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 3s
        Special Issue on Face Analysis for Applications and Special Issue on Affective Computing for Large-Scale Heterogeneous Multimedia Data
        November 2019
        304 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3368027
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 October 2019
        • Accepted: 1 August 2019
        • Revised: 1 March 2019
        • Received: 1 October 2018
        Published in tomm Volume 15, Issue 3s

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!