Abstract
Different from general face recognition, age-invariant face recognition (AIFR) aims at matching faces with a big age gap. Previous discriminative methods usually focus on decomposing facial feature into age-related and age-invariant components, which suffer from the loss of facial identity information. In this article, we propose a novel Multi-feature Fusion and Decomposition (MFD) framework for age-invariant face recognition, which learns more discriminative and robust features and reduces the intra-class variants. Specifically, we first sample multiple face images of different ages with the same identity as a face time sequence. Then, the multi-head attention is employed to capture contextual information from facial feature series, extracted by the backbone network. Next, we combine feature decomposition with fusion based on the face time sequence to ensure that the final age-independent features effectively represent the identity information of the face and have stronger robustness against the aging process. Besides, we also mitigate imbalanced age distribution in the training data by a re-weighted age loss. We experimented with the proposed MFD over the popular CACD and CACD-VS datasets, where we show that our approach improves the AIFR performance than previous state-of-the-art methods. We simultaneously show the performance of MFD on LFW dataset.
- [1] . 2020. Domain balancing: Face recognition on long-tailed domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5671–5679.Google Scholar
Cross Ref
- [2] . 2019. Learning imbalanced datasets with label-distribution-aware margin loss. In Advances in Neural Information Processing Systems. 1567–1578. Google Scholar
Digital Library
- [3] . 2020. Data uncertainty learning in face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5710–5719.Google Scholar
Cross Ref
- [4] . 2014. Cross-age reference coding for age-invariant face recognition and retrieval. In Proceedings of the European Conference on Computer Vision. Springer, 768–783.Google Scholar
Cross Ref
- [5] . 2015. Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Trans. Multimedia 17, 6 (2015), 804–815.Google Scholar
Digital Library
- [6] . 2013. Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3025–3032. Google Scholar
Digital Library
- [7] . 2016. Facial age estimation using robust label distribution. In Proceedings of the 24th ACM International Conference on Multimedia. 77–81. Google Scholar
Digital Library
- [8] . 2019. Structure-aware deep learning for product image classification. ACM Trans. Multimedia Comput. Commun. Appl. 15, 1s (2019), 1–20. Google Scholar
Digital Library
- [9] . 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9268–9277.Google Scholar
Cross Ref
- [10] . 2019. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4690–4699.Google Scholar
Cross Ref
- [11] . 2021. Syntax-guided hierarchical attention network for video captioning. IEEE Trans. Circ. Syst. Vid. Technol. (2021), 1–1.
DOI: DOI: https://doi.org/10.1109/TCSVT.2021.3063423Google Scholar - [12] . 2019. Cross-modal neural sign language translation. In Proceedings of the 27th ACM International Conference on Multimedia. 1650–1654. Google Scholar
Digital Library
- [13] . 2016. You lead, we exceed: Labor-free video concept learning by jointly exploiting web videos and images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 923–932.Google Scholar
Cross Ref
- [14] . 2013. Hidden factor analysis for age invariant face recognition. In Proceedings of the IEEE International Conference on Computer Vision. 2872–2879. Google Scholar
Digital Library
- [15] . 2016. Multiscale representation for partial face recognition under near infrared illumination. In Proceedings of the IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS’16). IEEE, 1–7.Google Scholar
Digital Library
- [16] . 2019. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In Proceedings of the IEEE International Conference on Computer Vision. 8450–8459.Google Scholar
- [17] . 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition.Google Scholar
- [18] . 2020. Caption-supervised face recognition: Training a state-of-the-art face model without manual annotation. In Proceedings of the European Conference on Computer Vision. Springer, 139–155.Google Scholar
Digital Library
- [19] . 2019. Striking the right balance with uncertainty. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 103–112.Google Scholar
Cross Ref
- [20] . 2019. Global-local temporal representations for video person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3958–3967.Google Scholar
Cross Ref
- [21] . 2019. Pose-guided representation learning for person re-identification. IEEE Trans. Pattern Anal. Mach. Intell. (2019), 1–1.
DOI: DOI: https://doi.org/10.1109/TPAMI.2019.2929036Google ScholarDigital Library
- [22] . 2012. Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimedia 14, 5 (2012), 1401–1413. Google Scholar
Digital Library
- [23] . 2019. A hierarchical CNN-RNN approach for visual emotion classification. ACM Trans. Multimedia Comput. Commun. Appl. 15, 3s (2019), 1–17. Google Scholar
Digital Library
- [24] . 2016. A discriminative null space based deep learning approach for person re-identification. In Proceedings of the 4th International Conference on Cloud Computing and Intelligence Systems (CCIS’16). IEEE, 480–484.Google Scholar
Cross Ref
- [25] . 2016. Aging face recognition: A hierarchical learning model based on local patterns selection. IEEE Trans. Image Process. 25, 5 (2016), 2146–2154. Google Scholar
Digital Library
- [26] . 2011. A discriminative model for age invariant face recognition. IEEE Trans. Inf. Forens. Secur. 6, 3 (2011), 1028–1037. Google Scholar
Digital Library
- [27] . 2016. Cross-domain visual matching via generalized similarity measure and feature learning. IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2016), 1089–1102. Google Scholar
Digital Library
- [28] . 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980–2988.Google Scholar
Cross Ref
- [29] . 2019. Knowledge-guided pairwise reconstruction network for weakly supervised referring expression grounding. In Proceedings of the 27th ACM International Conference on Multimedia. 539–547. Google Scholar
Digital Library
- [30] . 2020. Domain adaptive person re-identification via coupling optimization. In Proceedings of the 28th ACM International Conference on Multimedia. 547–555. Google Scholar
Digital Library
- [31] . 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 3730–3738. Google Scholar
Digital Library
- [32] . 2018. Video captioning with multi-faceted attention. Trans. Assoc. Comput. Linguist. 6 (2018), 173–184.Google Scholar
Cross Ref
- [33] . 2018. Multimodal keyless attention fusion for video classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32. Google Scholar
Digital Library
- [34] . 2016. Face model compression by distilling knowledge from neurons. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30. Google Scholar
Digital Library
- [35] . 2017. Multi-feature fusion for predicting social media popularity. In Proceedings of the 25th ACM International Conference on Multimedia. 1883–1888. Google Scholar
Digital Library
- [36] . 2010. Age invariant face recognition using graph matching. In Proceedings of the 4th IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS’10). IEEE, 1–7.Google Scholar
Cross Ref
- [37] . 2020. Multi-features fusion and decomposition for age-invariant face recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 3146–3154. Google Scholar
Digital Library
- [38] . 2006. Modeling age progression in young faces. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1. IEEE, 387–394. Google Scholar
Digital Library
- [39] . 2018. Learning to reweight examples for robust deep learning. In Proceedings of the International Conference on Machine Learning. PMLR, 4334–4343.Google Scholar
- [40] . 2015. DEX: Deep expectation of apparent age from a single image. In Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW’15). 10–15. Google Scholar
Digital Library
- [41] . 2019. Meta-weight-net: Learning an explicit mapping for sample weighting. In Advances in Neural Information Processing Systems. 1919–1930. Google Scholar
Digital Library
- [42] . 2018. Dual conditional GANs for face aging and rejuvenation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’18). 899–905. Google Scholar
Digital Library
- [43] . 2019. Human mesh recovery from monocular images via a skeleton-disentangled representation. In Proceedings of the IEEE International Conference on Computer Vision. 5349–5358.Google Scholar
Cross Ref
- [44] . 2014. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1701–1708. Google Scholar
Digital Library
- [45] . 2021. Viewpoint and scale consistency reinforcement for uav vehicle re-identification. Int. J. Comput. Vis. 129, 3 (2021), 719–735.Google Scholar
Digital Library
- [46] . 2020. Unsupervised person re-identification via multi-label classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10981–10990.Google Scholar
Digital Library
- [47] . 2019. Decorrelated adversarial learning for age-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3527–3536.Google Scholar
Cross Ref
- [48] . 2018. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5265–5274.Google Scholar
Cross Ref
- [49] . 2016. Recurrent face aging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2378–2386.Google Scholar
Cross Ref
- [50] . 2020. Listen, look, and find the one: Robust person search with multimodality index. ACM Trans. Multimedia Comput. Commun. Appl. 16, 2 (2020), 1–20. Google Scholar
Digital Library
- [51] . 2018. Orthogonal deep features decomposition for age-invariant face recognition. In Proceedings of the European Conference on Computer Vision (ECCV’18). 738–753.Google Scholar
Cross Ref
- [52] . 2016. Latent factor guided convolutional neural networks for age-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4893–4901.Google Scholar
Cross Ref
- [53] . 2016. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision. Springer, 499–515.Google Scholar
Cross Ref
- [54] . 2018. Shift: A zero flop, zero parameter alternative to spatial convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9127–9135.Google Scholar
Cross Ref
- [55] . 2017. Age invariant face recognition and retrieval by coupled auto-encoder networks. Neurocomputing 222 (2017), 62–71. Google Scholar
Digital Library
- [56] . 2020. A general re-ranking method based on metric learning for person re-identification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’20). IEEE, 1–6.Google Scholar
Cross Ref
- [57] . 2021. Deep multi-view enhancement hashing for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 43, 4 (2021), 1445–1451.Google Scholar
Cross Ref
- [58] . 2019. Cross-modality bridging and knowledge transferring for image understanding. IEEE Trans. Multimedia 21, 10 (2019), 2675–2685.Google Scholar
Digital Library
- [59] . 2020. Depth image denoising using nuclear norm and learning graph model. ACM Trans. Multimedia Comput. Commun. Appl. 16, 4 (2020), 1–17. Google Scholar
Digital Library
- [60] . 2020. 3d room layout estimation from a single rgb image. IEEE Trans. Multimedia 22, 11 (2020), 3014–3024.Google Scholar
Cross Ref
- [61] . 2019. Skeletonnet: A hybrid network with a skeleton-embedding process for multi-view image representation learning. IEEE Trans. Multimedia 21, 11 (2019), 2916–2929.Google Scholar
Cross Ref
- [62] . 2014. Learning face representation from scratch. arXiv:1411.7923. Retrieved from https://arxiv.org/abs/1411.7923.Google Scholar
- [63] . 2020. State-relabeling adversarial active learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8756–8765.Google Scholar
Cross Ref
- [64] . 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sign. Process. Lett. 23, 10 (2016), 1499–1503.Google Scholar
Cross Ref
- [65] . 2017. Age progression/regression by conditional adversarial autoencoder. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5810–5818.Google Scholar
Cross Ref
- [66] . 2019. Look across elapse: Disentangled representation learning and photorealistic cross-age face synthesis for age-invariant face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 9251–9258. Google Scholar
Digital Library
- [67] . 2017. Age estimation guided convolutional neural network for age-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1–9.Google Scholar
Cross Ref
- [68] . 2018. Facial aging and rejuvenation by conditional multi-adversarial autoencoder with ordinal regression. arXiv:1804.02740. Retrieved from https://arxiv.org/abs/1804.02740.Google Scholar
Index Terms
Age-Invariant Face Recognition by Multi-Feature Fusionand Decomposition with Self-attention
Recommendations
Multi-Features Fusion and Decomposition for Age-Invariant Face Recognition
MM '20: Proceedings of the 28th ACM International Conference on MultimediaAlthough the General Face Recognition (GFR) research achieves great success, Age-Invariant Face Recognition (AIFR) is still a challenging problem since facial appearance changing over time brings significant intra-class variations. The existing ...
Age-Invariant Face Recognition
One of the challenges in automatic face recognition is to achieve temporal invariance. In other words, the goal is to come up with a representation and matching scheme that is robust to changes due to facial aging. Facial aging is a complex process that ...
Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition
Computer Vision – ECCV 2018AbstractAs facial appearance is subject to significant intra-class variations caused by the aging process over time, age-invariant face recognition (AIFR) remains a major challenge in face recognition community. To reduce the intra-class discrepancy ...






Comments