Abstract
Single-label facial expression recognition (FER), which aims to classify single expression for facial images, usually suffers from the label noisy and incomplete problem, where manual annotations for partial training images exist wrong or incomplete labels, resulting in performance decline. Although prior work has attempted to leverage external sources or manual annotations to handle this problem, it usually requires extra costs. This article explores a simple yet effective three-phase paradigm (“warm-up,” “selection,” and “relabeling”) for FER task. First, the warm-up phase attempts to build an initial recognition network based on noisy samples for discriminative feature extractions and facial expression predictions. Then, the second selection phase defines several rules to choose high confident samples according to prediction scores, and the third relabeling phase assigns two potential labels to those samples for network updating according to a composite two-label loss. Compared with the previous studies, the three-phase learning could effectively correct noisy labels in the ground truth without extra information and automatically assign two potential labels to single-label samples without manual annotations. As a result, the label information is purified and supplemented with few cost, yielding significant performance improvement. Extensive experiments are conducted on three datasets, and the experimental results demonstrate that our approach is robust to noisy training samples and outperforms several state-of-the-art methods.
- [1] . 2021. Meta soft label generation for noisy labels. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 7142–7148.Google Scholar
Cross Ref
- [2] . 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279–283.Google Scholar
Digital Library
- [3] . 2018. Island loss for learning discriminative features in facial expression recognition. In Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG’18). IEEE, 302–309.Google Scholar
Digital Library
- [4] . 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13984–13993.Google Scholar
Cross Ref
- [5] . 2015. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the ACM on International Conference on Multimodal Interaction. 423–426.Google Scholar
Digital Library
- [6] . 2020. Occlusion-adaptive deep network for robust facial expression recognition. In Proceedings of the IEEE International Joint Conference on Biometrics (IJCB’20). IEEE, 1–9.Google Scholar
Digital Library
- [7] . 2021. Handling ambiguous annotations for facial expression recognition in the wild. In Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing. 1–9.Google Scholar
Digital Library
- [8] . 2021. Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition. Pattern Recogn. Lett. 145 (2021), 58–66.Google Scholar
Digital Library
- [9] . 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. Springer, 117–124.Google Scholar
Cross Ref
- [10] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [11] . 2019. Using pre-training can improve model robustness and uncertainty. In Proceedings of the International Conference on Machine Learning. PMLR, 2712–2721.Google Scholar
- [12] . 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 11879–11888.Google Scholar
Cross Ref
- [13] . 2019. O2u-net: A simple noisy label detection approach for deep neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3326–3334.Google Scholar
Cross Ref
- [14] . 2019. Photometric transformer networks and label adjustment for breast density prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW’19). IEEE Computer Society, 460–466.Google Scholar
Cross Ref
- [15] . 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [16] . 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5051–5059.Google Scholar
Cross Ref
- [17] . 2018. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28, 1 (2018), 356–370.Google Scholar
Digital Library
- [18] . 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852–2861.Google Scholar
Cross Ref
- [19] . 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 5 (2018), 2439–2450.Google Scholar
Cross Ref
- [20] . 2018. Local subclass constraint for facial expression recognition in the wild. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 3132–3137.Google Scholar
Cross Ref
- [21] . 2021. D\(^3\)Net: Dual-branch disturbance disentangling network for facial expression recognition. In Proceedings of the 29th ACM International Conference on Multimedia. 779–787.Google Scholar
Digital Library
- [22] . 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 1 (2017), 18–31.Google Scholar
Digital Library
- [23] . 2019. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [24] . 2021. Local multi-head channel self-attention for facial expression recognition. Retrieved from https://arXiv:2111.07224.Google Scholar
- [25] . 2021. Facial expression recognition using residual masking network. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 4513–4519.Google Scholar
Cross Ref
- [26] . 2016. Facial expression recognition using convolutional neural networks: State of the art. Retrieved from https://arXiv:1612.02903.Google Scholar
- [27] . 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). IEEE, 7656–7665.Google Scholar
Cross Ref
- [28] . 2020. Noiserank: Unsupervised label noise reduction with dependence models. In Proceedings of the European Conference on Computer Vision. Springer, 737–753.Google Scholar
Digital Library
- [29] . 2021. Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6248–6257.Google Scholar
Cross Ref
- [30] . 2021. BBAS: Towards large-scale effective ensemble adversarial attacks against deep neural network learning. Info. Sci. 569 (2021), 469–478.Google Scholar
Cross Ref
- [31] . 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https://arXiv:1409.1556.Google Scholar
- [32] . 2019. Combating label noise in deep learning using abstention. In Proceedings of the International Conference on Machine Learning (ICML’19).Google Scholar
- [33] . 2017. Learning from noisy large-scale datasets with minimal supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 839–847.Google Scholar
Cross Ref
- [34] . 2020. AU-guided unsupervised domain adaptive facial expression recognition. Retrieved from https://arXiv:2012.10078.Google Scholar
- [35] . 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897–6906.Google Scholar
Cross Ref
- [36] . 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29 (2020), 4057–4069.Google Scholar
Digital Library
- [37] . 2019. Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans. Cybernet. 50, 7 (2019), 3330–3342.Google Scholar
Cross Ref
- [38] . 2019. Improved mean absolute error for learning meaningful patterns from abnormal training data. Technical report.Google Scholar
- [39] . 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 322–330.Google Scholar
Cross Ref
- [40] . 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92 (2019), 177–191.Google Scholar
Digital Library
- [41] . 2018. Iterative cross learning on noisy labels. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 757–765.Google Scholar
Cross Ref
- [42] . 2022. Discriminative style learning for cross-domain image captioning. IEEE Trans. Image Process. 31 (2022), 1723–1736.Google Scholar
Cross Ref
- [43] . 2022. Face2Exp: Combating data biases for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20291–20300.Google Scholar
Cross Ref
- [44] . 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision (ECCV’18). 222–237.Google Scholar
Digital Library
- [45] . 2018. mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [46] . 2021. Relative uncertainty learning for facial expression recognition. Adv. Neural Info. Process. Syst. 34 (2021).Google Scholar
- [47] . 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3510–3519.Google Scholar
Cross Ref
- [48] . 2020. Error-bounded correction of noisy labels. In Proceedings of the International Conference on Machine Learning. PMLR, 11447–11457.Google Scholar
Index Terms
TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition
Recommendations
Expression-invariant face recognition by facial expression transformations
In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Facial expression recognition with Convolutional Neural Networks
Facial expression recognition has been an active research area in the past 10 years, with growing application areas including avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine ...
Facial Expression Recognition Based on Gabor Wavelet Phase Features
ICIG '13: Proceedings of the 2013 Seventh International Conference on Image and GraphicsGabor wavelet transform usually extracts features using Gabor amplitude features, because the Gabor amplitude reflects the energy spectrum of the image, but the phase information contains rich texture information. This paper proposes a facial expression ...






Comments