skip to main content
research-article

TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition

Published:15 March 2023Publication History
Skip Abstract Section

Abstract

Single-label facial expression recognition (FER), which aims to classify single expression for facial images, usually suffers from the label noisy and incomplete problem, where manual annotations for partial training images exist wrong or incomplete labels, resulting in performance decline. Although prior work has attempted to leverage external sources or manual annotations to handle this problem, it usually requires extra costs. This article explores a simple yet effective three-phase paradigm (“warm-up,” “selection,” and “relabeling”) for FER task. First, the warm-up phase attempts to build an initial recognition network based on noisy samples for discriminative feature extractions and facial expression predictions. Then, the second selection phase defines several rules to choose high confident samples according to prediction scores, and the third relabeling phase assigns two potential labels to those samples for network updating according to a composite two-label loss. Compared with the previous studies, the three-phase learning could effectively correct noisy labels in the ground truth without extra information and automatically assign two potential labels to single-label samples without manual annotations. As a result, the label information is purified and supplemented with few cost, yielding significant performance improvement. Extensive experiments are conducted on three datasets, and the experimental results demonstrate that our approach is robust to noisy training samples and outperforms several state-of-the-art methods.

REFERENCES

  1. [1] Algan Görkem and Ulusoy Ilkay. 2021. Meta soft label generation for noisy labels. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 71427148.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Barsoum Emad, Zhang Cha, Ferrer Cristian Canton, and Zhang Zhengyou. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Cai Jie, Meng Zibo, Khan Ahmed Shehab, Li Zhiyuan, O’Reilly James, and Tong Yan. 2018. Island loss for learning discriminative features in facial expression recognition. In Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG’18). IEEE, 302309.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Chen Shikai, Wang Jianfeng, Chen Yuedong, Shi Zhongchao, Geng Xin, and Rui Yong. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1398413993.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Dhall Abhinav, Murthy O. V. Ramana, Goecke Roland, Joshi Jyoti, and Gedeon Tom. 2015. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the ACM on International Conference on Multimodal Interaction. 423426.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Ding Hui, Zhou Peng, and Chellappa Rama. 2020. Occlusion-adaptive deep network for robust facial expression recognition. In Proceedings of the IEEE International Joint Conference on Biometrics (IJCB’20). IEEE, 19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Gera Darshan. 2021. Handling ambiguous annotations for facial expression recognition in the wild. In Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing. 19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Gera Darshan and Balasubramanian S.. 2021. Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition. Pattern Recogn. Lett. 145 (2021), 5866.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Goodfellow Ian J., Erhan Dumitru, Carrier Pierre Luc, Courville Aaron, Mirza Mehdi, Hamner Ben, Cukierski Will, Tang Yichuan, Thaler David, Lee Dong-Hyun, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. Springer, 117124.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Hendrycks Dan, Lee Kimin, and Mazeika Mantas. 2019. Using pre-training can improve model robustness and uncertainty. In Proceedings of the International Conference on Machine Learning. PMLR, 27122721.Google ScholarGoogle Scholar
  12. [12] Hu Wei, Huang Yangyu, Zhang Fan, and Li Ruirui. 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 1187911888.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Huang Jinchi, Qu Lie, Jia Rongfei, and Zhao Binqiang. 2019. O2u-net: A simple noisy label detection approach for deep neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 33263334.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Jaehwan Lee, Donggeun Yoo, and Hyo-Eun Kim. 2019. Photometric transformer networks and label adjustment for breast density prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW’19). IEEE Computer Society, 460466.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Li Junnan, Socher Richard, and Hoi Steven C. H.. 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  16. [16] Li Junnan, Wong Yongkang, Zhao Qi, and Kankanhalli Mohan S.. 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 50515059.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Li Shan and Deng Weihong. 2018. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28, 1 (2018), 356370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Li Shan, Deng Weihong, and Du JunPing. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 28522861.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Li Yong, Zeng Jiabei, Shan Shiguang, and Chen Xilin. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 5 (2018), 24392450.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Luo Zimeng, Hu Jiani, and Deng Weihong. 2018. Local subclass constraint for facial expression recognition in the wild. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 31323137.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Mo Rongyun, Yan Yan, Xue Jing-Hao, Chen Si, and Wang Hanzi. 2021. D\(^3\)Net: Dual-branch disturbance disentangling network for facial expression recognition. In Proceedings of the 29th ACM International Conference on Multimedia. 779787.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Mollahosseini Ali, Hasani Behzad, and Mahoor Mohammad H.. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 1 (2017), 1831.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Nguyen Duc Tam, Mummadi Chaithanya Kumar, Ngo Thi Phuong Nhung, Nguyen Thi Hoai Phuong, Beggel Laura, and Brox Thomas. 2019. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  24. [24] Pecoraro Roberto, Basile Valerio, Bono Viviana, and Gallo Sara. 2021. Local multi-head channel self-attention for facial expression recognition. Retrieved from https://arXiv:2111.07224.Google ScholarGoogle Scholar
  25. [25] Pham Luan, Vu The Huynh, and Tran Tuan Anh. 2021. Facial expression recognition using residual masking network. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 45134519.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Pramerdorfer Christopher and Kampel Martin. 2016. Facial expression recognition using convolutional neural networks: State of the art. Retrieved from https://arXiv:1612.02903.Google ScholarGoogle Scholar
  27. [27] Ruan Delian, Yan Yan, Lai Shenqi, Chai Zhenhua, Shen Chunhua, and Wang Hanzi. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). IEEE, 76567665.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Sharma Karishma, Donmez Pinar, Luo Enming, Liu Yan, and Yalniz I Zeki. 2020. Noiserank: Unsupervised label noise reduction with dependence models. In Proceedings of the European Conference on Computer Vision. Springer, 737753.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] She Jiahui, Hu Yibo, Shi Hailin, Wang Jun, Shen Qiu, and Mei Tao. 2021. Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 62486257.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Shen Jialie and Robertson Neil. 2021. BBAS: Towards large-scale effective ensemble adversarial attacks against deep neural network learning. Info. Sci. 569 (2021), 469478.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https://arXiv:1409.1556.Google ScholarGoogle Scholar
  32. [32] Thulasidasan Sunil, Bhattacharya Tanmoy, Bilmes Jeff A., Chennupati Gopinath, and Mohd-Yusof Jamal. 2019. Combating label noise in deep learning using abstention. In Proceedings of the International Conference on Machine Learning (ICML’19).Google ScholarGoogle Scholar
  33. [33] Veit Andreas, Alldrin Neil, Chechik Gal, Krasin Ivan, Gupta Abhinav, and Belongie Serge. 2017. Learning from noisy large-scale datasets with minimal supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 839847.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Wang Kai, Gu Yuxin, Peng Xiaojiang, Zhang Panpan, Sun Baigui, and Li Hao. 2020. AU-guided unsupervised domain adaptive facial expression recognition. Retrieved from https://arXiv:2012.10078.Google ScholarGoogle Scholar
  35. [35] Wang Kai, Peng Xiaojiang, Yang Jianfei, Lu Shijian, and Qiao Yu. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 68976906.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Wang Kai, Peng Xiaojiang, Yang Jianfei, Meng Debin, and Qiao Yu. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29 (2020), 40574069.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Wang Luo, Qian Xueming, Zhang Yuting, Shen Jialie, and Cao Xiaochun. 2019. Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans. Cybernet. 50, 7 (2019), 33303342.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Wang Xinshao, Kodirov Elyor, Hua Yang, and Robertson Neil M.. 2019. Improved mean absolute error for learning meaningful patterns from abnormal training data. Technical report.Google ScholarGoogle Scholar
  39. [39] Wang Yisen, Ma Xingjun, Chen Zaiyi, Luo Yuan, Yi Jinfeng, and Bailey James. 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 322330.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Xie Siyue, Hu Haifeng, and Wu Yongbo. 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92 (2019), 177191.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Yuan Bodi, Chen Jianyu, Zhang Weidong, Tai Hung-Shuo, and McMains Sara. 2018. Iterative cross learning on noisy labels. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 757765.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Yuan Jin, Zhu Shuai, Huang Shuyin, Zhang Hanwang, Xiao Yaoqiang, Li Zhiyong, and Wang Meng. 2022. Discriminative style learning for cross-domain image captioning. IEEE Trans. Image Process. 31 (2022), 17231736.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Zeng Dan, Lin Zhiyuan, Yan Xiao, Liu Yuting, Wang Fei, and Tang Bo. 2022. Face2Exp: Combating data biases for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2029120300.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Zeng Jiabei, Shan Shiguang, and Chen Xilin. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision (ECCV’18). 222237.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Zhang Hongyi, Cisse Moustapha, Dauphin Yann N., and Lopez-Paz David. 2018. mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  46. [46] Zhang Yuhang, Wang Chengrui, and Deng Weihong. 2021. Relative uncertainty learning for facial expression recognition. Adv. Neural Info. Process. Syst. 34 (2021).Google ScholarGoogle Scholar
  47. [47] Zhao Zengqun, Liu Qingshan, and Zhou Feng. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 35103519.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zheng Songzhu, Wu Pengxiang, Goswami Aman, Goswami Mayank, Metaxas Dimitris, and Chen Chao. 2020. Error-bounded correction of noisy labels. In Proceedings of the International Conference on Machine Learning. PMLR, 1144711457.Google ScholarGoogle Scholar

Index Terms

  1. TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3
      May 2023
      514 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3582886
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 March 2023
      • Online AM: 17 November 2022
      • Accepted: 22 October 2022
      • Revised: 22 August 2022
      • Received: 13 April 2022
      Published in tomm Volume 19, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!