skip to main content
research-article

Edge-AI-Driven Framework with Efficient Mobile Network Design for Facial Expression Recognition

Published:19 April 2023Publication History
Skip Abstract Section

Abstract

Facial Expression Recognition (FER) in the wild poses significant challenges due to realistic occlusions, illumination, scale, and head pose variations of the facial images. In this article, we propose an Edge-AI-driven framework for FER. On the algorithms aspect, we propose two attention modules, Arbitrary-oriented Spatial Pooling (ASP) and Scalable Frequency Pooling (SFP), for effective feature extraction to improve classification accuracy. On the systems aspect, we propose an edge-cloud joint inference architecture for FER to achieve low-latency inference, consisting of a lightweight backbone network running on the edge device, and two optional attention modules partially offloaded to the cloud. Performance evaluation demonstrates that our approach achieves a good balance between classification accuracy and inference latency.

REFERENCES

  1. [1] Al-bayati Zaid, Zhao Qingling, Youssef Ahmed, Zeng Haibo, and Gu Zonghua. 2015. Enhanced partitioned scheduling of mixed-criticality systems on multicore platforms. In Proceedings of the 20th Asia and South Pacific Design Automation Conference. IEEE, 630635.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Barsoum Emad, Zhang Cha, Ferrer Cristian Canton, and Zhang Zhengyou. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the ACM International Conference on Multimodal Interaction. 279283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Behera Ardhendu, Wharton Zachary, Hewage Pradeep R. P. G., and Bera Asish. 2021. Context-aware attentional pooling (CAP) for fine-grained visual classification. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. 929937.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bhardwaj Kartikeya, Lin Chingyi, Sartor Anderson L., and Marculescu Radu. 2019. Memory- and communication-aware model compression for distributed deep learning inference on IoT. ACM Transactions on Embedded Computing Systems 18, 5s (2019), 82:1–82:22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Choudhary Tejalal, Mishra Vipul Kumar, Goswami Anurag, and Sarangapani Jagannathan. 2020. A comprehensive survey on model compression and acceleration. Artificial Intelligence Review 53, 7 (2020), 51135155.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Dhall Abhinav, Goecke Roland, Lucey Simon, and Gedeon Tom. 2011. Static facial expression analysis in tough conditions: Data, evaluation protocol, and benchmark. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 21062112.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Hou Qibin, Zhou Daquan, and Feng Jiashi. 2021. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1371313722.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Hu Jie, Shen Li, and Sun Gang. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 71327141.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Jayakodi Nitthilan Kanappan, Belakaria Syrine, Deshwal Aryan, and Doppa Janardhan Rao. 2020. Design and optimization of energy-accuracy tradeoff networks for mobile platforms via pretrained deep models. ACM Transactions on Embedded Computing Systems 19, 1 (2020), 4:1–4:24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Jayakodi Nitthilan Kanappan, Doppa Janardhan Rao, and Pande Partha Pratim. 2020. SETGAN: Scale and energy tradeoff GANs for image applications on mobile platforms. In Proceedings of the IEEE/ACM International Conference On Computer Aided Design. 23:1–23:9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Jin Xin, Xie Yanping, Wei Xiu-Shen, Zhao Borui, Chen Zhao-Min, and Tan Xiaoyang. 2022. Delving deep into spatial pooling for squeeze-and-excitation networks. Pattern Recognition 121 (2022), 108159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Langner Oliver, Dotsch Ron, Bijlstra Gijsbert, Wigboldus Daniel H. J., Hawk Skyler T., and Knippenberg A. D. Van. 2010. Presentation and validation of the Radboud faces database. Cognition and Emotion 24, 8 (2010), 13771388.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Li En, Zeng Liekang, Zhou Zhi, and Chen Xu. 2020. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 19, 1 (2020), 447457.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Li Yong, Zeng Jiabei, Shan Shiguang, and Chen Xilin. 2018. Patch-gated CNN for occlusion-aware facial expression recognition. In Proceedings of the International Conference on Pattern Recognition. 22092214.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Li Yong, Zeng Jiabei, Shan Shiguang, and Chen Xilin. 2019. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing 28, 5 (2019), 24392450.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Liu Guangdong, Lu Ying, Wang Shige, and Gu Zonghua. 2014. Partitioned multiprocessor scheduling of mixed-criticality parallel jobs. In Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications. IEEE, 110.Google ScholarGoogle Scholar
  18. [18] Liu Yuanyuan, Yuan Xiaohui, Gong Xi, Xie Zhong, Fang Fang, and Luo Zhongwen. 2018. Conditional convolution neural network enhanced random forest for facial expression recognition. Pattern Recognition 84 (2018), 251261.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Luan Siyu, Gu Zonghua, Xu Rui, Zhao Qingling, and Chen Gang. 2023. LRP-based network pruning and policy distillation of robust and non-robust DRL agents for embedded systems. Concurrency and Computation: Practice and Experience (2023), e7351.Google ScholarGoogle Scholar
  20. [20] Ma Ningning, Zhang Xiangyu, Zheng Hai-Tao, and Sun Jian. 2018. ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision. 122138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Meng Wenjia, Gu Zonghua, Zhang Ming, and Wu Zhaohui. 2017. Two-bit networks for deep learning on resource-constrained embedded devices. CoRR abs/1701.00485 (2017).Google ScholarGoogle Scholar
  22. [22] Oh Youngmin, Kim Beomjun, and Ham Bumsub. 2021. Background-aware pooling and noise-aware loss for weakly-supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 69136922.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Panda Priyadarshini, Sengupta Abhronil, and Roy Kaushik. 2016. Conditional deep learning for energy-efficient and enhanced pattern recognition. In Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition. 475480.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Park Jongchan, Woo Sanghyun, Lee Joon-Young, and Kweon In So. 2018. Bam: Bottleneck attention module. In Proceedings of British Machine Vision Conference. 147–160.Google ScholarGoogle Scholar
  25. [25] Qin Zequn, Zhang Pengyi, Wu Fei, and Li Xi. 2021. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 783792.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Sandler Mark, Howard Andrew G., Zhu Menglong, Zhmoginov Andrey, and Chen Liang-Chieh. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the CVPR. 45104520.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Schuurmans Mathijs, Berman Maxim, and Blaschko Matthew B.. 2018. Efficient semantic image segmentation with superpixel pooling. CoRR abs/1806.02705 (2018).Google ScholarGoogle Scholar
  28. [28] Sovrasov Vladislav. 2022. Ptflops: A flops counting tool for neural networks in pytorch framework. https://github.com/sovrasov/flops-counter.pytorch.Google ScholarGoogle Scholar
  29. [29] Stamoulis Dimitrios, Chin Ting-Wu (Rudy), Prakash Anand Krishnan, Fang Haocheng, Sajja Sribhuvan, Bognar Mitchell, and Marculescu Diana. 2018. Designing adaptive neural networks for energy-constrained image classification. In Proceedings of the International Conference on Computer-Aided Design. 23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Walt Stefan Van Der, Colbert S. Chris, and Varoquaux Gael. 2011. The NumPy array: A structure for efficient numerical computation. Computing in Science and Engineering 13, 2 (2011), 2230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Vesdapunt Noranart and Wang Baoyuan. 2021. CRFace: Confidence ranker for model-agnostic face detection refinement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 16741684.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Wang Kai, Peng Xiaojiang, Yang Jianfei, Meng Debin, and Qiao Yu. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 40574069.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Wang Qilong, Wu Banggu, Zhu Pengfei, Li Peihua, Zuo Wangmeng, and Hu Qinghua. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1153411542.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Wang Wenjing, Yang Wenhan, and Liu Jiaying. 2021. HLA-Face: Joint high-low adaptation for low light face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1619516204.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Woo Sanghyun, Park Jongchan, Lee Joon-Young, and Kweon In So. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision. Vol. 11211. 319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Wu Yirui, Guo Haifeng, Chakraborty Chinmay, Khosravi Mohammad, Berretti Stefano, and Wan Shaohua. 2023. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Transactions on Network Science and Engineering (2023). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Xie Siyue, Hu Haifeng, and Wu Yongbo. 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recognition 92 (2019), 177191.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Zhai Shuangfei, Wu Hui, Kumar Abhishek, Cheng Yu, Lu Yongxi, Zhang Zhongfei, and Feris Rogério Schmidt. 2017. S3Pool: Pooling with stochastic spatial sampling. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 40034011.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Zhao Zengqun, Liu Qingshan, and Wang Shanmin. 2021. Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Transactions on Image Processing 30 (2021), 65446556.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Zhao Zengqun, Liu Qingshan, and Zhou Feng. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI. 35103519.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Edge-AI-Driven Framework with Efficient Mobile Network Design for Facial Expression Recognition

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Embedded Computing Systems
        ACM Transactions on Embedded Computing Systems  Volume 22, Issue 3
        May 2023
        546 pages
        ISSN:1539-9087
        EISSN:1558-3465
        DOI:10.1145/3592782
        • Editor:
        • Tulika Mitra
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 April 2023
        • Online AM: 6 March 2023
        • Accepted: 17 February 2023
        • Revised: 12 December 2022
        • Received: 6 May 2022
        Published in tecs Volume 22, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!