skip to main content
research-article

PhiNets: A Scalable Backbone for Low-power AI at the Edge

Authors Info & Claims
Published:09 December 2022Publication History
Skip Abstract Section

Abstract

In the Internet of Things era, where we see many interconnected and heterogeneous mobile and fixed smart devices, distributing the intelligence from the cloud to the edge has become a necessity. Due to limited computational and communication capabilities, low memory and limited energy budget, bringing artificial intelligence algorithms to peripheral devices, such as end-nodes of a sensor network, is a challenging task and requires the design of innovative solutions. In this work, we present PhiNets, a new scalable backbone optimized for deep-learning-based image processing on resource-constrained platforms. PhiNets are based on inverted residual blocks specifically designed to decouple the computational cost, working memory, and parameter memory, thus exploiting all available resources for a given platform. With a YoloV2 detection head and Simple Online and Realtime Tracking (SORT), the proposed architecture achieves state-of-the-art results in (i) detection on the COCO and VOC2012 benchmarks, and (ii) tracking on the MOT15 benchmark. PhiNets obtain a reduction in parameter count of around 90% with respect to previous state-of-the-art models (EfficientNetv1, MobileNetv2) and achieve better performance with lower computational cost. Moreover, we demonstrate our approach on a prototype node based on an STM32H743 microcontroller (MCU) with 2 MB of internal Flash and 1MB of RAM and achieve power requirements in the order of 10 mW. The code for the PhiNets is publicly available on GitHub.1

REFERENCES

  1. [1] Bewley Alex, Ge Zongyuan, Ott Lionel, Ramos Fabio, and Upcroft Ben. 2016. Simple online and realtime tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing. IEEE, 34643468.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bochkovskiy Alexey, Wang Chien-Yao, and Liao Hong-Yuan Mark. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934. Retrieved from https://arxiv.org/abs/2004.10934.Google ScholarGoogle Scholar
  3. [3] Bulat Adrian and Tzimiropoulos Georgios. 2019. Xnor-net++: Improved binary neural networks. arXiv:1909.13863. Retrieved from https://arxiv.org/abs/1909.13863.Google ScholarGoogle Scholar
  4. [4] Cai Han, Gan Chuang, and Han Song. 2019. Once for all: Train one network and specialize it for efficient deployment. arXiv:1908.09791. Retrieved from https://arxiv.org/abs/1908.09791.Google ScholarGoogle Scholar
  5. [5] Cai Han, Gan Chuang, Zhu Ligeng, and Han Song. 2020. Tiny transfer learning: Towards memory-efficient on-device learning. arXiv:2007.11622. Retrieved from https://arxiv.org/abs/2007.11622.Google ScholarGoogle Scholar
  6. [6] Cerutti Gianmarco, Prasad Rahul, Brutti Alessio, and Farella Elisabetta. 2019. Neural network distillation on IoT platforms for sound event detection. In Proceedings of the Interspeech 2019. 36093613. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Dai Xiaoliang, Zhang Peizhao, Wu Bichen, Yin Hongxu, Sun Fei, Wang Yanghan, Dukhan Marat, Hu Yunqing, Wu Yiming, Jia Yangqing, Vajda Peter, Uyttendaele Matt, and Jha Niraj K.. 2018. ChamNet: Towards efficient network design through platform-aware model adaptation. arXiv:1812.08934. Retrieved from https://arxiv.org/abs/1812.08934.Google ScholarGoogle Scholar
  8. [8] Duan Kaiwen, Bai Song, Xie Lingxi, Qi Honggang, Huang Qingming, and Tian Qi. 2019. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 65696578.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Everingham M., Gool L. Van, Williams C. K. I., Winn J., and Zisserman A.. [n. d.]. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. Retrieved from http://www.pascal-network.org/challenges/VOC/voc2012/ workshop/index.html.Google ScholarGoogle Scholar
  10. [10] Flamand Eric, Rossi Davide, Conti Francesco, Loi Igor, Pullini Antonio, Rotenberg Florent, and Benini Luca. 2018. GAP-8: A RISC-V SoC for AI at the edge of the IoT. In Proceedings of the 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Foundation Raspberry Pi. [n. d.]. Raspberry Pi Hardware. Retrieved from https://www.raspberrypi.org/documen-tation/hardware/raspberrypi/.Google ScholarGoogle Scholar
  12. [12] FriendlyARM. [n. d.]. NanoPi NEO-LTS. Retrieved from https://www.friendlyarm.com.Google ScholarGoogle Scholar
  13. [13] Garofalo Angelo, Rusci Manuele, Conti Francesco, Rossi Davide, and Benini Luca. 2020. PULP-NN: Accelerating quantized neural networks on parallel ultra-low-power RISC-V processors. Philosophical Transactions of the Royal Society A 378, 2164 (2020), 20190155.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Hao Cong, Dotzel Jordan, Xiong Jinjun, Benini Luca, Zhang Zhiru, and Chen Deming. 2021. Enabling design methodologies and future trends for edge AI: Specialization and co-design. IEEE Design and Test 38, 4 (2021), 7–26.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Hassantabar Shayan, Terway Prerit, and Jha Niraj K.. 2020. TUTOR: Training neural networks using decision rules as model priors. arXiv:2010.05429. Retrieved from https://arxiv.org/abs/2010.05429.Google ScholarGoogle Scholar
  16. [16] Hassantabar Shayan, Wang Zeyu, and Jha Niraj K.. 2019. SCANN: Synthesis of compact and accurate neural networks. arXiv:1904.09090. Retrieved from https://arxiv.org/abs/1904.09090.Google ScholarGoogle Scholar
  17. [17] He Kaiming, Gkioxari Georgia, Dollár Piotr, and Girshick Ross. 2017. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision. 29612969.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Howard Andrew G., Zhu Menglong, Chen Bo, Kalenichenko Dmitry, Wang Weijun, Weyand Tobias, Andreetto Marco, and Adam Hartwig. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. Retrieved from https://arxiv.org/abs/1704.04861.Google ScholarGoogle Scholar
  20. [20] Hu Jie, Shen Li, and Sun Gang. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 71327141.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] John Arlene, Cardiff Barry, and John Deepu. 2021. A 1D-CNN based deep learning technique for sleep apnea detection in IoT sensors. arXiv:2105.00528. Retrieved from https://arxiv.org/abs/2105.00528.Google ScholarGoogle Scholar
  22. [22] Lai Liangzhen, Suda Naveen, and Chandra Vikas. 2018. Cmsis-nn: Efficient neural network kernels for arm cortex-m cpus. arXiv:1801.06601. Retrieved from https://arxiv.org/abs/1801.06601.Google ScholarGoogle Scholar
  23. [23] Leal-Taixé Laura, Milan Anton, Reid Ian, Roth Stefan, and Schindler Konrad. 2015. Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv:504.01942. Retrieved from https://arxiv.org/abs/504.01942.Google ScholarGoogle Scholar
  24. [24] Lin Ji, Chen Wei-Ming, Lin Yujun, Cohn John, Gan Chuang, and Han Song. 2020. Mcunet: Tiny deep learning on iot devices. arXiv:2007.10319. Retrieved from https://arxiv.org/abs/2007.10319.Google ScholarGoogle Scholar
  25. [25] Lin Tsung-Yi, Maire Michael, Belongie Serge, Hays James, Perona Pietro, Ramanan Deva, Dollár Piotr, and Zitnick C. Lawrence. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740755.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Liu Wei, Anguelov Dragomir, Erhan Dumitru, Szegedy Christian, Reed Scott, Fu Cheng-Yang, and Berg Alexander C.. 2016. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision. Springer, 2137.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Ltd. Arm[n. d.]. Cortex-M. Retrieved from https://developer.arm.com/ip-products/processors/cortex-m.Google ScholarGoogle Scholar
  28. [28] Paissan Francesco, Cerutti Gianmarco, Gottardi Massimo, and Farella Elisabetta. 2019. People/car classification using an ultra-low-power smart vision sensor. In Proceedings of the 2019 IEEE 8th International Workshop on Advances in Sensors and Interfaces. IEEE, 9196.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Paissan Francesco, Gottardi Massimo, and Farella Elisabetta. 2021. Enabling energy efficient machine learning on a Ultra-Low-Power vision sensor for IoT. arXiv:2102.01340. Retrieved from https://arxiv.org/abs/2102.01340.Google ScholarGoogle Scholar
  30. [30] Redmon Joseph and Farhadi Ali. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 72637271.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Ren Shaoqing, He Kaiming, Girshick Ross, and Sun Jian. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv:1506.01497. Retrieved from https://arxiv.org/abs/1506.01497.Google ScholarGoogle Scholar
  32. [32] Romaszkan Wojciech, Li Tianmu, and Gupta Puneet. 2020. 3PXNet: Pruned-permuted-packed XNOR networks for edge machine learning. ACM Transactions on Embedded Computing Systems 19, 1(2020), 23 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Rusci Manuele, Rossi Davide, Flamand Eric, Gottardi Massimo, Farella Elisabetta, and Benini Luca. 2018. Always-ON visual node with a hardware-software event-based binarized neural network inference engine. In Proceedings of the 15th ACM International Conference on Computing Frontiers.Association for Computing Machinery, New York, NY, 314319. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Sakr Fouad, Bellotti Francesco, Berta Riccardo, and Gloria Alessandro De. 2020. Machine learning on mainstream microcontrollers. Sensors 20, 9 (2020). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Sandler Mark, Howard Andrew, Zhu Menglong, Zhmoginov Andrey, and Chen Liang-Chieh. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 45104520.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Tan Mingxing and Le Quoc. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 61056114.Google ScholarGoogle Scholar
  37. [37] Tan Mingxing and Le Quoc V.. 2021. Efficientnetv2: Smaller models and faster training. arXiv:2104.00298. Retrieved from https://arxiv.org/abs/2104.00298.Google ScholarGoogle Scholar
  38. [38] Venkatesh Ganesh, Valliappan Alagappan, Mahadeokar Jay, Shangguan Yuan, Fuegen Christian, Seltzer Michael L., and Chandra Vikas. 2021. Memory-efficient speech recognition on smart devices. arXiv:2102.11531. Retrieved from https://arxiv.org/abs/2102.11531.Google ScholarGoogle Scholar
  39. [39] Wang Xiaofei, Han Yiwen, Leung Victor C. M., Niyato Dusit, Yan Xueqiang, and Chen Xu. 2020. Convergence of edge computing and deep learning: A comprehensive survey. IEEE Communications Surveys and Tutorials 22, 2 (2020), 869904.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Wojke Nicolai, Bewley Alex, and Paulus Dietrich. 2017. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing. IEEE, 36453649.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Xu Dianlei, Li Tong, Li Yong, Su Xiang, Tarkoma Sasu, Jiang Tao, Crowcroft Jon, and Hui Pan. 2020. Edge intelligence: Architectures, challenges, and applications. arXiv:2003.12172. Retrieved from https://arxiv.org/abs/2003.12172.Google ScholarGoogle Scholar
  42. [42] Zhang Yifu, Wang Chunyu, Wang Xinggang, Zeng Wenjun, and Liu Wenyu. 2020. FairMOT: On the fairness of detection and re-identification in multiple object tracking. arXiv:2004.01888. Retrieved from https://arxiv.org/abs/2004.01888.Google ScholarGoogle Scholar
  43. [43] Zhou Andy, Muller Rikky, and Rabaey Jan M.. 2021. Memory-efficient, limb position-aware hand gesture recognition using hyperdimensional computing. arXiv:2103.05267. Retrieved from https://arxiv.org/abs/2103.05267.Google ScholarGoogle Scholar

Index Terms

  1. PhiNets: A Scalable Backbone for Low-power AI at the Edge

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 21, Issue 5
      September 2022
      526 pages
      ISSN:1539-9087
      EISSN:1558-3465
      DOI:10.1145/3561947
      • Editor:
      • Tulika Mitra
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 December 2022
      • Online AM: 3 February 2022
      • Accepted: 7 January 2022
      • Revised: 26 November 2021
      • Received: 27 July 2021
      Published in tecs Volume 21, Issue 5

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!