skip to main content
research-article

A Fall Detection Network by 2D/3D Spatio-temporal Joint Models with Tensor Compression on Edge

Authors Info & Claims
Published:12 December 2022Publication History
Skip Abstract Section

Abstract

Falling is ranked highly among the threats in elderly healthcare, which promotes the development of automatic fall detection systems with extensive concern. With the fast development of the Internet of Things (IoT) and Artificial Intelligence (AI), camera vision-based solutions have drawn much attention for single-frame prediction and video understanding on fall detection in the elderly by using Convolutional Neural Network (CNN) and 3D-CNN, respectively. However, these methods hardly supervise the intermediate features with good accurate and efficient performance on edge devices, which makes the system difficult to be applied in practice. This work introduces a fast and lightweight video fall detection network based on a spatio-temporal joint-point model to overcome these hurdles. Instead of detecting fall motion by the traditional CNNs, we propose a Long Short-Term Memory (LSTM) model based on time-series joint-point features extracted from a pose extractor. We also introduce the increasingly mature RGB-D camera and propose 3D pose estimation network to further improve the accuracy of the system. We propose to apply tensor train decomposition on the model to reduce storage and computational consumption so the deployment on edge devices can to realized. Experiments are conducted to verify the proposed framework. For fall detection task, the proposed video fall detection framework achieves a high sensitivity of 98.46% on Multiple Cameras Fall, 100% on UR Fall, and 98.01% on NTU RGB-D 120. For pose estimation task, our 2D model attains 73.3 mAP in the COCO keypoint challenge, which outperforms the OpenPose by 8%. Our 3D model attains 78.6% mAP on NTU RGB-D dataset with 3.6× faster speed than OpenPose.

REFERENCES

  1. [1] Asif Umar, Mashford Benjamin, Cavallar Stefan Von, Yohanandan Shivanthan, Roy Subhrajit, Tang Jianbin, and Harrer Stefan. 2020. Privacy preserving human fall detection using video data. In Proceedings of the Machine Learning for Health Workshop. 3951.Google ScholarGoogle Scholar
  2. [2] Auvinet Edouard, Rougier Caroline, Meunier Jean, St-Arnaud Alain, and Rousseau Jacqueline. 2010. Multiple Cameras Fall Dataset. Technical report. DIRO-Université de Montréal, Tech. Rep. 1350.Google ScholarGoogle Scholar
  3. [3] Bhandari Smriti, Babar Navnee, Gupta Pranav, Shah Nidhi, and Pujari Shreyas. 2017. A novel approach for fall detection in home environment. In Proceedings of the IEEE 6th Global Conference on Consumer Electronics (GCCE). IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bosch-Jorge Marc, Sánchez-Salmerón Antonio-José, Valera Ángel, and Ricolfe-Viala Carlos. 2014. Fall detection based on the gravity vector using a wide-angle camera. Exp. Syst. Applic. 41, 17 (2014), 79807986.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Cai Xi, Li Suyuan, Liu Xinyue, and Han Guang. 2020. Vision-based fall detection with multi-task hourglass convolutional auto-encoder. IEEE Access 8 (2020), 4449344502.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Cao Zhe, Hidalgo Gines, Simon Tomas, Wei Shih-En, and Sheikh Yaser. 2018. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018).Google ScholarGoogle Scholar
  7. [7] Chen Yie-Tarng, Lin Yu-Ching, and Fang Wen-Hsien. 2010. A hybrid human fall detection scheme. In Proceedings of the IEEE International Conference on Image Processing. IEEE, 34853488.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Cheng Yuan, Huang Guangtai, Zhen Peining, Liu Bin, Chen Hai-Bao, Wong Ngai, and Yu Hao. 2020. An anomaly comprehension neural network for surveillance videos on terminal devices. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 13961401.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Cheng Yuan, Li Guangya, Wong Ngai, Chen Hai-Bao, and Yu Hao. 2020. DEEPEYE: A deeply tensor-compressed neural network for video comprehension on terminal devices. ACM Trans. Embed. Comput. Syst. 19, 3 (2020), 125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Cheng Yuan, Yang Yuchao, Chen Hai-Bao, Wong Ngai, and Yu Hao. 2021. S3-Net: A fast scene understanding network by single-shot segmentation for autonomous driving. ACM Trans. Intell. Syst. Technol. 12, 5 (2021), 119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Chéron Guilhem, Laptev Ivan, and Schmid Cordelia. 2015. P-CNN: Pose-based CNN features for action recognition. In Proceedings of the IEEE International Conference on Computer Vision. 32183226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Cippitelli Enea, Fioranelli Francesco, Gambi Ennio, and Spinsante Susanna. 2017. Radar and RGB-depth sensors for fall detection: A review. IEEE Sensors J. 17, 12 (2017), 35853604. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Feng Qi, Gao Chenqiang, Wang Lan, Zhao Yue, Song Tiecheng, and Li Qiang. 2020. Spatio-temporal fall event detection in complex scenes using attention guided LSTM. Pattern Recog. Lett. 130 (2020), 242249.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Foroughi Homa, Aski Baharak Shakeri, and Pourreza Hamidreza. 2008. Intelligent video surveillance for monitoring fall detection of elderly in home environments. In Proceedings of the 11th International Conference on Computer and Information Technology. IEEE, 219224.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Gutiérrez Jesús, Rodríguez Víctor, and Martin Sergio. 2021. Comprehensive review of vision-based fall detection systems. Sensors 21, 3 (2021), 947.Google ScholarGoogle Scholar
  16. [16] Harrou Fouzi, Zerrouki Nabil, Sun Ying, and Houacine Amrane. 2017. Vision-based fall detection system for improving safety of elderly people. IEEE Instrum. Measur. Mag. 20, 6 (2017), 4955.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] He Kaiming, Gkioxari Georgia, Dollár Piotr, and Girshick Ross. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 29612969.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Hitchcock Frank L.. 1927. The expression of a tensor or a polyadic as a sum of products. J. Math. Phys. 6, 1–4 (1927), 164189.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Igual Raul, Medrano Carlos, and Plaza Inmaculada. 2013. Challenges, issues and trends in fall detection systems. Biomed. Eng. Onl. 12, 1 (2013), 66.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Jahanjoo Anice, Naderan Marjan, and Rashti Mohammad Javad. 2020. Detection and multi-class classification of falling in elderly people by deep belief network algorithms. J. Amb. Intell. Human. Comput. (2020), 121.Google ScholarGoogle Scholar
  21. [21] Kocabas Muhammed, Karagoz Salih, and Akbas Emre. 2018. MultiPoseNet: Fast multi-person pose estimation using pose residual network. In Proceedings of the European Conference on Computer Vision (ECCV). 417433.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Kwolek Bogdan and Kepski Michal. 2014. Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput. Meth. Prog. Biomed. 117, 3 (2014), 489501.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Li Qiang, Stankovic John A., Hanson Mark A., Barth Adam T., Lach John, and Zhou Gang. 2009. Accurate, fast fall detection using gyroscopes and accelerometer-derived posture information. In Proceedings of the 6th International Workshop on Wearable and Implantable Body Sensor Networks. IEEE, 138143.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Lin Ji, Gan Chuang, and Han Song. 2019. TSM: Temporal shift module for efficient video understanding. In Proceedings of the IEEE International Conference on Computer Vision. 70837093.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Lin Tsung-Yi, Maire Michael, Belongie Serge, Hays James, Perona Pietro, Ramanan Deva, Dollár Piotr, and Zitnick C. Lawrence. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740755.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Liu Jun, Shahroudy Amir, Perez Mauricio, Wang Gang, Duan Ling-Yu, and Kot Alex C.. 2019. NTU RGB+D 120: A large-scale benchmark for 3D human activity understanding. IEEE Trans. Pattern Anal. Mach. Intell. (2019). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Lu Na, Wu Yidan, Feng Li, and Song Jinbo. 2018. Deep learning for fall detection: Three-dimensional CNN combined with LSTM on video kinematic data. IEEE J. Biomed. Health Inform. 23, 1 (2018), 314323.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Martinez Julieta, Hossain Rayat, Romero Javier, and Little James J.. 2017. A simple yet effective baseline for 3D human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Mubashir Muhammad, Shao Ling, and Seed Luke. 2013. A survey on fall detection: Principles and approaches. Neurocomputing 100 (2013), 144152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Nunez Juan C., Cabido Raul, Pantrigo Juan J., Montemayor Antonio S., and Velez Jose F.. 2018. Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recog. 76 (2018), 8094.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Núñez-Marcos Adrián, Azkune Gorka, and Arganda-Carreras Ignacio. 2017. Vision-based fall detection with convolutional neural networks. Wirel. Commun. Mob. Comput. (2017).Google ScholarGoogle Scholar
  32. [32] Ophoff Tanguy, Beeck Kristof Van, and Goedemé Toon. 2019. Exploring RGB+ depth fusion for real-time object detection. Sensors 19, 4 (2019), 866.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Organization World Health, Ageing World Health Organization., and Unit Life Course. 2008. WHO Global Report on Falls Prevention in Older Age. World Health Organization.Google ScholarGoogle Scholar
  34. [34] Oseledets I. V.. 2011. Tensor-train decomposition. SIAM J. Sci. Comput. 33, 5 (2011), 22952317. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Papandreou George, Zhu Tyler, Chen Liang-Chieh, Gidaris Spyros, Tompson Jonathan, and Murphy Kevin. 2018. PersonLab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In Proceedings of the European Conference on Computer Vision (ECCV). 269286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Song Sijie, Lan Cuiling, Xing Junliang, Zeng Wenjun, and Liu Jiaying. 2017. An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Tucker Ledyard R.. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31 (1966), 279311. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Vaidehi V., Ganapathy Kirupa, Mohan K., Aldrin A., and Nirmal K.. 2011. Video based automatic fall detection in indoor environment. In Proceedings of the International Conference on Recent Trends in Information Technology (ICRTIT). IEEE, 10161020.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Wang Kun, Cao Guitao, Meng Dan, Chen Weiting, and Cao Wenming. 2016. Automatic fall detection of human in video using combination of features. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 12281233.Google ScholarGoogle Scholar
  40. [40] Wang Shengke, Chen Long, Zhou Zixi, Sun Xin, and Dong Junyu. 2016. Human fall detection in surveillance video based on PCANet. Multim. Tools Applic. 75, 19 (2016), 1160311613.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Yang Yuchao, Ren Hongwei, Li Chenghao, Ding Chenchen, and Yu Hao. 2021. An edge-device based fast fall detection using spatio-temporal optical flow model. In Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 50675071.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Zerrouki Nabil, Harrou Fouzi, Houacine Amrane, and Sun Ying. 2016. Fall detection using supervised machine learning algorithms: A comparative study. In Proceedings of the 8th International Conference on Modelling, Identification and Control (ICMIC). IEEE, 665670.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Zhen Peining, Chen Hai-Bao, Cheng Yuan, Ji Zhigang, Liu Bin, and Yu Hao. 2021. Fast video facial expression recognition by a deeply tensor-compressed LSTM neural network for mobile devices. ACM Trans. Internet Things 2, 4 (2021), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Fall Detection Network by 2D/3D Spatio-temporal Joint Models with Tensor Compression on Edge

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Embedded Computing Systems
          ACM Transactions on Embedded Computing Systems  Volume 21, Issue 6
          November 2022
          498 pages
          ISSN:1539-9087
          EISSN:1558-3465
          DOI:10.1145/3561948
          • Editor:
          • Tulika Mitra
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 December 2022
          • Online AM: 30 April 2022
          • Accepted: 26 March 2022
          • Revised: 23 March 2022
          • Received: 11 July 2021
          Published in tecs Volume 21, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!