skip to main content
research-article

Where Are They Going? Predicting Human Behaviors in Crowded Scenes

Published:12 November 2021Publication History
Skip Abstract Section

Abstract

In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal evolution of crowd behaviors. To model interactions among humans and environments, we embed both the social and the physical attention mechanisms into the long short-term memory. The social attention component can model the interactions among different pedestrians, whereas the physical attention component helps to understand the spatial configurations of the scene. Since pedestrians’ behaviors demonstrate multi-modal properties, we use the generative model to produce multiple acceptable future paths. The proposed framework not only predicts an individual’s trajectory accurately but also forecasts the ongoing group behaviors by leveraging on the coherent filtering approach. Experiments are carried out on the standard crowd benchmarks (namely, the ETH, the UCY, the CUHK crowd, and the CrowdFlow datasets), which demonstrate that the proposed framework is effective in forecasting crowd behaviors in complex scenarios.

REFERENCES

  1. [1] Alahi Alexandre, Goel Kratarth, Ramanathan Vignesh, Robicquet Alexandre, Fei-Fei Li, and Savarese Silvio. 2016. Social LSTM: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 961971.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Alahi Alexandre, Ramanathan Vignesh, and Fei-Fei Li. 2014. Socially-aware large-scale crowd forecasting. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 22032210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Ali Saad and Shah Mubarak. 2007. A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 16.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Allain Pierre, Courty Nicolas, and Corpetti Thomas. 2009. Crowd flow characterization with optimal control theory. In Proceedings of the Asian Conference on Computer Vision. 279290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Bagautdinov Timur, Alahi Alexandre, Fleuret Francois, Fua Pascal, and Savarese Silvio. 2017. Social scene understanding: End-to-end multi-person action localization and collective activity recognition. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 43154324.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Ballan Lamberto, Castaldo Francesco, Alahi Alexandre, Palmieri Francesco, and Savarese Silvio. 2016. Knowledge transfer for scene-specific motion prediction. In Proceedings of the European Conference on Computer Vision. 697713.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Bartoli Federico, Lisanti Giuseppe, Ballan Lamberto, and Bimbo Alberto Del. 2018. Context-aware trajectory prediction. In Proceedings of the IEEE International Conference on Pattern Recognition. IEEE, Los Alamitos, CA, 19411946.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Fernando Tharindu, Denman Simon, Sridharan Sridha, and Fookes Clinton. 2018. Soft+hardwired attention: An LSTM framework for human trajectory prediction and abnormal event detection. Neural Networks 108 (2018), 466478.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Ge Weina, Collins Robert T., and Ruback R. Barry. 2012. Vision-based analysis of small groups in pedestrian crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (2012), 10031016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Grant Jason M. and Flynn Patrick J.. 2017. Crowd scene understanding from video: A survey. ACM Transactions on Multimedia Computing, Communications, and Applications 13, 2 (2017), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Gupta Agrim, Johnson Justin, Fei-Fei Li, Savarese Silvio, and Alahi Alexandre. 2018. Social GAN: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 22552264.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Hassner Tal, Itcher Yossi, and Kliper-Gross Orit. 2012. Violent flows: Real-time detection of violent crowd behavior. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition Workshops. IEEE, Los Alamitos, CA, 16.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Helbing Dirk and Molnar Peter. 1995. Social force model for pedestrian dynamics. Physical Review E 51, 5 (1995), 4282.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Kitani Kris M., Ziebart Brian D., Bagnell J. Andrew, and Hebert Martial. 2012. Activity forecasting. In Proceedings of the European Conference on Computer Vision. 201214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Kok Ven Jyn, Lim Mei Kuan, and Chan Chee Seng. 2016. Crowd behavior analysis: A review where physics meets biology. Neurocomputing 177 (2016), 342362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Lawal Isah A., Poiesi Fabio, Anguita Davide, and Cavallaro Andrea. 2016. Support vector motion clustering. IEEE Transactions on Circuits and Systems for Video Technology 27, 11 (2016), 23952408.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Lee Namhoon, Choi Wongun, Vernaza Paul, Choy Christopher B., Torr Philip H. S., and Chandraker Manmohan. 2017. DESIRE: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 336345.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Li Teng, Chang Huan, Wang Meng, Ni Bingbing, Hong Richang, and Yan Shuicheng. 2014. Crowded scene analysis: A survey. IEEE Transactions on Circuits and Systems for Video Technology 25, 3 (2014), 367386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Li Weixin, Mahadevan Vijay, and Vasconcelos Nuno. 2013. Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 1 (2013), 1832. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Lucas Bruce D. and Kanade Takeo. 1981. An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence. 674679. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Morris Brendan Tran and Trivedi Mohan Manubhai. 2011. Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 11 (2011), 22872301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Robicquet Alexandre, Alahi Alexandre, Sadeghian Amir, Anenberg Bryan, Doherty John, Wu Eli, and Savarese Silvio. 2016. Forecasting social navigation in crowded complex scenes. arXiv: 1601.00998.Google ScholarGoogle Scholar
  23. [23] Robicquet Alexandre, Sadeghian Amir, Alahi Alexandre, and Savarese Silvio. 2016. Learning social etiquette: Human trajectory understanding in crowded scenes. In Proceedings of the European Conference on Computer Vision. 549565.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Rudenko Andrey, Palmieri Luigi, Herman Michael, Kitani Kris M., Gavrila Dariu M., and Arras Kai O.. 2020. Human motion trajectory prediction: A survey. International Journal of Robotics Research 39, 8 (2020), 895935.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Sadeghian Amir, Kosaraju Vineet, Sadeghian Ali, Hirose Noriaki, and Savarese Silvio. 2018. Sophie: An attentive GAN for predicting paths compliant to social and physical constraints. arXiv: 1806.01482.Google ScholarGoogle Scholar
  26. [26] Schröder Gregory, Senst Tobias, Bochinski Erik, and Sikora Thomas. 2018. Optical flow dataset and benchmark for visual crowd analysis. In Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, Los Alamitos, CA, 16.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Shao Jing, Loy Chen Change, and Wang Xiaogang. 2016. Learning scene-independent group descriptors for crowd understanding. IEEE Transactions on Circuits and Systems for Video Technology 27, 6 (2016), 12901303.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Shi Jianbo and Tomasi Carlo. 1994. Good features to track. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 593600.Google ScholarGoogle Scholar
  29. [29] Solmaz Berkan, Moore Brian E., and Shah Mubarak. 2012. Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 10 (2012), 20642070. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Berg Jur Van den, Lin Ming, and Manocha Dinesh. 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, Los Alamitos, CA, 19281935.Google ScholarGoogle Scholar
  31. [31] Vemula Anirudh, Muelling Katharina, and Oh Jean. 2018. Social attention: Modeling attention in human crowds. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, Los Alamitos, CA, 17.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Wang He, Ondřej Jan, and O’Sullivan Carol. 2016. Path patterns: Analyzing and comparing real and simulated crowds. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. ACM, New York, NY, 4957. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Yi Shuai, Li Hongsheng, and Wang Xiaogang. 2015. Understanding pedestrian behaviors from stationary crowd groups. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 34883496.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Yi Shuai, Li Hongsheng, and Wang Xiaogang. 2016. Pedestrian behavior understanding and prediction with deep neural networks. In Proceedings of the European Conference on Computer Vision. 263279.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Zhong Jinghui, Cai Wentong, Luo Linbo, and Yin Haiyan. 2015. Learning behavior patterns from video: A data-driven framework for agent-based crowd modeling. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems. 801809. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Zhou Bolei, Tang Xiaoou, and Wang Xiaogang. 2012. Coherent filtering: Detecting coherent motions from crowd clutters. In Proceedings of the European Conference on Computer Vision. 857871. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Zhou Bolei, Tang Xiaoou, and Wang Xiaogang. 2015. Learning collective crowd behaviors with dynamic pedestrian-agents. International Journal of Computer Vision 111, 1 (2015), 5068.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Where Are They Going? Predicting Human Behaviors in Crowded Scenes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 4
      November 2021
      529 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3492437
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 November 2021
      • Accepted: 1 February 2021
      • Revised: 1 November 2020
      • Received: 1 February 2020
      Published in tomm Volume 17, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!