Abstract
In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal evolution of crowd behaviors. To model interactions among humans and environments, we embed both the social and the physical attention mechanisms into the long short-term memory. The social attention component can model the interactions among different pedestrians, whereas the physical attention component helps to understand the spatial configurations of the scene. Since pedestrians’ behaviors demonstrate multi-modal properties, we use the generative model to produce multiple acceptable future paths. The proposed framework not only predicts an individual’s trajectory accurately but also forecasts the ongoing group behaviors by leveraging on the coherent filtering approach. Experiments are carried out on the standard crowd benchmarks (namely, the ETH, the UCY, the CUHK crowd, and the CrowdFlow datasets), which demonstrate that the proposed framework is effective in forecasting crowd behaviors in complex scenarios.
- [1] . 2016. Social LSTM: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 961–971.Google ScholarCross Ref
- [2] . 2014. Socially-aware large-scale crowd forecasting. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE , Los Alamitos, CA, 2203–2210. Google ScholarDigital Library
- [3] . 2007. A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 1–6.Google ScholarCross Ref
- [4] . 2009. Crowd flow characterization with optimal control theory. In Proceedings of the Asian Conference on Computer Vision. 279–290. Google Scholar
Digital Library
- [5] . 2017. Social scene understanding: End-to-end multi-person action localization and collective activity recognition. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 4315–4324.Google ScholarCross Ref
- [6] . 2016. Knowledge transfer for scene-specific motion prediction. In Proceedings of the European Conference on Computer Vision. 697–713.Google Scholar
Cross Ref
- [7] . 2018. Context-aware trajectory prediction. In Proceedings of the IEEE International Conference on Pattern Recognition.
IEEE ,Los Alamitos, CA , 1941–1946.Google ScholarCross Ref
- [8] . 2018. Soft+hardwired attention: An LSTM framework for human trajectory prediction and abnormal event detection. Neural Networks 108 (2018), 466–478.Google Scholar
Cross Ref
- [9] . 2012. Vision-based analysis of small groups in pedestrian crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (2012), 1003–1016. Google Scholar
Digital Library
- [10] . 2017. Crowd scene understanding from video: A survey. ACM Transactions on Multimedia Computing, Communications, and Applications 13, 2 (2017), 19. Google Scholar
Digital Library
- [11] . 2018. Social GAN: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 2255–2264.Google ScholarCross Ref
- [12] . 2012. Violent flows: Real-time detection of violent crowd behavior. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition Workshops.
IEEE ,Los Alamitos, CA , 1–6.Google ScholarCross Ref
- [13] . 1995. Social force model for pedestrian dynamics. Physical Review E 51, 5 (1995), 4282.Google Scholar
Cross Ref
- [14] . 2012. Activity forecasting. In Proceedings of the European Conference on Computer Vision. 201–214. Google Scholar
Digital Library
- [15] . 2016. Crowd behavior analysis: A review where physics meets biology. Neurocomputing 177 (2016), 342–362. Google Scholar
Digital Library
- [16] . 2016. Support vector motion clustering. IEEE Transactions on Circuits and Systems for Video Technology 27, 11 (2016), 2395–2408.Google Scholar
Cross Ref
- [17] . 2017. DESIRE: Distant future prediction in dynamic scenes with interacting agents. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 336–345.Google ScholarCross Ref
- [18] . 2014. Crowded scene analysis: A survey. IEEE Transactions on Circuits and Systems for Video Technology 25, 3 (2014), 367–386.Google Scholar
Digital Library
- [19] . 2013. Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 1 (2013), 18–32. Google Scholar
Digital Library
- [20] . 1981. An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence. 674–679. Google Scholar
Digital Library
- [21] . 2011. Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 11 (2011), 2287–2301. Google Scholar
Digital Library
- [22] . 2016. Forecasting social navigation in crowded complex scenes. arXiv: 1601.00998.Google Scholar
- [23] . 2016. Learning social etiquette: Human trajectory understanding in crowded scenes. In Proceedings of the European Conference on Computer Vision. 549–565.Google Scholar
Cross Ref
- [24] . 2020. Human motion trajectory prediction: A survey. International Journal of Robotics Research 39, 8 (2020), 895–935.Google Scholar
Digital Library
- [25] . 2018. Sophie: An attentive GAN for predicting paths compliant to social and physical constraints. arXiv: 1806.01482.Google Scholar
- [26] . 2018. Optical flow dataset and benchmark for visual crowd analysis. In Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance.
IEEE ,Los Alamitos, CA , 1–6.Google ScholarCross Ref
- [27] . 2016. Learning scene-independent group descriptors for crowd understanding. IEEE Transactions on Circuits and Systems for Video Technology 27, 6 (2016), 1290–1303.Google Scholar
Digital Library
- [28] . 1994. Good features to track. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 593–600.Google Scholar - [29] . 2012. Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 10 (2012), 2064–2070. Google Scholar
Digital Library
- [30] . 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of the IEEE International Conference on Robotics and Automation.
IEEE ,Los Alamitos, CA , 1928–1935.Google Scholar - [31] . 2018. Social attention: Modeling attention in human crowds. In Proceedings of the IEEE International Conference on Robotics and Automation.
IEEE ,Los Alamitos, CA , 1–7.Google ScholarCross Ref
- [32] . 2016. Path patterns: Analyzing and comparing real and simulated crowds. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games.
ACM ,New York, NY , 49–57. Google ScholarDigital Library
- [33] . 2015. Understanding pedestrian behaviors from stationary crowd groups. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
IEEE ,Los Alamitos, CA , 3488–3496.Google ScholarCross Ref
- [34] . 2016. Pedestrian behavior understanding and prediction with deep neural networks. In Proceedings of the European Conference on Computer Vision. 263–279.Google Scholar
Cross Ref
- [35] . 2015. Learning behavior patterns from video: A data-driven framework for agent-based crowd modeling. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems. 801–809. Google Scholar
Digital Library
- [36] . 2012. Coherent filtering: Detecting coherent motions from crowd clutters. In Proceedings of the European Conference on Computer Vision. 857–871. Google Scholar
Digital Library
- [37] . 2015. Learning collective crowd behaviors with dynamic pedestrian-agents. International Journal of Computer Vision 111, 1 (2015), 50–68.Google Scholar
Digital Library
Index Terms
Where Are They Going? Predicting Human Behaviors in Crowded Scenes
Recommendations
Visual analysis of socio-cognitive crowd behaviors for surveillance: A survey and categorization of trends and methods
AbstractMonitoring and inferring socio-cognitive behaviors through crowd analysis can help us to understand many processes. Be it people in crowded environments, road traffic or even a flock of fish, situational awareness becomes critical for ...
Observed behaviours in simulated close-range pedestrian dynamics
SIMAUD '18: Proceedings of the Symposium on Simulation for Architecture and Urban DesignCrowd simulation can be a useful tool for predicting, analyzing, and planning mass-gathering events. The analysis of simulated crowds aims to extract observations to assess occupant interactions and potential crowd flow issues. This paper presents a ...
Towards understanding socio-cognitive behaviors of crowds from visual surveillance data
AbstractThe problem of understanding socio-cognitive aspects of crowd behavior is a challenging yet critical task particularly for human-computer interaction applications. This issue is considered an important component of both current surveillance ...






Comments