Abstract
Crowd video analysis has applications in crowd management, public space design, and visual surveillance. Example tasks potentially aided by automated analysis include anomaly detection (such as a person walking against the grain of traffic or rapid assembly/dispersion of groups of people), population and density measurements, and interactions between groups of people. This survey explores crowd analysis as it relates to two primary research areas: crowd statistics and behavior understanding. First, we survey methods for counting individuals and approximating the density of the crowd. Second, we showcase research efforts on behavior understanding as related to crowds. These works focus on identifying groups, interactions within small groups, and abnormal activity detection such as riots and bottlenecks in large crowds. Works presented in this section also focus on tracking groups of individuals, either as a single entity or a subset of individuals within the frame of reference. Finally, a summary of datasets available for crowd activity video research is provided.
- Antonio Albiol, Inmaculada Mora, and Valery Naranjo. 2001. Real-time high density people counter using morphological tools. IEEE Trans. Intell. Transport. Syst. 2, 4 (2001), 204--218. Google Scholar
Digital Library
- Saad Ali and Mubarak Shah. 2007. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1--6.Google Scholar
Cross Ref
- Saad Ali and Mubarak Shah. 2008. Floor fields for tracking in high density crowd scenes. In Proceedings of the 2008 European Conference on Computer Vision. Springer, Berlin, 1--14. Google Scholar
Digital Library
- Pierre Allain, Nicolas Courty, and Thomas Corpetti. 2012. AGORASET: A dataset for crowd video analysis. In Proceedings from the 1st ICPR International Workshop on Pattern Recognition and Crowd Analysis. IEEE, 1--6.Google Scholar
- Carlos Arteta, Victor Lempitsky, J. Alison Noble, and Andrew Zisserman. 2012. Learning to detect cells using non-overlapping extremal regions. In Proceedings of the 15th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2012), Nicholas Ayache, Hervé Delingette, Polina Golland, and Kensaku Mori (Eds.). Springer, Berlin, 348--356. Google Scholar
Digital Library
- C. Arteta, V. Lempitsky, J. A. Noble, and A. Zisserman. 2013. Learning to detect partially overlapping instances. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 3230--3237. Google Scholar
Digital Library
- Carlos Arteta, Victor Lempitsky, J. Alison Noble, and Andrew Zisserman. 2014. Interactive object counting. In Proceedings of the 13th European Conference Computer Vision (ECCV) 2014, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Zurich, Switzerland, 504--518.Google Scholar
Cross Ref
- S. Bandini, A. Gorrini, L. Manenti, and G. Vizzari. 2012. Crowd and pedestrian dynamics: Empirical investigation and simulation. In Proceedings of Measuring Behavior, A. J. Spink, F. Grieco, O. E. Krips, L. W. S. Loijens, L. P. J. J. Noldus, and P. H. Zimmerman (Eds.), Vol. 2012. 308--311.Google Scholar
- Federico Bartoli, Giuseppe Lisanti, Lorenzo Seidenari, Svebor Karaman, and Alberto Del Bimbo. 2015. MuseumVisitors: A dataset for pedestrian and group detection, gaze estimation and behavior understanding. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE Computer Society, 19--27.Google Scholar
Cross Ref
- Stan Birchfield. 1998. Elliptical head tracking using intensity gradients and color histograms. In Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 232--237. Google Scholar
Digital Library
- A. Briassouli and I. Kompatsiaris. 2011. Spatiotemporally localized new event detection in crowds. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). IEEE, 928--933.Google Scholar
- Gabriel J. Brostow and Roberto Cipolla. 2006. Unsupervised bayesian detection of independent motion in crowds. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. IEEE, 594--601. Google Scholar
Digital Library
- A. B. Chan and N. Vasconcelos. 2012. Counting people with low-level features and Bayesian regression. IEEE Trans. Image Process. 21, 4 (April 2012), 2160--2177. Google Scholar
Digital Library
- Antoni B. Chan, Z.-S. J. Liang, and Nuno Vasconcelos. 2008. Privacy preserving crowd monitoring: Counting people without people models or tracking. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1--7.Google Scholar
Cross Ref
- Ming-Ching Chang, Nils Krahnstoever, Sernam Lim, and Ting Yu. 2010. Group level activity recognition in crowded environments across multiple cameras. In Proceedings of the 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 56--63. Google Scholar
Digital Library
- Tsong-yi; Chen Zhi-xian Chen, Thou-ho; Chen. 2006. An intelligent people-flow counting method for passing through a gate. In Proceedings of the 2006 IEEE Conference on Robotics, Automation and Mechatronics. IEEE Computer Society, 1--6.Google Scholar
Cross Ref
- Anil M. Cheriyadat and Richard J. Radke. 2008. Detecting dominant motions in dense crowds. IEEE J. Select. Top. Sign. Process. 2, 4 (Aug. 2008), 568--581.Google Scholar
- Siu-Yeung Cho, Tommy WS Chow, and Chi-Tat Leung. 1999. A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans. Syst. Man Cybernet. B 29, 4 (Aug. 1999), 535--541. Google Scholar
Digital Library
- Wongun Choi, Khuram Shahid, and Silvio Savarese. 2009. What are they doing? : Collective activity classification using spatio-temporal relationship among people. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops). IEEE Computer Society, 1282--1289.Google Scholar
- D. Comaniciu, V. Ramesh, and P. Meer. 2000. Real-time tracking of non-rigid objects using mean shift. In Proceedings of the 2000 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2. IEEE, 142--149.Google Scholar
- Yang Cong, H. Gong, S. C. Zhu, and Yandong Tang. 2009. Flow mosaicking: Real-time pedestrian counting without scene-specific learning. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 1093--1100.Google Scholar
Cross Ref
- Anthony C. Davies, Jia Hong Yin, and Sergio A. Velastin. 1995. Crowd monitoring using image processing. Electron. Commun. Eng. J. 7, 1 (1995), 37--47.Google Scholar
- Farouk El-Baz. 1996. Remote Sensing Controversy and the Million Man March. Earth Observ. Mag. (Feb. 1996), 16--18.Google Scholar
- Gunnar Farneback. 2003. Two-frame motion estimation based on polynomial expansion. In Image Analysis, Josef Bigun and Tomas Gustavsson (Eds.). Lecture Notes in Computer Science, Vol. 2749. Springer, Berlin, 363--370. Google Scholar
Digital Library
- Alireza Fathi and Greg Mori. 2008. Action recognition by learning mid-level motion features. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google Scholar
Cross Ref
- J. Ferryman and A. Shahrokni. 2009. PETS2009: Dataset and challenge. In Proceedings of the 2009 12th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. IEEE, UT, 1--6.Google Scholar
- John Fruin. 1992. Designing for pedestrians. Public Transportation United States (1992).Google Scholar
- Huiyuan Fu, Huadong Ma, and Hongtian Xiao. 2014. Scene-adaptive accurate and fast vertical crowd counting via joint using depth and color information. Multimedia Tools Appl. 73, 1 (2014), 273--289. Google Scholar
Digital Library
- Hironobu Fujiyoshi and Alan J Lipton. 1998. Real-time human motion analysis by image skeletonization. Proc. Fourth IEEE Workshop Appl. Comput. Vis. 1998 87, 1 (Oct. 1998), 15--21. Google Scholar
Digital Library
- Chenqiang Gao, Jun Liu, Qi Feng, and Jing Lv. 2016. People-flow counting in complex environments by combining depth and color information. Multimedia Tools Appl. 75, 15 (February 2016), 9315--9331. Google Scholar
Digital Library
- Weina Ge, Robert T. Collins, and R. Barry Ruback. 2012. Vision-based analysis of small groups in pedestrian crowds. IEEE Trans. Pattern Anal. Mach. Intell.e 34, 5 (May 2012), 1003--1016. Google Scholar
Digital Library
- Dirk Helbing, Illés Farkas, and Tamas Vicsek. 2000. Simulating dynamical features of escape panic. Nature 407, 6803 (2000), 487--490.Google Scholar
Cross Ref
- Dirk Helbing, Anders Johansson, and Habib Zein Al-Abideen. 2007. The dynamics of crowd disasters: An empirical study. Phys. Rev. E 75, 4 (Apr. 2007), 046109.Google Scholar
Cross Ref
- Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2547--2554. Google Scholar
Digital Library
- Pierre-Marc Jodoin, Yannick Benezeth, and Yi Wang. 2013. Meta-tracking for video scene understanding. In Proceedings of the 2013 IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 1--6.Google Scholar
Cross Ref
- Nahum Kiryati, Tammy Riklin Raviv, Yan Ivanchenko, and Shay Rochel. 2008. Real-time abnormal motion detection in surveillance video. In Proceedings of the 19th International Conference on Pattern Recognition (ICPR). IEEE, 1--4.Google Scholar
Cross Ref
- D. Kong, D. Gray, and Hai Tao. 2006. A viewpoint invariant approach for crowd counting. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06), Vol. 3. IEEE Computer Society, Hong Kong, 1187--1190. Google Scholar
Digital Library
- Victor Lempitsky and Andrew Zisserman. 2010. Learning to count objects in images. In Advances in Neural Information Processing Systems 23, J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta (Eds.). Curran Associates, Inc., 1324--1332. Google Scholar
Digital Library
- M. K. Lim, V. J. Kok, C. C. Loy, and C. S. Chan. 2014. Crowd saliency detection via global similarity structure. In Proceedings of the 22nd International Conference on Pattern Recognition (ICPR). IEEE, 3957--3962. Google Scholar
Digital Library
- Sheng-Fuu Lin, Jaw-Yeh Chen, and Hung-Xin Chao. 2001. Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans. Syst. Man Cybernet. A: Syst. Hum. 31, 6 (Nov. 2001), 645--654. Google Scholar
Digital Library
- Jingchen Liu and Yanxi Liu. 2010. Multi-target tracking of time-varying spatial patterns. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1839--1846.Google Scholar
Cross Ref
- Zhu Liu and Yao Wang. 2000. Face detection and tracking in video using dynamic programming. In Proceedings of the 2000 International Conference on Image Processing, Vol. 1. IEEE, 53--56.Google Scholar
Cross Ref
- C. C. Loy, S. Gong, and T. Xiang. 2013. From semi-supervised to transfer counting of crowds. In Proceedings of the 2013 IEEE International Conference on Computer Vision. 2256--2263. Google Scholar
Digital Library
- Z. Ma and A. B. Chan. 2013. Crossing the line: Crowd counting by integer programming with local features. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2539--2546. Google Scholar
Digital Library
- Z. Ma, Lei Yu, and A. B. Chan. 2015. Small instance detection by integer programming on object density maps. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3689--3697.Google Scholar
- Vijay Mahadevan, Weixin Li, Viral Bhalodia, and Nuno Vasconcelos. 2010. Anomaly detection in crowded scenes. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA, 1975--1981.Google Scholar
Cross Ref
- A. Marana, L. Costa, R. Lotufo, and S. Velastin. 1998. On the efficacy of texture analysis for crowd monitoring. In Proceedings of the 1998 International Symposium on Computer Graphics, Image Processing, and Vision. IEEE, 354--361. Google Scholar
Digital Library
- Ramin Mehran, Alexis Oyama, and Mubarak Shah. 2009. Abnormal crowd behavior detection using social force model. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 935--942.Google Scholar
Cross Ref
- Thomas Moeslund and Erik Granum. 2001. A survey of computer vision-based human motion capture. Comput. Vis. Image Understand. 81, 3 (Mar. 2001), 231--268. Google Scholar
Digital Library
- Mehdi Moussaïd, Niriaska Perozo, Simon Garnier, Dirk Helbing, and Guy Theraulaz. 2010. The walking behaviour of pedestrian social groups and its impact on crowd dynamics. PLoS ONE 5, 4 (Apr. 2010), e10047.Google Scholar
Cross Ref
- Vasu Parameswaran and Rama Chellappa. 2006. View invariance for human action recognition. Int. J. Comput. Vis. 66, 1 (Jan, 2006), 83--101. Google Scholar
Digital Library
- A. G. A. Perera, C. Srinivas, A. Hoogs, G. Brooksby, and Wensheng Hu. 2006. Multi-Object tracking through simultaneous long occlusions and split-merge conditions. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. IEEE, 666--673. Google Scholar
Digital Library
- Hidayah Rahmalan, Mark S. Nixon, and John N. Carter. 2006. On crowd density estimation for surveillance. In Proceedings of the 2006 Institution of Engineering and Technology Conference on Crime and Security. IET, IEEE, 540--545.Google Scholar
- S. Rao and P. S. Sastry. 2003. Abnormal activity detection in video sequences using learnt probability densities. In Proceedings of the 2003 Conference on Convergent Technologies for the Asia-Pacific Region (TENCON), Vol. 1. IEEE, 369--372.Google Scholar
- C. S. Regazzoni, A. Tesei, and V. Murino. 1993. A real-time vision system for crowding monitoring. In Proceedings of the 1993 International Conference on Industrial Electronics, Control, and Instrumentation, Vol. 3. IEEE, 1860--1864.Google Scholar
- I. Rodrigues de Almeida and C. Rosito Jung. 2013. Change detection in human crowds. In Proceedings of the 2013 Conference on Graphics, Patterns and Images (SIBGRAPI), Nina Hirata, Luciana Nedel, Claudio Silva, and Kim Boyer (Eds.). IEEE, 63--69. Google Scholar
Digital Library
- Mikel Rodriguez, Saad Ali, and Takeo Kanade. 2009. Tracking in unstructured crowded scenes. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. IEEE, 1389--1396.Google Scholar
Cross Ref
- M. Rodriguez, I. Laptev, J. Sivic, and J. Y. Audibert. 2011a. Density-aware person detection and tracking in crowds. In Proceedings of the 2011 International Conference on Computer Vision. IEEE, 2423--2430. Google Scholar
Digital Library
- M. Rodriguez, J. Sivic, I. Laptev, and J.-Y. Audibert. 2011b. Data-driven Crowd Analysis in Videos. In Proceedings of the 2011 IEEE International Conference on Computer Vision. IEEE, 1235--1242. Google Scholar
Digital Library
- L. J. M. Rothkrantz. 2013. Crisis management using multiple camera surveillance systems. In Proceedings of the 10th International Conference on Information Systems for Crisis Response and Management. ISCRAM, Baden-Baden, Germany, 617--626.Google Scholar
- M. S. Ryoo and J. K. Aggarwal. 2010. UT-Interaction Dataset, ICPR contest on Semantic Description of Human Activities (SDHA). Retrieved from http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html.Google Scholar
- Javier Sánchez Pérez, Enric Meinhardt-Llopis, and Gabriele Facciolo. 2013. TV-L1 optical flow estimation. Image Process. On Line 2013, 1 (Jul. 2013), 137--150.Google Scholar
- Christian Schuldt, Ivan Laptev, and Barbara Caputo. 2004. Recognizing human actions: A local SVM approach. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR), Vol. 3. IEEE, 32--36. Google Scholar
Digital Library
- Jing Shao, Chen Change Loy, and Xiaogang Wang. 2014. Scene-independent group profiling in crowd. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2219--2226. Google Scholar
Digital Library
- Julio Silveira Jacques Junior, Soraia Musse, and Claudio Jung. 2010. Crowd analysis using computer vision techniques. IEEE Sign. Process. Mag. 5, 27 (Sept. 2010), 66--77.Google Scholar
- F. Solera, S. Calderara, and R. Cucchiara. 2015. Learning to identify leaders in crowd. In Proceedings of 2015 CVPR International Workshop on Group And Crowd Behavior Analysis And Understanding. IEEE Computer Society, 43--48.Google Scholar
- Berkan Solmaz, Brian E. Moore, and Mubarak Shah. 2012. Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Trans. Pattern Anal. Mach. Intell. 34, 10 (October 2012), 2064--2070. Google Scholar
Digital Library
- Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. CRCV-TR-12-01 (November 2012).Google Scholar
- Transportation Research Board. 2000. Highway Capacity Manual. Washington, D.C.Google Scholar
- University of Minnesota. 2006. Detection of Unusual Crowd Activity. Retrieve from http://mha.cs.umn.edu/proj_events.shtml#crowd. (2006).Google Scholar
- Xinyu Wu, Guoyuan Liang, Ka Keung Lee, and Yangsheng Xu. 2006. Crowd density estimation using texture analysis and learning. In Proceedings of the 2006 IEEE International Conference on Robotics and Biomimetics. IEEE, 214--219.Google Scholar
Cross Ref
- K. Yamaguchi, A. C. Berg, L. E. Ortiz, and T. L. Berg. 2011. Who are you with and where are you going? In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 1345--1352. Google Scholar
Digital Library
- Alper Yilmaz, Omar Javed, and Mubarak Shah. 2006. Object tracking: A survey. ACM Comput. Surv. 38, 4 (2006), 13. Google Scholar
Digital Library
- Ting Yu and Ying Wu. 2004. Collaborative tracking of multiple targets. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, I--834--I--841.Google Scholar
- Beibei Zhan, Dorothy Monekosso, Paolo Remagnino, Sergio Velastin, and L. Xu. 2008. Crowd analysis: A survey. Mach. Vis. Appl. 19, 5--6 (April 2008), 345--357. Google Scholar
Digital Library
- Cong Zhang, Hongsheng Li, X. Wang, and Xiaokang Yang. 2015. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 833--841.Google Scholar
Cross Ref
- Li Zhang, Yuan Li, and Ramakant Nevatia. 2008. Global data association for multi-object tracking using network flows. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.Google Scholar
Cross Ref
- Xucong Zhang, Junjie Yan, Shikun Feng, Zhen Lei, Dong Yi, and S. Z. Li. 2012. Water filling: Unsupervised people counting via vertical kinect sensor. In Proceedings of the 2012 IEEE 9th International Conference on Advanced Video and Signal-Based Surveillance. IEEE, 215--220. Google Scholar
Digital Library
- Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 589--597.Google Scholar
Cross Ref
- Bolei Zhou, Xiaoou Tang, and Xiaogang Wang. 2015. Learning collective crowd behaviors with dynamic pedestrian-agents. Int. J. Comput. Vis. 111, 1 (2015), 50--68. Google Scholar
Digital Library
- Bolei Zhou, Xiaoou Tang, Hepeng Zhang, and Xiaogang Wang. 2014. Measuring crowd collectiveness. IEEE Trans. Pattern Anal. Mach. Intell. 36, 8 (August 2014), 1586--1599. Google Scholar
Digital Library
- B. Zhou, X. Wang, and X. Tang. 2012. Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Providence, 2871--2878. Google Scholar
Digital Library
Index Terms
Crowd Scene Understanding from Video: A Survey
Recommendations
Visual analysis of socio-cognitive crowd behaviors for surveillance: A survey and categorization of trends and methods
AbstractMonitoring and inferring socio-cognitive behaviors through crowd analysis can help us to understand many processes. Be it people in crowded environments, road traffic or even a flock of fish, situational awareness becomes critical for ...
Detecting contextual anomalies of crowd motion in surveillance video
ICIP'09: Proceedings of the 16th IEEE international conference on Image processingMany works have been proposed on detecting individual anomalies in crowd scenes, i.e., human behaviors anomalous with respect to the rest of the behaviors. In this paper, we introduce a new concept of contextual anomaly into the field of crowd analysis, ...
Crowd Behavior Representation Using Motion Influence Matrix for Anomaly Detection
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern RecognitionIn this paper, we propose a new method to detect abnormal behavior in crowd video. The motion influence matrix is proposed to represent crowd behaviors. It is generated based on concept of human perception with block-level motion vectors which describe ...






Comments