skip to main content
research-article

Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals

Authors Info & Claims
Published:04 March 2022Publication History
Skip Abstract Section

Abstract

A recent line of research focuses on crowd density estimation from RGB images for a variety of applications, for example, surveillance and traffic flow control. The performance drops dramatically for low-quality images, such as occlusion, or poor light conditions. However, people are equipped with various wireless devices, allowing the received signals to be easily collected at the base station. As such, another line of research utilizes received signals for crowd counting. Nevertheless, received signals offer only information regarding the number of people, while an accurate density map cannot be derived. As unmanned aerial vehicles (UAVs) are now treated as flying base stations and equipped with cameras, we make the first attempt to leverage both RGB images and received signals for crowd density estimation on UAVs. Specifically, we propose a novel network to effectively fuse the RGB images and received signal strength (RSS) information. Moreover, we design a new loss function that considers the uncertainty from RSS and makes the prediction consistent with the received signals. Experimental results show that the proposed method successfully helps break the limit of traditional crowd density estimation methods and achieves state-of-the-art performance. The proposed dataset is released as a public download for future research.

REFERENCES

  1. [1] Andrews Jeffrey G., Buzzi Stefano, Choi Wan, Hanly Stephen, Lozano Angel, Soong Anthony C. K., and Zhang Jianzhong Charlie. 2014. What will 5G be? IEEE Journal on Selected Areas in Communications 32, 6 (2014), 10651082.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Basalamah Anas. 2016. Automatic Update of Crowd and Traffic Data Using Device Monitoring. (Jul 2016). US Patent 9,401,086.Google ScholarGoogle Scholar
  3. [3] Bresenham Jack. 1965. Algorithm for computer control of a digital plotter. IBM Systems Journal 4, 1 (1965), 2530.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Cao Xinkun, Wang Zhipeng, Zhao Yanyun, and Su Fei. 2020. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’18), Munich, Germany. Springer, 757–773.Google ScholarGoogle Scholar
  5. [5] Chan Antoni B. and Vasconcelos Nuno. 2012. Counting people with low-level features and Bayesian regression. IEEE Transactions on Image Processing 21, 4 (April 2012), 21602177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Chen Jiwei, Su Wen, and Wang Zengfu. 2020. Crowd counting with crowd attention convolutional neural network. Neurocomputing 382 (2020), 210220. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Cheng Zhi-Qi, Li Jun-Xiu, Dai Qi, Wu Xiao, and Hauptmann Alexander G.. 2019. Learning spatial awareness to improve crowd counting. In Proceedings of the IEEE Conference on Computer Vision (ICCV’19), Seoul, Korea (South). 61516160.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Cheng Zhi-Qi, Li Jun-Xiu, Dai Qi, Wu Xiao, He Jun-Yan, and Hauptmann Alexander G.. 2019. Improving the learning of multi-column convolutional neural network for crowd counting. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19) Nice, France. Association for Computing Machinery, 18971906.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Dalal Navneet and Triggs Bill. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, Vol. 1. 886893.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Domenico Simone Di, Sanctis Mauro De, Cianca Ernestina, and Bianchi Giuseppe. 2016. A trained-once crowd counting method using differential WiFi channel state information. In Proceedings of the 3rd International Workshop on Physical Analytics (WPA’16), Singapore. 3742.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Dollár Piotr, Wojek Christian, Schiele Bernt, and Perona Pietro. 2012. Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 4 (April 2012), 743761.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Fu Huiyuan, Ma Huadong, and Xiao Hongtian. 2014. Crowd counting via head detection and motion flow estimation. In Proceedings of the 22nd ACM International Conference on Multimedia (MM’14) Orlando, FL, USA. ACM, 877880.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Gao Junyu, Wang Qi, and Yuan Yuan. 2019. SCAR: Spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing 363 (October 2019), 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Guo Dan, Li Kun, Zha Zheng-Jun, and Wang Meng. 2019. DADNet: Dilated-attention-deformable ConvNet for crowd counting. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19) Nice, France. ACM, 18231832.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Handte Marcus, Iqbal Muhammad Umer, Wagner Stephan, Apolinarski Wolfgang, Marrón Pedro, Navarro Eva Maria Muñoz, Martinez Santiago, Barthelemy Sara Izquierdo, and Fernández Mario G.. 2014. Crowd density estimation for public transport vehicles. In Proceedings of the International Conference on Extending Database Technology/International Conference on Database Theory (EDBT/ICDT’14), Joint Conference, Athens, Greece. CEUR-WS.org, 315322.Google ScholarGoogle Scholar
  16. [16] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Proceedings of the European Conference on Computer Vision (ECCV’14), Zurich, Switzerland. Springer, 346361.Google ScholarGoogle Scholar
  17. [17] Hu Yaocong, Chang Huan, Nian Fudong, Wang Yan, and Li Teng. 2016. Dense crowd counting from still images with convolutional neural networks. Journal of Visual Communication and Image Representation 38 (2016), 530539. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Idrees Haroon, Saleemi Imran, Seibert Cody, and Shah Mubarak. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), Portland, OR, USA. IEEE Computer Society, 25472554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Idrees Haroon, Soomro Khurram, and Shah Mubarak. 2015. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 10 (2015), 19861998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Jiang X., Zhang L., Xu M., Zhang T., Lv P., Zhou B., Yang X., and Pang Y.. 2020. Attention scaling for crowd counting. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20), Seattle, WA, USA. Computer Vision Foundation, 47054714.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Khawaja Wahab, Guvenc Ismail, Matolak David, Fiebig Uwe-Carsten, and Schneckenburger Nicolas. 2019. A survey of air-to-ground propagation channel modeling for unmanned aerial vehicles. IEEE Communications Surveys Tutorials 21, 3 (2019), 23612391.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Kocamaz Mehmet Kemal, Gong Jian, and Pires Bernardo R.. 2016. Vision-based counting of pedestrians and cyclists. In IEEE Winter Conference on Applications of Computer Vision (WACV’16). IEEE Computer Society, 18. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Lai C., Wang L., and Han Z.. 2019. Data-driven 3D placement of UAV base stations for arbitrarily distributed crowds. In 2019 IEEE Global Communications Conference (GLOBECOM’19) Waikoloa, HI, USA. IEEE, 16. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Lai Wei-Cheng, Xia Zi-Xiang, Lin Hao-Siang, Hsu Lien-Feng, Shuai Hong-Han, Jhuo I-Hong, and Cheng Wen-Huang. 2020. Trajectory prediction in heterogeneous environment via attended ecology embedding. In Proceedings of the ACM International Conference on Multimedia Virtual Event/Seattle, WA, USA. ACM, 202–210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Li Teng, Chang Huan, Wang Meng, Ni Bingbing, Hong Richang, and Yan Shuicheng. 2015. Crowded scene analysis: A survey. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT’15) 25, 3 (2015), 367386. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Li Yuhong, Zhang Xiaofan, and Chen Deming. 2018. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18) Salt Lake City, UT, USA. IEEE Computer Society, 10911100.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Lian Dongze, Li Jing, Zheng Jia, Luo Weixin, and Gao Shenghua. 2019. Density map regression guided detection network for RGB-D crowd counting and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 1821–1830.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Liu Chuanbin, Xie Hongtao, Zha Zhengjun, Yu Lingyun, Chen Zhineng, and Zhang Yongdong. 2020. Bidirectional attention-recognition model for fine-grained object classification. IEEE Transactions on Multimedia 22, 7 (2020), 17851795. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Liu Qinghui, Kampffmeyer Michael, Jenssen Robert, and Salberg Arnt-Børre. 2019. Dense dilated convolutions merging network for semantic mapping of remote sensing images. In Proceedings of Joint Urban Remote Sensing Event (JURSE’19) Vannes, France. IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Liu Weizhe, Lis Krzysztof Maciej, Salzmann Mathieu, and Fua Pascal. 2019. Geometric and physical constraints for drone-based head plane crowd density estimation. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’19), Macau, SAR, China. IEEE, 244249.Google ScholarGoogle Scholar
  31. [31] Liu Weizhe, Salzmann Mathieu, and Fua Pascal. 2019. Context-aware crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 50945103.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Liu Xiyang, Yang Jie, and Ding Wenrui. 2020. Adaptive mixture regression network with local counting map for crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 241–257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Liu Yan, Liu Lingqiao, Wang Peng, Zhang Pingping, and Lei Yinjie. 2020. Semi-supervised crowd counting via self-training on surrogate tasks. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 242–259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Long Jonathan, Shelhamer Evan, and Darrell Trevor. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), Boston, MA, USA. IEEE Computer Society, 3431–3440.Google ScholarGoogle Scholar
  35. [35] Ma Yu-Jen, Shuai Hong-Han, and Cheng Wen-Huang. 2021. Spatiotemporal dilated convolution with uncertain matching for video-based crowd estimation. IEEE Transactions on Multimedia (2021), 11. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Ma Zheng and Chan Antoni B.. 2013. Crossing the line: Crowd counting by integer programming with local features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), Portland, OR, USA. IEEE Computer Society, 25392546.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Ooi Yaik, Wai Kong Zan, Tan Ian, and Sheng Ooi Boon. 2016. Measuring the accuracy of crowd counting using WiFi probe-request-frame counting technique. Journal of Telecommunication, Electronic and Computer Engineering 8, 2 (2016), 7981.Google ScholarGoogle Scholar
  38. [38] Pan Xingang, Shi Jianping, Luo Ping, Wang Xiaogang, and Tang Xiaoou. 2017. Spatial As Deep: Spatial CNN for Traffic Scene Understanding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’18), New Orleans, Louisiana, USA. AAAI Press, 7276–7283.Google ScholarGoogle Scholar
  39. [39] Ryan David, Denman Simon, Fookes Clinton, and Sridharan Sridha. 2009. Crowd counting using multiple local features. In Proceedings of the Digital Image Computing: Techniques and Applications (DICTA’09) Melbourne, Australia. IEEE Computer Society, 8188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Sam Deepak Babu, Surya Shiv, and Babu R. Venkatesh. 2017. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17) Honolulu, HI, USA. IEEE Computer Society, 40314039.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Shi Miaojing, Yang Zhaohui, Xu Chao, and Chen Qijun. 2019. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). Long Beach, CA, USA. Computer Vision Foundation, 72797288.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Shibata Kyosuke and Yamamoto Hiroshi. 2019. People crowd density estimation system using deep learning for radio wave sensing of cellular communication. In Proceedings of the International Conference on Artificial Intelligence in Information and Communication (ICAIIC’19) Okinawa, Japan. IEEE, 143148.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Sindagi Vishwanath A., Yasarla Rajeev, Babu Deepak Sam, Babu R. Venkatesh, and Patel Vishal M.. 2020. Learning to count in the crowd from limited labeled data. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 212–229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Sio Chon Hou, Ma Yu-Jen, Shuai Hong-Han, Chen Jun-Cheng, and Cheng Wen-Huang. 2020. S2SiamFC: Self-supervised fully convolutional Siamese network for visual tracking. In Proceedings of the ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA. ACM, 1948–1957.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Stewart Russell, Andriluka Mykhaylo, and Ng Andrew Yan-Tak. 2016. End-to-end people detection in crowded scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16), Las Vegas, NV, USA. IEEE Computer Society, 2325–2333.Google ScholarGoogle Scholar
  46. [46] Stüber Gordon L.. 2017. Principles of Mobile Communication (4th ed.). Springer, Cham.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Tan Xin, Tao Chun, Ren Tongwei, Tang Jinhui, and Wu Gangshan. 2019. Crowd counting via multi-layer regression. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19), Nice, France. ACM, 19071915.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Tian Yukun, Lei Yiming, Zhang Junping, and Wang James Ze. 2019. PaDNet: Pan-density crowd counting. IEEE Transactions on Image Processing 29 (November 2019), 27142727.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Wang Haijun, Zhao Haitao, Wu Weiyu, Xiong Jun, Ma Dongtang, and Wei Jibo. 2019. Deployment algorithms of flying base stations: 5G and beyond with UAVs. In IEEE Internet of Things Journal 6, 6 (2019), 1000910027.Google ScholarGoogle Scholar
  50. [50] Wang Qi, Gao Junyu, Lin Wei, and Yuan Yuan. 2019. Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 81988207.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Wang Shuheng, Wang Hanli, and Li Qinyu. 2019. Multi-dilation network for crowd counting. In Proceedings of the ACM Multimedia Asia (MMAsia’19) Beijing, China. Association for Computing Machinery, Article 56, 16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Xia Zi-Xiang, Lai Wei-Cheng, Tsao Li-Wu, Hsu Lien-Feng, Yu Chih-Chia Hu, Shuai Hong-Han, and Cheng Wen-Huang. 2020. Human-like traffic scene understanding system: A survey. IEEE Industrial Electronics Magazine 15, 1 (2020), 6–15.Google ScholarGoogle Scholar
  53. [53] Yu Peng, Li Wenjing, Zhou Fanqin, Yin Lei Feng, Mengjun, Guo Shaoyong, Gao Zhipeng, and Qiu Xuesong. 2018. Capacity enhancement for 5G networks using MmWave aerial base stations: Self-organizing architecture and approach. IEEE Wireless Communications 25, 4 (August 2018), 5864.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Zhang Anran, Jiang Xiaolong, and Zhang Xianbin Cao Baochang. 2020. Multi-scale supervised attentive encoder-decoder network for crowd counting. ACM Transactions on Multimedia Computing, Communications, and Applications Article 28, 16, 1 (April 2020).Google ScholarGoogle Scholar
  55. [55] Zhang H., Song L., and Han Z.. 2020. Unmanned Aerial Vehicle Applications Over Cellular Networks for 5G and Beyond. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Zhang Yingying, Zhou Desen, Chen Siqin, Gao Shenghua, and Ma Yi. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16) Las Vegas, NV, USA. IEEE, 589597.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Zhang Zhaoxiang, Wang Mo, and Geng Xin. 2015. Crowd counting in public video surveillance by label distribution learning. Neurocomputing 166 (Oct. 2015), 151163.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Zhou Rui, Lu Xiang, Fu Yang, and Tang Mingjie. 2020. Device-free crowd counting with WiFi channel state information and deep neural networks. Wireless Networks 26, 5 (2020), 3495–3506. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Zhu Pengfei, Wen Longyin, Du Dawei, Bian Xiao, Hu Qinghua, and Ling Haibin. 2020. Vision Meets Drones: Past, Present and Future. (2020). arxiv:2001.06303Google ScholarGoogle Scholar

Index Terms

  1. Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 3
      August 2022
      478 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3505208
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 March 2022
      • Accepted: 1 October 2021
      • Revised: 1 September 2021
      • Received: 1 February 2021
      Published in tomm Volume 18, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!