skip to main content
research-article

ECCNAS: Efficient Crowd Counting Neural Architecture Search

Authors Info & Claims
Published:25 January 2022Publication History
Skip Abstract Section

Abstract

Recent solutions to crowd counting problems have already achieved promising performance across various benchmarks. However, applying these approaches to real-world applications is still challenging, because they are computation intensive and lack the flexibility to meet various resource budgets. In this article, we propose an efficient crowd counting neural architecture search (ECCNAS) framework to search efficient crowd counting network structures, which can fill this research gap. A novel search from pre-trained strategy enables our cross-task NAS to explore the significantly large and flexible search space with less search time and get more proper network structures. Moreover, our well-designed search space can intrinsically provide candidate neural network structures with high performance and efficiency. In order to search network structures according to hardwares with different computational performance, we develop a novel latency cost estimation algorithm in our ECCNAS. Experiments show our searched models get an excellent trade-off between computational complexity and accuracy and have the potential to deploy in practical scenarios with various resource budgets. We reduce the computational cost, in terms of multiply-and-accumulate (MACs), by up to 96% with comparable accuracy. And we further designed experiments to validate the efficiency and the stability improvement of our proposed search from pre-trained strategy.

REFERENCES

  1. [1] Abdou M. and Erradi A.. 2020. Crowd counting: A survey of machine learning approaches. In 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT’20). 4854. DOI: DOI: https://doi.org/10.1109/ICIoT48696.2020.9089594Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Abousamra Shahira, Nguyen Minh Hoai, Samaras Dimitris, and Chen Chao. 2021. Localization in the crowd with topological constraints. In AAAI Conference on Artificial Intelligence (AAAI’21).Google ScholarGoogle Scholar
  3. [3] Cao Xinkun, Wang Zhipeng, Zhao Yanyun, and Su Fei. 2018. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’18). 734750.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Chen Yukang, Yang Tong, Zhang Xiangyu, Meng Gaofeng, Xiao Xinyu, and Sun Jian. 2019. DetNAS: Backbone search for object detection. In Advances in Neural Information Processing Systems. 66426652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Fang Jiemin, Chen Yukang, Zhang Xinbang, Zhang Qian, Huang Chang, Meng Gaofeng, Liu Wenyu, and Wang Xinggang. 2019. EAT-NAS: Elastic architecture transfer for accelerating large-scale neural architecture search. arXiv preprint arXiv:1901.05884 (2019).Google ScholarGoogle Scholar
  6. [6] Fiaschi Luca, Köthe Ullrich, Nair Rahul, and Hamprecht Fred A.. 2012. Learning to count with regression forest and structured labels. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR’12). IEEE, 26852688.Google ScholarGoogle Scholar
  7. [7] He Kaiming, Girshick Ross, and Dollár Piotr. 2019. Rethinking imagenet pre-training. In Proceedings of the IEEE International Conference on Computer Vision. 49184927.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] He Yuhang, Ma Zhiheng, Wei Xing, Hong Xiaopeng, Ke Wei, and Gong Yihong. 2021. Error-aware density isomorphism reconstruction for unsupervised cross-domain crowd counting. In AAAI.Google ScholarGoogle Scholar
  9. [9] Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2019. Automated Machine Learning. Springer International Publishing. https://doi.org/10.1007/978-3-030-05318-5Google ScholarGoogle Scholar
  10. [10] Idrees Haroon, Saleemi Imran, Seibert Cody, and Shah Mubarak. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 25472554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Idrees Haroon, Tayyab Muhmmad, Athrey Kishan, Zhang Dong, Al-Maadeed Somaya, Rajpoot Nasir, and Shah Mubarak. 2018. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the European Conference on Computer Vision (ECCV’18). 532546.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Jeong Jiyeoup, Jeong Hawook, Lim Jongin, Choi Jongwon, Yun Sangdoo, and Choi Jin Young. 2018. Selective ensemble network for accurate crowd density estimation. In 2018 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 320325.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Jiang Xiaolong, Xiao Zehao, Zhang Baochang, Zhen Xiantong, Cao Xianbin, Doermann David, and Shao Ling. 2019. Crowd counting and density estimation by trellis encoder-decoder networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Jiang Xiaoheng, Zhang Li, Xu Mingliang, Zhang Tianzhu, Lv Pei, Zhou Bing, Yang Xin, and Pang Yanwei. 2020. Attention scaling for crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 47064715.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Jingying Wang. 2021. A survey on crowd counting methods and datasets. In Advances in Computer, Communication and Computational Sciences, Bhatia Sanjiv K., Tiwari Shailesh, Ruidan Su, Trivedi Munesh Chandra, and Mishra K. K. (Eds.). Springer Singapore, Singapore, 851863.Google ScholarGoogle Scholar
  16. [16] Kumagai Shohei, Hotta Kazuhiro, and Kurita Takio. 2017. Mixture of counting CNNs: Adaptive integration of CNNs specialized to specific appearance for crowd counting. arXiv preprint arXiv:1703.09393 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Lin Hui, Hong Xiaopeng, Ma Zhiheng, Wei Xing, Qiu Yunfeng, Wang Yaowei, and Gong Yihong. 2021. Direct measure matching for crowd counting. In IJCAI.Google ScholarGoogle Scholar
  18. [18] Liu Chenxi, Chen Liang-Chieh, Schroff Florian, Adam Hartwig, Hua Wei, Yuille Alan L., and Fei-Fei Li. 2019. Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8292.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Liu Hanxiao, Simonyan Karen, and Yang Yiming. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).Google ScholarGoogle Scholar
  20. [20] Liu Lingbo, Qiu Zhilin, Li Guanbin, Liu Shufan, Ouyang Wanli, and Lin Liang. 2019. Crowd counting with deep structured scale integration network. In Proceedings of the IEEE International Conference on Computer Vision. 17741783.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Liu Weizhe, Salzmann Mathieu, and Fua Pascal. 2019. Context-aware crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 50995108.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Ma Ningning, Zhang Xiangyu, Zheng Hai-Tao, and Sun Jian. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV’18). 116131.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Ma Zhiheng, Wei Xing, Hong Xiaopeng, and Gong Yihong. 2019. Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE International Conference on Computer Vision. 61426151.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Ma Zhiheng, Wei Xing, Hong Xiaopeng, and Gong Yihong. 2020. Learning Scales from Points: A Scale-Aware Probabilistic Model for Crowd Counting. Association for Computing Machinery, New York, NY, 220228. https://doi.org/10.1145/3394171.3413642 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Ma Zhiheng, Wei Xing, Hong Xiaopeng, Lin Hui, Yunfeng Qiu, and Gong Yihong. 2021. Learning to count via unbalanced optimal transport. In AAAI Conference on Artificial Intelligence (AAAI’21).Google ScholarGoogle Scholar
  26. [26] Nakamura Kazuaki, Ono Tsukasa, and Babaguchi Noboru. 2016. Detection of groups in crowd considering their activity state. In 2016 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 277282.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Oh Min-hwan, Olsen Peder A., and Ramamurthy Karthikeyan Natesan. 2020. Crowd counting with decomposed uncertainty. In AAAI. 1179911806.Google ScholarGoogle Scholar
  28. [28] Peng Wei, Hong Xiaopeng, and Zhao Guoying. 2019. Video action recognition via neural architecture searching. In 2019 IEEE International Conference on Image Processing (ICIP’19). IEEE, 1115.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Pham Hieu, Guan Melody Y., Zoph Barret, Le Quoc V., and Dean Jeff. 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018).Google ScholarGoogle Scholar
  30. [30] Sam Deepak Babu, Peri Skand Vishwanath, Sundararaman Mukuntha Narayanan, Kamath Amogh, and Babu R. Venkatesh. 2019. Locate, size and count: Accurately resolving people in dense crowds via detection. arXiv preprint arXiv:1906.07538 (2019).Google ScholarGoogle Scholar
  31. [31] Sam Deepak Babu, Surya Shiv, and Babu R. Venkatesh. 2017. Switching convolutional neural network for crowd counting. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 40314039.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Shi Miaojing, Yang Zhaohui, Xu Chao, and Chen Qijun. 2019. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 72797288.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Sindagi Vishwanath A. and Patel Vishal M.. 2017. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’17). IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Sindagi Vishwanath A. and Patel Vishal M.. 2019. HA-CCN: Hierarchical attention-based crowd counting network. IEEE Transactions on Image Processing 29 (2019), 323335.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Sindagi Vishwanath A. and Patel Vishal M.. 2019. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 10021012.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Tian Yukun, Lei Yiming, Zhang Junping, and Wang James Z.. 2019. Padnet: Pan-density crowd counting. IEEE Transactions on Image Processing 29 (2019), 27142727.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Wang Boyu, Liu Huidong, Samaras Dimitris, and Hoai Minh. 2020. Distribution matching for crowd counting. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  38. [38] Wang Meng and Wang Xiaogang. 2011. Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In CVPR’11. IEEE, 34013408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Wang Qi, Gao Junyu, Lin Wei, and Yuan Yuan. 2019. Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 81988207.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Wu Bichen, Dai Xiaoliang, Zhang Peizhao, Wang Yanghan, Sun Fei, Wu Yiming, Tian Yuandong, Vajda Peter, Jia Yangqing, and Keutzer Kurt. 2019. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1073410742.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Wu Bichen, Dai Xiaoliang, Zhang Peizhao, Wang Yanghan, Sun Fei, Wu Yiming, Tian Yuandong, Vajda Peter, Jia Yangqing, and Keutzer Kurt. 2018. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. arXiv e-prints, Article arXiv:1812.03443 (Dec. 2018), arXiv:1812.03443 pages. arxiv:cs.CV/1812.03443.Google ScholarGoogle Scholar
  42. [42] Wu Xingjiao, Zheng Yingbin, Ye Hao, Hu Wenxin, Yang Jing, and He Liang. 2019. Adaptive scenario discovery for crowd counting. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 23822386.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Xiong Haipeng, Lu Hao, Liu Chengxin, Liu Liang, Cao Zhiguo, and Shen Chunhua. 2019. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE International Conference on Computer Vision. 83628371.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Xu Chenfeng, Qiu Kai, Fu Jianlong, Bai Song, Xu Yongchao, and Bai Xiang. 2019. Learn to scale: Generating multipolar normalized density maps for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 83828390.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Yang Jianxing, Zhou Yuan, and Kung Sun-Yuan. 2018. Multi-scale generative adversarial networks for crowd counting. In 2018 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 32443249.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Zhang Anran, Shen Jiayi, Xiao Zehao, Zhu Fan, Zhen Xiantong, Cao Xianbin, and Shao Ling. 2019. Relational attention network for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 67886797.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhang Yingying, Zhou Desen, Chen Siqin, Gao Shenghua, and Ma Yi. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 589597.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zheng Shuai, Wang Yabin, Li Baotong, and Li Xin. 2021. A hardware-adaptive deep feature matching pipeline for real-time 3D reconstruction. Computer-Aided Design 132 (2021), 102984.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zoph Barret and Le Quoc V.. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).Google ScholarGoogle Scholar
  50. [50] Zoph Barret, Vasudevan Vijay, Shlens Jonathon, and Le Quoc V.. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 86978710.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. ECCNAS: Efficient Crowd Counting Neural Architecture Search

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 1s
        February 2022
        352 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3505206
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 January 2022
        • Accepted: 1 May 2021
        • Revised: 1 April 2021
        • Received: 1 January 2021
        Published in tomm Volume 18, Issue 1s

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!