Abstract
Recent solutions to crowd counting problems have already achieved promising performance across various benchmarks. However, applying these approaches to real-world applications is still challenging, because they are computation intensive and lack the flexibility to meet various resource budgets. In this article, we propose an efficient crowd counting neural architecture search (ECCNAS) framework to search efficient crowd counting network structures, which can fill this research gap. A novel search from pre-trained strategy enables our cross-task NAS to explore the significantly large and flexible search space with less search time and get more proper network structures. Moreover, our well-designed search space can intrinsically provide candidate neural network structures with high performance and efficiency. In order to search network structures according to hardwares with different computational performance, we develop a novel latency cost estimation algorithm in our ECCNAS. Experiments show our searched models get an excellent trade-off between computational complexity and accuracy and have the potential to deploy in practical scenarios with various resource budgets. We reduce the computational cost, in terms of multiply-and-accumulate (MACs), by up to 96% with comparable accuracy. And we further designed experiments to validate the efficiency and the stability improvement of our proposed search from pre-trained strategy.
- [1] . 2020. Crowd counting: A survey of machine learning approaches. In 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT’20). 48–54.
DOI: DOI: https://doi.org/10.1109/ICIoT48696.2020.9089594Google ScholarCross Ref
- [2] . 2021. Localization in the crowd with topological constraints. In AAAI Conference on Artificial Intelligence (AAAI’21).Google Scholar
- [3] . 2018. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’18). 734–750.Google Scholar
Cross Ref
- [4] . 2019. DetNAS: Backbone search for object detection. In Advances in Neural Information Processing Systems. 6642–6652. Google Scholar
Digital Library
- [5] . 2019. EAT-NAS: Elastic architecture transfer for accelerating large-scale neural architecture search. arXiv preprint arXiv:1901.05884 (2019).Google Scholar
- [6] . 2012. Learning to count with regression forest and structured labels. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR’12). IEEE, 2685–2688.Google Scholar
- [7] . 2019. Rethinking imagenet pre-training. In Proceedings of the IEEE International Conference on Computer Vision. 4918–4927.Google Scholar
Cross Ref
- [8] . 2021. Error-aware density isomorphism reconstruction for unsupervised cross-domain crowd counting. In AAAI.Google Scholar
- [9] Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2019. Automated Machine Learning. Springer International Publishing. https://doi.org/10.1007/978-3-030-05318-5Google Scholar
- [10] . 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2547–2554. Google Scholar
Digital Library
- [11] . 2018. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the European Conference on Computer Vision (ECCV’18). 532–546.Google Scholar
Cross Ref
- [12] . 2018. Selective ensemble network for accurate crowd density estimation. In 2018 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 320–325.Google Scholar
Cross Ref
- [13] . 2019. Crowd counting and density estimation by trellis encoder-decoder networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19).Google Scholar
Cross Ref
- [14] . 2020. Attention scaling for crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4706–4715.Google Scholar
Cross Ref
- [15] . 2021. A survey on crowd counting methods and datasets. In Advances in Computer, Communication and Computational Sciences, , , , , and (Eds.). Springer Singapore, Singapore, 851–863.Google Scholar
- [16] . 2017. Mixture of counting CNNs: Adaptive integration of CNNs specialized to specific appearance for crowd counting. arXiv preprint arXiv:1703.09393 (2017). Google Scholar
Digital Library
- [17] . 2021. Direct measure matching for crowd counting. In IJCAI.Google Scholar
- [18] . 2019. Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 82–92.Google Scholar
Cross Ref
- [19] . 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).Google Scholar
- [20] . 2019. Crowd counting with deep structured scale integration network. In Proceedings of the IEEE International Conference on Computer Vision. 1774–1783.Google Scholar
Cross Ref
- [21] . 2019. Context-aware crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5099–5108.Google Scholar
Cross Ref
- [22] . 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV’18). 116–131.Google Scholar
Cross Ref
- [23] . 2019. Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE International Conference on Computer Vision. 6142–6151.Google Scholar
Cross Ref
- [24] . 2020. Learning Scales from Points: A Scale-Aware Probabilistic Model for Crowd Counting. Association for Computing Machinery, New York, NY, 220–228. https://doi.org/10.1145/3394171.3413642 Google Scholar
Digital Library
- [25] . 2021. Learning to count via unbalanced optimal transport. In AAAI Conference on Artificial Intelligence (AAAI’21).Google Scholar
- [26] . 2016. Detection of groups in crowd considering their activity state. In 2016 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 277–282.Google Scholar
Cross Ref
- [27] . 2020. Crowd counting with decomposed uncertainty. In AAAI. 11799–11806.Google Scholar
- [28] . 2019. Video action recognition via neural architecture searching. In 2019 IEEE International Conference on Image Processing (ICIP’19). IEEE, 11–15.Google Scholar
Cross Ref
- [29] . 2018. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018).Google Scholar
- [30] . 2019. Locate, size and count: Accurately resolving people in dense crowds via detection. arXiv preprint arXiv:1906.07538 (2019).Google Scholar
- [31] . 2017. Switching convolutional neural network for crowd counting. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE, 4031–4039.Google Scholar
Cross Ref
- [32] . 2019. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7279–7288.Google Scholar
Cross Ref
- [33] . 2017. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS’17). IEEE, 1–6.Google Scholar
Cross Ref
- [34] . 2019. HA-CCN: Hierarchical attention-based crowd counting network. IEEE Transactions on Image Processing 29 (2019), 323–335.Google Scholar
Digital Library
- [35] . 2019. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 1002–1012.Google Scholar
Cross Ref
- [36] . 2019. Padnet: Pan-density crowd counting. IEEE Transactions on Image Processing 29 (2019), 2714–2727.Google Scholar
Cross Ref
- [37] . 2020. Distribution matching for crowd counting. In Advances in Neural Information Processing Systems.Google Scholar
- [38] . 2011. Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In CVPR’11. IEEE, 3401–3408. Google Scholar
Digital Library
- [39] . 2019. Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8198–8207.Google Scholar
Cross Ref
- [40] . 2019. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10734–10742.Google Scholar
Cross Ref
- [41] . 2018. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. arXiv e-prints, Article arXiv:1812.03443 (
Dec. 2018), arXiv:1812.03443 pages.arxiv:cs.CV/1812.03443 .Google Scholar - [42] . 2019. Adaptive scenario discovery for crowd counting. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 2382–2386.Google Scholar
Cross Ref
- [43] . 2019. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE International Conference on Computer Vision. 8362–8371.Google Scholar
Cross Ref
- [44] . 2019. Learn to scale: Generating multipolar normalized density maps for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 8382–8390.Google Scholar
Cross Ref
- [45] . 2018. Multi-scale generative adversarial networks for crowd counting. In 2018 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 3244–3249.Google Scholar
Cross Ref
- [46] . 2019. Relational attention network for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 6788–6797.Google Scholar
Cross Ref
- [47] . 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 589–597.Google Scholar
Cross Ref
- [48] . 2021. A hardware-adaptive deep feature matching pipeline for real-time 3D reconstruction. Computer-Aided Design 132 (2021), 102984.Google Scholar
Cross Ref
- [49] . 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).Google Scholar
- [50] . 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8697–8710.Google Scholar
Cross Ref
Index Terms
ECCNAS: Efficient Crowd Counting Neural Architecture Search
Recommendations
Towards More Powerful Multi-column Convolutional Network for Crowd Counting
Image and GraphicsAbstractScale variation has always been one of the most challenging problems for crowd counting. By using multi-column convolutions with different receptive fields to deal with different scales in the scene, the multi-column convolutional networks have ...
NAS-Count: Counting-by-Density with Neural Architecture Search
Computer Vision – ECCV 2020AbstractMost of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts. In this ...
A survey of crowd counting and density estimation based on convolutional neural network
AbstractCrowd counting and crowd density estimation methods are of great significance in the field of public security. Estimating crowd density and counting from single image or video frame has become an essential part of a computer vision ...






Comments