skip to main content
research-article

Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting

Authors Info & Claims
Published:12 March 2020Publication History
Skip Abstract Section

Abstract

Crowd counting is a popular topic with widespread applications. Currently, the biggest challenge to crowd counting is large-scale variation in objects. In this article, we focus on overcoming this challenge by proposing a novel Attentive Encoder-Decoder Network (AEDN), which is supervised on multiple feature scales to conduct crowd counting via density estimation. This work has three main contributions. First, we augment the traditional encoder-decoder architecture with our proposed residual attention blocks, which, beyond skip-connected encoded features, further extend the decoded features with attentive features. AEDN is better at establishing long-range dependencies between the encoder and decoder, therefore promoting more effective fusion of multi-scale features for handling scale-variations. Second, we design a new KL-divergence-based distribution loss to supervise the scale-aware structural differences between two density maps, which complements the pixel-isolated MSE loss and better optimizes AEDN to generate high-quality density maps. Third, we adopt a multi-scale supervision scheme, such that multiple KL divergences and MSE losses are deployed at all decoding stages, providing more thorough supervisions for different feature scales. Extensive experimental results on four public datasets, including ShanghaiTech Part A, ShanghaiTech Part B, UCF-CC-50, and UCF-QNRF, reveal the superiority and efficacy of the proposed method, which outperforms most state-of-the-art competitors.

References

  1. Senjian An, Wanquan Liu, and Svetha Venkatesh. 2007. Face recognition using kernel ridge regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ankan Bansal and K. S. Venkatesh. 2015. People counting in high density crowds from still images. Arxiv Preprint Arxiv:1507.08445.Google ScholarGoogle Scholar
  3. Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.Google ScholarGoogle Scholar
  4. Lokesh Boominathan, Srinivas S. S. Kruthiventi, and R. Venkatesh Babu. 2016. Crowdnet: A deep convolutional network for dense crowd counting. In Proceedings of the ACM Multimedia Conference. ACM, 640--644.Google ScholarGoogle Scholar
  5. Gabriel J. Brostow and Roberto Cipolla. 2006. Unsupervised bayesian detection of independent motion in crowds. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1. IEEE, 594--601.Google ScholarGoogle Scholar
  6. Xinkun Cao, Zhipeng Wang, Yanyun Zhao, and Fei Su. 2018. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’18). 734--750.Google ScholarGoogle ScholarCross RefCross Ref
  7. Antoni B. Chan, Zhang-Sheng John Liang, and Nuno Vasconcelos. 2008. Privacy preserving crowd monitoring: Counting people without people models or tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ke Chen, Chen Change Loy, Shaogang Gong, and Tony Xiang. 2012. Feature mining for localised crowd counting. In Proceedings of the British Machine Vision Conference (BMVC’12), Vol. 1. 3.Google ScholarGoogle ScholarCross RefCross Ref
  9. Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L. Yuille. 2016. Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3640--3649.Google ScholarGoogle Scholar
  10. Xinlei Chen, Ross Girshick, Kaiming He, and Piotr Dollár. 2019. Tensormask: A foundation for dense object segmentation. Arxiv Preprint Arxiv:1903.12174.Google ScholarGoogle Scholar
  11. Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. Attention-based Models for Speech Recognition. arxiv:cs.CL/1506.07503.Google ScholarGoogle Scholar
  12. Dorin Comaniciu, Visvanathan Ramesh, and Peter Meer. 2003. Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 5 (2003), 564--575.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-fcn: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. MIT Press, 379--387.Google ScholarGoogle Scholar
  14. Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886--893.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Rohit Girdhar and Deva Ramanan. 2017. Attentional pooling for action recognition. In Advances in Neural Information Processing Systems. MIT Press, 34--45.Google ScholarGoogle Scholar
  16. Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision. 1440--1448.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Robert M. Haralick and Linda G. Shapiro. 1985. Image segmentation techniques. Comput. Vision Graph. Image Process. 29, 1 (1985), 100--132.Google ScholarGoogle ScholarCross RefCross Ref
  18. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 7132--7141.Google ScholarGoogle ScholarCross RefCross Ref
  19. Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Shenghua Gao, Rongrong Ji, and Junwei Han. 2017. Body structure aware deep crowd counting. IEEE Trans. Image Process. 27, 3 (2017), 1049--1059.Google ScholarGoogle ScholarCross RefCross Ref
  20. Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). 2547--2554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Haroon Idrees, Khurram Soomro, and Mubarak Shah. 2015. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Trans. Pattern Anal. Mach. Intell. 37, 10 (2015), 1986--1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Haroon Idrees, Muhmmad Tayyab, Kishan Athrey, Dong Zhang, Somaya Al-Maadeed, Nasir Rajpoot, and Mubarak Shah. 2018. Composition loss for counting, density map estimation and localization in dense crowds. Arxiv Preprint Arxiv:1808.01050.Google ScholarGoogle Scholar
  23. Xiaolong Jiang, Peizhao Li, Xiantong Zhen, and Xianbin Cao. 2019. Model-free tracking with deep appearance and motion features integration. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’19). IEEE, 101--110.Google ScholarGoogle ScholarCross RefCross Ref
  24. Xiaolong Jiang, Zehao Xiao, Baochang Zhang, Xiantong Zhen, Xianbin Cao, David Doermann, and Ling Shao. 2019. Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 6133--6142.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. 2011. Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34, 7 (2011), 1409--1422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Dan Kong, Douglas Gray, and Hai Tao. 2006. A viewpoint invariant approach for crowd counting. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 3. IEEE, 1187--1190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. MIT Press, 1097--1105.Google ScholarGoogle Scholar
  28. Victor Kulikov and Victor Lempitsky. 2019. Instance segmentation of biological images using harmonic embeddings. Arxiv Preprint Arxiv:1904.05257.Google ScholarGoogle Scholar
  29. Victor Lempitsky and Andrew Zisserman. 2010. Learning to count objects in images. In Advances in Neural Information Processing Systems. MIT Press, 1324--1332.Google ScholarGoogle Scholar
  30. Min Li, Zhaoxiang Zhang, Kaiqi Huang, and Tieniu Tan. 2008. Estimating the number of people in crowded scenes by MID-based foreground segmentation and head-shoulder detection. In Proceedings of the 19th International Conference on Pattern Recognition (ICPR’08). IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  31. Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. arxiv:cs.CV/1802.10062.Google ScholarGoogle Scholar
  32. Sheng-Fuu Lin, Jaw-Yeh Chen, and Hung-Xin Chao. 2001. Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans. Syst. Man. Cybernet. Part A: Syst. Hum. 31, 6 (2001), 645--654.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Heng Liu, Jungong Han, Shudong Hou, Ling Shao, and Yue Ruan. 2018. Single image super-resolution using a deep encoder--decoder symmetrical network with iterative back projection. Neurocomputing 282 (2018), 52--59.Google ScholarGoogle ScholarCross RefCross Ref
  34. Hao Liu, Jiwen Lu, Jianjiang Feng, and Jie Zhou. 2017. Learning deep sharable and structural detectors for face alignment. IEEE Trans. Image Process. 26, 4 (2017), 1666--1678.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jiang Liu, Chenqiang Gao, Deyu Meng, and Alexander G. Hauptmann. 2018. Decidenet: Counting varying density crowds through attention guided detection and density estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 5197--5206.Google ScholarGoogle Scholar
  36. Xialei Liu, Joost van de Weijer, and Andrew D. Bagdanov. 2018. Leveraging unlabeled data for crowd counting by learning to rank. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 7661--7669.Google ScholarGoogle Scholar
  37. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  38. AN Marana, L da F Costa, RA Lotufo, and SA Velastin. 1998. On the efficacy of texture analysis for crowd monitoring. In Proceedings of the International Symposium on Computer Graphics, Image Processing, and Vision (SIBGRAPI’98). IEEE, 354--361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In Proceedings of the European Conference on Computer Vision. Springer, 483--499.Google ScholarGoogle ScholarCross RefCross Ref
  40. Daniel Onoro-Rubio and Roberto J. López-Sastre. 2016. Towards perspective-free object counting with deep learning. In Proceedings of the European Conference on Computer Vision. Springer, 615--629.Google ScholarGoogle Scholar
  41. Michael Oren, Constantine Papageorgiou, Pawan Sinha, Edgar Osuna, and Tomaso Poggio. 1997. Pedestrian detection using wavelet templates. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’97). IEEE, 193--199.Google ScholarGoogle ScholarCross RefCross Ref
  42. Nikos Paragios and Visvanathan Ramesh. 2001. A MRF-based approach for real-time subway monitoring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Vol. 1. IEEE, I--I.Google ScholarGoogle ScholarCross RefCross Ref
  43. Vincent Rabaud and Serge Belongie. 2006. Counting crowded moving objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1. IEEE, 705--711.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Viresh Ranjan, Hieu Le, and Minh Hoai. 2018. Iterative crowd counting. Arxiv Preprint Arxiv:1807.09959.Google ScholarGoogle Scholar
  45. Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 779--788.Google ScholarGoogle ScholarCross RefCross Ref
  46. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. MIT Press, 91--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 234--241.Google ScholarGoogle ScholarCross RefCross Ref
  48. Edward Rosten and Tom Drummond. 2006. Machine learning for high-speed corner detection. In Proceedings of the European Conference on Computer Vision. Springer, 430--443.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. David Ryan, Simon Denman, Clinton Fookes, and Sridha Sridharan. 2009. Crowd counting using multiple local features. In Proceedings of the Conference on Digital Image Computing: Techniques and Applications. IEEE, 81--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. David Ryan, Simon Denman, Sridha Sridharan, and Clinton Fookes. 2015. An evaluation of crowd counting methods, features and regression models. Comput. Vision Image Understand. 130 (2015), 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Sami Abdulla Mohsen Saleh, Shahrel Azmin Suandi, and Haidi Ibrahim. 2015. Recent survey on crowd density estimation and counting for visual surveillance. Eng. Appl. Artific. Intell. 41 (2015), 103--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Deepak Babu Sam, Shiv Surya, and R. Venkatesh Babu. 2017. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), Vol. 1. 6.Google ScholarGoogle Scholar
  53. Zan Shen, Yi Xu, Bingbing Ni, Minsi Wang, Jianguo Hu, and Xiaokang Yang. 2018. Crowd counting via adversarial cross-scale consistency pursuit. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 5245--5254.Google ScholarGoogle ScholarCross RefCross Ref
  54. Vishwanath A. Sindagi and Vishal M. Patel. 2017. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the 14th IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS’17). IEEE, 1--6.Google ScholarGoogle Scholar
  55. Vishwanath A. Sindagi and Vishal M. Patel. 2017. Generating high-quality crowd density maps using contextual pyramid CNNs. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 1879--1888.Google ScholarGoogle Scholar
  56. Vishwanath A. Sindagi and Vishal M. Patel. 2018. A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recogn. Lett. 107 (2018), 3--16.Google ScholarGoogle ScholarCross RefCross Ref
  57. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arxiv:cs.CL/1706.03762.Google ScholarGoogle Scholar
  58. Paul Viola and Michael J. Jones. 2004. Robust real-time face detection. Int. J. Comput. Vision 57, 2 (2004), 137--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Chuan Wang, Hua Zhang, Liang Yang, Si Liu, and Xiaochun Cao. 2015. Deep people counting in extremely dense crowds. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, 1299--1302.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). DOI:https://doi.org/10.1109/cvpr.2017.683Google ScholarGoogle ScholarCross RefCross Ref
  61. Meng Wang and Xiaogang Wang. 2011. Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 3401--3408.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip H. S. Torr. 2019. Fast online object tracking and segmentation: A unifying approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 1328--1338.Google ScholarGoogle Scholar
  63. Ze Wang, Zehao Xiao, Kai Xie, Qiang Qiu, Xiantong Zhen, and Xianbin Cao. 2018. In defense of single-column networks for crowd counting. Arxiv Preprint Arxiv:1808.06133.Google ScholarGoogle Scholar
  64. Peter Wilf, Shengping Zhang, Sharat Chikkerur, Stefan A. Little, Scott L. Wing, and Thomas Serre. 2016. Computer vision cracks the leaf code. Proc. Natl. Acad. Sci. U.S.A. 113, 12 (2016), 3305--3310.Google ScholarGoogle ScholarCross RefCross Ref
  65. Bo Wu and Ramakant Nevatia. 2005. Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV’05), Vol. 1. IEEE, 90--97.Google ScholarGoogle Scholar
  66. Xinyu Wu, Guoyuan Liang, Ka Keung Lee, and Yangsheng Xu. 2006. Crowd density estimation using texture analysis and learning. In Proceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO’06). IEEE, 214--219.Google ScholarGoogle ScholarCross RefCross Ref
  67. Saining Xie and Zhuowen Tu. 2015. Holistically nested edge detection. In Proceedings of the IEEE International Conference on Computer Vision. 1395--1403.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Feng Xiong, Xingjian Shi, and Dit-Yan Yeung. 2017. Spatiotemporal modeling for crowd counting in videos. In Proceedings of the IEEE International Conference on Computer Vision. 5151--5159.Google ScholarGoogle ScholarCross RefCross Ref
  69. Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, and Nicu Sebe. 2017. Multi-scale continuous crfs as sequential deep networks for monocular depth estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), Vol. 1.Google ScholarGoogle ScholarCross RefCross Ref
  70. Dan Xu, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, and Elisa Ricci. 2018. Structured attention guided convolutional neural fields for monocular depth estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 3917--3925.Google ScholarGoogle ScholarCross RefCross Ref
  71. Jing Xu, Rui Zhao, Feng Zhu, Huaming Wang, and Wanli Ouyang. 2018. Attention-aware compositional network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 2119--2128.Google ScholarGoogle ScholarCross RefCross Ref
  72. Alper Yilmaz, Omar Javed, and Mubarak Shah. 2006. Object tracking: A survey. ACM Comput. Surveys 38, 4 (2006), 13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Fisher Yu, Dequan Wang, Evan Shelhamer, and Trevor Darrell. 2018. Deep layer aggregation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 2403--2412.Google ScholarGoogle ScholarCross RefCross Ref
  74. Sergey Zagoruyko and Nikos Komodakis. 2016. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. Arxiv Preprint Arxiv:1612.03928.Google ScholarGoogle Scholar
  75. Lu Zhang, Miaojing Shi, and Qiaobo Chen. 2018. Crowd counting via scale-adaptive convolutional neural network. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 1113--1121.Google ScholarGoogle ScholarCross RefCross Ref
  76. Shengping Zhang, Xiangyuan Lan, Yuankai Qi, and Pong C. Yuen. 2017. Robust visual tracking via basis matching. IEEE Trans. Circ. Syst. Video Technol. 27, 3 (2017), 421--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. S. Zhang, X. Lan, H. Yao, H. Zhou, D. Tao, and X. Li. 2017. A biologically inspired appearance model for robust visual tracking. IEEE Trans. Neural Netw. Learn. Syst. 28, 10 (2017), 2357--2370.Google ScholarGoogle ScholarCross RefCross Ref
  78. Shengping Zhang, Huiyu Zhou, Feng Jiang, and Xuelong Li. 2015. Robust visual tracking using structurally random projection and weighted least squares. IEEE Trans. Circ. Syst. Video Technol. 25, 11 (2015), 1749--1760.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 589--597.Google ScholarGoogle ScholarCross RefCross Ref
  80. Tao Zhao, Ram Nevatia, and Bo Wu. 2008. Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30, 7 (2008), 1198--1211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Zhuoyi Zhao, Hongsheng Li, Rui Zhao, and Xiaogang Wang. 2016. Crossing-line crowd counting with two-phase deep neural networks. In Proceedings of the European Conference on Computer Vision. Springer, 712--726.Google ScholarGoogle ScholarCross RefCross Ref
  82. Wentao Zhu, Yufang Huang, Hui Tang, Zhen Qian, Nan Du, Wei Fan, and Xiaohui Xie. 2018. AnatomyNet: Deep 3D squeeze-and-excitation u-nets for fast and fully automated whole-volume anatomical segmentation. Arxiv Preprint Arxiv:1808.05238.Google ScholarGoogle Scholar
  83. Wentao Zhu, Xiang Xiang, Trac D. Tran, Gregory D. Hager, and Xiaohui Xie. 2018. Adversarial deep structured nets for mass segmentation from mammograms. In Proceedings of the IEEE 15th International Symposium on Biomedical Imaging (ISBI’18). IEEE, 847--850.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 1s
          Special Issue on Multimodal Machine Learning for Human Behavior Analysis and Special Issue on Computational Intelligence for Biomedical Data and Imaging
          January 2020
          376 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3388236
          Issue’s Table of Contents

          Copyright © 2020 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 March 2020
          • Accepted: 1 August 2019
          • Revised: 1 July 2019
          • Received: 1 April 2019
          Published in tomm Volume 16, Issue 1s

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!