skip to main content
research-article

Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images

Authors Info & Claims
Published:16 April 2021Publication History
Skip Abstract Section

Abstract

Pedestrian detection is a canonical problem for safety and security applications, and it remains a challenging problem due to the highly variable lighting conditions in which pedestrians must be detected. This article investigates several domain adaptation approaches to adapt RGB-trained detectors to the thermal domain. Building on our earlier work on domain adaptation for privacy-preserving pedestrian detection, we conducted an extensive experimental evaluation comparing top-down and bottom-up domain adaptation and also propose two new bottom-up domain adaptation strategies. For top-down domain adaptation, we leverage a detector pre-trained on RGB imagery and efficiently adapt it to perform pedestrian detection in the thermal domain. Our bottom-up domain adaptation approaches include two steps: first, training an adapter segment corresponding to initial layers of the RGB-trained detector adapts to the new input distribution; then, we reconnect the adapter segment to the original RGB-trained detector for final adaptation with a top-down loss. To the best of our knowledge, our bottom-up domain adaptation approaches outperform the best-performing single-modality pedestrian detection results on KAIST and outperform the state of the art on FLIR.

References

  1. Federico Angelini, Jiawei Yan, and Syed Naqvi. 2019. Privacy-preserving online human behaviour anomaly detection based on body movements and objects positions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). 8444–8448. DOI:https://doi.org/10.1109/ICASSP.2019.8683026Google ScholarGoogle ScholarCross RefCross Ref
  2. Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke, Abhijit Ogale, and Dave Ferguson. 2015. Real-time pedestrian detection with deep network cascades. In Proceedings of the British Machine Vision Conference (BMVC’15). Article 32, 12 pages.Google ScholarGoogle ScholarCross RefCross Ref
  3. Jeonghyun Baek, Sungjun Hong, Jisu Kim, and Euntai Kim. 2017. Efficient pedestrian detection at nighttime using a thermal camera. Sensors 17, 8 (2017), 1850. DOI:https://doi.org/10.3390/s17081850Google ScholarGoogle ScholarCross RefCross Ref
  4. Rodrigo Benenson, Mohamed Omran, Jan Hosang, and Bernt Schiele. 2014. Ten years of pedestrian detection, what have we learned?. In Proceedings of the European Conference on Computer Vision (ECCV’14), Vol. 8926. Springer, 613–627.Google ScholarGoogle Scholar
  5. Garrick Brazil, Xi Yin, and Xiaoming Liu. 2017. Illuminating pedestrians via simultaneous detection & segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4950–4959.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yanpeng Cao, Dayan Guan, Yulun Wu, Jiangxin Yang, Yanlong Cao, and Michael Ying Yang. 2019. Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection. ISPRS J. Photogram. Remote Sens. 150 (2019), 70–79. https://doi.org/10.1016/j.isprsjprs.2019.02.005Google ScholarGoogle ScholarCross RefCross Ref
  7. Chaitanya Devaguptapu, Ninad Akolekar, Manuj M. Sharma, and Vineeth N. Balasubramanian. 2019. Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19). 1029–1038. DOI:https://doi.org/10.1109/CVPRW.2019.00135Google ScholarGoogle Scholar
  8. Piotr Dollar, Christian Wojek, Bernt Schiele, and Pietro Perona. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4 (2011), 743–761.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, and Larry Davis. 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’17). IEEE, 953–961.Google ScholarGoogle ScholarCross RefCross Ref
  10. FLIR. 2018. FLIR starter thermal dataset. Retrieved from https://www.flir.com/oem/adas/adas-dataset-form/.Google ScholarGoogle Scholar
  11. Kevin Fritz, Daniel König, Ulrich Klauck, and Michael Teutsch. 2019. Generalization ability of region proposal networks for multispectral person detection. In Proceedings of the Automatic Target Recognition XXIX, Vol. 10988. International Society for Optics and Photonics, 109880Y.Google ScholarGoogle ScholarCross RefCross Ref
  12. Debasmita Ghose, Shasvat M. Desai, Sneha Bhattacharya, Deep Chakraborty, Madalina Fiterau, and Tauhidur Rahman. 2019. Pedestrian detection in thermal images using saliency maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19).Google ScholarGoogle Scholar
  13. Dayan Guan, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, and Michael Ying Yang. 2019. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Information Fusion 50 (2019), 148–157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alon Hazan, Yoel Shoshan, Daniel Khapun, Roy Aladjem, and Vadim Ratner. 2018. AdapterNet—Learning input transformation for domain adaptation. CoRRabs/1805.11601 (2018). arXiv:1805.11601. http://arxiv.org/abs/1805.11601.Google ScholarGoogle Scholar
  15. Christian Herrmann, Miriam Ruf, and Jürgen Beyerer. 2018. CNN-based thermal infrared person detection by domain adaptation. In Proceedings of the Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything, Vol. 10643. International Society for Optics and Photonics, 1064308.Google ScholarGoogle Scholar
  16. Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and Inso Kweon. 2015. Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), Vol. 20.Google ScholarGoogle ScholarCross RefCross Ref
  17. Vijay John, Seiichi Mita, Zheng Liu, and Bin Qi. 2015. Pedestrian detection in thermal images using adaptive fuzzy C-means clustering and convolutional neural networks. In Proceedings of the IAPR International Conference on Machine Vision Applications (MVA’15). IEEE, 246–249.Google ScholarGoogle ScholarCross RefCross Ref
  18. My Kieu, Andrew D. Bagdanov, Marco Bertini, and Alberto Del Bimbo. 2019. Domain adaptation for privacy-preserving pedestrian detection in thermal imagery. In Proceedings of the International Conference on Image Analysis and Processing (ICIAP’19). Springer, 203–213.Google ScholarGoogle ScholarCross RefCross Ref
  19. Daniel Konig, Michael Adam, Christian Jarvers, Georg Layher, Heiko Neumann, and Michael Teutsch. 2017. Fully convolutional region proposal networks for multispectral person detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 49–56. DOI:https://doi.org/10.1109/CVPRW.2017.36Google ScholarGoogle ScholarCross RefCross Ref
  20. Wouter M. Kouw. 2018. An introduction to domain adaptation and transfer learning. arxiv:1812.11806. Retrieved fromhttp://arxiv.org/abs/1812.11806.Google ScholarGoogle Scholar
  21. Yongwoo Lee, Toan Duc Bui, and Jitae Shin. 2018. Pedestrian detection based on deep fusion network using feature correlation. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC’18). IEEE, 694–699.Google ScholarGoogle ScholarCross RefCross Ref
  22. Chengyang Li, Dan Song, Ruofeng Tong, and Min Tang. 2018. Multispectral pedestrian detection via simultaneous detection and segmentation. In Proceedings of the British Machine Vision Conference (BMVC’18). BMVA Press, 225.Google ScholarGoogle Scholar
  23. Chengyang Li, Dan Song, Ruofeng Tong, and Min Tang. 2019. Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recogn. 85 (2019), 161–171. https://doi.org/10.1016/j.patcog.2018.08.005Google ScholarGoogle ScholarCross RefCross Ref
  24. Jianan Li, Xiaodan Liang, ShengMei Shen, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20, 4 (2017), 985–996.Google ScholarGoogle Scholar
  25. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV’14). Springer, 740–755.Google ScholarGoogle Scholar
  26. Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N. Metaxas. 2016. Multispectral deep neural networks for pedestrian detection. In Proceedings of the British Machine Vision Conference (BMVC’16). BMVA Press.Google ScholarGoogle Scholar
  27. Wei Liu, Shengcai Liao, Weiqiang Ren, Weidong Hu, and Yinan Yu. 2019. High-level semantic feature detection: A new perspective for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 5187–5196.Google ScholarGoogle ScholarCross RefCross Ref
  28. Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I. Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning (ICML’15). 97–105.Google ScholarGoogle Scholar
  29. IHS Markit. 2019. 245 million video surveillance cameras installed globally in 2014. Retrieved May 5, 2019 from https://technology.ihs.com/532501/245-million-video-surveillance-cameras-installed-globally-in-2014.Google ScholarGoogle Scholar
  30. Marc Masana, Joost van de Weijer, Luis Herranz, Andrew D. Bagdanov, and Jose M. Alvarez. 2017. Domain-adaptive deep network compression. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4289–4297.Google ScholarGoogle Scholar
  31. Shota Nakashima, Yuhki Kitazono, Lifeng Zhang, and Seiichi Serikawa. 2010. Development of privacy-preserving sensor for person detection. Proc. Soc. Behav. Sci. 2, 1 (2010), 213–217. https://doi.org/10.1016/j.sbspro.2010.01.038 The 1st International Conference on Security Camera Network, Privacy Protection and Community Safety 2009.Google ScholarGoogle ScholarCross RefCross Ref
  32. Wanli Ouyang, Xingyu Zeng, and Xiaogang Wang. 2016. Learning mutual visibility relationship for pedestrian detection with a deep model. International Journal of Computer Vision 120, 1 (2016), 14–27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 7263–7271. DOI:https://doi.org/10.1109/CVPR.2017.690Google ScholarGoogle ScholarCross RefCross Ref
  34. Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arxiv:1804.02767. Retrieved from http://arxiv.org/abs/1804.02767.Google ScholarGoogle Scholar
  35. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 618–626.Google ScholarGoogle Scholar
  36. Yunfei Teng and Anna Choromanska. 2019. Invertible autoencoder for domain adaptation. Computation 7, 2 (2019), 20. DOI:https://doi.org/10.3390/computation7020020Google ScholarGoogle ScholarCross RefCross Ref
  37. Yonglong Tian, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Pedestrian detection aided by deep learning semantic tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5079–5087.Google ScholarGoogle ScholarCross RefCross Ref
  38. Maarten Vandersteegen, Kristof Van Beeck, and Toon Goedemé. 2018. Real-time multispectral pedestrian detection with a single-pass deep neural network. In Proceedings of the International Conference Image Analysis and Recognition (ICIAR’18). Springer, 419–426.Google ScholarGoogle ScholarCross RefCross Ref
  39. Jörg Wagner, Volker Fischer, Michael Herman, and Sven Behnke. 2016. Multispectral pedestrian detection using deep fusion convolutional neural networks. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’16). 509–514.Google ScholarGoogle Scholar
  40. Dan Xu, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, and Nicu Sebe. 2017. Learning cross-modal deep representations for robust pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5363–5371.Google ScholarGoogle ScholarCross RefCross Ref
  41. Liliang Zhang, Liang Lin, Xiaodan Liang, and Kaiming He. 2016. Is faster r-cnn doing well for pedestrian detection?. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, 443–457.Google ScholarGoogle ScholarCross RefCross Ref
  42. Lu Zhang, Zhiyong Liu, Xiangyu Chen, and Xu Yang. 2019. The cross-modality disparity problem in multispectral pedestrian detection. arxiv:1901.02645. Retrieved fromhttp://arxiv.org/abs/1901.02645.Google ScholarGoogle Scholar
  43. Yang Zheng, Izzat H. Izzat, and Shahrzad Ziaee. 2019. GFD-SSD: Gated fusion double SSD for multispectral pedestrian detection. arxiv:1903.06999. Retrieved fromhttp://arxiv.org/abs/1903.06999.Google ScholarGoogle Scholar

Index Terms

  1. Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!