Abstract
Pedestrian detection is a canonical problem for safety and security applications, and it remains a challenging problem due to the highly variable lighting conditions in which pedestrians must be detected. This article investigates several domain adaptation approaches to adapt RGB-trained detectors to the thermal domain. Building on our earlier work on domain adaptation for privacy-preserving pedestrian detection, we conducted an extensive experimental evaluation comparing top-down and bottom-up domain adaptation and also propose two new bottom-up domain adaptation strategies. For top-down domain adaptation, we leverage a detector pre-trained on RGB imagery and efficiently adapt it to perform pedestrian detection in the thermal domain. Our bottom-up domain adaptation approaches include two steps: first, training an adapter segment corresponding to initial layers of the RGB-trained detector adapts to the new input distribution; then, we reconnect the adapter segment to the original RGB-trained detector for final adaptation with a top-down loss. To the best of our knowledge, our bottom-up domain adaptation approaches outperform the best-performing single-modality pedestrian detection results on KAIST and outperform the state of the art on FLIR.
- Federico Angelini, Jiawei Yan, and Syed Naqvi. 2019. Privacy-preserving online human behaviour anomaly detection based on body movements and objects positions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). 8444–8448. DOI:https://doi.org/10.1109/ICASSP.2019.8683026Google Scholar
Cross Ref
- Anelia Angelova, Alex Krizhevsky, Vincent Vanhoucke, Abhijit Ogale, and Dave Ferguson. 2015. Real-time pedestrian detection with deep network cascades. In Proceedings of the British Machine Vision Conference (BMVC’15). Article 32, 12 pages.Google Scholar
Cross Ref
- Jeonghyun Baek, Sungjun Hong, Jisu Kim, and Euntai Kim. 2017. Efficient pedestrian detection at nighttime using a thermal camera. Sensors 17, 8 (2017), 1850. DOI:https://doi.org/10.3390/s17081850Google Scholar
Cross Ref
- Rodrigo Benenson, Mohamed Omran, Jan Hosang, and Bernt Schiele. 2014. Ten years of pedestrian detection, what have we learned?. In Proceedings of the European Conference on Computer Vision (ECCV’14), Vol. 8926. Springer, 613–627.Google Scholar
- Garrick Brazil, Xi Yin, and Xiaoming Liu. 2017. Illuminating pedestrians via simultaneous detection & segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4950–4959.Google Scholar
Cross Ref
- Yanpeng Cao, Dayan Guan, Yulun Wu, Jiangxin Yang, Yanlong Cao, and Michael Ying Yang. 2019. Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection. ISPRS J. Photogram. Remote Sens. 150 (2019), 70–79. https://doi.org/10.1016/j.isprsjprs.2019.02.005Google Scholar
Cross Ref
- Chaitanya Devaguptapu, Ninad Akolekar, Manuj M. Sharma, and Vineeth N. Balasubramanian. 2019. Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19). 1029–1038. DOI:https://doi.org/10.1109/CVPRW.2019.00135Google Scholar
- Piotr Dollar, Christian Wojek, Bernt Schiele, and Pietro Perona. 2011. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4 (2011), 743–761.Google Scholar
Digital Library
- Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, and Larry Davis. 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’17). IEEE, 953–961.Google Scholar
Cross Ref
- FLIR. 2018. FLIR starter thermal dataset. Retrieved from https://www.flir.com/oem/adas/adas-dataset-form/.Google Scholar
- Kevin Fritz, Daniel König, Ulrich Klauck, and Michael Teutsch. 2019. Generalization ability of region proposal networks for multispectral person detection. In Proceedings of the Automatic Target Recognition XXIX, Vol. 10988. International Society for Optics and Photonics, 109880Y.Google Scholar
Cross Ref
- Debasmita Ghose, Shasvat M. Desai, Sneha Bhattacharya, Deep Chakraborty, Madalina Fiterau, and Tauhidur Rahman. 2019. Pedestrian detection in thermal images using saliency maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19).Google Scholar
- Dayan Guan, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, and Michael Ying Yang. 2019. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Information Fusion 50 (2019), 148–157.Google Scholar
Digital Library
- Alon Hazan, Yoel Shoshan, Daniel Khapun, Roy Aladjem, and Vadim Ratner. 2018. AdapterNet—Learning input transformation for domain adaptation. CoRRabs/1805.11601 (2018). arXiv:1805.11601. http://arxiv.org/abs/1805.11601.Google Scholar
- Christian Herrmann, Miriam Ruf, and Jürgen Beyerer. 2018. CNN-based thermal infrared person detection by domain adaptation. In Proceedings of the Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything, Vol. 10643. International Society for Optics and Photonics, 1064308.Google Scholar
- Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and Inso Kweon. 2015. Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), Vol. 20.Google Scholar
Cross Ref
- Vijay John, Seiichi Mita, Zheng Liu, and Bin Qi. 2015. Pedestrian detection in thermal images using adaptive fuzzy C-means clustering and convolutional neural networks. In Proceedings of the IAPR International Conference on Machine Vision Applications (MVA’15). IEEE, 246–249.Google Scholar
Cross Ref
- My Kieu, Andrew D. Bagdanov, Marco Bertini, and Alberto Del Bimbo. 2019. Domain adaptation for privacy-preserving pedestrian detection in thermal imagery. In Proceedings of the International Conference on Image Analysis and Processing (ICIAP’19). Springer, 203–213.Google Scholar
Cross Ref
- Daniel Konig, Michael Adam, Christian Jarvers, Georg Layher, Heiko Neumann, and Michael Teutsch. 2017. Fully convolutional region proposal networks for multispectral person detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 49–56. DOI:https://doi.org/10.1109/CVPRW.2017.36Google Scholar
Cross Ref
- Wouter M. Kouw. 2018. An introduction to domain adaptation and transfer learning. arxiv:1812.11806. Retrieved fromhttp://arxiv.org/abs/1812.11806.Google Scholar
- Yongwoo Lee, Toan Duc Bui, and Jitae Shin. 2018. Pedestrian detection based on deep fusion network using feature correlation. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC’18). IEEE, 694–699.Google Scholar
Cross Ref
- Chengyang Li, Dan Song, Ruofeng Tong, and Min Tang. 2018. Multispectral pedestrian detection via simultaneous detection and segmentation. In Proceedings of the British Machine Vision Conference (BMVC’18). BMVA Press, 225.Google Scholar
- Chengyang Li, Dan Song, Ruofeng Tong, and Min Tang. 2019. Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recogn. 85 (2019), 161–171. https://doi.org/10.1016/j.patcog.2018.08.005Google Scholar
Cross Ref
- Jianan Li, Xiaodan Liang, ShengMei Shen, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimedia 20, 4 (2017), 985–996.Google Scholar
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV’14). Springer, 740–755.Google Scholar
- Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N. Metaxas. 2016. Multispectral deep neural networks for pedestrian detection. In Proceedings of the British Machine Vision Conference (BMVC’16). BMVA Press.Google Scholar
- Wei Liu, Shengcai Liao, Weiqiang Ren, Weidong Hu, and Yinan Yu. 2019. High-level semantic feature detection: A new perspective for pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 5187–5196.Google Scholar
Cross Ref
- Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I. Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning (ICML’15). 97–105.Google Scholar
- IHS Markit. 2019. 245 million video surveillance cameras installed globally in 2014. Retrieved May 5, 2019 from https://technology.ihs.com/532501/245-million-video-surveillance-cameras-installed-globally-in-2014.Google Scholar
- Marc Masana, Joost van de Weijer, Luis Herranz, Andrew D. Bagdanov, and Jose M. Alvarez. 2017. Domain-adaptive deep network compression. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 4289–4297.Google Scholar
- Shota Nakashima, Yuhki Kitazono, Lifeng Zhang, and Seiichi Serikawa. 2010. Development of privacy-preserving sensor for person detection. Proc. Soc. Behav. Sci. 2, 1 (2010), 213–217. https://doi.org/10.1016/j.sbspro.2010.01.038 The 1st International Conference on Security Camera Network, Privacy Protection and Community Safety 2009.Google Scholar
Cross Ref
- Wanli Ouyang, Xingyu Zeng, and Xiaogang Wang. 2016. Learning mutual visibility relationship for pedestrian detection with a deep model. International Journal of Computer Vision 120, 1 (2016), 14–27.Google Scholar
Digital Library
- Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 7263–7271. DOI:https://doi.org/10.1109/CVPR.2017.690Google Scholar
Cross Ref
- Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. arxiv:1804.02767. Retrieved from http://arxiv.org/abs/1804.02767.Google Scholar
- R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 618–626.Google Scholar
- Yunfei Teng and Anna Choromanska. 2019. Invertible autoencoder for domain adaptation. Computation 7, 2 (2019), 20. DOI:https://doi.org/10.3390/computation7020020Google Scholar
Cross Ref
- Yonglong Tian, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Pedestrian detection aided by deep learning semantic tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5079–5087.Google Scholar
Cross Ref
- Maarten Vandersteegen, Kristof Van Beeck, and Toon Goedemé. 2018. Real-time multispectral pedestrian detection with a single-pass deep neural network. In Proceedings of the International Conference Image Analysis and Recognition (ICIAR’18). Springer, 419–426.Google Scholar
Cross Ref
- Jörg Wagner, Volker Fischer, Michael Herman, and Sven Behnke. 2016. Multispectral pedestrian detection using deep fusion convolutional neural networks. In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN’16). 509–514.Google Scholar
- Dan Xu, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, and Nicu Sebe. 2017. Learning cross-modal deep representations for robust pedestrian detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5363–5371.Google Scholar
Cross Ref
- Liliang Zhang, Liang Lin, Xiaodan Liang, and Kaiming He. 2016. Is faster r-cnn doing well for pedestrian detection?. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, 443–457.Google Scholar
Cross Ref
- Lu Zhang, Zhiyong Liu, Xiangyu Chen, and Xu Yang. 2019. The cross-modality disparity problem in multispectral pedestrian detection. arxiv:1901.02645. Retrieved fromhttp://arxiv.org/abs/1901.02645.Google Scholar
- Yang Zheng, Izzat H. Izzat, and Shahrzad Ziaee. 2019. GFD-SSD: Gated fusion double SSD for multispectral pedestrian detection. arxiv:1903.06999. Retrieved fromhttp://arxiv.org/abs/1903.06999.Google Scholar
Index Terms
Bottom-up and Layerwise Domain Adaptation for Pedestrian Detection in Thermal Images
Recommendations
Task-Conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery
Computer Vision – ECCV 2020AbstractPedestrian detection is a core problem in computer vision that sees broad application in video surveillance and, more recently, in advanced driving assistance systems. Despite its broad application and interest, it remains a challenging problem in ...
Adapting pedestrian detectors to new domains
Successful detection and localisation of pedestrians is an important goal in computer vision which is a core area in Artificial Intelligence. State-of-the-art pedestrian detectors proposed in literature have reached impressive performance on certain ...
Unsupervised thermal-to-visible domain adaptation method for pedestrian detection
Highlights- The proposed thermal-to-visible adaptive detector has the advantage to be trained on unlabeled thermal images.
Graphical abstractDisplay Omitted
AbstractPedestrian detection is a common task in the research area of video analysis and its results lay the foundations of a wide range of applications. It is commonly known that under challenging illumination and weather conditions, ...






Comments