skip to main content
research-article

Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion

Authors Info & Claims
Published:05 January 2023Publication History
Skip Abstract Section

Abstract

Detecting objects in aerial images is a long-standing and challenging problem since the objects in aerial images vary dramatically in size and orientation. Most existing neural network based methods are not robust enough to provide accurate oriented object detection results in aerial images since they do not consider the correlations between different levels and scales of features. In this paper, we propose a novel two-stage network-based detector with adaptive feature fusion towards highly accurate oriented object detection in aerial images, named AFF-Det. First, a multi-scale feature fusion module (MSFF) is built on the top layer of the extracted feature pyramids to mitigate the semantic information loss in the small-scale features. We also propose a cascaded oriented bounding box regression method to transform the horizontal proposals into oriented ones. Then the transformed proposals are assigned to all feature pyramid network (FPN) levels and aggregated by the weighted RoI feature aggregation (WRFA) module. The above modules can adaptively enhance the feature representations in different stages of the network based on the attention mechanism. Finally, a rotated decoupled-RCNN head is introduced to obtain the classification and localization results. Extensive experiments are conducted on the DOTA and HRSC2016 datasets to demonstrate the advantages of our proposed AFF-Det. The best detection results can achieve 80.73% mAP and 90.48% mAP, respectively, on these two datasets, outperforming recent state-of-the-art methods.

REFERENCES

  1. [1] Azimi Seyed Majid, Vig Eleonora, Bahmanyar Reza, Körner Marco, and Reinartz Peter. 2018. Towards multi-class object detection in unconstrained remote sensing imagery. In Asian Conference on Computer Vision. 150165.Google ScholarGoogle Scholar
  2. [2] Cai Zhaowei and Vasconcelos Nuno. 2018. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 61546162.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Chen Zhiming, Chen Kean, Lin Weiyao, See John, Yu Hui, Ke Yan, and Yang Cong. 2020. PIoU loss: Towards accurate oriented object detection in complex environments. In European Conference on Computer Vision. 195211.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Ding Jian, Xue Nan, Long Yang, Xia Gui-Song, and Lu Qikai. 2019. Learning RoI transformer for oriented object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 28492858.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Fu Kun, Chang Zhonghan, Zhang Yue, and Sun Xian. 2021. Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing 59, 5 (2021), 43704387.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Han Jiaming, Ding Jian, Li Jie, and Xia Gui-Song. 2021. Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing (2021), 111.Google ScholarGoogle Scholar
  7. [7] Han Jiaming, Ding Jian, Xue Nan, and Xia Gui-Song. 2021. ReDet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 27862795.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] He Kaiming, Gkioxari Georgia, Dollár Piotr, and Girshick Ross. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 29612969.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Ji Ruyi, Liu Zeyu, Zhang Libo, Liu Jianwei, Zuo Xin, Wu Yanjun, Zhao Chen, Wang Haofeng, and Yang Lin. 2021. Multi-peak graph-based multi-instance learning for weakly supervised object detection. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 2s (2021), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Li Chengzheng, Xu Chunyan, Cui Zhen, Wang Dan, Zhang Tong, and Yang Jian. 2019. Feature-attentioned object detection in remote sensing imagery. In IEEE International Conference on Image Processing. 38863890.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Liao Minghui, Zhu Zhen, Shi Baoguang, Xia Gui-song, and Bai Xiang. 2018. Rotation-sensitive regression for oriented scene text detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 59095918.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Lin Tsung-Yi, Dollár Piotr, Girshick Ross, He Kaiming, Hariharan Bharath, and Belongie Serge. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21172125.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Lin Tsung-Yi, Goyal Priya, Girshick Ross, He Kaiming, and Dollár Piotr. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 29802988.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Lin Youtian, Feng Pengming, and Guan Jian. 2019. IENet: Interacting embranchment one stage anchor free detector for orientation aerial object detection. arXiv preprint arXiv:1912.00969 (2019).Google ScholarGoogle Scholar
  16. [16] Liu Wei, Anguelov Dragomir, Erhan Dumitru, Szegedy Christian, Reed Scott, Fu Cheng-Yang, and Berg Alexander C.. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. 2137.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Luo Xiaofan, Wong Fukoeng, and Hu Haifeng. 2020. FIN: Feature integrated network for object detection. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 2 (2020), 118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Ma Jianqi, Shao Weiyuan, Ye Hao, Wang Li, Wang Hong, Zheng Yingbin, and Xue Xiangyang. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia 20, 11 (2018), 31113122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Ming Qi, Zhou Zhiqiang, Miao Lingjuan, Zhang Hongwei, and Li Linhao. 2021. Dynamic anchor learning for arbitrary-oriented object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 23552363.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Pan Xingjia, Ren Yuqiang, Sheng Kekai, Dong Weiming, Yuan Haolei, Guo Xiaowei, Ma Chongyang, and Xu Changsheng. 2020. Dynamic refinement network for oriented and densely packed object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1120711216.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Qian Wen, Yang Xue, Peng Silong, Yan Junchi, and Guo Yue. 2021. Learning modulated loss for rotated object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 24582466.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Redmon Joseph, Divvala Santosh, Girshick Ross, and Farhadi Ali. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779788.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Ren Shaoqing, He Kaiming, Girshick Ross, and Sun Jian. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 9199.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Shu Xiangbo, Yang Jiawen, Yan Rui, and Song Yan. 2022. Expansion-squeeze-excitation fusion network for elderly activity recognition. IEEE Transactions on Circuits and Systems for Video Technology (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Sun Peng, Chen Guang, and Shang Yi. 2020. Adaptive saliency biased loss for object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing 58, 10 (2020), 71547165.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Tong Chao, Liang Baoyu, Zhang Mengze, Chen Rongshan, Sangaiah Arun Kumar, Zheng Zhigao, Wan Tao, Yue Chenyang, and Yang Xinyi. 2020. Pulmonary nodule detection based on ISODATA-improved faster RCNN and 3D-CNN with focal loss. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020), 19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Wang Jinwang, Ding Jian, Guo Haowen, Cheng Wensheng, Pan Ting, and Yang Wen. 2019. Mask OBB: A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing 11, 24 (2019), 2930.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Wang J., Yang W., Li H., Zhang H., and Xia G.. 2021. Learning center probability map for detecting objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing 59, 5 (2021), 43074323.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Wang Peijin, Sun Xian, Diao Wenhui, and Fu Kun. 2019. FMSSD: Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing 58, 5 (2019), 33773390.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Wu Yue, Chen Yinpeng, Yuan Lu, Liu Zicheng, Wang Lijuan, Li Hongzhi, and Fu Yun. 2020. Rethinking classification and localization for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1018610195.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Xia Gui-Song, Bai Xiang, Ding Jian, Zhu Zhen, Belongie Serge, Luo Jiebo, Datcu Mihai, Pelillo Marcello, and Zhang Liangpei. 2018. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 39743983.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Xu Chunyan, Li Chengzheng, Cui Zhen, Zhang Tong, and Yang Jian. 2020. Hierarchical semantic propagation for object detection in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing 58, 6 (2020), 43534364.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Xu Yongchao, Fu Mingtao, Wang Qimeng, Wang Yukang, Chen Kai, Xia Gui-Song, and Bai Xiang. 2021. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 4 (2021), 14521459.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Yang Xue, Hou Liping, Zhou Yue, Wang Wentao, and Yan Junchi. 2021. Dense label encoding for boundary discontinuity free rotation detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1581915829.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Yang Xue and Yan Junchi. 2020. Arbitrary-oriented object detection with circular smooth label. In European Conference on Computer Vision. 677694.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Yang Xue, Yan Junchi, Feng Ziming, and He Tao. 2021. R3Det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 31633171.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Yang Xue, Yan Junchi, Ming Qi, Wang Wentao, Zhang Xiaopeng, and Tian Qi. 2021. Rethinking rotated object detection with Gaussian Wasserstein distance loss. In Proceedings of the International Conference on Machine Learning, Vol. 139. 1183011841.Google ScholarGoogle Scholar
  38. [38] Yang Xue, Yan Junchi, Yang Xiaokang, Tang Jin, Liao Wenlong, and He Tao. 2020. SCRDet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. arXiv preprint arXiv:2004.13316. (2020).Google ScholarGoogle Scholar
  39. [39] Yang Xue, Yang Jirui, Yan Junchi, Zhang Yue, Zhang Tengfei, Guo Zhi, Sun Xian, and Fu Kun. 2019. SCRDet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE International Conference on Computer Vision. 82328241.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Yang Xue, Yang Xiaojiang, Yang Jirui, Ming Qi, Wang Wentao, Tian Qi, and Yan Junchi. 2021. Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. arXiv preprint arXiv:2106.01883. (2021).Google ScholarGoogle Scholar
  41. [41] Yang Xue, Zhou Yue, and Yan Junchi. 2021. AlphaRotate: A rotation detection benchmark using TensorFlow. (2021). https://github.com/yangxue0827/RotationDetection.Google ScholarGoogle Scholar
  42. [42] Yi Jingru, Wu Pengxiang, Liu Bo, Huang Qiaoying, Qu Hui, and Metaxas Dimitris. 2021. Oriented object detection in aerial images with box boundary-aware vectors. In IEEE Winter Conference on Applications of Computer Vision. 21502159.Google ScholarGoogle Scholar
  43. [43] Zhang Gongjie, Lu Shijian, and Zhang Wei. 2019. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing 57, 12 (2019), 1001510024.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Zhang Xiangrong, Wang Guanchun, Zhu Peng, Zhang Tianyang, Li Chen, and Jiao Licheng. 2021. GRS-Det: An anchor-free rotation ship detector based on Gaussian-mask in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 59, 4 (2021), 35183531.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Zhou Xingyi, Wang Dequan, and Krähenbühl Philipp. 2019. Objects as points. arXiv preprint arXiv:1904.07850. (2019).Google ScholarGoogle Scholar
  46. [46] Zhu Y., Du J., and Wu X.. 2020. Adaptive period embedding for representing oriented objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing 58, 10 (2020), 72477257.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 1
        January 2023
        505 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3572858
        • Editor:
        • Abdulmotaleb El Saddik
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 January 2023
        • Online AM: 18 February 2022
        • Accepted: 21 January 2022
        • Revised: 18 December 2021
        • Received: 28 June 2021
        Published in tomm Volume 19, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!