skip to main content
research-article

LogoDet-3K: A Large-scale Image Dataset for Logo Detection

Authors Info & Claims
Published:27 January 2022Publication History
Skip Abstract Section

Abstract

Logo detection has been gaining considerable attention because of its wide range of applications in the multimedia field, such as copyright infringement detection, brand visibility monitoring, and product brand management on social media. In this article, we introduce LogoDet-3K, the largest logo detection dataset with full annotation, which has 3,000 logo categories, about 200,000 manually annotated logo objects, and 158,652 images. LogoDet-3K creates a more challenging benchmark for logo detection, for its higher comprehensive coverage and wider variety in both logo categories and annotated objects compared with existing datasets. We describe the collection and annotation process of our dataset and analyze its scale and diversity in comparison to other datasets for logo detection. We further propose a strong baseline method Logo-Yolo, which incorporates Focal loss and CIoU loss into the basic YOLOv3 framework for large-scale logo detection. It obtains about 4% improvement on the average performance compared with YOLOv3, and greater improvements compared with reported several deep detection models on LogoDet-3K. We perform extensive evaluation on three other existing datasets to further verify on both logo detection and retrieval tasks, and we demonstrate better generalization ability of LogoDet-3K on logo detection and retrieval tasks. The LogoDet-3K dataset is used to promote large-scale logo-related research. The code and LogoDet-3K can be found at https://github.com/Wangjing1551/LogoDet-3K-Dataset.

REFERENCES

  1. [1] Bianco Simone, Buzzelli Marco, Mazzini Davide, and Schettini Raimondo. 2015. Logo recognition using CNN features. In Proceedings of the International Conference on Image Analysis and Processing. 438448.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Bianco Simone, Buzzelli Marco, Mazzini Davide, and Schettini Raimondo. 2017. Deep learning for logo recognition. Neurocomputing 245 (July 2017), 2330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Cai Zhaowei and Vasconcelos Nuno. 2018. Cascade R-CNN: Delving into high-quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 61546162.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Carion Nicolas, Massa Francisco, Synnaeve Gabriel, Usunier Nicolas, Kirillov Alexander, and Zagoruyko Sergey. 2020. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision. 213229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Cheng ZhiQi, Liu Yang, Wu Xiao, and Hua Xian Sheng. 2016. Video ECommerce: Towards online video advertising. In Proceedings of the ACM International Conference on Multimedia. 13651374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 248255.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Eggert Christian, Zecha Dan, Brehm Stephan, and Lienhart Rainer. 2017. Improving small object proposals for company logo detection. In Proceedings of the ACM on International Conference on Multimedia Retrieval. 167174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Everingham Mark, Gool Luc Van, Williams Christopher K. I., Winn John M., and Zisserman Andrew. 2010. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision (2010), 303338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Fehérvári István and Appalaraju Srikar. 2019. Scalable logo recognition using proxies. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 715725.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Hoi Steven C. H., Wu Xiongwei, Liu Hantang, Wu Yue, Wang Huiqiong, Xue Hui, and Wu Qiang. 2015. LOGO-Net: Large-scale deep logo detection and brand recognition with deep region-based convolutional networks. Retrieved from https://arXiv:1511.02462.Google ScholarGoogle Scholar
  11. [11] Iandola Forrest N., Shen Anting, Gao Peter, and Keutzer Kurt. 2015. DeepLogo: Hitting logo recognition with the deep neural network hammer. Retrieved from https://arXiv:1510.02131.Google ScholarGoogle Scholar
  12. [12] Li K. He, J. Dai, Y., and Sun J.. 2016. R-fcn: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. MIT Press, 379387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Zhang Shaoqing Ren, Jian Sun, Kaiming He, and Xiangyu. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Proceedings of the IEEE International Conference on Computer Vision. 346361.Google ScholarGoogle Scholar
  14. [14] Kalantidis Yannis, Pueyo Lluis Garcia, Trevisiol Michele, Zwol Roelof van, and Avrithis Yannis. 2011. Scalable triangulation-based logo recognition. In Proceedings of the ACM International Conference on Multimedia Retrieval. 17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Uijlings Theo Gevers, Arnold W. M. Smeulders, Koen E. A. van deSande, and Jasper R.R.. 2011. Segmentation as selective search for object recognition. In Proceedings of the IEEE International Conference on Computer Vision. 18791886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Kong Tao, Sun Fuchun, Liu Huaping, Jiang Yuning, Li Lei, and Shi Jianbo. 2020. Foveabox: Beyound anchor-based object detection. IEEE Trans. Image Process. 29 (2020), 73897398.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Kuznetsov Andrey and Savchenko Andrey V.. 2020. A new sport teams logo dataset for detection tasks. In Computer Vision and Graphics. Springer, 8797.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Law Hei and Deng Jia. 2018. CornerNet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision. 765781.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Liao Yuan, Lu Xiaoqing, Zhang Chengcui, Wang Yongtao, and Tang Zhi. 2017. Mutual enhancement for detection of multiple logos in sports videos. In Proceedings of the IEEE International Conference on Computer Vision. 48564865.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Lin Tsung-Yi, Dollár Piotr, Girshick Ross B., He Kaiming, Hariharan Bharath, and Belongie Serge J.. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 936944.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Lin Tsung-Yi, Goyal Priya, Girshick Ross B., He Kaiming, and Dollár Piotr. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 29993007.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Lin Tsung-Yi, Maire Michael, Belongie Serge J., Hays James, Perona Pietro, Ramanan Deva, Dollár Piotr, and Zitnick C. Lawrence. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740755.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Tian Wengang Zhou, Bo Zhang, Lingxi Xie, and Qi. 2014. Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb. In Computer Vision Image Understanding. Elsevier, 3141.Google ScholarGoogle Scholar
  24. [24] Liu Jiaying, Song Sijie, Liu Chunhui, Li Yanghao, and Hu Yueyu. 2020. A benchmark dataset and comparison study for multi-modal human action analytics. ACM Trans. Multimedia Comput. Commun. Appl. (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Liu Liu, Dzyabura Daria, and Mizik Natalie. 2018. Visual listening in: Extracting brand image portrayed on social media. In Proceedings of the AAAI Conference on Artificial Intelligence. 7177.Google ScholarGoogle Scholar
  26. [26] Liu Wei, Anguelov Dragomir, Erhan Dumitru, Szegedy Christian, Reed Scott E., Fu Cheng-Yang, and Berg Alexander C.. 2016. SSD: Single shot MultiBox detector. In Proceedings of the European Conference on Computer Vision. 2137.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Lowe D. G.. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference. 11501157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Zeiler Rob Fergus and Matthew D.. 2014. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision. 818833.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Mei Tao, Hua Xian-Sheng, Yang Linjun, and Li Shipeng. 2007. VideoSense-towards effective online video advertising. In Proceedings of the ACM International Conference on Multimedia. 10751084. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Dalal Bill Triggs and Navneet. 2005. Histograms of oriented gradients for human detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 886893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Neumann Jan, Samet Hanan, and Soffer Aya. 2002. Integration of local and global shape analysis for logo classification. Pattern Recogn. Lett. (2002), 14491457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Oliveira Gonçalo, Frazão Xavier, Pimentel André, and Ribeiro Bernardete. 2016. Automatic graphic logo detection via fast region-based convolutional networks. In Proceedings of the International Joint Conference on Neural Networks. 985991.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Girshick David A. McAllester Pedro F. Felzenszwalb, and Ross B.. 2010. Cascade object detection with deformable part models. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 22412248.Google ScholarGoogle Scholar
  34. [34] Girshick David A. McAllester, Deva Ramanan, Pedro F. Felzenszwalb, and Ross B.. 2010. Object detection with discriminatively trained part-based models. In IEEE Transactions on Pattern Analysis and Machine Intelligence. IEEE, 16271645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] McAllester Deva Ramanan, Pedro F. Felzenszwalb, and David A.. 2008. A discriminatively trained, multiscale, deformable part model. In Proceedings of the Conference on Computer Vision and Pattern Recognition. 18.Google ScholarGoogle Scholar
  36. [36] Lu Jian Xue, Ling Shao, Jiayi Lyu, Pengcheng Gao, and Ke. 2020. A coarse-to-fine facial landmark detection method based on self-attention mechanism. IEEE Trans. Multimedia (2020), 110.Google ScholarGoogle Scholar
  37. [37] Redmon Joseph, Divvala Santosh Kumar, Girshick Ross B., and Farhadi Ali. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779788.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Redmon Joseph and Farhadi Ali. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 65176525.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Redmon Joseph and Farhadi Ali. 2018. YOLOv3: An incremental improvement. Retrieved from https://arXiv:1804.02767.Google ScholarGoogle Scholar
  40. [40] Ren Shaoqing, He Kaiming, Girshick Ross B., and Sun Jian. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Conference on Neural Information Processing Systems. 9199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Revaud Jérôme, Douze Matthijs, and Schmid Cordelia. 2012. Correlation-based burstiness for logo retrieval. In Proceedings of the ACM International Conference on Multimedia. 965968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Romberg Stefan and Lienhart Rainer. 2013. Bundle min-hashing for logo recognition. In Proceedings of the ACM Conference on International Conference on Multimedia Retrieval. 113120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Romberg Stefan, Pueyo Lluis Garcia, Lienhart Rainer, and Zwol Roelof van. 2011. Scalable logo recognition in real-world images. In Proceedings of the ACM Conference on International Conference on Multimedia Retrieval. 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Darrell Trevor, Malik Jitendra, Girshick Ross B., and Donahue Jeff. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 580587. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Simonyan Karen and Zisserman Andrew. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations. 114.Google ScholarGoogle Scholar
  46. [46] Singh Bharat, Najibi Mahyar, and Davis Larry S.. 2018. SNIPER: Efficient multi-scale training. In Proceedings of the Conference on Neural Information Processing Systems. 93339343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Su Hang, Gong Shaogang, and Zhu Xiatian. 2017. WebLogo-2M: Scalable logo detection by deep learning from the web. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 270279.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Su Hang, Gong Shaogang, and Zhu Xiatian. 2020. Scalable logo detection by self co-learning. Pattern Recogn. (2020), 107003.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Su Hang, Gong Shaogang, and Zhu Xiatian. 2021. Multi-perspective cross-class domain adaptation for open logo detection. Comput. Vision Image Understand. (2021), 103156.Google ScholarGoogle Scholar
  50. [50] Su Hang, Zhu Xiatian, and Gong Shaogang. 2017. Deep learning logo detection with data expansion by synthesising context. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 530539.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Su Hang, Zhu Xiatian, and Gong Shaogang. 2018. Open logo detection challenge. In Proceedings of the British Machine Vision Conference. 111119.Google ScholarGoogle Scholar
  52. [52] Sun Peize, Zhang Rufeng, Jiang Yi, Kong Tao, Xu Chenfeng, Zhan Wei, Tomizuka Masayoshi, Li Lei, Yuan Zehuan, Wang Changhu et al. 2020. Sparse r-cnn: End-to-end object detection with learnable proposals. Retrieved from https://arXiv:2011.12450.Google ScholarGoogle Scholar
  53. [53] Tüzkö Andras, Herrmann Christian, Manger Daniel, and Beyerer Jürgen. 2018. Open set logo detection and retrieval. In Proceedings of the Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. 284292.Google ScholarGoogle Scholar
  54. [54] Wang Jing, Min Weiqing, Hou Sujuan, Ma Shengnan, Zheng Yuanjie, Wang Haishuai, and Jiang Shuqiang. 2020. Logo-2K+: A large-scale logo dataset for scalable logo classification. In Proceedings of the AAAI Conference on Artificial Intelligence. 61946201.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Wei Yang, Wang Zhuzhu, Xiao Bin, Liu Ximeng, Yan Zheng, and Ma Jianfeng. 2020. Controlling neural learning network with multiple scales for image splicing forgery detection. ACM Trans. Multimedia Comput. Commun. Appl. (2020), 122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Wei-Qi Jun Wang Yan,, and Kankanhalli Mohan. 2005. Automatic video logo detection and removal. Multimedia Syst. (2005), 379391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Yang Linjie, Luo Ping, Loy Chen Change, and Tang Xiaoou. 2015. A large-scale car dataset for fine-grained categorization and verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 39733981.Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Li Xin Fan, Risheng Liu, Qi Jia, Yu Bao, and Haojie. 2016. Region-based CNN for logo detection. In Internet Multimedia Computing and Service. Springer, 319322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Wang Huan-Bo Luan, Tat-Seng Chua, Yue Gao, and Fanglin. 2014. Brand data gathering from live social media streams. In Proceedings of the International Conference on Multimedia Retrieval. 169176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Zhen Haojie Li, Tat-Seng Chua, Yue Gao, and Yi. 2016. Filtering of brand-related microblogs using social-smooth multiview embedding. IEEE Trans. Multimedia (2016), 21152126.Google ScholarGoogle Scholar
  61. [61] Zhang Qianni and Izquierdo Ebroul. 2013. Multifeature analysis and semantic context learning for image classification. ACM Trans. Multimedia Comput. Commun. Appl. (2013), 120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Zhang Shifeng, Chi Cheng, Yao Yongqiang, Lei Zhen, and Li Stan Z.. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 97599768.Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Zheng Zhaohui, Wang Ping, Liu Wei, Li Jinze, Ye Rongguang, and Ren Dongwei. 2020. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence. 1299313000.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Chen Hao, He Tong, Tian Zhi, and Shen Chunhua. 2020. FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’20).Google ScholarGoogle Scholar
  65. [65] Liu Yang, Hua XianSheng, Cheng ZhiQi, and Wu Xiao. 2017. Video eCommerce++: Toward large scale online video advertising. IEEE Trans. Multimedia (2017), 11701183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Zhong Sheng-Hua, Wang Yuantian, Ren Tongwei, Zheng Mingjie, Liu Yan, and Wu Gangshan. 2019. Steganographer detection via multi-scale embedding probability estimation. ACM Trans. Multimedia Comput. Commun. Appl. (2019), 123. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LogoDet-3K: A Large-scale Image Dataset for Logo Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 1
        January 2022
        517 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3505205
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 January 2022
        • Revised: 1 May 2021
        • Accepted: 1 May 2021
        • Received: 1 September 2020
        Published in tomm Volume 18, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!