skip to main content
research-article

MFECN: Multi-level Feature Enhanced Cumulative Network for Scene Text Detection

Authors Info & Claims
Published:22 July 2021Publication History
Skip Abstract Section

Abstract

Recently, many scene text detection algorithms have achieved impressive performance by using convolutional neural networks. However, most of them do not make full use of the context among the hierarchical multi-level features to improve the performance of scene text detection. In this article, we present an efficient multi-level features enhanced cumulative framework based on instance segmentation for scene text detection. At first, we adopt a Multi-Level Features Enhanced Cumulative (MFEC) module to capture features of cumulative enhancement of representational ability. Then, a Multi-Level Features Fusion (MFF) module is designed to fully integrate both high-level and low-level MFEC features, which can adaptively encode scene text information. To verify the effectiveness of the proposed method, we perform experiments on six public datasets (namely, CTW1500, Total-text, MSRA-TD500, ICDAR2013, ICDAR2015, and MLT2017), and make comparisons with other state-of-the-art methods. Experimental results demonstrate that the proposed Multi-Level Features Enhanced Cumulative Network (MFECN) detector can well handle scene text instances with irregular shapes (i.e., curved, oriented, and horizontal) and achieves better or comparable results.

References

  1. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2018), 834–848.Google ScholarGoogle ScholarCross RefCross Ref
  2. Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).Google ScholarGoogle Scholar
  3. Zhineng Chen, Shanshan Ai, and Caiyan Jia. 2019. Structure-aware deep learning for product image classification. ACM Trans. Multim. Comput. Commun. Applic. 15, 1s (2019), 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chee Kheng Ch’ng and Chee Seng Chan. 2017. Total-text: A comprehensive dataset for scene text detection and recognition. InProceedings of theInternational Conference on Document Analysis and Recognition (ICDAR). 935–942.Google ScholarGoogle ScholarCross RefCross Ref
  5. Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, and Rita Cucchiara. 2018. Paying more attention to saliency: Image captioning with saliency and context attention. ACM Trans. Multim. Comput. Commun. Applic. 14, 2 (2018), 48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yuchen Dai, Zheng Huang, Yuting Gao, Youxuan Xu, Kai Chen, Jie Guo, and Weidong Qiu. 2018. Fused text segmentation networks for multi-oriented scene text detection. In Proceedings of the International Conference on Pattern Recognition (ICPR). 3604–3609.Google ScholarGoogle ScholarCross RefCross Ref
  7. Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. 2018. PixelLink: Detecting scene text via instance segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 6773–6780.Google ScholarGoogle Scholar
  8. Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2010. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 2 (2010), 303–338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Rohit Girdhar, Georgia Gkioxari, Lorenzo Torresani, Manohar Paluri, and Du Tran. 2018. Detect-and-track: Efficient pose estimation in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 350–359.Google ScholarGoogle ScholarCross RefCross Ref
  10. Ankush Gupta, Andrea Vedaldi, and Andrew Zisserman. 2016. Synthetic data for text localisation in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2315–2324.Google ScholarGoogle ScholarCross RefCross Ref
  11. Dafang He, Xiao Yang, Chen Liang, Zihan Zhou, Alexander G. Ororbi, Daniel Kifer, and C. Lee Giles. 2017. Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3519–3528.Google ScholarGoogle Scholar
  12. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2961–2969.Google ScholarGoogle Scholar
  13. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  14. Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, and Xiaolin Li. 2017. Single shot text detector with regional attention. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3047–3055.Google ScholarGoogle ScholarCross RefCross Ref
  15. Tong He, Zhi Tian, Weilin Huang, Chunhua Shen, Yu Qiao, and Changming Sun. 2018. An end-to-end textspotter with explicit alignment and attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5020–5029.Google ScholarGoogle ScholarCross RefCross Ref
  16. Wenhao He, Xu-Yao Zhang, Fei Yin, and Cheng-Lin Liu. 2017. Deep direct regression for multi-oriented scene text detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 745–753.Google ScholarGoogle ScholarCross RefCross Ref
  17. Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip H. S. Torr. 2017. Deeply supervised salient object detection with short connections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3203–3212.Google ScholarGoogle Scholar
  18. Han Hu, Chengquan Zhang, Yuxuan Luo, Yuzhuo Wang, Junyu Han, and Errui Ding. 2017. Wordsup: Exploiting word annotations for character based text detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 4940–4949.Google ScholarGoogle ScholarCross RefCross Ref
  19. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7132–7141.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick. 2018. Learning to segment every thing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4233–4241.Google ScholarGoogle ScholarCross RefCross Ref
  21. Shao Huang, Weiqiang Wang, Shengfeng He, and Rynson W. H. Lau. 2018. Egocentric hand detection via dynamic region growing. ACM Trans. Multim. Comput. Commun. Applic. 14, 1 (2018), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhida Huang, Zhuoyao Zhong, Lei Sun, and Qiang Huo. 2019. Mask R-CNN with pyramid attention network for scene text detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). 764–772.Google ScholarGoogle ScholarCross RefCross Ref
  23. Jisoo Jeong, Hyojin Park, and Nojun Kwak. 2017. Enhancement of SSD by concatenating feature maps for object detection. arXiv preprint arXiv:1705.09587 (2017).Google ScholarGoogle Scholar
  24. Dimosthenis Karatzas, Lluis Gomez-Bigorda, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, Jiri Matas, Lukas Neumann, Vijay Ramaseshan Chandrasekhar, Shijian Lu et al. 2015. ICDAR 2015 competition on robust reading. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 1156–1160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazan Almazan, and Lluis Pere De Las Heras. 2013. ICDAR 2013 robust reading competition. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 1484–1493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, and Qixiang Ye. 2017. SRN: Side-output residual network for object symmetry detection in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1068–1076.Google ScholarGoogle ScholarCross RefCross Ref
  27. Hyungtae Lee and Heesung Kwon. 2017. Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans. Image Proc. 26, 10 (2017), 4843–4855.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, and Yichen Wei. 2017. Fully convolutional instance-aware semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2359–2367.Google ScholarGoogle ScholarCross RefCross Ref
  29. Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, and Wenyu Liu. 2017. Textboxes: A fast text detector with a single deep neural network. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 4161–4167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Minghui Liao, Zhen Zhu, Baoguang Shi, Gui-song Xia, and Xiang Bai. 2018. Rotation-sensitive regression for oriented scene text detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5909–5918.Google ScholarGoogle ScholarCross RefCross Ref
  31. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2117–2125.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, and Qingjie Liu. 2019. Pyramid mask text detector. arXiv preprint arXiv:1903.11800 (2019).Google ScholarGoogle Scholar
  33. Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, and Errui Ding. 2019. Detecting text in the wild with deep character embedding network. arXiv preprint arXiv:1901.00363 (2019).Google ScholarGoogle Scholar
  34. Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 8759–8768.Google ScholarGoogle ScholarCross RefCross Ref
  35. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV). 21–37.Google ScholarGoogle Scholar
  36. Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, and Junjie Yan. 2018. FOTS: Fast oriented text spotting with a unified network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5676–5685.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. 2017. Richer convolutional features for edge detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)). 3000–3009.Google ScholarGoogle ScholarCross RefCross Ref
  38. Yuliang Liu, Lianwen Jin, Shuaitao Zhang, and Sheng Zhang. 2017. Detecting curve text in the wild: New dataset and new solution. arXiv preprint arXiv:1712.02170 (2017).Google ScholarGoogle Scholar
  39. Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, and Wang Ling Goh. 2019. Towards robust curve text detection with conditional spatial expansion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7269–7278.Google ScholarGoogle ScholarCross RefCross Ref
  40. Zhandong Liu, Wengang Zhou, and Houqiang Li. 2019. AB-LSTM: Attention-based bidirectional LSTM model for scene text detection. ACM Trans. Multim. Comput. Commun. Applic. 15, 4 (2019), 1–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Zhandong Liu, Wengang Zhou, and Houqiang Li. 2019. Scene text detection with fully convolutional neural networks. Multim. Tools Applic. 78, 13 (2019), 18205–18227. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3431–3440.Google ScholarGoogle ScholarCross RefCross Ref
  43. Shangbang Long, Jiaqiang Ruan, Wenjie Zhang, Xin He, Wenhao Wu, and Cong Yao. 2018. TextSnake: A flexible representation for detecting text of arbitrary shapes. In Proceedings of the European Conference on Computer Vision (ECCV). 20–36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, and Xiang Bai. 2018. Mask TextSpotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. In Proceedings of the European Conference on Computer Vision (ECCV). 67–83.Google ScholarGoogle ScholarCross RefCross Ref
  45. Pengyuan Lyu, Cong Yao, Wenhao Wu, Shuicheng Yan, and Xiang Bai. 2018. Multi-oriented scene text detection via corner localization and region segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7553–7563.Google ScholarGoogle ScholarCross RefCross Ref
  46. Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, and Xiangyang Xue. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multim. 20, 11 (2018), 3111–3122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Nibal Nayef, Fei Yin, Imen Bizid, Hyunsoo Choi, Yuan Feng, Dimosthenis Karatzas, Zhenbo Luo, Umapada Pal, Christophe Rigaud, Joseph Chazalon et al. 2017. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 1454–1459.Google ScholarGoogle Scholar
  48. S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 6 (2017), 1137–1149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Abhijit Guha Roy, Nassir Navab, and Christian Wachinger. 2019. Recalibrating fully convolutional networks with spatial and channel “Squeeze and Excitation” blocks. IEEE Trans. Med. Imag. 38, 2 (2019), 540–549.Google ScholarGoogle ScholarCross RefCross Ref
  50. Baoguang Shi, Xiang Bai, and Serge Belongie. 2017. Detecting oriented text in natural images by linking segments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2550–2558.Google ScholarGoogle ScholarCross RefCross Ref
  51. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  52. Jingkuan Song, Zhilong Zhou, Lianli Gao, Xing Xu, and Heng Tao Shen. 2018. Cumulative nets for edge detection. In Proceedings of the ACM International Conference on Multimedia (MM). 1847–1855. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Mingxing Tan, Ruoming Pang, and Quoc V. Le. 2020. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 10781–10790.Google ScholarGoogle Scholar
  54. Zhi Tian, Weilin Huang, Tong He, Pan He, and Yu Qiao. 2016. Detecting text in natural image with connectionist text proposal network. In Proceedings of the European Conference on Computer Vision (ECCV). 56–72.Google ScholarGoogle ScholarCross RefCross Ref
  55. Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, and Dacheng Tao. 2018. Geometry-aware scene text detection with instance transformation network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1381–1389.Google ScholarGoogle ScholarCross RefCross Ref
  56. Pengfei Wang, Chengquan Zhang, Fei Qi, Zuming Huang, Mengyi En, Junyu Han, Jingtuo Liu, Errui Ding, and Guangming Shi. 2019. A single-shot arbitrarily-shaped text detector based on context attended multi-task learning. In Proceedings of the ACM International Conference on Multimedia (MM). 1277–1285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Wenhai Wang, Enze Xie, Xiang Li, Wenbo Hou, Tong Lu, Gang Yu, and Shuai Shao. 2019. Shape robust text detection with progressive scale expansion network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9336–9345.Google ScholarGoogle ScholarCross RefCross Ref
  58. Christian Wolf and Jean-Michel Jolion. 2006. Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recog. 8, 4 (2006), 280–296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, and Guangyao Li. 2019. Scene text detection with supervised pyramid context network. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 9038–9045.Google ScholarGoogle ScholarCross RefCross Ref
  60. Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1492–1500.Google ScholarGoogle ScholarCross RefCross Ref
  61. Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1395–1403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Yongchao Xu, Yukang Wang, Wei Zhou, Yongpan Wang, Zhibo Yang, and Xiang Bai. 2019. TextField: Learning a deep direction field for irregular scene text detection. IEEE Trans. Image Proc. 28, 11 (2019), 5566--5579.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Chuhui Xue, Shijian Lu, and Fangneng Zhan. 2018. Accurate scene text detection through border semantics awareness and bootstrapping. In Proceedings of the European Conference on Computer Vision (ECCV). 355–372.Google ScholarGoogle ScholarCross RefCross Ref
  64. Chuhui Xue, Shijian Lu, and Wei Zhang. 2019. MSR: Multi-scale shape regression for scene text detection. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). 989–995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Qiangpeng Yang, Mengli Cheng, Wenmeng Zhou, Yan Chen, Minghui Qiu, and Wei Lin. 2018. IncepText: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 1071–1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Cong Yao, Xiang Bai, and Wenyu Liu. 2014. A unified framework for multioriented text detection and recognition. IEEE Trans. Image Proc. 23, 11 (2014), 4737–4749.Google ScholarGoogle ScholarCross RefCross Ref
  67. Cong Yao, Xiang Bai, Wenyu Liu, Yi Ma, and Zhuowen Tu. 2012. Detecting texts of arbitrary orientations in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1083–1090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Xu-Cheng Yin, Ze-Yu Zuo, Shu Tian, and Cheng-Lin Liu. 2016. Text detection, tracking and recognition in video: A comprehensive survey. IEEE Trans. Image Proc. 25, 6 (2016), 2752–2773. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, and Thomas S. Huang. 2016. UnitBox: An advanced object detection network. In Proceedings of the ACM International Conference on Multimedia (MM). 516–520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Xingyu Zeng, Wanli Ouyang, Bin Yang, Junjie Yan, and Xiaogang Wang. 2016. Gated bi-directional CNN for object detection. In Proceedings of the European Conference on Computer Vision (ECCV). 354–369.Google ScholarGoogle ScholarCross RefCross Ref
  71. Lu Zhang, Ju Dai, Huchuan Lu, You He, and Gang Wang. 2018. A bi-directional message passing model for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1741–1750.Google ScholarGoogle ScholarCross RefCross Ref
  72. Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 202–211.Google ScholarGoogle ScholarCross RefCross Ref
  73. Sheng Zhang, Yuliang Liu, Lianwen Jin, and Canjie Luo. 2018. Feature enhancement network: A refined scene text detector. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2612–2619.Google ScholarGoogle Scholar
  74. Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, and Xiang Bai. 2016. Multi-oriented text detection with fully convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4159–4167.Google ScholarGoogle ScholarCross RefCross Ref
  75. Kai Zhao, Wei Shen, Shanghua Gao, Dandan Li, and Ming-Ming Cheng. 2018. Hi-Fi: Hierarchical feature integration for skeleton detection. In Proceedings of the International Joint Conference on Artificial Intelligenc (IJCAI). 1191–1197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, and Jiajun Liang. 2017. EAST: An efficient and accurate scene text detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5551–5560.Google ScholarGoogle ScholarCross RefCross Ref
  77. Yingying Zhu, Cong Yao, and Xiang Bai. 2016. Scene text detection and recognition: Recent advances and future trends. Front. Comput. Sci. 10, 1 (2016), 19–36. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MFECN: Multi-level Feature Enhanced Cumulative Network for Scene Text Detection

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 3
            August 2021
            443 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3476118
            Issue’s Table of Contents

            Copyright © 2021 Association for Computing Machinery.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Accepted: 1 November 2021
            • Published: 22 July 2021
            • Revised: 1 September 2020
            • Received: 1 August 2019
            Published in tomm Volume 17, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!