skip to main content
research-article

Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models

Published:13 December 2021Publication History
Skip Abstract Section

Abstract

In this article, we address the problem of localizing text and symbolic annotations on the scanned image of a printed document. Previous approaches have considered the task of annotation extraction as binary classification into printed and handwritten text. In this work, we further subcategorize the annotations as underlines, encirclements, inline text, and marginal text. We have collected a new dataset of 300 documents constituting all classes of annotations marked around or in-between printed text. Using the dataset as a benchmark, we report the results of two saliency formulations—CRF Saliency and Discriminant Saliency, for predicting salient patches, which can correspond to different types of annotations. We also compare our work with recent semantic segmentation techniques using deep models. Our analysis shows that Discriminant Saliency can be considered as the preferred approach for fast localization of patches containing different types of annotations. The saliency models were learned on a small dataset, but still, give comparable performance to the deep networks for pixel-level semantic segmentation. We show that saliency-based methods give better outcomes with limited annotated data compared to more sophisticated segmentation techniques that require a large training set to learn the model.

REFERENCES

  1. [1] Awal A. M. and Belaïd A.. 2017. Neighborhood label extension for handwritten/printed text separation in Arabic documents. In Proceedings of the 2017 1st International Workshop on Arabic Script Analysis and Recognition. 3640.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Ban Sang-Woo, Lee Minho, and Yang Hyun-Seung. 2004. A face detection using biologically motivated bottom-up saliency map model and top-down perception model. Neurocomputing 56, 1 (2004), 475480.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Scoble Robinson Beccaloni,. 2003. Chapter 10: Computerising Unit-level Data in Natural History Card Archives. Scoble, M. J. (Ed.), 176pp.Google ScholarGoogle Scholar
  4. [4] Belaïd Abdel, Santosh K. C., and D’Andecy Vincent Poulain. 2013. Handwritten and printed text separation in real document. In Proceedings of MVA.Google ScholarGoogle Scholar
  5. [5] Belongie S., Malik J., and Puzicha J.. 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on PAMI 24, 4 (Apr 2002), 509522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Benjlaiel M., Mullot R., and Alimi A. M.. 2014. Multi-oriented handwritten annotations extraction from scanned documents. In Proceedings of the Document Analysis Systems 11th IAPR International Workshop on. 126130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Borji A., Cheng M. M., Jiang H., and Li J.. 2015. Salient object detection: A benchmark. IEEE Transactions on Image Processing 24, 12 (Dec 2015), 57065722.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Borji A. and Itti L.. 2013. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (Jan 2013), 185207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Chang Chih-Chung and Lin Chih-Jen. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3 (2011), 27:1–27:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Cheng M. M., Mitra N. J., Huang X., Torr P. H. S., and Hu S. M.. 2015. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (March 2015), 569582.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Silva L. F. da, Conci A., and Sanchez A.. 2009. Automatic discrimination between printed and handwritten text in documents. In Proceedings of the Computer Graphics and Image Processing XXII Brazilian Symposium. 261267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Echi Afef Kacem and Saidani Asma. 2014. How to separate between machine-printed/handwritten and Arabic/words?Electronic Letters on Computer Vision and Image Analysis 13, 1 (2014), 116.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Emambakhsh M., He Y., and Nabney I.. 2016. Handwritten and machine-printed text discrimination using a template matching approach. In Proceedings of the 2016 12th IAPR Workshop on Document Analysis Systems (DAS). 399404.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Eskenazi Sébastien, Gomez-Krämer Petra, and Ogier Jean-Marc. 2017. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognition 64 (2017), 114. DOI: https://doi.org/10.1016/j.patcog.2016.10.023 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Fan Kuo-Chin, Wang Liang-Shen, and Tu Yin-Tien. 1998. Classification of machine-printed and handwritten texts using character block layout variance.Pattern Recognition 31, 9 (1998), 12751284.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Farooq F., Sridharan K., and Govindaraju V.. 2006. Identifying handwritten text in mixed documents. In Proceedings of the 18th International Conference on Pattern Recognition, Vol. 2. 11421145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Franke J. and Oberlander M.. 1993. Writing style detection by statistical combination of classifiers in form reader applications. In Proceedings of the 2nd ICDAR, 1993. 581584.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Gao D., Han S., and Vasconcelos N.. 2009. Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 6 (June 2009), 9891005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Gao Dashan and Vasconcelos N.. 2005. An experimental comparison of three guiding principles for the detection salient image locations: Stability, complexity, and discrimination. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 8484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Gao Dashan and Vasconcelos Nuno. 2009. Decision-theoretic saliency: Computational principles, biological plausibility, and implications for neurophysiology and psychophysics. Neural Computation 21, 1 (2009), 239271. DOI:DOI: DOI: https://doi.org/10.1162/neco.2009.11-06-391 Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Xie Liang Lin Guanbin Li, Yuan and Yu Yizhou. 2017. Instance–level salient object segmentation. In Proceedings of the IEEE Conference on CVPR.Google ScholarGoogle Scholar
  22. [22] Guo J. K. and Ma M. Y.. 2001. Separating handwritten material from machine printed text using hidden Markov models. In Proceedings of the 6th International Conference on Document Analysis and Recognition. 439443. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Hong X., Chang H., Shan S., Chen X., and Gao W.. 2009. Sigma set: A small second order statistical region descriptor. In Proceedings of the Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  24. [24] Jang Seung Ick, Jeong Seon Hwa, and Nam Yun-Seok. 2004. Classification of machine-printed and handwritten addresses on Korean mail piece images using geometric features. In Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2. 383386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Judd Tilke, Ehinger Krista, Durand Frédo, and Torralba Antonio. 2009. Learning to predict where humans look. In Proceedings of the IEEE 12th International Conference on Computer Vision. 21062113.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Kandan R., Reddy NirupKumar, Arvind K. R., and Ramakrishnan A. G.. 2007. A robust two level classification algorithm for text localization in documents. In Proceedings of the Advances in Visual Computing. Bebis George, Boyle Richard, Parvin Bahram, Koracin Darko, Paragios Nikos, Tanveer Syeda-Mahmood, Ju Tao, Liu Zicheng, Coquillart Sabine, Cruz-Neira Carolina, Müller Torsten, and Malzbender Tom (Eds.). Lecture Notes in Computer Science, Vol. 4842. Springer Berlin, 96105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Kavallieratou Ergina, Fakotakis Nikos, and Kokkinakis George K.. 2002. An unconstrained handwriting recognition system.International Journal on Document Analysis and Recognition 4, 4 (2002), 226242.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Kavallieratou Ergina and Stamatatos Stathis. 2004. Discrimination of machine-printed from handwritten text using simple structural characteristics. In Proceedings of the 17th International Conference on Pattern Recognition. 437440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Kocak Aysun, Cizmeciler Kemal, Erdem Aykut, and Erdem Erkut. 2014. Top down saliency estimation via superpixel-based discriminative dictionaries. In Proceedings of the British Machine Vision Conference. BMVA Press.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Kumar Jayant, Prasad Rohit, Cao Huiagu, Abd-Almageed Wael, Doermann David, and Natarajan Premkumar. 2011. Shape codebook based handwritten and machine printed text zone extraction. In Proceedings of the Document Recognition and Retrieval XVIII, Gady Agam and Christian Viard-Gaudin (Eds.). SPIE, 47–54. DOI: 10.1117/12.876725Google ScholarGoogle Scholar
  31. [31] Kölsch A., Mishra A., Varshneya S., Afzal M. Z., and Liwicki M.. 2018. Recognizing challenging handwritten annotations with fully convolutional networks. In Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition. 2531.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Lei Yang, Fan Jian, and Liu Jerry. 2016. A multi-scale approach to extract meaningful annotations from document images. In Proceedings of the International Conference on Accoustics, Speech, and Signal Processing.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Li Xiao-Hui, Yin Fei, and Liu Cheng-Lin. 2018. Printed/handwritten texts and graphics separation in complex documents using conditional random fields. In Proceedings of the 13th IAPR International Workshop on Document Analysis and Systems. 145150.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Likforman Laurence, Vaillant Pascal, and Jacopière Aliette de Bodard de la. 2006. Automatic name extraction from degraded document images. Pattern Analysis and Applications 9, 2 (Aug 2006), 211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Marti U.-V. and Bunke H.. 2002. The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5, 1 (Nov 2002), 3946.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Mikolajczyk Krystian and Schmid Cordelia. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 10 (Oct 2005), 16151630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Pal U. and Chaudhuri B. B.. 1999. Automatic separation of machine-printed and hand-written text lines. In Proceedings of the 5th ICDAR’99.645648. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Pandey S. and Harit G.. 2015. Segmenting printed text and handwritten annotation by Spectral Partitioning. In Proceedings of the 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics. 14.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Peng Xujun, Setlur Srirangaraj, Govindaraju Venu, and Sitaram Ramachandrula. 2013. Handwritten text separation from annotated machine printed documents using Markov Random Fields. International Journal on Document Analysis and Recognition 16, 1 (2013), 116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Pinson S. J. and Barrett W. A.. 2011. Connected component level discrimination of handwritten and machine-printed text using eigenfaces. In Proceedings of the International Conference on Document Analysis and Recognition. 13941398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Sang N., Wei L., and Wang Y.. 2010. A biologically-inspired top-down learning model based on visual attention. In Proceedings of the 20th International Conference on Pattern Recognition. 37363739. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Santos J. Eduardo Bastos Dos, Dubuisson B., and Bortolozzi F.. 2002. Characterizing and distinguishing text in bank cheque images. In Proceedings of the XV Brazilian Symposium on Computer Graphics and Image Processing. 203209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Seuret Mathias, Liwicki Marcus, and Ingold Rolf. 2014. Pixel level handwritten and printed content discrimination in scanned documents. In Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. 423428.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Shelhamer E., Long J., and Darrell T.. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (April 2017), 640651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Shetty Shravya, Srinivasan Harish, and Srihari Sargur. 2007. Segmentation and labeling of documents using conditional random fields. In Proceedings of the International Conference on Document Analysis and Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Srihari S. N., Shin Yong-Chul, Ramanaprasad V., and Lee Dar-Shyang. 1996. A system to read names and addresses on tax forms. Proceedings of the IEEE 84, 7 (Jul 1996), 10381049.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Walther Dirk and Koch Christof. 2006. Modeling attention to salient proto-objects. Neural Networks 19, 9 (2006), 13951407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Xu Linfeng, Zeng Liaoyuan, Duan Huiping, and Sowah Nii Longdon. 2014. Saliency detection in complex scenes. EURASIP Journal on Image and Video Processing2014, 1 (24 Jun 2014), 31.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Xu Mai, Jiang Lai, Ye Zhaoting, and Wang Zulin. 2016. Bottom-up saliency detection with sparse representation of learnt texture atoms. Pattern Recognition 60, C (2016), 348360. DOI: 10.1016/j.patcog.2016.05.023 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Yang J. and Yang M. H.. 2012. Top-down visual saliency via joint CRF and dictionary learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 22962303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Zagoris Konstantinos, Pratikakis Ioannis, Antonacopoulos Apostolos, Gatos Basilis, and Papamarkos Nikos. 2014. Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognition 47, 3 (2014), 10511062. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Zheng Yefeng, Li Huiping, and Doermann David. 2004. Machine printed text and handwriting identification in noisy document images. IEEE Transactions on Pattern Analysis Machine Intelligence 26, 3 (2004), 337–353.Google ScholarGoogle Scholar

Index Terms

  1. Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 3
      May 2022
      413 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3505182
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 December 2021
      • Accepted: 1 September 2021
      • Revised: 1 August 2021
      • Received: 1 April 2020
      Published in tallip Volume 21, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)120
      • Downloads (Last 6 weeks)10

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!