Abstract
In this article, we address the problem of localizing text and symbolic annotations on the scanned image of a printed document. Previous approaches have considered the task of annotation extraction as binary classification into printed and handwritten text. In this work, we further subcategorize the annotations as underlines, encirclements, inline text, and marginal text. We have collected a new dataset of 300 documents constituting all classes of annotations marked around or in-between printed text. Using the dataset as a benchmark, we report the results of two saliency formulations—CRF Saliency and Discriminant Saliency, for predicting salient patches, which can correspond to different types of annotations. We also compare our work with recent semantic segmentation techniques using deep models. Our analysis shows that Discriminant Saliency can be considered as the preferred approach for fast localization of patches containing different types of annotations. The saliency models were learned on a small dataset, but still, give comparable performance to the deep networks for pixel-level semantic segmentation. We show that saliency-based methods give better outcomes with limited annotated data compared to more sophisticated segmentation techniques that require a large training set to learn the model.
- [1] . 2017. Neighborhood label extension for handwritten/printed text separation in Arabic documents. In Proceedings of the 2017 1st International Workshop on Arabic Script Analysis and Recognition. 36–40.Google Scholar
Cross Ref
- [2] . 2004. A face detection using biologically motivated bottom-up saliency map model and top-down perception model. Neurocomputing 56, 1 (2004), 475–480.Google Scholar
Cross Ref
- [3] . 2003. Chapter 10: Computerising Unit-level Data in Natural History Card Archives. Scoble, M. J. (Ed.), 176pp.Google Scholar
- [4] . 2013. Handwritten and printed text separation in real document. In Proceedings of MVA.Google Scholar
- [5] . 2002. Shape matching and object recognition using shape contexts. IEEE Transactions on PAMI 24, 4 (
Apr 2002), 509–522. Google ScholarDigital Library
- [6] . 2014. Multi-oriented handwritten annotations extraction from scanned documents. In Proceedings of the Document Analysis Systems 11th IAPR International Workshop on. 126–130. Google Scholar
Digital Library
- [7] . 2015. Salient object detection: A benchmark. IEEE Transactions on Image Processing 24, 12 (
Dec 2015), 5706–5722.Google ScholarDigital Library
- [8] . 2013. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (
Jan 2013), 185–207. Google ScholarDigital Library
- [9] . 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3 (2011), 27:1–27:27. Google Scholar
Digital Library
- [10] . 2015. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (
March 2015), 569–582.Google ScholarDigital Library
- [11] . 2009. Automatic discrimination between printed and handwritten text in documents. In Proceedings of the Computer Graphics and Image Processing XXII Brazilian Symposium. 261–267. Google Scholar
Digital Library
- [12] . 2014. How to separate between machine-printed/handwritten and Arabic/words?Electronic Letters on Computer Vision and Image Analysis 13, 1 (2014), 1–16.Google Scholar
Cross Ref
- [13] . 2016. Handwritten and machine-printed text discrimination using a template matching approach. In Proceedings of the 2016 12th IAPR Workshop on Document Analysis Systems (DAS). 399–404.Google Scholar
Cross Ref
- [14] . 2017. A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognition 64 (2017), 1–14.
DOI: https://doi.org/10.1016/j.patcog.2016.10.023 Google ScholarDigital Library
- [15] . 1998. Classification of machine-printed and handwritten texts using character block layout variance.Pattern Recognition 31, 9 (1998), 1275–1284.Google Scholar
Cross Ref
- [16] . 2006. Identifying handwritten text in mixed documents. In Proceedings of the 18th International Conference on Pattern Recognition, Vol. 2. 1142–1145. Google Scholar
Digital Library
- [17] . 1993. Writing style detection by statistical combination of classifiers in form reader applications. In Proceedings of the 2nd ICDAR, 1993. 581–584.Google Scholar
Cross Ref
- [18] . 2009. Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 6 (
June 2009), 989–1005. Google ScholarDigital Library
- [19] . 2005. An experimental comparison of three guiding principles for the detection salient image locations: Stability, complexity, and discrimination. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 84–84. Google Scholar
Digital Library
- [20] . 2009. Decision-theoretic saliency: Computational principles, biological plausibility, and implications for neurophysiology and psychophysics. Neural Computation 21, 1 (2009), 239–271.
DOI: DOI: DOI: https://doi.org/10.1162/neco.2009.11-06-391 Google ScholarCross Ref
- [21] . 2017. Instance–level salient object segmentation. In Proceedings of the IEEE Conference on CVPR.Google Scholar
- [22] . 2001. Separating handwritten material from machine printed text using hidden Markov models. In Proceedings of the 6th International Conference on Document Analysis and Recognition. 439–443. Google Scholar
Digital Library
- [23] . 2009. Sigma set: A small second order statistical region descriptor. In Proceedings of the Computer Vision and Pattern Recognition.Google Scholar
- [24] . 2004. Classification of machine-printed and handwritten addresses on Korean mail piece images using geometric features. In Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2. 383–386. Google Scholar
Digital Library
- [25] . 2009. Learning to predict where humans look. In Proceedings of the IEEE 12th International Conference on Computer Vision. 2106–2113.Google Scholar
Cross Ref
- [26] . 2007. A robust two level classification algorithm for text localization in documents. In Proceedings of the Advances in Visual Computing. , , , , , , , , , , , and (Eds.).
Lecture Notes in Computer Science , Vol. 4842. Springer Berlin, 96–105. Google ScholarDigital Library
- [27] . 2002. An unconstrained handwriting recognition system.International Journal on Document Analysis and Recognition 4, 4 (2002), 226–242.Google Scholar
Cross Ref
- [28] . 2004. Discrimination of machine-printed from handwritten text using simple structural characteristics. In Proceedings of the 17th International Conference on Pattern Recognition. 437–440. Google Scholar
Digital Library
- [29] . 2014. Top down saliency estimation via superpixel-based discriminative dictionaries. In Proceedings of the British Machine Vision Conference. BMVA Press.Google Scholar
Cross Ref
- [30] . 2011. Shape codebook based handwritten and machine printed text zone extraction. In Proceedings of the Document Recognition and Retrieval XVIII, Gady Agam and Christian Viard-Gaudin (Eds.). SPIE, 47–54.
DOI: 10.1117/12.876725Google Scholar - [31] . 2018. Recognizing challenging handwritten annotations with fully convolutional networks. In Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition. 25–31.Google Scholar
Cross Ref
- [32] . 2016. A multi-scale approach to extract meaningful annotations from document images. In Proceedings of the International Conference on Accoustics, Speech, and Signal Processing.Google Scholar
Digital Library
- [33] . 2018. Printed/handwritten texts and graphics separation in complex documents using conditional random fields. In Proceedings of the 13th IAPR International Workshop on Document Analysis and Systems. 145–150.Google Scholar
Cross Ref
- [34] . 2006. Automatic name extraction from degraded document images. Pattern Analysis and Applications 9, 2 (
Aug 2006), 211. Google ScholarDigital Library
- [35] . 2002. The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5, 1 (
Nov 2002), 39–46.Google ScholarCross Ref
- [36] . 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 10 (
Oct 2005), 1615–1630. Google ScholarDigital Library
- [37] . 1999. Automatic separation of machine-printed and hand-written text lines. In Proceedings of the 5th ICDAR’99.645–648. Google Scholar
Digital Library
- [38] . 2015. Segmenting printed text and handwritten annotation by Spectral Partitioning. In Proceedings of the 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics. 1–4.Google Scholar
Cross Ref
- [39] . 2013. Handwritten text separation from annotated machine printed documents using Markov Random Fields. International Journal on Document Analysis and Recognition 16, 1 (2013), 1–16. Google Scholar
Digital Library
- [40] . 2011. Connected component level discrimination of handwritten and machine-printed text using eigenfaces. In Proceedings of the International Conference on Document Analysis and Recognition. 1394–1398. Google Scholar
Digital Library
- [41] . 2010. A biologically-inspired top-down learning model based on visual attention. In Proceedings of the 20th International Conference on Pattern Recognition. 3736–3739. Google Scholar
Digital Library
- [42] . 2002. Characterizing and distinguishing text in bank cheque images. In Proceedings of the XV Brazilian Symposium on Computer Graphics and Image Processing. 203–209. Google Scholar
Digital Library
- [43] . 2014. Pixel level handwritten and printed content discrimination in scanned documents. In Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. 423–428.Google Scholar
Cross Ref
- [44] . 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 4 (
April 2017), 640–651. Google ScholarDigital Library
- [45] . 2007. Segmentation and labeling of documents using conditional random fields. In Proceedings of the International Conference on Document Analysis and Recognition.Google Scholar
Cross Ref
- [46] . 1996. A system to read names and addresses on tax forms. Proceedings of the IEEE 84, 7 (
Jul 1996), 1038–1049.Google ScholarCross Ref
- [47] . 2006. Modeling attention to salient proto-objects. Neural Networks 19, 9 (2006), 1395–1407. Google Scholar
Digital Library
- [48] . 2014. Saliency detection in complex scenes. EURASIP Journal on Image and Video Processing2014, 1 (
24 Jun 2014), 31.Google ScholarCross Ref
- [49] . 2016. Bottom-up saliency detection with sparse representation of learnt texture atoms. Pattern Recognition 60, C (2016), 348–360.
DOI: 10.1016/j.patcog.2016.05.023 Google ScholarDigital Library
- [50] . 2012. Top-down visual saliency via joint CRF and dictionary learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2296–2303. Google Scholar
Digital Library
- [51] . 2014. Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognition 47, 3 (2014), 1051–1062. Google Scholar
Digital Library
- [52] . 2004. Machine printed text and handwriting identification in noisy document images. IEEE Transactions on Pattern Analysis Machine Intelligence 26, 3 (2004), 337–353.Google Scholar
Index Terms
Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models
Recommendations
Visual saliency and terminology extraction for document annotation
DocEng '13: Proceedings of the 2013 ACM symposium on Document engineeringThe document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify ...
A framework for the standardized description of handwritten annotations
DCMI '05: Proceedings of the 2005 international conference on Dublin Core and metadata applications: vocabularies in practiceIn this paper we introduce a novel way for the standardized description of handwritten annotations on an electronic document. This approach allows it on the one hand to describe the annotation itself which means the geometric representation. On the ...
Separating Handwritten Material from Machine Printed Text Using Hidden Markov Models
ICDAR '01: Proceedings of the Sixth International Conference on Document Analysis and RecognitionAbstract: In this paper, we address the problem of separating handwritten annotations from machine printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMM) to distinguish between machine printed ...






Comments