skip to main content
research-article

Improving Handwritten Arabic Character Recognition by Modeling Human Handwriting Distortions

Published:21 November 2015Publication History
Skip Abstract Section

Abstract

Handwritten Arabic character recognition systems face several challenges, including the unlimited variation in human handwriting and the unavailability of large public databases of handwritten characters and words. The use of synthetic data for training and testing handwritten character recognition systems is one of the possible solutions to provide several variations for these characters and to overcome the lack of large databases. While this can be using arbitrary distortions, such as image noise and randomized affine transformations, such distortions are not realistic. In this work, we model real distortions in handwriting using real handwritten Arabic character examples and then use these distortion models to synthesize handwritten examples that are more realistic. We show that the use of our proposed approach leads to significant improvements across different machine-learning classification algorithms.

References

  1. B. Al-Badr and S. A. Mahmoud. 1995. Survey and bibliography of arabic optical text recognition. Signal Processing 41, 1 (1995), 49--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. A.-H. Mohamad, L. Likforman-Sulem, and C. Mokbel. 2009. Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 7 (2009), 1165--1177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Al-Ohali, M. Cheriet, and C. Suen. 2003. Databases for recognition of handwritten Arabic cheques. Pattern Recognition 36, 1 (2003), 111--121.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. Amin. 1998. Off-line Arabic character recognition: The state of the art. Pattern Recognition 31 (1998), 513--530. DOI:http://dx.doi.org/science/article/B6V14-3WH50NV-3/2/bdea5cf6fdb37081d189f011a8110a06Google ScholarGoogle ScholarCross RefCross Ref
  5. L. Dinges, A. Al-Hamadi, and M. Elzobi. 2013. An approach for Arabic handwriting synthesis based on active shape models. In 2013 12th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1260--1264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. S. Elarian, H. A. Al-Muhsateb, and L. M. Ghouti. 2011. Arabic handwriting synthesis. In 1st International Workshop on Frontiers in Arabic Handwriting Recognition.Google ScholarGoogle Scholar
  7. A. Graves and J. Schmidhuber. 2009. Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems. 545--552.Google ScholarGoogle Scholar
  8. T. M. Ha and H. Bunke. 1997. Off-line, handwritten numeral recognition by perturbation method. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 5 (1997), 535--539. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Habash and R. M. Roth. 2011. Using deep morphology to improve automatic error detection in Arabic handwriting recognition. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics. 875--884. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Hamdani, H. El Abed, M. Kherallah, and A. M. Alimi. 2009. Combining multiple HMMs using on-line and off-line features for off-line Arabic handwriting recognition. In 10th International Conference on Document Analysis and Recognition, 2009 (ICDAR’09). IEEE, 201--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Kanungo. 1996. Document Degradation Models and a Methodology for Degradation Model Validation. Ph.D. Dissertation. University of Washington. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Kharma, M. Ahmed, and R. Ward. 1999. A new comprehensive database of handwritten Arabic words, numbers, and signatures used for OCR testing. 1999 IEEE Canadian Conference on Electrical and Computer Engineering 2 (1999).Google ScholarGoogle Scholar
  13. M. Z. Khedher, G. A. Abandah, and A. M. Al-Khawaldeh. 2005. Optimizing feature selection for recognizing handwritten Arabic characters. In The 2nd World Enformatika Conference, 2005 (WEC’05).Google ScholarGoogle Scholar
  14. E. G. Learned-Miller. 2006. Data driven image models through continuous joint alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 2 (2006), 236--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. A. Mahmoud and M. H. Abu-Amara. 2010. Recognition of handwritten Arabic (Indian) numerals using radon-fourier-based features. Recent Advances in Signal Processing, Robotics and Automation (2010), 158--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. A. Mahmoud and S. M. Awaida. 2009. Recognition of off-line handwritten Arabic (Indian) numerals using multi-scale features and support vector machines vs. hidden markov models. Arabian Journal for Science and Engineering 34, 2B (2009), 429--444.Google ScholarGoogle Scholar
  17. V. Margner and M. Pechwitz. 2001. Synthetic data for Arabic OCR system development. In Proceedings of the 6th International Conference on Document Analysis and Recognition (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. G. Miller. 2002. Learning from One Example in Machine Vision by Sharing Probability Densities. Ph.D. Dissertation. Massachusetts Institute of Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Miyao and M. Maruyama. 2006. Virtual example synthesis based on PCA for off-line handwritten character recognition. Lecture Notes in Computer Science 3872 (2006), 96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Pechwitz, S. S. Maddouri, V. Märgner, N. Ellouze, and H. Amiri. 2002. IFN/ENIT-database of handwritten Arabic words. In Proceedings of of CIFED, Vol. 2. Citeseer, 127--136.Google ScholarGoogle Scholar
  21. A. Sahloul and C. Suen. 2014. OFF-line system for the recognition of handwritten arabic character. Fourth International Conference on Computer Science & Information Technology. 227--244.Google ScholarGoogle Scholar
  22. S. Saleem, H. Cao, K. Subramanian, M. Kamali, R. Prasad, and P. Natarajan. 2009. Improvements in BBN’s HMM-based offline Arabic handwriting recognition system. In 10th International Conference on Document Analysis and Recognition, 2009 (ICDAR’09). IEEE, 773--777. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Shatnawi. 2015. Offline handwritten Arabic character recognition: A survey. In 2015 International Conference on Image Processing, Computer Vision, and Pattern Recognition, 2015 (IPCV’15). The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp). 52--58.Google ScholarGoogle Scholar
  24. M. T. Parvez and S. A. Mahmoud. 2013. Arabic handwriting recognition using structural and syntactic pattern attributes. Pattern Recognition 46, 1 (Jan. 2013), 141--154. DOI:http://dx.doi.org/10.1016/j.patcog.2012.07.012 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. Tomeh, N. Habash, R. Roth, N. Farra, P. Dasigi, and M. T. Diab. 2013. Reranking with linguistic and semantic features for arabic optical character recognition. In ACL (2). 549--555.Google ScholarGoogle Scholar
  26. T. Wakahara, Y. Kimura, and A. Tomono. 2001. Affine-invariant recognition of gray-scale characters using globalaffine transformation correlation. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 4 (2001), 384--395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Wakahara and K. Odaka. 1998. Adaptive normalization of handwritten characters using global/localaffine transformation. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 12 (1998), 1332--1341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Zaki, S. Wolfsheimer, G. Nuel, and S. Khuri. 2011. Conotoxin protein classification using free scores of words and support vector machines. BMC Bioinformatics 12, 1 (2011), 217.Google ScholarGoogle ScholarCross RefCross Ref
  29. N. M. Zaki, S. Deris, and R. M. Illias. 2004. Features extraction for protein homology detection using hidden markov models combining scores. International Journal of Computational Intelligence and Applications 4, 01 (2004), 1--12.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Improving Handwritten Arabic Character Recognition by Modeling Human Handwriting Distortions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!