skip to main content
research-article

Online Handwritten Gurmukhi Strokes Dataset Based on Minimal Set of Words

Authors Info & Claims
Published:27 June 2016Publication History
Skip Abstract Section

Abstract

The online handwriting data are an integral part of data analysis and classification research, as collected handwritten data offers many challenges to group handwritten stroke classes. The present work has been done for grouping handwritten strokes from the Indic script Gurmukhi. Gurmukhi is the script of the popular and widely spoken language Punjabi. The present work includes development of the dataset of Gurmukhi words in the context of online handwriting recognition for real-life use applications, such as maps navigation. We have collected the data of 100 writers from the largest cities in the Punjab region. The writers’ variations, such as writing skill level (beginner, moderate, and expert), gender, right or left handedness, and their adaptability to digital handwriting, have been considered in dataset development. We have introduced a novel technique to form handwritten stroke classes based on a limited set of words. The presence of all alphabets including vowels of Gurmukhi script has been considered before selection of a word. The developed dataset includes 39,411 strokes from handwritten words and forms 72 classes of strokes after using a k-means clustering technique and manual verification through expert and moderate writers. We have achieved recognition results using the Hidden Markov Model as 87.10%, 85.43%, and 84.33% for middle zone strokes when using training data as 66%, 50%, and 80% of the developed dataset. The present work is a step in a direction to find groups for unknown handwriting strokes with reasonably higher levels of accuracy.

References

  1. http://lipitk.sourceforge.net/hpl-datasets.htm.Google ScholarGoogle Scholar
  2. http://www.isical.ac.in/∼ujjwal/download/database.html.Google ScholarGoogle Scholar
  3. A. Bharath and S. Madhvanath. 2012. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012), 670--682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Deepu, S. Madhvanath, and A. G. Ramakrishnan. 2004. Principal component analysis for online handwritten character recognition. In Proceedings of 17th International Conference on Pattern Recognition (2004), 327--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Ghosh, T. Dube, and A. P. Shivaprasad. 2010. Script recognition-a review. IEEE Trans. Pattern Anal. Mach. Intell. 32 (2010), 2142--2161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. A. Hartigan and M. A. Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C 28 (1979), 100--108.Google ScholarGoogle Scholar
  7. A. K. Jain. 2010. Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31 (2010), 651--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Kumar and R. K. Sharma. 2013. An efficient post processing algorithm for online handwriting Gurmukhi character recognition using set theory. Int. J. Pattern Recogn. Artif. Intell. 27 (2013).Google ScholarGoogle Scholar
  9. R. Kumar, R. K. Sharma, and A. Sharma. 2014. Recognition of multi-stroke based online handwritten Gurmukhi aksharas. In Proceedings of the National Academy of Sciences, India Section A: Physical Sciences (2014).Google ScholarGoogle Scholar
  10. R. Kunwar, U. Pal, and M. Blumenstein. 2014. Semi-supervised online Bayesian network learner for handwritten characters recognition. In Proceedings of 22nd International Conference on Pattern Recognition, Stockholm, Sweden (2014), 3104--3109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Madhvanath, V. Deepu, and T. M. Kadiresan. 2006. Lipitk: A generic toolkit for online handwriting recognition. In Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR 2006), La Baule, France (October 2006).Google ScholarGoogle Scholar
  12. B. Nethravathi, C. P. Archana, K. Shashikiran, A. G. Ramakrishnan, and V. Kumar. 2010. Creation of a huge annotated database for Tamil and Kannada OHR. In Proceedings of International Conference on Frontiers in Handwriting Recognition (2010), 415--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. K. Parui, K. Guin, U. Bhattacharya, and B. B. Chaudhuri. 2008. Online handwritten Bangla character recognition using HMM. In Proceedings of 19th International Conference on Pattern Recognition (2008), 1--4.Google ScholarGoogle Scholar
  14. R. Plamondon and S. N. Srihari. 2000. Online and off-line handwriting recognition: A comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000), 63--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. K. Prasad, I. Khan, N. R. Chanukotimath, and F. Khan. 2012. On-line handwritten character recognition system for Kannada using principal component analysis approach: For handheld devices. In Proceedings of World Congress on Information and Communication Technologies (2012), 675--678.Google ScholarGoogle Scholar
  16. J. R. Prasad and U. V. Kulkarni. 2014. Gujrati character recognition using adaptive neuro fuzzy classifier. In Proceedings of International Conference on Electron System Signal Process Computer Technology (2014), 402--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77 (1989), 257--286.Google ScholarGoogle ScholarCross RefCross Ref
  18. K. Roy, N. Sharma, T. Pal, and U. Pal. 2007. Online Bangla handwriting recognition system. In Proceedings of International Conference on Advances in Pattern Recognition (2007), 117--122.Google ScholarGoogle Scholar
  19. O. Samanta, U. Bhattacharya, and S. K. Parui. 2014. Smoothing of HMM parameters for efficient recognition of online handwriting. Pattern Recogn. 47 (2014), 3614--3629. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Sarkar, N. Das, S. Basu, M. Kundu, M. Nasipuri, and D. K. Basu. 2012. A database of unconstrained handwritten Bangla and Bangla-English mixed script document image. Int. J. Doc. Anal. Recogn. 15 (2012), 71--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. M. R. Sazal, S. K. Biswas, M. F. Amin, and K. Murase. 2014. Bangla handwritten character recognition using deep belief network. In Proccedings of International Conference on Electrical Information and Communication Technology (2014), 1--5.Google ScholarGoogle Scholar
  22. A. Sharma. 2009. Online handwritten Gurmukhi character recognition. Ph.D. Thesis. Thapar University Patiala (2009).Google ScholarGoogle Scholar
  23. A. Sharma, R. Kumar, and R. K. Sharma. 2008. Online handwritten Gurmukhi character recognition using elastic matching. In Proceedings of Congress on Image and Signal Processing 2 (2008), 391--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Sharma, R. Kumar, and R. K. Sharma. 2009a. Online preprocessing of handwritten Gurmukhi strokes. Int. J. Mach. Graphics. Vis. 18 (2009), 105--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Sharma, R. Kumar, and R. K. Sharma. 2009b. Rearrangement of recognized strokes in online handwritten Gurmukhi words recognition. In Proceedings of 10th International Conference on Document Analysis and Recognition (2009), IEEE, 1241--1245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Sundaram and A. G. Ramakrishnan. 2011. Lexicon-free, novel segmentation of online handwritten Indic words. In Proceedings of 11th International Conference on Document Analysis and Recognition (2011), 1175--1179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Sundaram and A. G. Ramakrishnan. 2013. Attention-feedback based robust segmentation of online handwritten isolated Tamil words. ACM Transactions on Asian Language Information Processing 12 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Swethalakshmi, A. Jayaraman, V. S. Chakravarthy, and C. C. Sekhar. 2006. Online handwritten character recognition of Devanagari and Telugu characters using support vector machines. In Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition (2006).Google ScholarGoogle Scholar

Index Terms

  1. Online Handwritten Gurmukhi Strokes Dataset Based on Minimal Set of Words

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!