Abstract
The use of computer programs in breaching web site security is common today. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) and human interaction proofs are the cost-effective solution to these kinds of computer attacks on web sites. These CAPTCHAs are available in many forms, such as those based on text, images and audio. A CAPTCHA must be secure enough that it cannot be broken by a computer program, and it must be usable enough that humans can easily understand it. The most popular is the text-based scheme. Most text-based CAPTCHAs are based on the English language and are not usable by the native people of India. Research has proven that native people are more comfortable with native language–based CAPTCHA. Devanagari-based CAPTCHAs are also available, but the security aspect has not been tested. Unfortunately, English language–based CAPTCHAs are successfully broken. Therefore, it is important to test the security of Devanagari script-based CAPTCHAs. We picked five unique monochrome CAPTCHAs and five unique greyscale CAPTCHAs for testing security. We achieved 88.13% to 97.6% segmentation rates on these schemes and generated six types of features for these segmented characters, such as raw pixels, zoning, projection, Scale-Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF) and Oriented Fast and Rotated BRIEF (ORB). For classification, we used three classifiers for comparative analyses. Using k-Nearest Neighbour (k-NN), Support Vector Machine (SVM) and Random Forest, we achieved high recognition on monochrome and greyscale schemes. For monochrome Devanagari CAPTCHAs, the recognition rate of k-NN ranges from 64.78% to 82.39%, SVM ranges from 76.46% to 91.34% and Random Forest ranges from 80.34% to 91.28%. For greyscale Devanagari CAPTCHAs, the recognition rate of k-NN ranges from 67.52% to 85.47%, SVM ranges from 76.9% to 91.71% and Random Forest ranges from 83.07% to 92.13%. We achieved a breaking rate for monochrome schemes of 66% to 85% and for greyscale schemes of 73% to 93%.
- L. Ahn, M. Blum, and J. Langford. 2004. Telling humans and computers apart automatically. Communications of the ACM 47, 2 (2004), 56–60. Google Scholar
Digital Library
- W. Al-Sudani, A. Gill, C. Li, J. Wang, and F. Liu. 2010. Protection through multimedia captchas. In Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia. 63–68. Google Scholar
Digital Library
- S. Alsuhibany. 2018. Generating Arabic handwritten CAPTCHA for cyber security. International Journal of Computer Science and Network Security 18, 3 (2018), 41–47.Google Scholar
- M. Banday and N. Shah. 2011. Challenges of CAPTCHA in the accessibility of Indian regional websites. In Proceedings of the 4th Annual ACM Bangalore Conference. 1–4. Google Scholar
Digital Library
- J. Chen, X. Luo, Y. Guo, Y. Zhang, and D. Gong. 2017. A survey on breaking technique of text-based CAPTCHA. Security and Communication Networks 2017 (2017), Article 6898617.Google Scholar
- M. Kumar, M. Jindal, and R. Sharma. 2011. Review on OCR for handwritten Indian scripts character recognition. In Advances in Digital Image Processing and Information. Communications in Computer and Information Science, Vol. 205. Springer, 268–276.Google Scholar
- M. Kumar, M. Jindal, and R. Sharma. 2019. Character and numeral recognition for non-Indic and Indic scripts: A survey. Artificial Intelligence Review 52 (2019), 2235–2261.Google Scholar
Digital Library
- M. Kumar and H. Kaur. 2018. A comprehensive review on word recognition for non-Indic and Indic scripts. Pattern Analysis and Applications 21 (2018), 897–929. Google Scholar
Digital Library
- R. Kumar and K. Ravulakollu. 2014. On the performance of Devnagari handwritten character recognition. World Applied Sciences Journal 31, 6 (2014), 1012–1019.Google Scholar
- M. Kumar, R. Sharma, and M. Jindal. 2014. Efficient feature extraction techniques for offline handwritten Gurmukhi character recognition. National Academy Science Letters 37 (2014), 381–391.Google Scholar
Cross Ref
- Z. Noury and M. Rezaei. 2020. Deep-CAPTCHA: A deep learning based CAPTCHA solver for vulnerability assessment. arXiv:2006.08296Google Scholar
- S. Shirali-Shahreza, H. Abolhassani, H. Sameti, and M. Shirali-Shahreza. 2009. Spoken CAPTCHA: A CAPTCHA system for blind users. In Proceedings of the 2009 ISECS International Colloquium on Computing, Communication, Control, and Management. 221–224.Google Scholar
- M. Shirali-Shahreza and S. Shirali-Shahreza. 2007. Question-based CAPTCHA. In Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’07). 54–58. Google Scholar
Digital Library
- S. Sivakorn, I. Polakis, and K. Angelos. 2016. I am robot deep learning to break semantic image CAPTCHAs. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P’16). 388–403.Google Scholar
- Y. Xu, G. Reynaga, S. Chiasson, and F. Jan-Michael. 2012. Security and usability challenges of moving-object CAPTCHAs: Decoding codewords in motion. In Proceedings of the 21st USENIX Security Symposium (Security’12). 49–64. Google Scholar
Digital Library
- J. Yu, X. Ma, and T. Han. 2016. Usability investigation on the localization of text CAPTCHAs: Take Chinese characters as a case study. Transdisciplinary Engineering: A Paradigm Shift 5 (2016), 233–242.Google Scholar
Index Terms
A Novel Attack on Monochrome and Greyscale Devanagari CAPTCHAs
Recommendations
Benchmarks for Designing a Secure Devanagari CAPTCHA
AbstractCAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is one of the easiest ways to achieve human authentication on the web sites. Text-based CAPTCHAs are the most popular type of CAPTCHA used on the web sites. Most ...
An efficient technique for breaking of coloured Hindi CAPTCHA
AbstractCyber-attacks on the internet are not a new phenomenon. Hackers are always looking for ways to breach the security of the internet. Good researchers continuously develop solutions to the problems of breaches in security. Human interaction proofs ...
A machine learning attack against variable-length Chinese character CAPTCHAs
CAPTCHA (Completely Automated Public Turing test to tell Computer and Human Apart) is widely used as a standard security mechanism to protect resources on websites. Among various kinds of CAPTCHAs, the text-based CAPTCHA is the most popular scheme, ...






Comments