10.1145/3128572.3140443acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

In (Cyber)Space Bots Can Hear You Speak: Breaking Audio CAPTCHAs Using OTS Speech Recognition

Published:03 November 2017Publication History

ABSTRACT

Captchas have become almost ubiquitous as they are commonly deployed by websites as part of their defenses against fraudsters. However visual captchas pose a considerable obstacle to certain groups of users, such as the visually impaired, and that has necessitated the inclusion of more accessible captcha schemes. As a result, many captcha services also offer audio challenges as an alternative.

In this paper we conduct an extensive exploration of the audio captcha ecosystem, and present effective low-cost attacks against the audio challenges offered by seven major captcha services. Motivated by the recent advancements in deep learning, we demonstrate how off-the-shelf (OTS) speech recognition services can be misused by attackers for trivially bypassing the most popular audio captchas. Our experimental evaluation highlights the effectiveness of our approach as our AudioBreaker system is able to break all captcha schemes, achieving accuracies of up to 98.3% against Google's ReCaptcha.

The broader implications of our study are twofold. First, we find that the wide availability of advanced speech recognition services has severely lowered the technical capabilities required by fraudsters for deploying effective attacks, as there is no longer a need to build sophisticated custom classifiers. Second, we find that the availability of audio captchas poses a significant risk to services, as our attacks against ReCaptcha's audio challenges are 13.1%-27.5% more accurate than state-of-the-art attacks against the corresponding image-based challenges. Overall, we argue that it is necessary to explore alternative captcha designs that fulfill the accessibility properties of audio captchas without undermining the security offered by their visual counterparts.

References

  1. 2008. ZDNet - Inside India's CAPTCHA solving economy. (2008). http://www.zdnet.com/article/inside-indias-captcha-solving-economy/.Google ScholarGoogle Scholar
  2. 2012. David Pogue - Time to Kill Off Captchas. (2012). https://www.scientificamerican.com/article/time-to-kill-off-captchas/.Google ScholarGoogle Scholar
  3. 2014. PopTech - Luis von Ahn: CAPTCHAs' My fault.. (2014). https://poptech.org/blog/luis_von_ahn_captchas_my_fault.Google ScholarGoogle Scholar
  4. 2014. World Health Organization - Visual Impairment and Blindness. (2014). http://www.who.int/mediacentre/factsheets/fs282/en/.Google ScholarGoogle Scholar
  5. 2017. Reaching new records in speech recognition. (2017). https://www.ibm.com/blogs/watson/2017/03/reaching-new-records-in-speech-recognition/.Google ScholarGoogle Scholar
  6. 2017. SeleniumHQ - Browser Automation. (2017). http://www.seleniumhq.org/.Google ScholarGoogle Scholar
  7. 2017. Windows GUI Automation with Python. (2017). https://github.com/pywinauto/pywinauto.Google ScholarGoogle Scholar
  8. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, and others. 2016. TensorFlow: A System for Large-Scale Machine Learning. OSDI, Vol. Vol. 16. 265--283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jeffrey P. Bigham, Maxwell B. Aller, Jeremy T. Brudvik, Jessica O. Leung, Lindsay A. Yazzolino, and Richard E. Ladner. 2008. Inspiring Blind High School Students to Pursue Computer Science with Instant Messaging Chatbots. In Proceedings of the 39th SIGCSE Technical Symposium on Computer Science Education (SIGCSE '08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jeffrey P. Bigham and Anna C. Cavender. 2009. Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1829--1838.Google ScholarGoogle Scholar
  11. Kevin Bock, Daven Patel, George Hughey, and Dave Levin. unCaptcha: A Low-Resource Defeat of reCaptcha's Audio Challenge 11th USENIX Workshop on Offensive Technologies (WOOT 17).Google ScholarGoogle Scholar
  12. Elie Bursztein, Jonathan Aigrain, Angelika Moscicki, and John C. Mitchell. 2014. The End is Nigh: Generic Solving of Text-based CAPTCHAs. 8th USENIX Workshop on Offensive Technologies (WOOT 14). USENIX Association.Google ScholarGoogle Scholar
  13. Elie Bursztein, Romain Beauxis, Hristo Paskov, Daniele Perito, Celine Fabry, and John Mitchell. 2011. The failure of noise-based non-continuous audio captchas. In Security and Privacy (SP), 2011 IEEE Symposium on. IEEE, 19--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Elie Bursztein and Steven Bethard. Decaptcha: Breaking 75% of eBay Audio CAPTCHAs. In Proceedings of the 3rd USENIX Conference on Offensive Technologies (WOOT'09).Google ScholarGoogle Scholar
  15. Elie Bursztein, Steven Bethard, Celine Fabry, John C. Mitchell, and Dan Jurafsky. How Good Are Humans at Solving CAPTCHAs' A Large Scale Evaluation. In SP '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kumar Chellapilla, Kevin Larson, Patrice Simard, and Mary Czerwinski. Designing Human Friendly Human Interaction Proofs (HIPs) CHI '05.Google ScholarGoogle Scholar
  17. Monica Chew and J. D. Tygar. Image Recognition CAPTCHAs. In ISC '04. Google ScholarGoogle ScholarCross RefCross Ref
  18. Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, and Vinay Shet. Multi-digit number recognition from street view imagery using deep convolutional neural networks. In CoRR '13.Google ScholarGoogle Scholar
  19. Google Online Security Blog. 2014. Are you a robot? Introducing "No CAPTCHA reCAPTCHA". (2014). http://googleonlinesecurity.blogspot.com/2014/12/are-you-robot-introducing-no-captcha.html.Google ScholarGoogle Scholar
  20. Gaurav Goswami, Brian M. Powell, Mayank Vatsa, Richa Singh, and Afzel Noore. 2014. FaceDCAPTCHA: Face detection based color image CAPTCHA. Future Generation Computer Systems Vol. 31 (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jonathan Holman, Jonathan Lazar, Jinjuan Heidi Feng, and John D'Arcy. 2007. Developing usable CAPTCHAs for blind users. In Proceedings of the 9th international ACM SIGACCESS conference on Computers and accessibility. ACM, 245--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Elson Jeremy, John R. Douceur, Jon Howell, and Jared Sault. Asirra: a CAPTCHA that exploits interest-aligned manual image categorization. In CCS '07.Google ScholarGoogle Scholar
  23. Kat Krol, Simon Parkin, and M Angela Sasse. 2016. Better the devil you know: A user study of two CAPTCHAs and a possible replacement technology. In NDSS Workshop on Usable Security (USEC). Google ScholarGoogle ScholarCross RefCross Ref
  24. Pierre Laperdrix, Walter Rudametkin, and Benoit Baudry. 2016. Beauty and the beast: Diverting modern web browsers to build unique browser fingerprints. In Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 878--894.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jonathan Lazar, Jinjuan Feng, Tim Brooks, Genna Melamed, Brian Wentz, Jon Holman, Abiodun Olalere, and Nnanna Ekedebe. The SoundsRight CAPTCHA: An Improved Approach to Audio Human Interaction Proofs for Blind Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Anu Markkola and Janne Lindqvist. 2008. Accessible voice CAPTCHAs for internet telephony. In Symposium on Accessible Privacy and Security (SOAPS). 1--2.Google ScholarGoogle Scholar
  27. Hendrik Meutzner, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2014. Using automatic speech recognition for attacking acoustic CAPTCHAs: The trade-off between usability and security. In Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC). 276--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Manar Mohamed, Song Gao, Niharika Sachdeva, Nitesh Saxena, Chengcui Zhang, Ponnurangam Kumaraguru, and Paul C. Van Oorschot. 2017. On the security and usability of dynamic cognitive game CAPTCHAs. Journal of Computer Security Preprint (2017), 1--26.Google ScholarGoogle Scholar
  29. Marti Motoyama, Kirill Levchenko, Chris Kanich, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage. Re: CAPTCHAs: understanding captcha-solving services in an economic context. In USENIX Security '10.Google ScholarGoogle Scholar
  30. Casey O'Callaghan. 2009. Auditory perception. (2009).Google ScholarGoogle Scholar
  31. Iasonas Polakis, Georgios Kontaxis, and Sotiris Ioannidis. 2011. CAPTCHuring Automated (Smart) Phone Attacks. SysSec Workshop Vol. 0 (2011), 27--34.Google ScholarGoogle Scholar
  32. Shotaro Sano, Takuma Otsuka, and Hiroshi G. Okuno. Solving Google's Continuous Audio CAPTCHA with HMM-Based Automatic Speech Recognition. In Advances in Information and Computer Security: 8th International Workshop on Security, IWSEC 2013.Google ScholarGoogle ScholarCross RefCross Ref
  33. George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, and others. 2017. English conversational telephone speech recognition by humans and machines. arXiv preprint arXiv:1703.02136 (2017).Google ScholarGoogle Scholar
  34. Graig Sauer, Harry Hochheiser, Jinjuan Feng, and Jonathan Lazar. 2008. Towards a universally usable CAPTCHA. In Proceedings of the 4th Symposium on Usable Privacy and Security, Vol. Vol. 6. 1.Google ScholarGoogle Scholar
  35. Andy Schlaikjer. 2007. A dual-use speech CAPTCHA: Aiding visually impaired web users while providing transcriptions of Audio Streams. LTI-Carnegie Mellon University Technical Report (2007), 07--014.Google ScholarGoogle Scholar
  36. Barbara G. Shinn-Cunningham. 2008. Object-based auditory and visual attention. Trends in cognitive sciences Vol. 12, 5 (2008), 182--186. Google ScholarGoogle ScholarCross RefCross Ref
  37. Sajad Shirali-Shahreza and M. Hassan Shirali-Shahreza. 2011. Accessibility of CAPTCHA methods. In Proceedings of the 4th ACM workshop on security and artificial intelligence. ACM, 109--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Suphannee Sivakorn, Iasonas Polakis, and Angelos D. Keromytis. I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs. In Proceedings of the 1st IEEE European Symposium on Security and Privacy (EuroSP '16').Google ScholarGoogle Scholar
  39. Jennifer Tam, Jiri Simsa, David Huggins-Daines, Luis Von Ahn, and Manuel Blum. 2008. Improving audio captchas. In Symposium On Usable Privacy and Security (SOUPS).Google ScholarGoogle Scholar
  40. Jennifer Tam, Jiri Simsa, Sean Hyde, and Luis V Ahn. 2009. Breaking audio captchas. In Advances in Neural Information Processing Systems. 1625--1632.Google ScholarGoogle Scholar
  41. Huahong Tu, Adam Doupé, Ziming Zhao, and Gail-Joon Ahn. 2016. SoK: Everyone Hates Robocalls: A Survey of Techniques against Telephone Spam. In Proceedings of the IEEE Symposium on Security and Privacy. Google ScholarGoogle ScholarCross RefCross Ref
  42. Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford. CAPTCHA: Using Hard AI Problems for Security. In EUROCRYPT '03.Google ScholarGoogle Scholar
  43. Jeff Yan and Ahmad Salah El Ahmad. 2008. Usability of CAPTCHAs or usability issues in CAPTCHA design. In Proceedings of the 4th symposium on Usable privacy and security. ACM, 44--52. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. In (Cyber)Space Bots Can Hear You Speak: Breaking Audio CAPTCHAs Using OTS Speech Recognition

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!