skip to main content
research-article

Grammar-based whitebox fuzzing

Published:07 June 2008Publication History
Skip Abstract Section

Abstract

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.

In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.

References

  1. D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.Google ScholarGoogle Scholar
  2. S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.Google ScholarGoogle Scholar
  5. C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.Google ScholarGoogle Scholar
  16. P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.Google ScholarGoogle Scholar
  17. K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.Google ScholarGoogle Scholar
  23. R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.Google ScholarGoogle Scholar
  25. P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.Google ScholarGoogle Scholar
  27. D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.Google ScholarGoogle Scholar
  33. D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.Google ScholarGoogle Scholar
  39. G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Grammar-based whitebox fuzzing

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 43, Issue 6
              PLDI '08
              June 2008
              382 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/1379022
              Issue’s Table of Contents
              • cover image ACM Conferences
                PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation
                June 2008
                396 pages
                ISBN:9781595938602
                DOI:10.1145/1375581
                • General Chair:
                • Rajiv Gupta,
                • Program Chair:
                • Saman Amarasinghe

              Copyright © 2008 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 June 2008

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!