Abstract
Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.
In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.
- D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.Google Scholar
- S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.Google Scholar
Digital Library
- D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983. Google Scholar
Digital Library
- N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.Google Scholar
- C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002. Google Scholar
Digital Library
- C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006. Google Scholar
Digital Library
- K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000. Google Scholar
Digital Library
- D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005. Google Scholar
Digital Library
- W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007. Google Scholar
Digital Library
- B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007. Google Scholar
Digital Library
- M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007. Google Scholar
Digital Library
- J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000. Google Scholar
Digital Library
- P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007. Google Scholar
Digital Library
- P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005. Google Scholar
Digital Library
- P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.Google Scholar
- P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.Google Scholar
- K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.Google Scholar
Digital Library
- J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979. Google Scholar
Digital Library
- S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004. Google Scholar
Digital Library
- J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976. Google Scholar
Digital Library
- R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.Google Scholar
Digital Library
- R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.Google Scholar
- R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007. Google Scholar
Digital Library
- B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.Google Scholar
- P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990. Google Scholar
Digital Library
- B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.Google Scholar
- D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997. Google Scholar
Digital Library
- B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990. Google Scholar
Digital Library
- R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000. Google Scholar
Digital Library
- C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007. Google Scholar
Digital Library
- R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006. Google Scholar
Digital Library
- P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.Google Scholar
- D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989. Google Scholar
Digital Library
- K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005. Google Scholar
Digital Library
- E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999. Google Scholar
Digital Library
- K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004. Google Scholar
Digital Library
- M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007. Google Scholar
Digital Library
- M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.Google Scholar
- G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007. Google Scholar
Digital Library
Index Terms
Grammar-based whitebox fuzzing
Recommendations
Random testing for security: blackbox vs. whitebox fuzzing
RT '07: Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007)Fuzz testing is an effective technique for finding security vulnerabilities in software. Fuzz testing is a form of blackbox random testing which randomly mutates well-formed inputs and tests the program on the resulting data. In some cases, grammars are ...
Grammar-based whitebox fuzzing
PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and ImplementationWhitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
Compositional dynamic test generation
POPL '07: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesDynamic test generation is a form of dynamic program analysis that attempts to compute test inputs to drive a program along a specific program path. Directed Automated Random Testing, or DART for short, blends dynamic test generation with model checking ...







Comments