Abstract
Property-based random testing, exemplified by frameworks such as Haskell's QuickCheck, works by testing an executable predicate (a property) on a stream of randomly generated inputs. Property testing works very well in many cases, but not always. Some properties are conditioned on the input satisfying demanding semantic invariants that are not consequences of its syntactic structure---e.g., that an input list must be sorted or have no duplicates. Most randomly generated inputs fail to satisfy properties with such sparse preconditions, and so are simply discarded. As a result, much of the target system may go untested.
We address this issue with a novel technique called coverage guided, property based testing (CGPT). Our approach is inspired by the related area of coverage guided fuzzing, exemplified by tools like AFL. Rather than just generating a fresh random input at each iteration, CGPT can also produce new inputs by mutating previous ones using type-aware, generic mutator operators. The target program is instrumented to track which control flow branches are executed during a run and inputs whose runs expand control-flow coverage are retained for future mutations. This means that, when sparse conditions in the target are satisfied and new coverage is observed, the input that triggered them will be retained and used as a springboard to go further.
We have implemented CGPT as an extension to the QuickChick property testing tool for Coq programs; we call our implementation FuzzChick. We evaluate FuzzChick on two Coq developments for abstract machines that aim to enforce flavors of noninterference, which has a (very) sparse precondition. We systematically inject bugs in the machines' checking rules and use FuzzChick to look for counterexamples to the claim that they satisfy a standard noninterference property. We find that vanilla QuickChick almost always fails to find any bugs after a long period of time, as does an earlier proposal for combining property testing and fuzzing. In contrast, FuzzChick often finds them within seconds to minutes. Moreover, FuzzChick is almost fully automatic; although highly tuned, hand-written generators can find the bugs faster, they require substantial amounts of insight and manual effort.
Supplemental Material
- AFL 2018. American Fuzzing Lop (AFL). http://lcamtuf.coredump.cx/afl/ .Google Scholar
- Thomas H. Austin and Cormac Flanagan. 2009. Efficient purely-dynamic information flow analysis. In Workshop on Programming Languages and Analysis for Security (PLAS) (PLAS). ACM, 113–124. http://slang.soe.ucsc.edu/cormac/ papers/plas09.pdfGoogle Scholar
- Arthur Azevedo de Amorim, Nathan Collins, André DeHon, Delphine Demange, Cătălin Hriţcu, David Pichardie, Benjamin C. Pierce, Randy Pollack, and Andrew Tolmach. 2014. A Verified Information-Flow Architecture. In Proceedings of the 41st Symposium on Principles of Programming Languages (POPL) (POPL). ACM, 165–178. http://www.crash- safe.org/node/29Google Scholar
Digital Library
- Osbert Bastani, Rahul Sharma, Alex Aiken, and Percy Liang. 2017. Synthesizing Program Input Grammars. In PLDI.Google Scholar
- Jasmin Christian Blanchette. 2012. Automatic proofs and refutations for higher-order logic. Ph.D. Dissertation. Technical University Munich. http://nbn- resolving.de/urn:nbn:de:bvb:91- diss- 20120628- 1097834- 1- 6Google Scholar
- Jasmin Christian Blanchette and Tobias Nipkow. 2010. Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder. In First International Conference on Interactive Theorem Proving (ITP) (Lecture Notes in Computer Science), Vol. 6172. Springer, 131–146. http://link.springer.com/chapter/10.1007%2F978- 3- 642- 14052- 5_11Google Scholar
Digital Library
- Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-based Greybox Fuzzing As Markov Chain. In ACM SIGSAC Conference on Computer and Communications Security (CCS).Google Scholar
- Lukas Bulwahn. 2012a. The New Quickcheck for Isabelle - Random, Exhaustive and Symbolic Testing under One Roof. In 2nd International Conference on Certified Programs and Proofs (CPP) (Lecture Notes in Computer Science), Vol. 7679. Springer, 92–108. https://www.irisa.fr/celtique/genet/ACF/BiblioIsabelle/quickcheckNew.pdfGoogle Scholar
- Lukas Bulwahn. 2012b. Smart Testing of Functional Programs in Isabelle. In 18th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (Lecture Notes in Computer Science), Vol. 7180. Springer, 153–167.Google Scholar
- Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs. In OSDI.Google Scholar
Digital Library
- Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In IEEE Symposium on Security and Privacy (S&P).Google Scholar
Cross Ref
- Silviu Chiricescu, André DeHon, Delphine Demange, Suraj Iyer, Aleksey Kliger, Greg Morrisett, Benjamin C. Pierce, Howard Reubenstein, Jonathan M. Smith, Gregory T. Sullivan, Arun Thomas, Jesse Tov, Christopher M. White, and David Wittenberg. 2013. SAFE: A Clean-Slate Architecture for Secure Systems. In Proceedings of the IEEE International Conference on Technologies for Homeland Security.Google Scholar
Cross Ref
- Koen Claessen, Jonas Duregård, and Michał H. Pałka. 2014. Generating Constrained Random Data with Uniform Distribution. In Functional and Logic Programming (Lecture Notes in Computer Science), Vol. 8475. Springer, 18–34. Google Scholar
Cross Ref
- Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In 5th ACM SIGPLAN International Conference on Functional Programming (ICFP). ACM, 268–279. http://www.eecs.northwestern. edu/~robby/courses/395- 495- 2009- fall/quick.pdfGoogle Scholar
Digital Library
- Koen Claessen and Michał Pałka. 2013. Splittable pseudorandom number generators using cryptographic hashing. In ACM SIGPLAN Symposium on Haskell. ACM, 47–58. http://publications.lib.chalmers.se/records/fulltext/183348/local_183348. pdfGoogle Scholar
Digital Library
- Jake Corina, Aravind Machiry, Christopher Salls, Yan Shoshitaishvili, Shuang Hao, Christopher Kruegel, and Giovanni Vigna. 2017. DIF UZE: Interface Aware Fuzzing for Kernel Drivers. In ACM SIGSAC Conference on Computer and Communications Security (CCS).Google Scholar
- Crowbar 2017. Crowbar. https://github.com/stedolan/crowbar .Google Scholar
- Simon Cruanes and Jasmin Christian Blanchette. 2016. Extending Nunchaku to Dependent Type Theory. In Proceedings First International Workshop on Hammers for Type Theories, [email protected] 2016, Coimbra, Portugal, July 1, 2016. 3–12. Google Scholar
Cross Ref
- Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340. http://dl.acm.org/citation.cfm?id=1792734.1792766Google Scholar
Digital Library
- Maxime Dénès, Cătălin Hriţcu, Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2014. QuickChick: Property-based testing for Coq. The Coq Workshop. http://prosecco.gforge.inria.fr/personal/hritcu/talks/coq6_ submission_4.pdfGoogle Scholar
- Burke Fetscher, Koen Claessen, Michal H. Palka, John Hughes, and Robert Bruce Findler. 2015. Making Random Judgments: Automatically Generating Well-Typed Terms from the Definition of a Type-System. In 24th European Symposium on Programming (Lecture Notes in Computer Science), Vol. 9032. Springer, 383–405. http://users.eecs.northwestern.edu/ ~baf111/random- judgments/Google Scholar
Cross Ref
- Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In S&P.Google Scholar
- Daniel B. Giffin, Amit Levy, Deian Stefan, David Terei, David Mazières, John Mitchell, and Alejandro Russo. 2012. Hails: Protecting Data Privacy in Untrusted Web Applications. In 10th Symposium on Operating Systems Design and Implementation (OSDI). USENIX, 47–60. http://www.scs.stanford.edu/~deian/pubs//giffin:2012:hails.pdfGoogle Scholar
- Milos Gligoric, Tihomir Gvero, Vilas Jagannath, Sarfraz Khurshid, Viktor Kuncak, and Darko Marinov. 2010. Test generation through programming in UDITA. In 32nd ACM/IEEE International Conference on Software Engineering. ACM, 225–234. Google Scholar
Digital Library
- Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. 2008a. Grammar-based Whitebox Fuzzing. In PLDI.Google Scholar
- Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008b. Automated Whitebox Fuzz Testing. In NDSS.Google Scholar
- Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&Fuzz: Machine Learning for Input Fuzzing. In ASE.Google Scholar
- Joseph A. Goguen and JosÃľ Meseguer. 1982. Security Policies and Security Models. In S&P.Google Scholar
- Gustavo Grieco, Martín Ceresa, and Pablo Buiras. 2016. QuickFuzz: an automatic random fuzzer for common file formats. In International Symposium on Haskell.Google Scholar
Digital Library
- Gustavo Grieco, Martn Ceresa, Agustn Mista, and Pablo Buiras. 2017. QuickFuzz Testing for Fun and Profit. J. Syst. Softw. (2017).Google Scholar
- Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan (Newman) Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016. 653–669. https: //www.usenix.org/conference/osdi16/technical- sessions/presentation/guGoogle Scholar
Digital Library
- Cătălin Hriţcu, Michael Greenberg, Ben Karel, Benjamin C. Pierce, and Greg Morrisett. 2013a. All Your IFCException Are Belong To Us. In 34th IEEE Symposium on Security and Privacy. IEEE Computer Society Press, 3–17. http://www.crashsafe.org/node/23Google Scholar
- Cătălin Hriţcu, John Hughes, Benjamin C. Pierce, Antal Spector-Zabusky, Dimitrios Vytiniotis, Arthur Azevedo de Amorim, and Leonidas Lampropoulos. 2013b. Testing Noninterference, Quickly. In 18th ACM SIGPLAN International Conference on Functional Programming (ICFP). ACM, 455–468. http://prosecco.gforge.inria.fr/personal/hritcu/publications/testingnoninterference- icfp2013.pdfGoogle Scholar
Digital Library
- Cătălin Hriţcu, Leonidas Lampropoulos, Antal Spector-Zabusky, Arthur Azevedo de Amorim, Maxime Dénès, John Hughes, Benjamin C. Pierce, and Dimitrios Vytiniotis. 2016. Testing Noninterference, Quickly. Journal of Functional Programming (JFP); Special issue for ICFP 2013 26 (April 2016), e4 (62 pages). Google Scholar
Cross Ref
- Daniel Jackson. 2011. Software Abstractions: Logic, Language, and Anlysis. The MIT Press. http://alloy.mit.edu/alloy/book.htmlGoogle Scholar
Digital Library
- Vivek Jain, Sanjay Rawat, Cristiano Giuffrida, and Herbert Bos. 2018. TIFF: Using Input Type Inference To Improve Fuzzing. In ACSAC.Google Scholar
Digital Library
- George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 2123–2138. Google Scholar
Digital Library
- Leonidas Lampropoulos. 2018. Random Testing for Language Design. Ph.D. Dissertation. University of Pennsylvania.Google Scholar
- Leonidas Lampropoulos, Diane Gallois-Wong, Catalin Hritcu, John Hughes, Benjamin C. Pierce, and Li-yao Xia. 2017. Beginner’s Luck: a language for property-based generators. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017. 114–129. http://dl.acm.org/citation. cfm?id=3009868Google Scholar
Digital Library
- Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2018. Generating good generators for inductive relations. PACMPL 2, POPL (2018), 45:1–45:30. Google Scholar
Digital Library
- Leonidas Lampropoulos and Benjamin C. Pierce. 2018. QuickCHick: Property-Based Testing In Coq. Electronic textbook. http://www.cis.upenn.edu/~bcpierce/sfGoogle Scholar
- Caroline Lemieux and Koushik Sen. 2018. FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage. IEEE/ACM International Conference on Automated Software Engineering.Google Scholar
- Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (July 2009), 107–115. Google Scholar
Digital Library
- Barton P. Miller, Louis Fredriksen, and Bryan So. 1990. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM 33, 12 (Dec. 1990), 32–44.Google Scholar
Digital Library
- Benoît Montagu, Benjamin C. Pierce, and Randy Pollack. 2013. A Theory of Information-Flow Labels. In 26th IEEE Computer Security Foundations Symposium (CSF). IEEE, 3–17. http://www.crash- safe.org/node/25Google Scholar
Digital Library
- Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2018. Zest: Validity Fuzzing and Parametric Generators for Effective Random Testing. CoRR abs/1812.00078 (2018).Google Scholar
- Manolis Papadakis and Konstantinos F. Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the 10th ACM SIGPLAN workshop on Erlang, Tokyo, Japan, September 23, 2011. 39–50. Google Scholar
Digital Library
- Zoe Paraskevopoulou, Cătălin Hriţcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015a. A Coq Framework For Verified Property-Based Testing. Workshop on Coq for PL. https://coqpl.cs.washington.edu/wpcontent/uploads/2014/12/quickchick.pdfGoogle Scholar
- Zoe Paraskevopoulou, Cătălin Hriţcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015b. Foundational Property-Based Testing. In 6th International Conference on Interactive Theorem Proving (ITP) (Lecture Notes in Computer Science), Christian Urban and Xingyuan Zhang (Eds.), Vol. 9236. Springer, 325–343. http://prosecco.gforge.inria.fr/ personal/hritcu/publications/foundational- pbt.pdfGoogle Scholar
- Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by program transformation. In IEEE Symposium on Security and Privacy (S&P).Google Scholar
Cross Ref
- Van-Thuan Pham, Marcel Böhme, Andrew E. Santosa, Alexandru Razvan Caciulescu, and Abhik Roychoudhury. 2018. Smart Greybox Fuzzing. CoRR abs/1811.09447 (2018). arXiv: 1811.09447 http://arxiv.org/abs/1811.09447Google Scholar
- Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. Vuzzer: Applicationaware evolutionary fuzzing. In NDSS.Google Scholar
- A. Sabelfeld and A.C. Myers. 2003. Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21, 1 (Jan. 2003), 5–19. Google Scholar
Digital Library
- Eric L. Seidel, Niki Vazou, and Ranjit Jhala. 2015. Type Targeted Testing. In Programming Languages and Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings. 812–836. Google Scholar
Digital Library
- Bhargava Shastry, Markus Leutner, Tobias Fiebig, Kashyap Thimmaraju, Fabian Yamaguchi, Konrad Rieck, Stefan Schmid, Jean-Pierre Seifert, and Anja Feldmann. 2017. Static Program Analysis as a Fuzzing Aid. In Research in Attacks, Intrusions, and Defenses (RAID).Google Scholar
- Deian Stefan, Alejandro Russo, David Mazières, and John C. Mitchell. 2012. Disjunction Category Labels. In NordSec.Google Scholar
- Deian Stefan, Alejandro Russo, John C. Mitchell, and David Mazières. 2011. Flexible dynamic information flow control in Haskell. In 4th Symposium on Haskell. ACM, 95–106. http://www.scs.stanford.edu/~deian/pubs//stefan:2011:flexibleext.pdfGoogle Scholar
Digital Library
- Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution.. In Network and Distributed System Security Symposium (NDSS).Google Scholar
Cross Ref
- Emina Torlak and Daniel Jackson. 2007. Kodkod: A Relational Model Finder. In 13th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Lecture Notes in Computer Science), Vol. 4424. Springer, 632–647. http://people.csail.mit.edu/emina/pubs/kodkod.tacas07.pdfGoogle Scholar
- Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In IEEE Symposium on Security and Privacy (S&P).Google Scholar
- Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2018. Superion: Grammar-Aware Greybox Fuzzing. CoRR abs/1812.01197 (2018).Google Scholar
- Tjark Weber. 2005. Bounded Model Generation for Isabelle/HOL. Electronic Notes in Theoretical Computer Science 125, 3 (2005), 103–116. http://lara.epfl.ch/w/_media/projects:weber- hol- models.pdfGoogle Scholar
Digital Library
- Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 745–761. https://www.usenix.org/conference/usenixsecurity18/presentation/yunGoogle Scholar
- Stephan A. Zdancewic. 2002. Programming Languages for Information Security. Ph.D. Dissertation. Cornell University. http://www.cis.upenn.edu/~stevez/papers/Zda02.pdfGoogle Scholar
Digital Library
Index Terms
Coverage guided, property based testing
Recommendations
JQF: coverage-guided property-based testing in Java
ISSTA 2019: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and AnalysisWe present JQF, a platform for performing coverage-guided fuzz testing in Java. JQF is designed both for practitioners, who wish to find bugs in Java programs, as well as for researchers, who wish to implement new fuzzing algorithms.
Practitioners ...
Generating good generators for inductive relations
Property-based random testing (PBRT) is widely used in the functional programming and verification communities. For testing simple properties, PBRT tools such as QuickCheck can automatically generate random inputs of a given type. But for more complex ...
Coverage is not strongly correlated with test suite effectiveness
ICSE 2014: Proceedings of the 36th International Conference on Software EngineeringThe coverage of a test suite is often used as a proxy for its ability to detect faults. However, previous studies that investigated the correlation between code coverage and test suite effectiveness have failed to reach a consensus about the nature and ...






Comments