skip to main content
research-article
Open Access

Coverage guided, property based testing

Published:10 October 2019Publication History
Skip Abstract Section

Abstract

Property-based random testing, exemplified by frameworks such as Haskell's QuickCheck, works by testing an executable predicate (a property) on a stream of randomly generated inputs. Property testing works very well in many cases, but not always. Some properties are conditioned on the input satisfying demanding semantic invariants that are not consequences of its syntactic structure---e.g., that an input list must be sorted or have no duplicates. Most randomly generated inputs fail to satisfy properties with such sparse preconditions, and so are simply discarded. As a result, much of the target system may go untested.

We address this issue with a novel technique called coverage guided, property based testing (CGPT). Our approach is inspired by the related area of coverage guided fuzzing, exemplified by tools like AFL. Rather than just generating a fresh random input at each iteration, CGPT can also produce new inputs by mutating previous ones using type-aware, generic mutator operators. The target program is instrumented to track which control flow branches are executed during a run and inputs whose runs expand control-flow coverage are retained for future mutations. This means that, when sparse conditions in the target are satisfied and new coverage is observed, the input that triggered them will be retained and used as a springboard to go further.

We have implemented CGPT as an extension to the QuickChick property testing tool for Coq programs; we call our implementation FuzzChick. We evaluate FuzzChick on two Coq developments for abstract machines that aim to enforce flavors of noninterference, which has a (very) sparse precondition. We systematically inject bugs in the machines' checking rules and use FuzzChick to look for counterexamples to the claim that they satisfy a standard noninterference property. We find that vanilla QuickChick almost always fails to find any bugs after a long period of time, as does an earlier proposal for combining property testing and fuzzing. In contrast, FuzzChick often finds them within seconds to minutes. Moreover, FuzzChick is almost fully automatic; although highly tuned, hand-written generators can find the bugs faster, they require substantial amounts of insight and manual effort.

Skip Supplemental Material Section

Supplemental Material

a181-lampropoulos

Presentation at OOPSLA '19

References

  1. AFL 2018. American Fuzzing Lop (AFL). http://lcamtuf.coredump.cx/afl/ .Google ScholarGoogle Scholar
  2. Thomas H. Austin and Cormac Flanagan. 2009. Efficient purely-dynamic information flow analysis. In Workshop on Programming Languages and Analysis for Security (PLAS) (PLAS). ACM, 113–124. http://slang.soe.ucsc.edu/cormac/ papers/plas09.pdfGoogle ScholarGoogle Scholar
  3. Arthur Azevedo de Amorim, Nathan Collins, André DeHon, Delphine Demange, Cătălin Hriţcu, David Pichardie, Benjamin C. Pierce, Randy Pollack, and Andrew Tolmach. 2014. A Verified Information-Flow Architecture. In Proceedings of the 41st Symposium on Principles of Programming Languages (POPL) (POPL). ACM, 165–178. http://www.crash- safe.org/node/29Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Osbert Bastani, Rahul Sharma, Alex Aiken, and Percy Liang. 2017. Synthesizing Program Input Grammars. In PLDI.Google ScholarGoogle Scholar
  5. Jasmin Christian Blanchette. 2012. Automatic proofs and refutations for higher-order logic. Ph.D. Dissertation. Technical University Munich. http://nbn- resolving.de/urn:nbn:de:bvb:91- diss- 20120628- 1097834- 1- 6Google ScholarGoogle Scholar
  6. Jasmin Christian Blanchette and Tobias Nipkow. 2010. Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder. In First International Conference on Interactive Theorem Proving (ITP) (Lecture Notes in Computer Science), Vol. 6172. Springer, 131–146. http://link.springer.com/chapter/10.1007%2F978- 3- 642- 14052- 5_11Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-based Greybox Fuzzing As Markov Chain. In ACM SIGSAC Conference on Computer and Communications Security (CCS).Google ScholarGoogle Scholar
  8. Lukas Bulwahn. 2012a. The New Quickcheck for Isabelle - Random, Exhaustive and Symbolic Testing under One Roof. In 2nd International Conference on Certified Programs and Proofs (CPP) (Lecture Notes in Computer Science), Vol. 7679. Springer, 92–108. https://www.irisa.fr/celtique/genet/ACF/BiblioIsabelle/quickcheckNew.pdfGoogle ScholarGoogle Scholar
  9. Lukas Bulwahn. 2012b. Smart Testing of Functional Programs in Isabelle. In 18th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (Lecture Notes in Computer Science), Vol. 7180. Springer, 153–167.Google ScholarGoogle Scholar
  10. Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs. In OSDI.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In IEEE Symposium on Security and Privacy (S&P).Google ScholarGoogle ScholarCross RefCross Ref
  12. Silviu Chiricescu, André DeHon, Delphine Demange, Suraj Iyer, Aleksey Kliger, Greg Morrisett, Benjamin C. Pierce, Howard Reubenstein, Jonathan M. Smith, Gregory T. Sullivan, Arun Thomas, Jesse Tov, Christopher M. White, and David Wittenberg. 2013. SAFE: A Clean-Slate Architecture for Secure Systems. In Proceedings of the IEEE International Conference on Technologies for Homeland Security.Google ScholarGoogle ScholarCross RefCross Ref
  13. Koen Claessen, Jonas Duregård, and Michał H. Pałka. 2014. Generating Constrained Random Data with Uniform Distribution. In Functional and Logic Programming (Lecture Notes in Computer Science), Vol. 8475. Springer, 18–34. Google ScholarGoogle ScholarCross RefCross Ref
  14. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In 5th ACM SIGPLAN International Conference on Functional Programming (ICFP). ACM, 268–279. http://www.eecs.northwestern. edu/~robby/courses/395- 495- 2009- fall/quick.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  15. Koen Claessen and Michał Pałka. 2013. Splittable pseudorandom number generators using cryptographic hashing. In ACM SIGPLAN Symposium on Haskell. ACM, 47–58. http://publications.lib.chalmers.se/records/fulltext/183348/local_183348. pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jake Corina, Aravind Machiry, Christopher Salls, Yan Shoshitaishvili, Shuang Hao, Christopher Kruegel, and Giovanni Vigna. 2017. DIF UZE: Interface Aware Fuzzing for Kernel Drivers. In ACM SIGSAC Conference on Computer and Communications Security (CCS).Google ScholarGoogle Scholar
  17. Crowbar 2017. Crowbar. https://github.com/stedolan/crowbar .Google ScholarGoogle Scholar
  18. Simon Cruanes and Jasmin Christian Blanchette. 2016. Extending Nunchaku to Dependent Type Theory. In Proceedings First International Workshop on Hammers for Type Theories, [email protected] 2016, Coimbra, Portugal, July 1, 2016. 3–12. Google ScholarGoogle ScholarCross RefCross Ref
  19. Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340. http://dl.acm.org/citation.cfm?id=1792734.1792766Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Maxime Dénès, Cătălin Hriţcu, Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2014. QuickChick: Property-based testing for Coq. The Coq Workshop. http://prosecco.gforge.inria.fr/personal/hritcu/talks/coq6_ submission_4.pdfGoogle ScholarGoogle Scholar
  21. Burke Fetscher, Koen Claessen, Michal H. Palka, John Hughes, and Robert Bruce Findler. 2015. Making Random Judgments: Automatically Generating Well-Typed Terms from the Definition of a Type-System. In 24th European Symposium on Programming (Lecture Notes in Computer Science), Vol. 9032. Springer, 383–405. http://users.eecs.northwestern.edu/ ~baf111/random- judgments/Google ScholarGoogle ScholarCross RefCross Ref
  22. Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In S&P.Google ScholarGoogle Scholar
  23. Daniel B. Giffin, Amit Levy, Deian Stefan, David Terei, David Mazières, John Mitchell, and Alejandro Russo. 2012. Hails: Protecting Data Privacy in Untrusted Web Applications. In 10th Symposium on Operating Systems Design and Implementation (OSDI). USENIX, 47–60. http://www.scs.stanford.edu/~deian/pubs//giffin:2012:hails.pdfGoogle ScholarGoogle Scholar
  24. Milos Gligoric, Tihomir Gvero, Vilas Jagannath, Sarfraz Khurshid, Viktor Kuncak, and Darko Marinov. 2010. Test generation through programming in UDITA. In 32nd ACM/IEEE International Conference on Software Engineering. ACM, 225–234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. 2008a. Grammar-based Whitebox Fuzzing. In PLDI.Google ScholarGoogle Scholar
  26. Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008b. Automated Whitebox Fuzz Testing. In NDSS.Google ScholarGoogle Scholar
  27. Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&Fuzz: Machine Learning for Input Fuzzing. In ASE.Google ScholarGoogle Scholar
  28. Joseph A. Goguen and JosÃľ Meseguer. 1982. Security Policies and Security Models. In S&P.Google ScholarGoogle Scholar
  29. Gustavo Grieco, Martín Ceresa, and Pablo Buiras. 2016. QuickFuzz: an automatic random fuzzer for common file formats. In International Symposium on Haskell.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Gustavo Grieco, Martn Ceresa, Agustn Mista, and Pablo Buiras. 2017. QuickFuzz Testing for Fun and Profit. J. Syst. Softw. (2017).Google ScholarGoogle Scholar
  31. Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan (Newman) Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016. 653–669. https: //www.usenix.org/conference/osdi16/technical- sessions/presentation/guGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  32. Cătălin Hriţcu, Michael Greenberg, Ben Karel, Benjamin C. Pierce, and Greg Morrisett. 2013a. All Your IFCException Are Belong To Us. In 34th IEEE Symposium on Security and Privacy. IEEE Computer Society Press, 3–17. http://www.crashsafe.org/node/23Google ScholarGoogle Scholar
  33. Cătălin Hriţcu, John Hughes, Benjamin C. Pierce, Antal Spector-Zabusky, Dimitrios Vytiniotis, Arthur Azevedo de Amorim, and Leonidas Lampropoulos. 2013b. Testing Noninterference, Quickly. In 18th ACM SIGPLAN International Conference on Functional Programming (ICFP). ACM, 455–468. http://prosecco.gforge.inria.fr/personal/hritcu/publications/testingnoninterference- icfp2013.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  34. Cătălin Hriţcu, Leonidas Lampropoulos, Antal Spector-Zabusky, Arthur Azevedo de Amorim, Maxime Dénès, John Hughes, Benjamin C. Pierce, and Dimitrios Vytiniotis. 2016. Testing Noninterference, Quickly. Journal of Functional Programming (JFP); Special issue for ICFP 2013 26 (April 2016), e4 (62 pages). Google ScholarGoogle ScholarCross RefCross Ref
  35. Daniel Jackson. 2011. Software Abstractions: Logic, Language, and Anlysis. The MIT Press. http://alloy.mit.edu/alloy/book.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vivek Jain, Sanjay Rawat, Cristiano Giuffrida, and Herbert Bos. 2018. TIFF: Using Input Type Inference To Improve Fuzzing. In ACSAC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 2123–2138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Leonidas Lampropoulos. 2018. Random Testing for Language Design. Ph.D. Dissertation. University of Pennsylvania.Google ScholarGoogle Scholar
  39. Leonidas Lampropoulos, Diane Gallois-Wong, Catalin Hritcu, John Hughes, Benjamin C. Pierce, and Li-yao Xia. 2017. Beginner’s Luck: a language for property-based generators. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017. 114–129. http://dl.acm.org/citation. cfm?id=3009868Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2018. Generating good generators for inductive relations. PACMPL 2, POPL (2018), 45:1–45:30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Leonidas Lampropoulos and Benjamin C. Pierce. 2018. QuickCHick: Property-Based Testing In Coq. Electronic textbook. http://www.cis.upenn.edu/~bcpierce/sfGoogle ScholarGoogle Scholar
  42. Caroline Lemieux and Koushik Sen. 2018. FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage. IEEE/ACM International Conference on Automated Software Engineering.Google ScholarGoogle Scholar
  43. Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Commun. ACM 52, 7 (July 2009), 107–115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Barton P. Miller, Louis Fredriksen, and Bryan So. 1990. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM 33, 12 (Dec. 1990), 32–44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Benoît Montagu, Benjamin C. Pierce, and Randy Pollack. 2013. A Theory of Information-Flow Labels. In 26th IEEE Computer Security Foundations Symposium (CSF). IEEE, 3–17. http://www.crash- safe.org/node/25Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2018. Zest: Validity Fuzzing and Parametric Generators for Effective Random Testing. CoRR abs/1812.00078 (2018).Google ScholarGoogle Scholar
  47. Manolis Papadakis and Konstantinos F. Sagonas. 2011. A PropEr integration of types and function specifications with property-based testing. In Proceedings of the 10th ACM SIGPLAN workshop on Erlang, Tokyo, Japan, September 23, 2011. 39–50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Zoe Paraskevopoulou, Cătălin Hriţcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015a. A Coq Framework For Verified Property-Based Testing. Workshop on Coq for PL. https://coqpl.cs.washington.edu/wpcontent/uploads/2014/12/quickchick.pdfGoogle ScholarGoogle Scholar
  49. Zoe Paraskevopoulou, Cătălin Hriţcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015b. Foundational Property-Based Testing. In 6th International Conference on Interactive Theorem Proving (ITP) (Lecture Notes in Computer Science), Christian Urban and Xingyuan Zhang (Eds.), Vol. 9236. Springer, 325–343. http://prosecco.gforge.inria.fr/ personal/hritcu/publications/foundational- pbt.pdfGoogle ScholarGoogle Scholar
  50. Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by program transformation. In IEEE Symposium on Security and Privacy (S&P).Google ScholarGoogle ScholarCross RefCross Ref
  51. Van-Thuan Pham, Marcel Böhme, Andrew E. Santosa, Alexandru Razvan Caciulescu, and Abhik Roychoudhury. 2018. Smart Greybox Fuzzing. CoRR abs/1811.09447 (2018). arXiv: 1811.09447 http://arxiv.org/abs/1811.09447Google ScholarGoogle Scholar
  52. Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. Vuzzer: Applicationaware evolutionary fuzzing. In NDSS.Google ScholarGoogle Scholar
  53. A. Sabelfeld and A.C. Myers. 2003. Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21, 1 (Jan. 2003), 5–19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Eric L. Seidel, Niki Vazou, and Ranjit Jhala. 2015. Type Targeted Testing. In Programming Languages and Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings. 812–836. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Bhargava Shastry, Markus Leutner, Tobias Fiebig, Kashyap Thimmaraju, Fabian Yamaguchi, Konrad Rieck, Stefan Schmid, Jean-Pierre Seifert, and Anja Feldmann. 2017. Static Program Analysis as a Fuzzing Aid. In Research in Attacks, Intrusions, and Defenses (RAID).Google ScholarGoogle Scholar
  56. Deian Stefan, Alejandro Russo, David Mazières, and John C. Mitchell. 2012. Disjunction Category Labels. In NordSec.Google ScholarGoogle Scholar
  57. Deian Stefan, Alejandro Russo, John C. Mitchell, and David Mazières. 2011. Flexible dynamic information flow control in Haskell. In 4th Symposium on Haskell. ACM, 95–106. http://www.scs.stanford.edu/~deian/pubs//stefan:2011:flexibleext.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  58. Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution.. In Network and Distributed System Security Symposium (NDSS).Google ScholarGoogle ScholarCross RefCross Ref
  59. Emina Torlak and Daniel Jackson. 2007. Kodkod: A Relational Model Finder. In 13th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Lecture Notes in Computer Science), Vol. 4424. Springer, 632–647. http://people.csail.mit.edu/emina/pubs/kodkod.tacas07.pdfGoogle ScholarGoogle Scholar
  60. Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In IEEE Symposium on Security and Privacy (S&P).Google ScholarGoogle Scholar
  61. Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2018. Superion: Grammar-Aware Greybox Fuzzing. CoRR abs/1812.01197 (2018).Google ScholarGoogle Scholar
  62. Tjark Weber. 2005. Bounded Model Generation for Isabelle/HOL. Electronic Notes in Theoretical Computer Science 125, 3 (2005), 103–116. http://lara.epfl.ch/w/_media/projects:weber- hol- models.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  63. Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 745–761. https://www.usenix.org/conference/usenixsecurity18/presentation/yunGoogle ScholarGoogle Scholar
  64. Stephan A. Zdancewic. 2002. Programming Languages for Information Security. Ph.D. Dissertation. Cornell University. http://www.cis.upenn.edu/~stevez/papers/Zda02.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Coverage guided, property based testing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!