skip to main content

Accelerating Fuzzing through Prefix-Guided Execution

Published:06 April 2023Publication History
Skip Abstract Section

Abstract

Coverage-guided fuzzing is one of the most effective approaches for discovering software defects and vulnerabilities. It executes all mutated tests from seed inputs to expose coverage-increasing tests. However, executing all mutated tests incurs significant performance penalties---most of the mutated tests are discarded because they do not increase code coverage. Thus, determining if a test increases code coverage without actually executing it is beneficial, but a paradoxical challenge. In this paper, we introduce the notion of prefix-guided execution (PGE) to tackle this challenge. PGE leverages two key observations: (1) Only a tiny fraction of the mutated tests increase coverage, thus requiring full execution; and (2) whether a test increases coverage may be accurately inferred from its partial execution. PGE monitors the execution of a test and applies early termination when the execution prefix indicates that the test is unlikely to increase coverage.

To demonstrate the potential of PGE, we implement a prototype on top of AFL++, which we call AFL++-PGE. We evaluate AFL++-PGE on MAGMA, a ground-truth benchmark set that consists of 21 programs from nine popular real-world projects. Our results show that, after 48 hours of fuzzing, AFL++-PGE finds more bugs, discovers bugs faster, and achieves higher coverage. Prefix-guided execution is general and can benefit the AFL-based family of fuzzers.

References

  1. Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous fuzzing for open source software. Google Testing Blog. Google ScholarGoogle Scholar
  2. Austin Appleby. 2016. MurmurHash3. https://github.com/aappleby/smhasher/wiki/MurmurHash3 Google ScholarGoogle Scholar
  3. Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence.. In Network and Distributed System Security. 19, 1–15. Google ScholarGoogle Scholar
  4. Marcel Böhme and Brandon Falk. 2020. Fuzzing: On the exponential cost of vulnerability discovery. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 713–724. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2329–2344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45, 5 (2017), 489–506. Google ScholarGoogle ScholarCross RefCross Ref
  7. Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. 2018. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2095–2108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In IEEE Symposium on Security and Privacy (SP). 711–725. Google ScholarGoogle ScholarCross RefCross Ref
  9. Peng Chen, Jianzhong Liu, and Hao Chen. 2019. Matryoshka: fuzzing deeply nested branches. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 499–513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yaohui Chen, Mansour Ahmadi, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). 77–92. Google ScholarGoogle Scholar
  11. Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. 2020. Savior: Towards bug-driven hybrid testing. In 2020 IEEE Symposium on Security and Privacy (SP). 1580–1596. Google ScholarGoogle ScholarCross RefCross Ref
  12. Andrea Fioraldi, Daniele Cono D’Elia, and Davide Balzarotti. 2021. The Use of Likely Invariants as Feedback for Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21). 2829–2846. Google ScholarGoogle Scholar
  13. Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20). Google ScholarGoogle Scholar
  14. Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2022. American Fuzzy Lop plus plus (AFL++). https://github.com/AFLplusplus/AFLplusplus/releases/tag/4.01c Google ScholarGoogle Scholar
  15. Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. GREYONE: Data flow sensitive fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). 2577–2594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 50–59. Google ScholarGoogle ScholarCross RefCross Ref
  17. Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2020. Magma: A Ground-Truth Fuzzing Benchmark. Proc. ACM Meas. Anal. Comput. Syst., 4, 3 (2020), Article 49, Dec., 29 pages. https://doi.org/10.1145/3428334 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony L Hosking. 2021. Seed selection for successful fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 230–243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Chin-Chia Hsu, Che-Yu Wu, Hsu-Chun Hsiao, and Shih-Kun Huang. 2018. Instrim: Lightweight instrumentation for coverage-guided fuzzing. In Symposium on Network and Distributed System Security, Workshop on Binary Analysis Research. Google ScholarGoogle ScholarCross RefCross Ref
  20. Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (SP). 1613–1627. Google ScholarGoogle ScholarCross RefCross Ref
  21. Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incomplete observations. Journal of the American statistical association, 53, 282 (1958), 457–481. Google ScholarGoogle ScholarCross RefCross Ref
  22. George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2123–2138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 475–485. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shaohua Li and Zhendong Su. 2023. Accelerating Fuzzing through Prefix-Guided Execution. https://doi.org/10.5281/zenodo.7727577 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: program-state based binary fuzzing. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 627–637. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, and Peng Cheng. 2021. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21). Google ScholarGoogle Scholar
  27. LLVM team. 2021. SanitizerCoverage. https://releases.llvm.org/11.0.1/tools/clang/docs/SanitizerCoverage.html Google ScholarGoogle Scholar
  28. Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966. Google ScholarGoogle Scholar
  29. Valentin JM Manès, Soomin Kim, and Sang Kil Cha. 2020. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1024–1036. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Valentin Jean Marie Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2019. The art, science, and engineering of fuzzing: A survey. IEEE Transactions on Software Engineering. Google ScholarGoogle Scholar
  31. Nathan Mantel. 1966. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep, 50, 3 (1966), 163–170. Google ScholarGoogle Scholar
  32. Björn Mathis, Rahul Gopinath, and Andreas Zeller. 2020. Learning input tokens for effective fuzzing. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 27–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Stefan Nagy and Matthew Hicks. 2019. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing. In 2019 IEEE Symposium on Security and Privacy (SP). 787–802. Google ScholarGoogle ScholarCross RefCross Ref
  34. Stefan Nagy, Anh Nguyen-Tuong, Jason D Hiser, Jack W Davidson, and Matthew Hicks. 2021. Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 351–365. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yannic Noller, Rody Kersten, and Corina S Păsăreanu. 2018. Badger: complexity analysis with fuzzing and symbolic execution. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 322–332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. 2020. HyDiff: Hybrid differential software analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 1273–1285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by program transformation. In 2018 IEEE Symposium on Security and Privacy (SP). 697–710. Google ScholarGoogle ScholarCross RefCross Ref
  38. Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2020. AFLNet: a greybox fuzzer for network protocols. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). 460–465. Google ScholarGoogle ScholarCross RefCross Ref
  39. Mohit Rajpal, William Blum, and Rishabh Singh. 2017. Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596. Google ScholarGoogle Scholar
  40. Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing.. In Network and Distributed System Security. 17, 1–14. Google ScholarGoogle Scholar
  41. David A Schum. 2001. The evidential foundations of probabilistic reasoning. Northwestern University Press. Google ScholarGoogle Scholar
  42. Dongdong She, Rahul Krishna, Lu Yan, Suman Jana, and Baishakhi Ray. 2020. MTFuzz: fuzzing with a multi-task neural network. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 737–749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2019. Neuzz: Efficient fuzzing with neural program smoothing. In 2019 IEEE Symposium on Security and Privacy (SP). 803–817. Google ScholarGoogle ScholarCross RefCross Ref
  44. Dokyung Song, Felicitas Hetzelt, Jonghwan Kim, Brent Byunghoon Kang, Jean-Pierre Seifert, and Michael Franz. 2020. Agamotto: Accelerating kernel driver fuzzing with lightweight virtual machine checkpoints. In 29th USENIX Security Symposium (USENIX Security 20). 2541–2557. Google ScholarGoogle Scholar
  45. Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution.. In Network and Distributed System Security. 16, 1–16. Google ScholarGoogle Scholar
  46. Jonas Benedict Wagner. 2017. Elastic program transformations: automatically optimizing the reliability/performance trade-off in systems software. Ph. D. Dissertation. Ecole Polytechnique Fédérale de Lausanne. Google ScholarGoogle Scholar
  47. Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yuekang Li, Yang Liu, Shengchao Qin, Hongxu Chen, and Yulei Sui. 2020. Typestate-guided fuzzer for discovering use-after-free vulnerabilities. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 999–1010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). 579–594. Google ScholarGoogle ScholarCross RefCross Ref
  49. Mingzhe Wang, Jie Liang, Yuanliang Chen, Yu Jiang, Xun Jiao, Han Liu, Xibin Zhao, and Jiaguang Sun. 2018. SAFL: increasing and accelerating testing coverage with symbolic execution and guided fuzzing. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. 61–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Valentin Wüstholz and Maria Christakis. 2020. Targeted greybox fuzzing with static lookahead analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 789–800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing new operating primitives to improve fuzzing performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2313–2328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. $QSYM$: A practical concolic execution engine tailored for hybrid fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745–761. Google ScholarGoogle Scholar
  53. Michał Zalewski. 2014. American fuzzy lop. https://lcamtuf.coredump.cx/afl/ Google ScholarGoogle Scholar
  54. Chijin Zhou, Mingzhe Wang, Jie Liang, Zhe Liu, and Yu Jiang. 2020. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 858–870. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Peiyuan Zong, Tao Lv, Dawei Wang, Zizhuang Deng, Ruigang Liang, and Kai Chen. 2020. Fuzzguard: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning. In 29th USENIX Security Symposium (USENIX Security 20). Google ScholarGoogle Scholar

Index Terms

  1. Accelerating Fuzzing through Prefix-Guided Execution

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!