skip to main content

Compiler fuzzing: how much does it matter?

Published:10 October 2019Publication History
Skip Abstract Section

Abstract

Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whether we can manually find application inputs that trigger execution divergence due to such changes. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-product of formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for at least one package, in which case only a very small part of the binary is typically affected, yet causing two failures when running the test suites of all the impacted packages. User-reported and formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one test failure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-found and non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact or that they would require very specific runtime circumstances to trigger execution divergence.

Skip Supplemental Material Section

Supplemental Material

a155-marcozzi

Presentation at OOPSLA '19

References

  1. Scott Bauer, Pascal Cuoq, and John Regehr. 2015. Deniable Backdoors using Compiler Bugs. PoC GTFO (2015), 7–9.Google ScholarGoogle Scholar
  2. Abdulazeez Boujarwah and Kassem Saleh. 1997. Compiler test case generation methods: a survey and assessment. Information and Software Technology (IST) 39 (1997), 617 – 625. Issue 9.Google ScholarGoogle ScholarCross RefCross Ref
  3. Colin Burgess and M. Saidi. 1996. The automatic generation of test cases for optimizing Fortran compilers. Information and Software Technology (IST) 38 (1996), 111 – 119. Issue 2.Google ScholarGoogle Scholar
  4. Cristian Cadar, Luís Pina, and John Regehr. 2015. Multi-Version Execution Defeats a Compiler-Bug-Based Backdoor. http://ccadar.blogspot.co.uk/2015/11/multi-version-execution-defeats.html .Google ScholarGoogle Scholar
  5. Cristian Cadar and Koushik Sen. 2013. Symbolic Execution for Software Testing: Three Decades Later. Communications of the Association for Computing Machinery (CACM) 56, 2 (2013), 82–90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T.Y. Chen, S.C. Cheung, and S.M. Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report HKUST-CS98-01. Hong Kong University of Science and Technology.Google ScholarGoogle Scholar
  7. Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Fern, Eric Eide, and John Regehr. 2013. Taming Compiler Fuzzers. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing Static Analyzers with Randomly Generated Programs. In Proc. of the 4th International Conference on NASA Formal Methods.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov. 2007. Automated Testing of Refactoring Engines. In Proc. of the joint meeting of the European Software Engineering Conference and the ACM Symposium on the Foundations of Software Engineering (ESEC/FSE’07).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated Testing of Graphics Shader Compilers. Proceedings of the ACM Programming Languages (PACMPL) 1, OOPSLA (2017), 93:1–93:29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alastair F. Donaldson and Andrei Lascu. 2016. Metamorphic testing for (graphics) compilers. In Proc. of the International Workshop on Metamorphic Testing (MET’16).Google ScholarGoogle Scholar
  13. K.V. Hanford. 1970. Automatic generation of test cases. IBM Systems Journal 9 (1970), 242–257. Issue 4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proc. of the 21st USENIX Security Symposium (USENIX Security’12).Google ScholarGoogle Scholar
  15. Petr Hosek and Cristian Cadar. 2015. Varan the Unbelievable: An efficient N-version execution framework. In Proc. of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15).Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Derek Jones. 2015. So you found a bug in my compiler: Whoopee do. http://shape-of-code.coding-guidelines.com/2015/12/ 07/so-you-found-a-bug-in-my-compiler-whoopee-do/ .Google ScholarGoogle Scholar
  17. Timotej Kapus and Cristian Cadar. 2017. Automatic Testing of Symbolic Execution Engines via Program Generation and Differential Testing. In Proc. of the 32nd IEEE International Conference on Automated Software Engineering (ASE’17).Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Alexander Kossatchev and Mikhail Posypkin. 2005. Survey of Compiler Testing Methods. Programming and Computing Software 31 (Jan. 2005), 10–19. Issue 1.Google ScholarGoogle Scholar
  19. Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. of the 2nd International Symposium on Code Generation and Optimization (CGO’04).Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence Modulo Inputs. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’14).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Vu Le, Chengnian Sun, and Zhendong Su. 2015a. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proc. of the 30th Annual Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA’15).Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Vu Le, Chengnian Sun, and Zhendong Su. 2015b. Randomized Stress-testing of Link-time Optimizers. In Proc. of the International Symposium on Software Testing and Analysis (ISSTA’15).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xavier Leroy. 2009. Formal verification of a realistic compiler. Communications of the Association for Computing Machinery (CACM) 52, 7 (2009), 107–115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core compiler fuzzing. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’15).Google ScholarGoogle Scholar
  25. Nuno Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. 2015. Provably Correct Peephole Optimizations with Alive. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’15).Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An Empirical Analysis of Flaky Tests. In Proc. of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE’14).Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Paul Dan Marinescu, Petr Hosek, and Cristian Cadar. 2014. Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software. In Proc. of the International Symposium on Software Testing and Analysis (ISSTA’14).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. M. McKeeman. 1998. Differential testing for software. Digital Technical Journal 10 (1998), 100–107. Issue 1.Google ScholarGoogle Scholar
  29. Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing random testing of arithmetic optimization of C compilers by scaling up size and number of expressions. IPSJ Transactions on System LSI Design Methodology 7 (2014), 91–100.Google ScholarGoogle Scholar
  30. Kazuhiro Nakamura and Nagisa Ishiura. 2016. Random testing of C compilers based on test program generation by equivalence transformation. In 2016 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). 676–679.Google ScholarGoogle ScholarCross RefCross Ref
  31. Paul Purdom. 1972. A sentence generator for testing parsers. BIT Numerical Mathematics 12 (1972), 366–375. Issue 3.Google ScholarGoogle ScholarCross RefCross Ref
  32. John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’12).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Richard L. Sauder. 1962. A General Test Data Generator for COBOL. In Proc. of the 1962 Spring Joint Computer Conference (AIEE-IRE’62 Spring).Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sergio Segura, Gordon Fraser, Ana Sanchez, and Antonio Ruiz-Cortés. 2016. A Survey on Metamorphic Testing. (2016).Google ScholarGoogle Scholar
  35. Chengnian Sun, Vu Le, and Zhendong Su. 2016a. Finding compiler bugs via live code mutation. In Proc. of the 31st Annual Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016b. Toward Understanding Compiler Bugs in GCC and LLVM. In Proc. of the International Symposium on Software Testing and Analysis (ISSTA’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Qiuming Tao, Wei Wu, Chen Zhao, and Wuwei Shen. 2010. An Automatic Testing Approach for Compiler Based on Metamorphic Testing Technique. In Proc. of the 17th Asia-Pacific Software Engineering Conference (ASPEC’10).Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. B.A. Wichmann. 1998. Some Remarks about Random Testing. http://www.npl.co.uk/upload/pdf/random_testing.pdf .Google ScholarGoogle Scholar
  39. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’11).Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yarpgen. 2018. https://github.com/intel/yarpgen .Google ScholarGoogle Scholar
  41. Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal program enumeration for rigorous compiler testing. In Proc. of the Conference on Programing Language Design and Implementation (PLDI’17).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Compiler fuzzing: how much does it matter?

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!