Abstract
Compiler testing is important and challenging. Equivalence Modulo Inputs (EMI) is a recent promising approach for compiler validation. It is based on mutating the unexecuted statements of an existing program under some inputs to produce new equivalent test programs w.r.t. these inputs. Orion is a simple realization of EMI by only randomly deleting unexecuted statements. Despite its success in finding many bugs in production compilers, Orion’s effectiveness is still limited by its simple, blind mutation strategy. To more effectively realize EMI, this paper introduces a guided, advanced mutation strategy based on Bayesian optimization. Our goal is to generate diverse programs to more thoroughly exercise compilers. We achieve this with two techniques: (1) the support of both code deletions and insertions in the unexecuted regions, leading to a much larger test program space; and (2) the use of an objective function that promotes control-flow-diverse programs for guiding Markov Chain Monte Carlo (MCMC) optimization to explore the search space. Our technique helps discover deep bugs that require elaborate mutations. Our realization, Athena, targets C compilers. In 19 months, Athena has found 72 new bugs — many of which are deep and important bugs — in GCC and LLVM. Developers have confirmed all 72 bugs and fixed 68 of them.
- ACE. SuperTest compiler test and validation suite. URL http://www.ace.nl/compiler/supertest.html.Google Scholar
- C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan. An Introduction to MCMC for Machine Learning. Machine Learning, 50(1):5–43, Jan. 2003.Google Scholar
- A. Balestrat. CCG: A random C code generator. URL https://github.com/Merkil/ccg/.Google Scholar
- Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197–208, 2013. Google Scholar
Digital Library
- N. Chong, A. Donaldson, A. Lascu, and C. Lidbury. Manycore compiler fuzzing. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2015. Google Scholar
Digital Library
- W. R. Gilks. Markov Chain Monte Carlo In Practice. Chapman and Hall/CRC, 1999. ISBN 0412055511.Google Scholar
- GNU Compiler Collection. Gcov - Using the GNU Compiler Collection (GCC). URL http://gcc.gnu.org/onlinedocs/ gcc/Gcov.html.Google Scholar
- Jesse Ruderman. Introducing jsfunfuzz. URL http://www. squarefree.com/2007/08/02/introducing-jsfunfuzz/.Google Scholar
- V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. In Proceedings of the 2014 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2014. Google Scholar
Digital Library
- V. Le, C. Sun, and Z. Su. Randomized stress-testing of linktime optimizers. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA), 2014. Google Scholar
Digital Library
- X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42–54, 2006. Google Scholar
Digital Library
- X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363–446, 2009. Google Scholar
Digital Library
- G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237–248, 2010.. Google Scholar
Digital Library
- W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.Google Scholar
- S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. URL http://delta.tigris.org/.Google Scholar
- E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48–53, 2012.Google Scholar
- E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88–93, 2013.Google Scholar
- G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83–94, 2000.. Google Scholar
Digital Library
- Plum Hall, Inc. The Plum Hall Validation Suite for C. URL http://www.plumhall.com/stec.html.Google Scholar
- A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151–166, 1998. Google Scholar
Digital Library
- J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335–346, 2012. Google Scholar
Digital Library
- H. Samet. Automatically proving the correctness of translations involving optimized code. PhD Thesis, Stanford University, May 1975. Google Scholar
Digital Library
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 305–316, 2013. Google Scholar
Digital Library
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 53–64, 2014. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. SPEC CINT2006 Benchmarks. URL https://www.spec.org/ cpu2006/CINT2006/.Google Scholar
- R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 264–276, 2009. Google Scholar
Digital Library
- The Clang Team. Clang 3.4 documentation: LibTooling. URL http://clang.llvm.org/docs/LibTooling.html.Google Scholar
- J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating valuegraph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295–305, 2011. Google Scholar
Digital Library
- Wikipedia. Jaccard index. URL http://en.wikipedia.org/ wiki/Jaccard_index.Google Scholar
- X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283–294, 2011. Google Scholar
Digital Library
- J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175–186, 2013. Google Scholar
Digital Library
- Introduction Illustrative Examples LLVM Crashing Bug 18615 GCC Miscompilation Bug 61383 Background Equivalence Modulo Inputs Markov Chain Monte Carlo MCMC Bug Finding Objective Function MCMC Sampling Variant Proposal Athena Extracting Statement Candidates Proposing Variants Discussion Evaluation Testing Setup Quantitative Results Effectiveness of MCMC Bug Finding Coverage Improvement Assorted Sample Bugs Found by Athena Related Work Conclusion AcknowledgmentsGoogle Scholar
Index Terms
Finding deep compiler bugs via guided stochastic program mutation
Recommendations
Finding compiler bugs via live code mutation
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsValidating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Toward understanding compiler bugs in GCC and LLVM
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisCompilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix ...
Compiler validation via equivalence modulo inputs
PLDI '14We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) ...






Comments