Abstract
We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce a natural collection of the original program's EMI variants, which can help differentially test any compiler and specifically target the difficult-to-find miscompilations.
To create a practical implementation of EMI for validating C compilers, we profile a program's test executions and stochastically prune its unexecuted code. Our extensive testing in eleven months has led to 147 confirmed, unique bug reports for GCC and LLVM alone. The majority of those bugs are miscompilations, and more than 100 have already been fixed.
Beyond testing compilers, EMI can be adapted to validate program transformation and analysis systems in general. This work opens up this exciting, new direction.
- ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.Google Scholar
- A. Balestrat. CCG: A random C code generator. https://github.com/Merkil/ccg/.Google Scholar
- S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 394--403, 2006. Google Scholar
Digital Library
- Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197--208, 2013. Google Scholar
Digital Library
- P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In A. Goodloe and S. Person, editors, NASA Formal Methods, volume 7226 of Lecture Notes in Computer Science, pages 120--125. Springer Berlin Heidelberg, 2012. Google Scholar
Digital Library
- C. Ellison and G. Rosu. An executable formal semantics of C with applications. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 533--544, 2012. Google Scholar
Digital Library
- GNU Compiler Collection. gcov. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.Google Scholar
- A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA), pages 78--88, 2012. Google Scholar
Digital Library
- T. Hoare. The verifying compiler: A grand challenge for computing research. In Modular Programming Languages, pages 25--35. Springer, 2003.Google Scholar
Cross Ref
- C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium, 2012. Google Scholar
Digital Library
- R. Joshi, G. Nelson, and K. H. Randall. Denali: A goal-directed superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 304--314, 2002. Google Scholar
Digital Library
- X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42--54. ACM Press, 2006. Google Scholar
Digital Library
- X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363--446, 2009. Google Scholar
Digital Library
- G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237--248, 2010. Google Scholar
Digital Library
- H. Massalin. Superoptimizer -- A look at the smallest program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 122--126, 1987. Google Scholar
Digital Library
- W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100--107, 1998.Google Scholar
- S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. http://delta.tigris.org/.Google Scholar
- E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48--53, 2012.Google Scholar
- E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88--93, 2013.Google Scholar
- G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83--94, 2000. Google Scholar
Digital Library
- Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.Google Scholar
- A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151--166, London, UK, UK, 1998. Springer-Verlag. Google Scholar
Digital Library
- J. Regehr. Embedded in academia. http://blog.regehr.org/.Google Scholar
- J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335--346, 2012. Google Scholar
Digital Library
- R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the 36th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 264--276, 2009. Google Scholar
Digital Library
- The Clang Team. Clang 3.4 documentation: LibTooling. http://clang.llvm.org/docs/LibTooling.html.Google Scholar
- J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295--305, 2011. Google Scholar
Digital Library
- X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283--294, 2011. Google Scholar
Digital Library
- C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36--43, 2009.Google Scholar
- J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175--186, 2013. Google Scholar
Digital Library
Index Terms
Compiler validation via equivalence modulo inputs
Recommendations
Finding compiler bugs via live code mutation
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsValidating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Finding deep compiler bugs via guided stochastic program mutation
OOPSLA '15Compiler testing is important and challenging. Equivalence Modulo Inputs (EMI) is a recent promising approach for compiler validation. It is based on mutating the unexecuted statements of an existing program under some inputs to produce new equivalent ...
Finding and understanding bugs in C compilers
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and ImplementationCompilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to ...







Comments