Abstract
A program can be viewed as a syntactic structure P (syntactic skeleton) parameterized by a collection of identifiers V (variable names). This paper introduces the skeletal program enumeration (SPE) problem: Given a syntactic skeleton P and a set of variables V , enumerate a set of programs P exhibiting all possible variable usage patterns within P. It proposes an effective realization of SPE for systematic, rigorous compiler testing by leveraging three important observations: (1) Programs with different variable usage patterns exhibit diverse control- and data-dependence, and help exploit different compiler optimizations; (2) most real compiler bugs were revealed by small tests (i.e., small-sized P) — this “small-scope” observation opens up SPE for practical compiler validation; and (3) SPE is exhaustive w.r.t. a given syntactic skeleton and variable set, offering a level of guarantee absent from all existing compiler testing techniques.
The key challenge of SPE is how to eliminate the enormous amount of equivalent programs w.r.t. α-conversion. Our main technical contribution is a novel algorithm for computing the canonical (and smallest) set of all non-α-equivalent programs. To demonstrate its practical utility, we have applied the SPE technique to test C/C++ compilers using syntactic skeletons derived from their own regression test-suites. Our evaluation results are extremely encouraging. In less than six months, our approach has led to 217 confirmed GCC/Clang bug reports, 119 of which have already been fixed, and the majority are long latent despite extensive prior testing efforts. Our SPE algorithm also provides six orders of magnitude reduction. Moreover, in three weeks, our technique has found 29 CompCert crashing bugs and 42 bugs in two Scala optimizing compilers. These results demonstrate our SPE technique’s generality and further illustrate its effectiveness.
- Dotty Compiler. http://dotty.epfl.ch/.Google Scholar
- Perennial, Inc. Perennial C Compiler Validation Suite. http: //www.peren.com/pages/cvsa_set.htm.Google Scholar
- Plum Hall, Inc. The Plum Hall Validation Suite for C. http: //www.plumhall.com/stec.html.Google Scholar
- Scala Compiler. http://www.scala-lang.org/.Google Scholar
- A. Balestrat. CCG. https://github.com/Mrktn/ccg.Google Scholar
- A. S. Boujarwah and K. Saleh. Compiler test case generation methods: a survey and assessment. Information & Software Technology, 39(9):617–625, 1997.Google Scholar
Cross Ref
- C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, pages 123–133, 2002. Google Scholar
Digital Library
- F. Briggs and M. O’Neill. Functional genetic programming and exhaustive program search with combinator expressions. KES Journal, 12(1):47–68, 2008. Google Scholar
Digital Library
- Y. Chen, A. Groce, C. Zhang, W. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In PLDI, pages 197–208, 2013. Google Scholar
Digital Library
- K. Claessen and J. Hughes. QuickCheck: a lightweight tool for random testing of Haskell programs. In ICFP, pages 268–279, 2000. Google Scholar
Digital Library
- K. Claessen, J. Duregård, and M. H. Palka. Generating constrained random data with uniform distribution. J. Funct. Program., 25, 2015.Google Scholar
- B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, pages 185–194, 2007. Google Scholar
Digital Library
- S. M. Daniel S. Wilkerson and S. Goldsmith. Berkeley Delta. http://delta.stage.tigris.org/.Google Scholar
- N. G. De Bruijn. Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the Church-Rosser theorem. In Indagationes Mathematicae, volume 75, pages 381–392, 1972.Google Scholar
Cross Ref
- R. A. DeMillo, R. J. Lipton, and F. G. Sayward. Hints on test data selection: Help for the practicing programmer. IEEE Computer, 11(4):34–41, 1978. Google Scholar
Digital Library
- J. Duregård, P. Jansson, and M. Wang. Feat: functional enumeration of algebraic types. In Haskell, pages 61–72, 2012.Google Scholar
Digital Library
- C. Ellison and G. Rosu. An executable formal semantics of C with applications. In POPL, pages 533–544, 2012. Google Scholar
Digital Library
- B. Fetscher, K. Claessen, M. H. Palka, J. Hughes, and R. B. Findler. Making random judgments: Automatically generating well-typed terms from the definition of a type-system. In ESOP, pages 383–405, 2015.Google Scholar
Cross Ref
- J. P. Galeotti, N. Rosner, C. G. L. Pombo, and M. F. Frias. TACO: efficient SAT-based bounded verification using symmetry breaking and tight bounds. IEEE Trans. Software Eng., 39(9):1283–1307, 2013. Google Scholar
Digital Library
- M. Gligoric, T. Gvero, V. Jagannath, S. Khurshid, V. Kuncak, and D. Marinov. Test generation through programming in UDITA. In ICSE, pages 225–234, 2010. Google Scholar
Digital Library
- K. Grygiel and P. Lescanne. Counting and generating lambda terms. J. Funct. Program., 23(5):594–628, 2013.Google Scholar
Cross Ref
- K. Grygiel and P. Lescanne. Counting and generating terms in the binary lambda calculus. J. Funct. Program., 25, 2015.Google Scholar
- R. G. Hamlet. Testing programs with the aid of a compiler. IEEE Trans. Software Eng., 3(4):279–290, 1977. Google Scholar
Digital Library
- C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In USENIX Security, pages 445–458, 2012. Google Scholar
Digital Library
- S. Katayama. Systematic search for lambda expressions. In TFP, pages 111–126, 2005.Google Scholar
- S. Katayama. Efficient exhaustive generation of functional programs using Monte-Carlo search with iterative deepening. In Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence, pages 199–210, 2008. Google Scholar
Digital Library
- S. Katayama. An analytical inductive functional programming system that avoids unintended programs. In PEPM, pages 43–52, 2012. Google Scholar
Digital Library
- S. Khurshid and D. Marinov. TestEra: Specification-based testing of Java programs using SAT. Autom. Softw. Eng., 11(4): 403–434, 2004. Google Scholar
Digital Library
- D. E. Knuth. The art of computer programming. Vol. 4A., Combinatorial algorithms. Part 1. Addison-Wesley, 2011. Google Scholar
Digital Library
- D. L. Kreher and D. R. Stinson. Combinatorial algorithms: generation, enumeration, and search. CRC Press, London, New York, 1999.Google Scholar
- I. Kuraj, V. Kuncak, and D. Jackson. Programming with enumerable sets of structures. In OOPSLA, pages 37–56, 2015. Google Scholar
Digital Library
- V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. In PLDI, page 25, 2014. Google Scholar
Digital Library
- V. Le, C. Sun, and Z. Su. Finding deep compiler bugs via guided stochastic program mutation. In OOPSLA, pages 386– 399, 2015. Google Scholar
Digital Library
- V. Le, C. Sun, and Z. Su. Randomized stress-testing of linktime optimizers. In ISSTA, pages 327–337, 2015. Google Scholar
Digital Library
- X. Leroy. Formal certification of a compiler back-end or: programming a compiler with a proof assistant. In POPL, pages 42–54, 2006. Google Scholar
Digital Library
- P. Lescanne. On counting untyped lambda terms. Theor. Comput. Sci., 474:80–97, 2013. Google Scholar
Digital Library
- C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson. Manycore compiler fuzzing. In PLDI, pages 65–76, 2015. Google Scholar
Digital Library
- T. Mansour and G. Nassar. Gray codes, loopless algorithm and partitions. Journal of Mathematical Modelling and Algorithms, 7(3):291–310, 2008.Google Scholar
Cross Ref
- T. Mansour, G. Nassar, and V. Vajnovszki. Loop-free Gray code algorithm for the e-restricted growth functions. Information Processing Letters, 111(11):541–544, 2011. Google Scholar
Digital Library
- W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.Google Scholar
- E. Nagai, A. Hashimoto, and N. Ishiura. Reinforcing random testing of arithmetic optimization of C compilers by scaling up size and number of expressions. IPSJ Trans. System LSI Design Methodology, 7:91–100, 2014.Google Scholar
Cross Ref
- G. C. Necula. Translation validation for an optimizing compiler. In PLDI, pages 83–94, 2000. Google Scholar
Digital Library
- F. Nielson, H. R. Nielson, and C. Hankin. Principles of program analysis. Springer, 1999. Google Scholar
Digital Library
- F. W. J. Olver, D. W. Lozier, R. F. Boisvert, and C. W. Clark, editors. NIST Handbook of Mathematical Functions. Cambridge University Press, New York, NY, 2010. Google Scholar
Digital Library
- M. H. Palka, K. Claessen, A. Russo, and J. Hughes. Testing an optimising compiler by generating random lambda terms. In AST, pages 91–97, 2011. Google Scholar
Digital Library
- A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In TACAS, pages 151–166, 1998. Google Scholar
Digital Library
- J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In PLDI, pages 335– 346, 2012. Google Scholar
Digital Library
- N. Rosner, V. S. Bengolea, P. Ponzio, S. A. Khalek, N. Aguirre, M. F. Frias, and S. Khurshid. Bounded exhaustive test input generation from hybrid invariants. In OOPSLA, pages 655–674, 2014. Google Scholar
Digital Library
- C. Runciman, M. Naylor, and F. Lindblad. Smallcheck and lazy smallcheck: automatic exhaustive testing for small values. In Haskell, pages 37–48, 2008. Google Scholar
Digital Library
- V. Senni and F. Fioravanti. Generation of test data structures using constraint logic programming. In TAP, pages 115–131, 2012. Google Scholar
Digital Library
- K. J. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, pages 133–142, 2004. Google Scholar
Digital Library
- C. Sun, V. Le, and Z. Su. Finding and analyzing compiler warning defects. In ICSE, pages 203–213, 2016. Google Scholar
Digital Library
- C. Sun, V. Le, and Z. Su. Finding compiler bugs via live code mutation. In OOPSLA, pages 849–863, 2016. Google Scholar
Digital Library
- C. Sun, V. Le, Q. Zhang, and Z. Su. Toward understanding compiler bugs in GCC and LLVM. In ISSTA, pages 294–305, 2016. Google Scholar
Digital Library
- P. Tarau. On type-directed generation of lambda terms. In ICLP (Technical Communications), 2015.Google Scholar
- W. Visser, C. S. Pasareanu, and S. Khurshid. Test input generation with Java PathFinder. In ISSTA, pages 97–107, 2004. Google Scholar
Digital Library
- X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In PLDI, pages 283–294, 2011. Google Scholar
Digital Library
Index Terms
Skeletal program enumeration for rigorous compiler testing
Recommendations
A Survey of Compiler Testing
Virtually any software running on a computer has been processed by a compiler or a compiler-like tool. Because compilers are such a crucial piece of infrastructure for building software, their correctness is of paramount importance. To validate and ...
Skeletal program enumeration for rigorous compiler testing
PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and ImplementationA program can be viewed as a syntactic structure P (syntactic skeleton) parameterized by a collection of identifiers V (variable names). This paper introduces the skeletal program enumeration (SPE) problem: Given a syntactic skeleton P and a set of ...
Learning to accelerate compiler testing
ICSE '18: Proceedings of the 40th International Conference on Software Engineering: Companion ProceeedingsCompilers are one of the most important software infrastructures. Compiler testing is an effective and widely-used way to assure the quality of compilers. While many compiler testing techniques have been proposed to detect compiler bugs, these ...






Comments