skip to main content
research-article

Finding deep compiler bugs via guided stochastic program mutation

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

Compiler testing is important and challenging. Equivalence Modulo Inputs (EMI) is a recent promising approach for compiler validation. It is based on mutating the unexecuted statements of an existing program under some inputs to produce new equivalent test programs w.r.t. these inputs. Orion is a simple realization of EMI by only randomly deleting unexecuted statements. Despite its success in finding many bugs in production compilers, Orion’s effectiveness is still limited by its simple, blind mutation strategy. To more effectively realize EMI, this paper introduces a guided, advanced mutation strategy based on Bayesian optimization. Our goal is to generate diverse programs to more thoroughly exercise compilers. We achieve this with two techniques: (1) the support of both code deletions and insertions in the unexecuted regions, leading to a much larger test program space; and (2) the use of an objective function that promotes control-flow-diverse programs for guiding Markov Chain Monte Carlo (MCMC) optimization to explore the search space. Our technique helps discover deep bugs that require elaborate mutations. Our realization, Athena, targets C compilers. In 19 months, Athena has found 72 new bugs — many of which are deep and important bugs — in GCC and LLVM. Developers have confirmed all 72 bugs and fixed 68 of them.

References

  1. ACE. SuperTest compiler test and validation suite. URL http://www.ace.nl/compiler/supertest.html.Google ScholarGoogle Scholar
  2. C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan. An Introduction to MCMC for Machine Learning. Machine Learning, 50(1):5–43, Jan. 2003.Google ScholarGoogle Scholar
  3. A. Balestrat. CCG: A random C code generator. URL https://github.com/Merkil/ccg/.Google ScholarGoogle Scholar
  4. Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197–208, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Chong, A. Donaldson, A. Lascu, and C. Lidbury. Manycore compiler fuzzing. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. R. Gilks. Markov Chain Monte Carlo In Practice. Chapman and Hall/CRC, 1999. ISBN 0412055511.Google ScholarGoogle Scholar
  7. GNU Compiler Collection. Gcov - Using the GNU Compiler Collection (GCC). URL http://gcc.gnu.org/onlinedocs/ gcc/Gcov.html.Google ScholarGoogle Scholar
  8. Jesse Ruderman. Introducing jsfunfuzz. URL http://www. squarefree.com/2007/08/02/introducing-jsfunfuzz/.Google ScholarGoogle Scholar
  9. V. Le, M. Afshari, and Z. Su. Compiler validation via equivalence modulo inputs. In Proceedings of the 2014 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. Le, C. Sun, and Z. Su. Randomized stress-testing of linktime optimizers. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42–54, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363–446, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237–248, 2010.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.Google ScholarGoogle Scholar
  15. S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. URL http://delta.tigris.org/.Google ScholarGoogle Scholar
  16. E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48–53, 2012.Google ScholarGoogle Scholar
  17. E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88–93, 2013.Google ScholarGoogle Scholar
  18. G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83–94, 2000.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Plum Hall, Inc. The Plum Hall Validation Suite for C. URL http://www.plumhall.com/stec.html.Google ScholarGoogle Scholar
  20. A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151–166, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335–346, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Samet. Automatically proving the correctness of translations involving optimized code. PhD Thesis, Stanford University, May 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 305–316, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 53–64, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Standard Performance Evaluation Corporation. SPEC CINT2006 Benchmarks. URL https://www.spec.org/ cpu2006/CINT2006/.Google ScholarGoogle Scholar
  26. R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, pages 264–276, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. The Clang Team. Clang 3.4 documentation: LibTooling. URL http://clang.llvm.org/docs/LibTooling.html.Google ScholarGoogle Scholar
  28. J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating valuegraph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295–305, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Wikipedia. Jaccard index. URL http://en.wikipedia.org/ wiki/Jaccard_index.Google ScholarGoogle Scholar
  30. X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283–294, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175–186, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Introduction Illustrative Examples LLVM Crashing Bug 18615 GCC Miscompilation Bug 61383 Background Equivalence Modulo Inputs Markov Chain Monte Carlo MCMC Bug Finding Objective Function MCMC Sampling Variant Proposal Athena Extracting Statement Candidates Proposing Variants Discussion Evaluation Testing Setup Quantitative Results Effectiveness of MCMC Bug Finding Coverage Improvement Assorted Sample Bugs Found by Athena Related Work Conclusion AcknowledgmentsGoogle ScholarGoogle Scholar

Index Terms

  1. Finding deep compiler bugs via guided stochastic program mutation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 50, Issue 10
            OOPSLA '15
            October 2015
            953 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/2858965
            • Editor:
            • Andy Gill
            Issue’s Table of Contents
            • cover image ACM Conferences
              OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
              October 2015
              953 pages
              ISBN:9781450336895
              DOI:10.1145/2814270

            Copyright © 2015 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 October 2015

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!