Abstract
Despite the substantial progress in compiler testing, research endeavors have mainly focused on detecting compiler crashes and subtle miscompilations caused by bugs in the implementation of compiler optimizations. Surprisingly, this growing body of work neglects other compiler components, most notably the front-end. In statically-typed programming languages with rich and expressive type systems and modern features, such as type inference or a mix of object-oriented with functional programming features, the process of static typing in compiler front-ends is complicated by a high-density of bugs. Such bugs can lead to the acceptance of incorrect programs (breaking code portability or the type system's soundness), the rejection of correct (e.g. well-typed) programs, and the reporting of misleading errors and warnings.
We conduct, what is to the best of our knowledge, the first empirical study for understanding and characterizing typing-related compiler bugs. To do so, we manually study 320 typing-related bugs (along with their fixes and test cases) that are randomly sampled from four mainstream JVM languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several aspects, including their symptom, root cause, bug fix's size, and the characteristics of the bug-revealing test cases. Some representative observations indicate that: (1) more than half of the typing-related bugs manifest as unexpected compile-time errors: the buggy compiler wrongly rejects semantically correct programs, (2) the majority of typing-related bugs lie in the implementations of the underlying type systems and in other core components related to operations on types, (3) parametric polymorphism is the most pervasive feature in the corresponding test cases, (4) one third of typing-related bugs are triggered by non-compilable programs.
We believe that our study opens up a new research direction by driving future researchers to build appropriate methods and techniques for a more holistic testing of compilers.
Supplemental Material
- Nada Amin, Samuel Grütter, Martin Odersky, Tiark Rompf, and Sandro Stucki. 2016. The Essence of Dependent Object Types. Springer International Publishing, Cham. 249–272. https://doi.org/10.1007/978-3-319-30936-1_14 Google Scholar
- Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian. 2020. Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API Usages, and Differences. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 214, Nov., 32 pages. https://doi.org/10.1145/3428282 Google Scholar
Digital Library
- Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler. 1998. Making the Future Safe for the Past: Adding Genericity to the Java Programming Language. In Proceedings of the 13th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’98). Association for Computing Machinery, New York, NY, USA. 183–200. https://doi.org/10.1145/286936.286957 Google Scholar
Digital Library
- Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Bing Xie. 2017. Learning to Prioritize Test Programs for Compiler Testing. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17). IEEE Press, 700–711. https://doi.org/10.1109/ICSE.2017.70 Google Scholar
Digital Library
- Junjie Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie. 2016. Test Case Prioritization for Compilers: A Text-Vector Based Approach. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). 266–277. https://doi.org/10.1109/ICST.2016.19 Google Scholar
Cross Ref
- Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of Compiler Testing. ACM Comput. Surv., 53, 1 (2020), Article 4, Feb., 36 pages. issn:0360-0300 https://doi.org/10.1145/3363562 Google Scholar
Digital Library
- Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep Differential Testing of JVM Implementations. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 1257–1268. https://doi.org/10.1109/ICSE.2019.00127 Google Scholar
Digital Library
- Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-Directed Differential Testing of JVM Implementations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Association for Computing Machinery, New York, NY, USA. 85–99. https://doi.org/10.1145/2908080.2908095 Google Scholar
Digital Library
- Shafiul Azam Chowdhury, Sohil Lal Shrestha, Taylor T. Johnson, and Christoph Csallner. 2020. SLEMI: Equivalence modulo Input (EMI) Based Mutation of CPS Models for Finding Compiler Bugs in Simulink. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 335–346. https://doi.org/10.1145/3377811.3380381 Google Scholar
Digital Library
- Kyle Dewey, Jared Roesch, and Ben Hardekopf. 2015. Fuzzing the Rust Typechecker Using CLP. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE ’15). IEEE Press, 482–493. https://doi.org/10.1109/ASE.2015.65 Google Scholar
Digital Library
- Anthony Di Franco, Hui Guo, and Cindy Rubio-González. 2017. A Comprehensive Study of Real-World Numerical Bug Characteristics. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, 509–519. https://doi.org/10.1109/ASE.2017.8115662 Google Scholar
Digital Library
- Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated Testing of Graphics Shader Compilers. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 93, Oct., 29 pages. https://doi.org/10.1145/3133917 Google Scholar
Digital Library
- Alastair F. Donaldson, Hugues Evrard, and Paul Thomson. 2020. Putting Randomized Compiler Testing into Production (Experience Report). In 34th European Conference on Object-Oriented Programming (ECOOP 2020), Robert Hirschfeld and Tobias Pape (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 166). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 22:1–22:29. issn:1868-8969 https://doi.org/10.4230/LIPIcs.ECOOP.2020.22 Google Scholar
Cross Ref
- Saikat Dutta, Owolabi Legunsen, Zixin Huang, and Sasa Misailovic. 2018. Testing Probabilistic Programming Systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA. 574–586. https://doi.org/10.1145/3236024.3236057 Google Scholar
Digital Library
- Github Inc.. 2021. The state of the Octoverse. https://octoverse.github.com/ Online accessed; 05-03-2021Google Scholar
- James Gosling, Bill Joy, Guy Steele, Gilad Bracha, and Alex Buckley. 2015. The Java Language Specification: Java SE 8 Edition. https://docs.oracle.com/javase/specs/jls/se8/jls8.pdf Google Scholar
Digital Library
- Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proceedings of the 21st USENIX Conference on Security Symposium (Security’12). USENIX Association, USA. 38. Google Scholar
Digital Library
- Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World Performance Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 77–88. https://doi.org/10.1145/2254064.2254075 Google Scholar
Digital Library
- Filip Křikava, Heather Miller, and Jan Vitek. 2019. Scala Implicits Are Everywhere: A Large-Scale Study of the Use of Scala Implicits in the Wild. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 163, Oct., 28 pages. https://doi.org/10.1145/3360589 Google Scholar
Digital Library
- Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 216–226. https://doi.org/10.1145/2594291.2594334 Google Scholar
Digital Library
- Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). Association for Computing Machinery, New York, NY, USA. 386–399. https://doi.org/10.1145/2814270.2814319 Google Scholar
Digital Library
- Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, and Haryadi S. Gunawi. 2016. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’16). Association for Computing Machinery, New York, NY, USA. 517–530. https://doi.org/10.1145/2872362.2872374 Google Scholar
Digital Library
- Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-Core Compiler Fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). Association for Computing Machinery, New York, NY, USA. 65–76. https://doi.org/10.1145/2737924.2737986 Google Scholar
Digital Library
- Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random Testing for C and C++ Compilers with YARPGen. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 196, Nov., 25 pages. https://doi.org/10.1145/3428264 Google Scholar
Digital Library
- M. Zalewski. 2013. American fuzzy lop. https://lcamtuf.coredump.cx/afl/ Online accessed; 05-08-2021Google Scholar
- Michaël Marcozzi, Qiyi Tang, Alastair F. Donaldson, and Cristian Cadar. 2019. Compiler Fuzzing: How Much Does It Matter? Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 155, Oct., 29 pages. https://doi.org/10.1145/3360581 Google Scholar
Digital Library
- Luis Mastrangelo, Matthias Hauswirth, and Nathaniel Nystrom. 2019. Casting about in the Dark: An Empirical Study of Cast Operations in Java Programs. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 158, Oct., 31 pages. https://doi.org/10.1145/3360584 Google Scholar
Digital Library
- Bruno Gois Mateus and Matias Martinez. 2020. On the Adoption, Usage and Evolution of Kotlin Features in Android Development. In Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (ESEM ’20). Association for Computing Machinery, New York, NY, USA. Article 15, 12 pages. https://doi.org/10.1145/3382494.3410676 Google Scholar
Digital Library
- Robin Milner. 1978. A Theory of Type Polymorphism in Programming. J. Comput. System Sci., 17, 3 (1978), 348–375.Google Scholar
- Adriaan Moors, Frank Piessens, and Martin Odersky. 2008. Generics of a Higher Kind. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications (OOPSLA ’08). Association for Computing Machinery, New York, NY, USA. 423–438. https://doi.org/10.1145/1449764.1449798 Google Scholar
Digital Library
- Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48–53.Google Scholar
- Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions. IPSJ Transactions on System LSI Design Methodology, 7 (2014), 91–100. https://doi.org/10.2197/ipsjtsldm.7.91 Google Scholar
- Martin Odersky, Philippe Altherr, Vincent Cremet, Burak Emir, Sebastian Maneth, Stéphane Micheloud, Nikolay Mihaylov, Michel Schinz, Erik Stenman, and Matthias Zenger. 2004. An overview of the Scala programming language.Google Scholar
- S. Park, W. Xu, I. Yun, D. Jang, and T. Kim. 2020. Fuzzing JavaScript Engines with Aspect-preserving Mutation. In 2020 IEEE Symposium on Security and Privacy (SP). 1629–1642. https://doi.org/10.1109/SP40000.2020.00067 Google Scholar
Cross Ref
- John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 335–346. https://doi.org/10.1145/2254064.2254104 Google Scholar
Digital Library
- Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding and Analyzing Compiler Warning Defects. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery, New York, NY, USA. 203–213. https://doi.org/10.1145/2884781.2884879 Google Scholar
Digital Library
- Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. OOPSLA 2016. Association for Computing Machinery, New York, NY, USA. 849–863. https://doi.org/10.1145/2983990.2984038 Google Scholar
Digital Library
- Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016). Association for Computing Machinery, New York, NY, USA. 294–305. https://doi.org/10.1145/2931037.2931074 Google Scholar
Digital Library
- TIOBE Software BV. 2021. TIOBE index. https://www.tiobe.com/tiobe-index/ Online accessed; 05-03-2021Google Scholar
- Seth Tisue. 2017. Bye bye JIRA — Scala issues migrated to GitHub scala/bug. https://contributors.scala-lang.org/t/bye-bye-jira-scala-issues-migrated-to-github-scala-bug/715Google Scholar
- Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-Aware Greybox Fuzzing. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 724–735. https://doi.org/10.1109/ICSE.2019.00081 Google Scholar
Digital Library
- Jie Wang, Wensheng Dou, Yu Gao, Chushu Gao, Feng Qin, Kang Yin, and Jun Wei. 2017. A Comprehensive Study on Real World Concurrency Bugs in Node.js. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, 520–531. https://doi.org/10.1109/ASE.2017.8115663 Google Scholar
Digital Library
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 283–294. https://doi.org/10.1145/1993498.1993532 Google Scholar
Digital Library
- Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal Program Enumeration for Rigorous Compiler Testing. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). Association for Computing Machinery, New York, NY, USA. 347–361. https://doi.org/10.1145/3062341.3062379 Google Scholar
Digital Library
- Zhide Zhou, Zhilei Ren, Guojun Gao, and He Jiang. 2021. An empirical study of optimization bugs in GCC and LLVM. Journal of Systems and Software, 174 (2021), 110884. issn:0164-1212 https://doi.org/10.1016/j.jss.2020.110884 Google Scholar
- David Zubrow. 2010. IEEE Standard Classification for Software Anomalies. IEEE Std 1044-2009 (Revision of IEEE Std 1044-1993), 1–23. https://doi.org/10.1109/IEEESTD.2010.5399061 Google Scholar
Cross Ref
Index Terms
Well-typed programs can go wrong: a study of typing-related bugs in JVM compilers
Recommendations
Finding typing compiler bugs
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationWe propose a testing framework for validating static typing procedures in compilers. Our core component is a program generator suitably crafted for producing programs that are likely to trigger typing compiler bugs. One of our main contributions is that ...
Toward understanding compiler bugs in GCC and LLVM
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisCompilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix ...
Compile-time type-checking for custom type qualifiers in Java
OOPSLA '07: Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companionWe have created a system that enables programmers to add custom type qualifiers to the Java language in a backward-compatible way. The system allows programmers to write type qualifiers in their programs and to create compiler plug-ins that enforce the ...






Comments