Abstract
This article studies undefined behavior arising in systems programming languages such as C/C++. Undefined behavior bugs lead to unpredictable and subtle systems behavior, and their effects can be further amplified by compiler optimizations. Undefined behavior bugs are present in many systems, including the Linux kernel and the Postgres database. The consequences range from incorrect functionality to missing security checks. This article proposes a formal and practical approach that finds undefined behavior bugs by finding “unstable code” in terms of optimizations that leverage undefined behavior. Using this approach, we introduce a new static checker called Stack that precisely identifies undefined behavior bugs. Applying Stack to widely used systems has uncovered 161 new bugs that have been confirmed and fixed by developers.
- Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe user-level access to privileged CPU features. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI’12). 335--348. Google Scholar
Digital Library
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Communication of the ACM 53, 2 (Feb. 2010), 66--75. Google Scholar
Digital Library
- Sam Blackshear and Shuvendu Lahiri. 2013. Almost-correct specifications: A modular semantic framework for assigning confidence to warnings. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 209--218. Google Scholar
Digital Library
- Hans-J. Boehm. 2005. Threads cannot be implemented as a library. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Chicago, IL, 261--268. Google Scholar
Digital Library
- Robert Brummayer and Armin Biere. 2009. Boolector: An efficient SMT solver for bit-vectors and arrays. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 174--177. Google Scholar
Digital Library
- Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI’08). Google Scholar
Digital Library
- Géraud Canet, Pascal Cuoq, and Benjamin Monate. 2009. A value analysis for C programs. In Proceedings of the 9th IEEE International Working Conference on Source Code Analysis and Manipulation. 123--124. Google Scholar
Digital Library
- Haogang Chen, Yandong Mao, Xi Wang, Dong Zhou, Nickolai Zeldovich, and M. Frans Kaashoek. 2011. Linux kernel vulnerabilities: State-of-the-art defenses and open problems. In Proceedings of the 2nd Asia-Pacific Workshop on Systems. Google Scholar
Digital Library
- Chromium 2013. Issue 12079010: Avoid Undefined Behavior When Checking for Pointer Wraparound. Retrieved from https://codereview.chromium.org/12079010/.Google Scholar
- Alessandro Cimatti, Alberto Griggio, and Roberto Sebastiani. 2011. Computing small unsatisfiable cores in satisfiability modulo theories. Journal of Artificial Intelligence Research 40 (2011), 701--728. Google Scholar
Digital Library
- Clang. 2014. Clang Compiler User’s Manual: Controlling Code Generation. Retrieved from http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation.Google Scholar
- Jonathan Corbet. 2009. Fun with NULL Pointers, Part 1. (July 2009). http://lwn.net/Articles/342330/.Google Scholar
- Russ Cox. 2008. Re: plan9port build failure on Linux (debian). (March 2008). http://9fans.net/archive/2008/03/89.Google Scholar
- Pascal Cuoq, Matthew Flatt, and John Regehr. 2014. Proposal for a Friendly Dialect of C. Retrieved from http://blog.regehr.org/archives/1180.Google Scholar
- Will Dietz, Peng Li, John Regehr, and Vikram Adve. 2012. Understanding integer overflow in C/C++. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). 760--770. Google Scholar
Digital Library
- Isil Dillig, Thomas Dillig, and Alex Aiken. 2007. Static error detection using semantic inconsistency inference. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). 435--445. Google Scholar
Digital Library
- Chad R. Dougherty and Robert C. Seacord. 2008. C compilers may silently discard some wraparound checks. Vulnerability Note VU#162289. US-CERT. Retrieved from http://www.kb.cert.org/vuls/id/162289, original version http://www.isspcs.org/render.html?it=9100.Google Scholar
- Chucky Ellison and Grigore Roşu. 2012a. Defining the Undefinedness of C. Technical Report. University of Illinois. Retrieved from http://hdl.handle.net/2142/30780.Google Scholar
- Chucky Ellison and Grigore Roşu. 2012b. An executable formal semantics of C with applications. In Proceedings of the 39th ACM Symposium on Principles of Programming Languages (POPL’12). 533--544. Google Scholar
Digital Library
- Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01). 57--72. Google Scholar
Digital Library
- GCC. 2006. Bug 27180—Pointer Arithmetic Overflow Handling Broken. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27180.Google Scholar
- GCC. 2007. Bug 30475—assert(int+100 > int) Optimized Away. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475.Google Scholar
- GCC 2011. Bug 49820—Explicit Check for Integer Negative after abs Optimized Away. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49820.Google Scholar
- GCC. 2013. Bug 53265—Warn When Undefined Behavior Implies Smaller Iteration Count. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53265.Google Scholar
- Jochen Hoenicke, K. Rustan M. Leino, Andreas Podelski, Martin Schäf, and Thomas Wies. 2009. It’s doomed; we can prove it. In Proceedings of the 16th International Symposium on Formal Methods (FM’09). Eindhoven, the Netherlands, 338--353. Google Scholar
Digital Library
- IBM. 2009. Optimizing C Code at Optimization Level 2. White paper.Google Scholar
- IBM. 2010. Power ISA Version 2.06 Revision B, Book I: Power ISA User Instruction Set Architecture.Google Scholar
- Intel. 2010. Intel Itanium Architecture Software Developer’s Manual, Volume 1: Application Architecture.Google Scholar
- Intel. 2013. Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 2: Instruction Set Reference, A--Z.Google Scholar
- Intel. 2014. Intel 64 and IA-32 Architectures Software Developer’s Manual.Google Scholar
- ISO/IEC. 2003. Rationale for International Standard - Programming Languages - C.Google Scholar
- ISO/IEC. 2011. ISO/IEC 9899:2011, Programming languages - C.Google Scholar
- Barnaby Jack. 2007. Vector Rewrite Attack: Exploitable NULL Pointer Vulnerabilities on ARM and XScale Architectures. White paper. Juniper Networks.Google Scholar
- Robbert Krebbers and Freek Wiedijk. 2012. Subtleties of the ANSI/ISO C Standard. Document N1639. ISO/IEC.Google Scholar
- Tom Lane. 2005. Anyone for Adding -fwrapv to Our Standard CFLAGS? Retrieved from http://www.postgresql.org/message-id/[email protected].Google Scholar
- Tom Lane. 2009. Re: gcc versus Division-by-Zero Traps. Retrieved from http://www.postgresql.org/message-id/[email protected].Google Scholar
- Chris Lattner. 2011. What Every C Programmer Should Know About Undefined Behavior. Retrieved from http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html.Google Scholar
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’’04). 75--86. Google Scholar
Digital Library
- Linux Kernel. 2009. Bug 14287—ext4: Fixpoint Divide Exception at ext4_fill_super. Retrieved from https://bugzilla.kernel.org/show_bug.cgi?id=14287.Google Scholar
- John Lions. 1977. A Commentary on the Sixth Edition UNIX Operating System.Google Scholar
- David MacKenzie, Ben Elliston, and Akim Demaille. 2012. Autoconf: Creating Automatic Configuration Scripts for Version 2.69. Free Software Foundation.Google Scholar
- William M. Miller. 2012. C++ Standard Core Language Defect Reports and Accepted Issues, Issue 1457: Undefined Behavior in Left-Shift. Retrieved from http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1457.Google Scholar
- Bruce Momjian. 2006. Re: Fix for Win32 Division Involving INT_MIN. Retrieved from http://www.postgresql.org/message-id/[email protected].Google Scholar
- Steven S. Muchnick. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann. Google Scholar
Digital Library
- Diego Novillo. 2005. A propagation engine for GCC. In Proceedings of the 2005 GCC & GNU Toolchain Developers’’ Summit. 175--184.Google Scholar
- Python. 2013. Issue 17016:_sre: Avoid Relying on Pointer Overflow. Retrieved from http://bugs.python.org/issue17016.Google Scholar
- Silvio Ranise, Cesare Tinelli, and Clark Barrett. 2013. QF_BV logic. Retrieved from http://smtlib.cs.uiowa.edu/logics/QF_BV.smt2.Google Scholar
- John Regehr. 2010. A Guide to Undefined Behavior in C and C++. Retrieved from http://blog.regehr.org/archives/213.Google Scholar
- John Regehr. 2012. Undefined behavior consequences contest winners. (July 2012). http://blog.regehr.org/archives/767.Google Scholar
- Robert C. Seacord. 2010. Dangerous Optimizations and the Loss of Causality. Retrieved from https://www.securecoding.cert.org/confluence/download/attachments/40402999/Dangerous+Optimizations.pdf.Google Scholar
- Richard M. Stallman and the GCC Developer Community. 2013. Using the GNU Compiler Collection for GCC 4.8.0. Free Software Foundation.Google Scholar
- Mark Stephenson, Jonathan Babb, and Saman Amarasinghe. 2000. Bitwidth analysis with application to silicon compilation. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’00). 108--120. Google Scholar
Digital Library
- Eugene Teo. 2009. {PATCH} Add -fno-delete-null-pointer-checks to gcc CFLAGS. Retrieved from https://lists.ubuntu.com/archives/kernel-team/2009-July/006609.html.Google Scholar
- Julien Tinnes. 2009. Bypassing Linux NULL Pointer Dereference Exploit Prevention (mmap_min_addr). Retrieved from http://blog.cr0.org/2009/06/bypassing-linux-null-pointer.html.Google Scholar
- Aaron Tomb and Cormac Flanagan. 2012. Detecting inconsistencies via universal reachability analysis. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. 287--297. Google Scholar
Digital Library
- Linus Torvalds. 2007. Re: {patch} CFS Scheduler, -v8. Retrieved from https://lkml.org/lkml/2007/5/7/213.Google Scholar
- Jean Tourrilhes. 2003. Invalid Compilation without -fno-strict-aliasing. Retrieved from https://lkml.org/lkml/2003/2/25/270.Google Scholar
- Peng Tu and David Padua. 1995. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In Proceedings of the 9th ACM International Conference on Supercomputing. 414--423. Google Scholar
Digital Library
- Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. 2012a. Undefined behavior: What happened to my code? In Proceedings of the 3rd Asia-Pacific Workshop on Systems. Google Scholar
Digital Library
- Xi Wang, Haogang Chen, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. 2012b. Improving integer security for systems with Kint. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI’12). 163--177. Google Scholar
Digital Library
- Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). 260--275. Google Scholar
Digital Library
- Keith Winstein and Hari Balakrishnan. 2012. Mosh: An interactive remote shell for mobile clients. In Proceedings of the 2012 USENIX Annual Technical Conference. 177--182. Google Scholar
Digital Library
- John F. Woods. 1992. Re: Why is This Legal? Retrieved from http://groups.google.com/group/comp.std.c/msg/dfe1ef367547684b.Google Scholar
- Nickolai Zeldovich, Silas Boyd-Wickizer, Eddie Kohler, and David Mazières. 2006. Making information flow explicit in HiStar. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 263--278. Google Scholar
Digital Library
Index Terms
A Differential Approach to Undefined Behavior Detection
Recommendations
Taming undefined behavior in LLVM
PLDI '17A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an ...
Undefined behavior: what happened to my code?
APSYS '12: Proceedings of the Asia-Pacific Workshop on SystemsSystem programming languages such as C grant compiler writers freedom to generate efficient code for a specific instruction set by defining certain language constructs as undefined behavior. Unfortunately, the rules for what is undefined behavior are ...
The Impact of Undefined Behavior on Compiler Optimization
ESSE '21: Proceedings of the 2021 European Symposium on Software EngineeringWith the development of society and the improvement of life quality, people's requirements for software experience are becoming more and more stringent. Performance optimization is becoming more and more important in software design. At present, most ...






Comments