skip to main content
research-article
Open Access

A Differential Approach to Undefined Behavior Detection

Published:11 March 2015Publication History
Skip Abstract Section

Abstract

This article studies undefined behavior arising in systems programming languages such as C/C++. Undefined behavior bugs lead to unpredictable and subtle systems behavior, and their effects can be further amplified by compiler optimizations. Undefined behavior bugs are present in many systems, including the Linux kernel and the Postgres database. The consequences range from incorrect functionality to missing security checks. This article proposes a formal and practical approach that finds undefined behavior bugs by finding “unstable code” in terms of optimizations that leverage undefined behavior. Using this approach, we introduce a new static checker called Stack that precisely identifies undefined behavior bugs. Applying Stack to widely used systems has uncovered 161 new bugs that have been confirmed and fixed by developers.

References

  1. Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe user-level access to privileged CPU features. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI’12). 335--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Communication of the ACM 53, 2 (Feb. 2010), 66--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sam Blackshear and Shuvendu Lahiri. 2013. Almost-correct specifications: A modular semantic framework for assigning confidence to warnings. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 209--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Hans-J. Boehm. 2005. Threads cannot be implemented as a library. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Chicago, IL, 261--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Robert Brummayer and Armin Biere. 2009. Boolector: An efficient SMT solver for bit-vectors and arrays. In Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 174--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Géraud Canet, Pascal Cuoq, and Benjamin Monate. 2009. A value analysis for C programs. In Proceedings of the 9th IEEE International Working Conference on Source Code Analysis and Manipulation. 123--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Haogang Chen, Yandong Mao, Xi Wang, Dong Zhou, Nickolai Zeldovich, and M. Frans Kaashoek. 2011. Linux kernel vulnerabilities: State-of-the-art defenses and open problems. In Proceedings of the 2nd Asia-Pacific Workshop on Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chromium 2013. Issue 12079010: Avoid Undefined Behavior When Checking for Pointer Wraparound. Retrieved from https://codereview.chromium.org/12079010/.Google ScholarGoogle Scholar
  10. Alessandro Cimatti, Alberto Griggio, and Roberto Sebastiani. 2011. Computing small unsatisfiable cores in satisfiability modulo theories. Journal of Artificial Intelligence Research 40 (2011), 701--728. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Clang. 2014. Clang Compiler User’s Manual: Controlling Code Generation. Retrieved from http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation.Google ScholarGoogle Scholar
  12. Jonathan Corbet. 2009. Fun with NULL Pointers, Part 1. (July 2009). http://lwn.net/Articles/342330/.Google ScholarGoogle Scholar
  13. Russ Cox. 2008. Re: plan9port build failure on Linux (debian). (March 2008). http://9fans.net/archive/2008/03/89.Google ScholarGoogle Scholar
  14. Pascal Cuoq, Matthew Flatt, and John Regehr. 2014. Proposal for a Friendly Dialect of C. Retrieved from http://blog.regehr.org/archives/1180.Google ScholarGoogle Scholar
  15. Will Dietz, Peng Li, John Regehr, and Vikram Adve. 2012. Understanding integer overflow in C/C++. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). 760--770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Isil Dillig, Thomas Dillig, and Alex Aiken. 2007. Static error detection using semantic inconsistency inference. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). 435--445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chad R. Dougherty and Robert C. Seacord. 2008. C compilers may silently discard some wraparound checks. Vulnerability Note VU#162289. US-CERT. Retrieved from http://www.kb.cert.org/vuls/id/162289, original version http://www.isspcs.org/render.html?it=9100.Google ScholarGoogle Scholar
  18. Chucky Ellison and Grigore Roşu. 2012a. Defining the Undefinedness of C. Technical Report. University of Illinois. Retrieved from http://hdl.handle.net/2142/30780.Google ScholarGoogle Scholar
  19. Chucky Ellison and Grigore Roşu. 2012b. An executable formal semantics of C with applications. In Proceedings of the 39th ACM Symposium on Principles of Programming Languages (POPL’12). 533--544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. 2001. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01). 57--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. GCC. 2006. Bug 27180—Pointer Arithmetic Overflow Handling Broken. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27180.Google ScholarGoogle Scholar
  22. GCC. 2007. Bug 30475—assert(int+100 > int) Optimized Away. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475.Google ScholarGoogle Scholar
  23. GCC 2011. Bug 49820—Explicit Check for Integer Negative after abs Optimized Away. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49820.Google ScholarGoogle Scholar
  24. GCC. 2013. Bug 53265—Warn When Undefined Behavior Implies Smaller Iteration Count. Retrieved from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53265.Google ScholarGoogle Scholar
  25. Jochen Hoenicke, K. Rustan M. Leino, Andreas Podelski, Martin Schäf, and Thomas Wies. 2009. It’s doomed; we can prove it. In Proceedings of the 16th International Symposium on Formal Methods (FM’09). Eindhoven, the Netherlands, 338--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. IBM. 2009. Optimizing C Code at Optimization Level 2. White paper.Google ScholarGoogle Scholar
  27. IBM. 2010. Power ISA Version 2.06 Revision B, Book I: Power ISA User Instruction Set Architecture.Google ScholarGoogle Scholar
  28. Intel. 2010. Intel Itanium Architecture Software Developer’s Manual, Volume 1: Application Architecture.Google ScholarGoogle Scholar
  29. Intel. 2013. Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 2: Instruction Set Reference, A--Z.Google ScholarGoogle Scholar
  30. Intel. 2014. Intel 64 and IA-32 Architectures Software Developer’s Manual.Google ScholarGoogle Scholar
  31. ISO/IEC. 2003. Rationale for International Standard - Programming Languages - C.Google ScholarGoogle Scholar
  32. ISO/IEC. 2011. ISO/IEC 9899:2011, Programming languages - C.Google ScholarGoogle Scholar
  33. Barnaby Jack. 2007. Vector Rewrite Attack: Exploitable NULL Pointer Vulnerabilities on ARM and XScale Architectures. White paper. Juniper Networks.Google ScholarGoogle Scholar
  34. Robbert Krebbers and Freek Wiedijk. 2012. Subtleties of the ANSI/ISO C Standard. Document N1639. ISO/IEC.Google ScholarGoogle Scholar
  35. Tom Lane. 2005. Anyone for Adding -fwrapv to Our Standard CFLAGS? Retrieved from http://www.postgresql.org/message-id/[email protected].Google ScholarGoogle Scholar
  36. Tom Lane. 2009. Re: gcc versus Division-by-Zero Traps. Retrieved from http://www.postgresql.org/message-id/[email protected].Google ScholarGoogle Scholar
  37. Chris Lattner. 2011. What Every C Programmer Should Know About Undefined Behavior. Retrieved from http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html.Google ScholarGoogle Scholar
  38. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’’04). 75--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Linux Kernel. 2009. Bug 14287—ext4: Fixpoint Divide Exception at ext4_fill_super. Retrieved from https://bugzilla.kernel.org/show_bug.cgi?id=14287.Google ScholarGoogle Scholar
  40. John Lions. 1977. A Commentary on the Sixth Edition UNIX Operating System.Google ScholarGoogle Scholar
  41. David MacKenzie, Ben Elliston, and Akim Demaille. 2012. Autoconf: Creating Automatic Configuration Scripts for Version 2.69. Free Software Foundation.Google ScholarGoogle Scholar
  42. William M. Miller. 2012. C++ Standard Core Language Defect Reports and Accepted Issues, Issue 1457: Undefined Behavior in Left-Shift. Retrieved from http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1457.Google ScholarGoogle Scholar
  43. Bruce Momjian. 2006. Re: Fix for Win32 Division Involving INT_MIN. Retrieved from http://www.postgresql.org/message-id/[email protected].Google ScholarGoogle Scholar
  44. Steven S. Muchnick. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Diego Novillo. 2005. A propagation engine for GCC. In Proceedings of the 2005 GCC & GNU Toolchain Developers’’ Summit. 175--184.Google ScholarGoogle Scholar
  46. Python. 2013. Issue 17016:_sre: Avoid Relying on Pointer Overflow. Retrieved from http://bugs.python.org/issue17016.Google ScholarGoogle Scholar
  47. Silvio Ranise, Cesare Tinelli, and Clark Barrett. 2013. QF_BV logic. Retrieved from http://smtlib.cs.uiowa.edu/logics/QF_BV.smt2.Google ScholarGoogle Scholar
  48. John Regehr. 2010. A Guide to Undefined Behavior in C and C++. Retrieved from http://blog.regehr.org/archives/213.Google ScholarGoogle Scholar
  49. John Regehr. 2012. Undefined behavior consequences contest winners. (July 2012). http://blog.regehr.org/archives/767.Google ScholarGoogle Scholar
  50. Robert C. Seacord. 2010. Dangerous Optimizations and the Loss of Causality. Retrieved from https://www.securecoding.cert.org/confluence/download/attachments/40402999/Dangerous+Optimizations.pdf.Google ScholarGoogle Scholar
  51. Richard M. Stallman and the GCC Developer Community. 2013. Using the GNU Compiler Collection for GCC 4.8.0. Free Software Foundation.Google ScholarGoogle Scholar
  52. Mark Stephenson, Jonathan Babb, and Saman Amarasinghe. 2000. Bitwidth analysis with application to silicon compilation. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’00). 108--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Eugene Teo. 2009. {PATCH} Add -fno-delete-null-pointer-checks to gcc CFLAGS. Retrieved from https://lists.ubuntu.com/archives/kernel-team/2009-July/006609.html.Google ScholarGoogle Scholar
  54. Julien Tinnes. 2009. Bypassing Linux NULL Pointer Dereference Exploit Prevention (mmap_min_addr). Retrieved from http://blog.cr0.org/2009/06/bypassing-linux-null-pointer.html.Google ScholarGoogle Scholar
  55. Aaron Tomb and Cormac Flanagan. 2012. Detecting inconsistencies via universal reachability analysis. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. 287--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Linus Torvalds. 2007. Re: {patch} CFS Scheduler, -v8. Retrieved from https://lkml.org/lkml/2007/5/7/213.Google ScholarGoogle Scholar
  57. Jean Tourrilhes. 2003. Invalid Compilation without -fno-strict-aliasing. Retrieved from https://lkml.org/lkml/2003/2/25/270.Google ScholarGoogle Scholar
  58. Peng Tu and David Padua. 1995. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In Proceedings of the 9th ACM International Conference on Supercomputing. 414--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. 2012a. Undefined behavior: What happened to my code? In Proceedings of the 3rd Asia-Pacific Workshop on Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Xi Wang, Haogang Chen, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. 2012b. Improving integer security for systems with Kint. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI’12). 163--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). 260--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Keith Winstein and Hari Balakrishnan. 2012. Mosh: An interactive remote shell for mobile clients. In Proceedings of the 2012 USENIX Annual Technical Conference. 177--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. John F. Woods. 1992. Re: Why is This Legal? Retrieved from http://groups.google.com/group/comp.std.c/msg/dfe1ef367547684b.Google ScholarGoogle Scholar
  64. Nickolai Zeldovich, Silas Boyd-Wickizer, Eddie Kohler, and David Mazières. 2006. Making information flow explicit in HiStar. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 263--278. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Differential Approach to Undefined Behavior Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computer Systems
        ACM Transactions on Computer Systems  Volume 33, Issue 1
        March 2015
        114 pages
        ISSN:0734-2071
        EISSN:1557-7333
        DOI:10.1145/2745713
        Issue’s Table of Contents

        Copyright © 2015 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 March 2015
        • Accepted: 1 October 2014
        • Received: 1 September 2014
        Published in tocs Volume 33, Issue 1

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!