skip to main content
article

SAFECode: enforcing alias analysis for weakly typed languages

Published:11 June 2006Publication History
Skip Abstract Section

Abstract

Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. We describe a compilation strategy for standard C programs that guarantees that aggressive interprocedural pointer analysis (or less precise ones), a call graph, and type information for a subset of memory, are never invalidated by any possible memory errors. We formalize our approach as a new type system with the necessary run-time checks in operational semantics and prove the correctness of our approach for a subset of C. Our semantics provide the foundation for other sophisticated static analyses to be applied to C programs with a guarantee of soundness. Our work builds on a previously published transformation called Automatic Pool Allocation to ensure that hard-to-detect memory errors (dangling pointer references and certain array bounds errors) cannot invalidate the call graph, points-to information or type information. The key insight behind our approach is that pool allocation can be used to create a run-time partitioning of memory that matches the compile-time memory partitioning in a points-to graph, and efficient checks can be used to isolate the run-time partitions. Furthermore, we show that the sound analysis information enables static checking techniques that eliminate many run-time checks. Our approach requires no source code changes, allows memory to be managedexplicitly, and does not use meta-data on pointers or individual tag bits for memory. Using several benchmark s and system codes, we show experimentally that the run-time overheads are low (less than 10% in nearly all cases and 30% in the worst case we have seen).We also show the effectiveness of static analyses in eliminating run-time checks.

References

  1. A. Aiken, M. Fahndrich, and R. Levien. Better static memory management: Improving region-based analysis of higher-order languages. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. M. Austin, S. E. Breach, and G. S. Sohi. Efficient detection of all pointer and array access errors. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Berger and B. Zorn. Diehard: Probabilistic memory safety for unsafe languages. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Berger, B. Zorn, and K. McKinley. Reconsidering custom memory allocation. In Proc. Conference on Object-Oriented Programming: Systems, Languages, and Applications, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Bodik, R. Gupta, and V. Sarkar. ABCD: eliminating array bounds checks on demand. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Bollella and J. Gosling. The real-time specification for Java. IEEE Computer, 33(6):47--54, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Boyapati, A. Salcianu, W. Beebee, and M. Rinard. Ownership types for safe region-based memory management in real-time java. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. C. Carlisle. Olden: parallelizing programs with dynamic data structures on distributed-memory machines. PhD thesis, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W.-N. Chin, F. Craciun, S. Qin, and M. Rinard. Region inference for an object-oriented language. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Das, S. Lerner, and M. Siegle. Esp: Path-sensitive program verification in polynomial time. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), Berlin, Germany, Jun 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Dhurjati and V. Adve. Backwards-compatible array bounds checking for C with very low overhead. In Proc. 28th Int'l Conf. on Software Engineering (ICSE), Shanghai, China, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Dhurjati and V. Adve. Efficiently detecting all dangling pointer uses in production servers. In Proc. Int'l Conf. on Dependable Systems and Networks (DSN), Philadelphia, USA, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Dhurjati, S. Kowshik, and V. Adve. Enforcing alias analysis for weakly typed languages. Tech Report UIUCDCS-R-2005-2657, Computer Science Dept., Univ. of Illinois at Urbana-Champaign, Oct 2005. See http://safecode.cs.uiuc.edu/.Google ScholarGoogle Scholar
  14. D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory safety without runtime checks or garbage collection. In Conf. on Language, Compiler, and Tool Support for Embedded Systems (LCTES), Jun 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Dhurjati, S. Kowshik, V. Adve, and C. Lattner. Memory safety without garbage collection for embedded applications. ACM Transactions on Embedded Computing Systems, Feb. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Dor, S. Adams, M. Das, and Z. Yang. Software validation via scalable path-sensitive value flow analysis. In Proc. of ACM SIGSOFT international symposium on Software testing and analysis, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Dor, M. Rodeh, and M. Sagiv. Cssv: Towards a realistic tool for statically detecting all buffer overflows in c. In SIGPLAN Conference on Programming Language Design and Implementation, Sandiego, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Ganapathy, S. Jha, D. Chandler, D. Melski, and D. Vitek. Buffer overrun detection using linear programming and static analysis. In Proceedings of the 10th ACM conference on Computer and communications security, New York, NY, USA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Gay and A. Aiken. Memory management with explicit regions. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), pages 313--323, Montreal, Canada, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Region-based memory management in cyclone. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Hackett, M. Das, D. Wang, and Z. Yang. Modular checking forbuffer overflows in the large. In Proc. 28th Int'l Conf. on Software Engineering (ICSE), Shanghai, China, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Hackett and R. Rugina. Region-based shape analysis with tracked locations. In POPL '05: Proceedings of the 32nd ACM SIGPLANSIGACT symposium on Principles of programming languages, pages 310--323, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Hastings and B. Joyce. Purify: Fast detection of memory leaks and access errors. In Winter USENIX, 1992.Google ScholarGoogle Scholar
  24. T. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with Blast. In Tenth International Workshop on Model Checking of Software (SPIN), pages 235--239, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Hicks, G. Morrisett, D. Grossman, and T. Jim. Experience with safe manual memory-management in Cyclone. In Proc. of the 4th international symposium on Memory management (ISMM), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Hind. Pointer analysis: Haven't we solved this problem yet? In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 54--61, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. W. M. Jones and P. H. J. Kelly. Backwards-compatible bounds checking for arrays and pointers in c programs. In Automated and Algorithmic Debugging, pages 13--26, 1997.Google ScholarGoogle Scholar
  28. W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott. The Omega Library Interface Guide. Technical report, Computer Science Dept., U. Maryland, College Park, Apr. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Lattner. Macroscopic Data Structure Analysis and Optimization. PhD thesis, Comp. Sci. Dept., Univ. of Illinois, Urbana, IL, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Proc. Int'l Symp. on Code Generation and Optimization (CGO), San Jose, Mar 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Lattner and V. Adve. Automatic pool allocation: Improving performance by controlling data structure layout in the heap. In Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI), Chicago, IL, Jun 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Loginov, S. H. Yong, S. Horwitz, and T. Reps. Debugging via run-time type checking. Lecture Notes in Computer Science, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. C. Necula, J. Condit, M. Harren, S. McPeak, and W. Weimer. Ccured: type-safe retrofitting of legacy software. ACM Transactions on Programming Language and Systems, 27(3):477--526, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Patil and C. N. Fischer. Efficient run-time monitoring using shadow processing. In Automated and Algorithmic Debugging, pages 119--132, 1995.Google ScholarGoogle Scholar
  35. J. Seward. Valgrind, an open-source memory debugger for x86-gnu/linux.Google ScholarGoogle Scholar
  36. B. Steensgaard. Points-to analysis in almost linear time. In ACM symposium on Principles of programming languages (POPL), 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Tofte, L. Birkedal, M. Elsman, N. Hallenberg, T. H. Olesen, P. Sestoft, and P. Bertelsen. Programming with Regions in the ML Kit. Technical Report DIKU-TR-97/12, 1997.Google ScholarGoogle Scholar
  38. M. Tofte and J.-P. Talpin. Region-based memory management. Information and Computation, pages 132(2):109--176, Feb. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. R. Wahbe, S. Lucco, T. E. Anderson, and S. L. Graham. Efficient software-based fault isolation. ACM SIGOPS Operating Systems Review, 27(5):203--216, December 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Xie, A. Chou, and D. Engler. Archer: using symbolic, path-sensitive analysis to detect memory access errors. SIGSOFT Softw. Eng. Notes, 28(5):327--336, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. H. Yong and S. Horwitz. Protecting C programs from attacks via invalid pointer dereferences. In Foundations of Software Engineering, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SAFECode: enforcing alias analysis for weakly typed languages

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!