skip to main content
research-article

Automatic generation of library bindings using static analysis

Published:15 June 2009Publication History
Skip Abstract Section

Abstract

High-level languages are growing in popularity. However, decades of C software development have produced large libraries of fast, time-tested, meritorious code that are impractical to recreate from scratch. Cross-language bindings can expose low-level C code to high-level languages. Unfortunately, writing bindings by hand is tedious and error-prone, while mainstream binding generators require extensive manual annotation or fail to offer the language features that users of modern languages have come to expect.

We present an improved binding-generation strategy based on static analysis of unannotated library source code. We characterize three high-level idioms that are not uniquely expressible in C's low-level type system: array parameters, resource managers, and multiple return values. We describe a suite of interprocedural analyses that recover this high-level information, and we show how the results can be used in a binding generator for the Python programming language. In experiments with four large C libraries, we find that our approach avoids the mistakes characteristic of hand-written bindings while offering a level of Python integration unmatched by prior automated approaches. Among the thousands of functions in the public interfaces of these libraries, roughly 40% exhibit the behaviors detected by our static analyses.

References

  1. B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In POPL '88: Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 1--11, New York, NY, USA, 1988. ACM. ISBN 0-89791-252-7. doi: http://doi.acm.org/10.1145/73560.73561. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, Department of Computer Science, University of Cophenhagen, May 1994.Google ScholarGoogle Scholar
  3. D. M. Beazley. SWIG: an easy to use tool for integrating scripting languages with C and C++. In TCLTK'96: Proceedings of the 4th conference on USENIX Tcl/Tk Workshop, 1996, pages 15--15, Berkeley, CA, USA, 1996. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. M. Beazley. Simplified wrapper and interface generator. http://www.swig.org, Nov. 2008.Google ScholarGoogle Scholar
  5. E. Busboom, A. Cancro, and W. Goesgens. libical. http://freeassociation.sourceforge.net/, Nov. 2008.Google ScholarGoogle Scholar
  6. P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL '78: Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 84--96, New York, NY, USA, 1978. ACM. doi: http://doi.acm.org/10.1145/512760.512770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ctypesgen Developers. ctypesgen. http://code.google.com/p/ctypesgen/, Nov. 2008.Google ScholarGoogle Scholar
  8. R. Cytron and R. Gershbein. Efficient accommodation of may-alias information in SSA form. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, pages 36--45, New York, NY, USA, 1993. ACM. ISBN 0-89791-598-4. doi: http://doi.acm.org/10.1145/155090.155094. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Elder, S. Jackson, and B. Liblit. Code sandwiches. Technical Report 1647, University of Wisconsin-Madison, Oct. 2008.Google ScholarGoogle Scholar
  10. J. S. Foster, R. Johnson, J. Kodumal, and A. Aiken. Flow-insensitive type qualifiers. ACM Trans. Program. Lang. Syst., 28(6):1035--1087, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Furr and J. S. Foster. Checking type safety of foreign function calls. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 62--72, New York, NY, USA, 2005. ACM. ISBN 1-59593-056-6. doi: http://doi.acm.org/10.1145/1065010.1065019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Gailly and M. Adler. zlib home site. http://zlib.net/, Nov. 2008.Google ScholarGoogle Scholar
  13. M. Galassi, J. Davies, J. Theiler, B. Gough, G. Jungman, M. Booth, and F. Rossi. GNU Scientific Library Reference Manual. Network Theory Ltd., Bristol, United Kingdom, revised second edition, Aug. 2006.Google ScholarGoogle Scholar
  14. The GNOME Project. GNOME Bug Tracking System. http://bugzilla.gnome.org, Jan. 2009.Google ScholarGoogle Scholar
  15. H. S. Gunawi, C. Rubio-González, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and B. Liblit. EIO: Error handling is occasionally correct. In M. Baker and E. Riedel, editors, FAST, pages 207--222. USENIX, 2008. ISBN 978-1-931971-56-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. J. Harrold and M. L. Soffa. Efficient computation of interprocedural definition-use chains. ACM Trans. Program. Lang. Syst., 16(2):175--204, 1994. ISSN 0164-0925. doi: http://doi.acm.org/10.1145/174662.174663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. L. Heine and M. S. Lam. A practical flow-sensitive and context-sensitive C and C++ memory leak detector. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 168--181, New York, NY, USA, 2003. ACM. ISBN 1-58113-662-5. doi: http://doi.acm.org/10.1145/781131.781150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Heller. ctypeslib -- useful additions to the ctypes FFI library. http://pypi.python.org/pypi/ctypeslib/, Nov. 2008.Google ScholarGoogle Scholar
  19. S. Jaroszewicz. ctypesGSL. http://www.cs.umb.edu/sj/ctypesGsl/, Aug. 2008.Google ScholarGoogle Scholar
  20. T. Kientzle. libarchive. http://people.freebsd.org/~kientzle/libarchive/, Nov. 2008.Google ScholarGoogle Scholar
  21. T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: inferring the specification within. In OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementation, pages 161--176, Berkeley, CA, USA, 2006. USENIX Association. ISBN 1-931971-47-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Lattner. LLVM and Clang: Next generation compiler technology. In BSDCan 2008: The BSD Conference, Ottawa, Canada, May 2008.Google ScholarGoogle Scholar
  23. C. Lattner and V. S. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO, pages 75--88. IEEE Computer Society, 2004. ISBN 0-7695-2102-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Makhorin. GLPK (GNU linear programming kit). http://www.gnu.org/software/glpk/, Nov. 2008.Google ScholarGoogle Scholar
  25. M.-T. Pham. ctypes-glpk: A Python wrapper for GLPK using ctypes. http://code.google.com/p/ctypes-glpk, Nov. 2008.Google ScholarGoogle Scholar
  26. J. Reppy and C. Song. Application-specific foreign-interface generation. In GPCE '06: Proceedings of the 5th international conference on Generative programming and component engineering, pages 49--58, New York, NY, USA, 2006. ACM. ISBN 1-59593-237-2. doi: http://doi.acm.org/10.1145/1173706.1173714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Rubio-González, H. S. Gunawi, B. Liblit, R. H. Arpaci-Dusseau, and A. C. Arpaci-Dusseau. Error propagation analysis for file systems. In Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation, Dublin, Ireland, June 15--20 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Seward. bzip2. http://www.bzip.org/, Nov. 2008.Google ScholarGoogle Scholar
  29. Silicon Graphics, Inc. libacl. http://oss.sgi.com/projects/xfs/, Feb. 2008.Google ScholarGoogle Scholar
  30. Silicon Graphics, Inc. libattr. http://oss.sgi.com/projects/xfs/, Feb. 2008.Google ScholarGoogle Scholar

Index Terms

  1. Automatic generation of library bindings using static analysis

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!