ABSTRACT
Automatic object inlining [19, 20] transforms heap data structures by fusing parent and child objects together. It can improve runtime by reducing object allocation and pointer dereference costs. We report continuing work studying object inlining optimizations. In particular, we present a new semantic derivation of the correctness conditions for object inlining, and program analysis which extends our previous work. And we present an object inlining transformation, focusing on a new algorithm which optimizes class field layout to minimize code expansion. Finally, we detail a fuller evaluation on eleven programs and libraries (including Xpdf, the 25,000 line Portable Document Format (PDF) file browser) that utilizes hardware measures of impact on the memory system.
We show that our analysis scales effectively to large programs, finding many inlinable fields (45 in xpdf) at acceptable cost, and we show that, on some programs, it finds nearly all fields for which object inlining is correct, and averages 40% of such fields across our benchmarks. We implement our analyses in an advanced analysis infrastructure, and we show that, compared to traditional 1-CFA, that infrastructure provides better results and lower and more scalable cost. Across all programs, analysis identified about 30% of objects as inlinable on average. Our transformation increases code size by only 20% while inlining this 30% of fields. Inlining these objects eliminated on average 28% of field reads, 58% of object creations, 12% of all loads. Further, the optimized programs have significantly improved memory reference behavior, producing 25% fewer L1 data cache misses and 25% fewer read stalls. On average the runtime improved by 14%.
References
- 1.O. Agesen, J. Palsberg, and M. Schwartzbach. Type inference of SELF: Analysis of objects with dynamic and multiple inheritance. In Proceedings of ECOOP '93, 1993. Google Scholar
Digital Library
- 2.J. M. Anderson, S. P. Amarasinghe, and M. S. Lain. Data and computation transformations for multiprocessors. In Proceedings of Fifth Symposium on Principles and Practice of Parallel Programming, 1995. Google Scholar
Digital Library
- 3.A. Ayers, R. Gottlieb, and R. Schooler. Aggressive inlining. In Proceedings of the 1997 A CM SIGPLAN Conference on Programming Language Design and Implementation, pages 134-145, Las Vegas, Nevada, June 1997. Google Scholar
Digital Library
- 4.H. Baker. The Treadmill: Real-time garbage collection without motion sickness. A CM SIGPLAN Notices, 27(3):66-70, March 1992. Google Scholar
Digital Library
- 5.T. Ball. What's in a region? computing control dependence regions in linear time and space. Technical report, University of Wisconsin-Madison, 1992.Google Scholar
- 6.A. Black, N. Hutchinson, E. Jul, and H. Levy. Object structure in the emerald system. In Proceedings of OOPSLA '36, pages 78-86. ACM, September 1986. Google Scholar
Digital Library
- 7.B. Blanchett. Escape analysis correctness proof, implementation and experimental results. In Proceedings of the 25th A CM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 25-37, San Diego, CA, January 1998. Google Scholar
Digital Library
- 8.Z. Budimlic and K. Kennedy. Optimizing java: Theory and practice. Concurrency: Practice and Experience, 9(6), June 1997.Google Scholar
- 9.C. Chambers and D. Ungar. Iterative type analysis and extended message splitting. In Proceedings of the SIC- PLAN Conference on Programming Language Design and Implementation, pages 150-60, 1990. Google Scholar
Digital Library
- 10.A. Chien, J. Dolby, B. Ganguly, V. Karamcheti, and X. Zhang. Supporting high level programming with high performance: The Illinois Concert system. In Proceedings of the Second International Workshop on High-level Parallel Programming Models and Supportive Environments, pages 15-24, April 1997. Google Scholar
Digital Library
- 11.A. A. Chien, U. S. Reddy, J. Plevyak, and J. Dolby. ICC++- a C++ dialect for high-performance parallel computation. In Proceedings of the 2nd International Symposium on Object Technologies for Advanced Software, pages 76-95, March 1996. Google Scholar
Digital Library
- 12.T. M. Chilimbi, B. Davidson, and J. R. Larus. Cacheconscious structure definition. In Proceedings of the 1999 A CM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1999. Google Scholar
Digital Library
- 13.T. M. Chilimbi, M. D. Hill, and J. R. Larus. Cacheconscious structure layout. In Proceedings of the 1999 A CM SICPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May 1999. Google Scholar
Digital Library
- 14.J.-D. Choi, M. Burke, and P. Carini. Efficient flow-sensative interprocedural computation of pointerinduced aliases and side effects. In Twentieth Symposium on Principles of Programming Languages, pages 232-245. ACM SIGPLAN, 1993. Google Scholar
Digital Library
- 15.R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and F. Zadeck. An efficient method of computing static single assignment form and the control dependence graph. A CM Transactions on Programming Languages and Systems, 13(4):451-490, October 1991. Google Scholar
Digital Library
- 16.J. Dean, C. Chambers, and D. Grove. Selective specialization for object-oriented languages. In Proceedings of the A CM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 93-102, La Jolla, CA, June 1995. Google Scholar
Digital Library
- 17.A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In Proceedings of the SIC- PLAN Conference on Programming Language Design and Implementation, pages 230-241, 1994. Google Scholar
Digital Library
- 18.A. Diwan, K. S.McKinley, and J. E. B. Moss. Typebased alias analysis. In Proceedings of the 1993 ACM SICPLAN Conference on Programming Language Design and Implementation, pages 106-117, Montreal, Canada, June 1998. Google Scholar
Digital Library
- 19.J. Dolby. Automatic inline allocation of objects. In Proceedings of the 1997 A CM SIGPLAN Conference on Programming Language Design and Implementation, pages 7-17, Las Vegas, Nevada, June 1997. Google Scholar
Digital Library
- 20.J. Dolby and A. A. Chien. An evaluation of automatic object inline allocation techniques. In Proceedings of the Thirteenth Annual Conference on Object-Oriented Programming Languages, Systems and Applications (OOP- SLA), Vancouver, British Columbia, October 1998. Available at http://www-csag, cs. uiuc. edu/papers/ oopsla-98.ps. Google Scholar
Digital Library
- 21.J. R. Ellis and D. L. Detlefs. Safe, efficient garbage collection for c++. Technical report, Xerox Palo Alto Research Center, June 1993.Google Scholar
- 22.M. A. Ellis and B. Stroustrup. The Annotated C++ Reference Manual. Addison-Wesley, 1990. Google Scholar
Digital Library
- 23.R. Ghiya and L. J. Hendren. Connection analysis: A practical interprocedural heap analysis for C. In Proceedings of the Workshop for Languages and Compilers for Parallel Computing, 1995. Google Scholar
Digital Library
- 24.K. E. Gorlen, S. M. Orlow, and P. S. Plexico. Data Abstraction and Object-Oriented Programming in C++. John Wiley and Sons, 1991. Google Scholar
Digital Library
- 25.C. Hall, S. L. Peyton-Jones, and P. M. Sansom. Functional Programming, Glasgow 1995, chapter Unboxing Using Specialization. Workshops in Computing Science. Springer-Verlag, 1995.Google Scholar
- 26.M. W. Hall. Managing Interprocedural Optimization. PhD thesis, Rice University, 1991. Google Scholar
Digital Library
- 27.G. Hamilton, editor. JavaBeans 1.01 Specification. Sun Microsystems, Mountain View, CA, 1997. published online at http://www, javaso~t, corn/beans/ docs/beans. 101. ps.Google Scholar
- 28.U. HSlzle, C. Chambers, and D. Ungar. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In ECOOP'91 Conference Proceedings. Springer-Verlag, 1991. Lecture Notes in Computer Science 512. Google Scholar
Digital Library
- 29.U. HSlzle and D. Ungar. Optimizing dynamicallydispatched calls with run-time type feedback. In Proceedings of the 1993 A CM SICPLAN Conference on Programming Language Design and Implementation, pages 326-336, June 1994. Google Scholar
Digital Library
- 30.W. Landi and B. Ryder. A safe approximate algorithm for interprocedural pointer aliasing. In A CM SICPLAN Symposium on Programming Language Design and Implementation, pages 235-249, 1992. Google Scholar
Digital Library
- 31.J. Palsberg and M. Schwartzbach. Object-oriented type inference. In Proceedings of OOPSLA '91, pages 146- 61, 1991. Google Scholar
Digital Library
- 32.H. D. Pande and B. G. Ryder. Static type determination and aliasing in c++. Technical Report LCSR-TR- 250, Laboratory of Computer Science Research, July 1995.Google Scholar
- 33.J. Philbin, J. Edler, O. J. Anshus, C. C. Douglas, and K. Li. Thread scheduling for cache locality. In Proceedings of the Seventh Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pages 60-71, 1996. Google Scholar
Digital Library
- 34.J. Plevyak. Optimization of Object-Oriented and Concurrent Programs. PhD thesis, University of Illinois at Urbana-Champaign, Urbana, Illinois, 1996. Google Scholar
Digital Library
- 35.J. Plevyak and A. A. Chien. Precise concrete type inference of object-oriented programs. In Proceedings of OOPSLA '94, Object-Oriented Programming Systems, Languages and Architectures, pages 324-340, 1994. Google Scholar
Digital Library
- 36.J. Plevyak and A. A. Chien. Type directed cloning for object-oriented programs. In Proceedings of the Workshop for Languages and Compilers for Parallel Computing, pages 566-580, 1995. Google Scholar
Digital Library
- 37.E. Ruf. Context-insensitive alias analysis reconsidered. In Proceedings of the 1995 A CM SICPLAN Conference on Programming Language Design and Implementation, pages 13-22, June 1995. Google Scholar
Digital Library
- 38.Z. Shao, J. H. Reppy, and A. W. Appel. Unrolling lists. In A CM Conference on Lisp and Functional Programming, June 1994. Google Scholar
Digital Library
- 39.A. A. Stepanov and M. Lee. The standard template library. ISO programming language C++ project. Technical Report X3J16/94-0095, WG21/NO482, Hewlett- Packard, May 1994.Google Scholar
- 40.D. Stoutamire and S. Omohundro. Sather 1.1, draft. Available online from http:// www. icsi. berkeley, edu/"ather/Sather-1.1, ps, August 1995.Google Scholar
- 41.Sun Microsystems Computer Corporation. The Java Language Specification, March 1995. Available at http://j ava. sun. corn/1.0alp ha2/doc/" ava-whit ep ap er. ps.Google Scholar
- 42.M. Torte and L. Birkedal. A region inference algorithm. Transactions on Programming Languages and Systems (TOPLAS), 20(4):734-767, July 1998. Google Scholar
Digital Library
- 43.N. Wirth and J. Gutknecht. Project Oberon: The Design of an Operating System and Compiler. Addison Wesley, 1992. Google Scholar
Digital Library
- 44.M. E. Wolf and M. S. Lain. A data locality optimizing algorithm. In Proceedings of the 1991 A CM SICPLAN Conference on Programming Language Design and Implementation, June 1991. Google Scholar
Digital Library
Index Terms
An automatic object inlining optimization and its evaluation


Julian Dolby


Comments