Abstract
Programmers hoping to achieve performance improvements often use custom memory allocators. This in-depth study examines eight applications that use custom allocators. Surprisingly, for six of these applications, a state-of-the-art general-purpose allocator (the Lea allocator) performs as well as or better than the custom allocators. The two exceptions use regions, which deliver higher performance (improvements of up to 44%). Regions also reduce programmer burden and eliminate a source of memory leaks. However, we show that the inability of programmers to free individual objects within regions can lead to a substantial increase in memory consumption. Worse, this limitation precludes the use of regions for common programming idioms, reducing their usefulness.
We present a generalization of general-purpose and region-based allocators that we call reaps. Reaps are a combination of regions and heaps, providing a full range of region semantics with the addition of individual object deletion. We show that our implementation of reaps provides high performance, outperforming other allocators with region-like semantics. We then use a case study to demonstrate the space advantages and software engineering benefits of reaps in practice. Our results indicate that programmers needing fast regions should use reaps, and that most programmers considering custom allocators should instead use the Lea allocator.
- Apache Foundation. Apache Web server. http://www.apache.org.Google Scholar
- William S. Beebee and Martin C. Rinard. An implementation of scoped memory for Real-Time Java. In EMSOFT, pages 289--305, 2001. Google Scholar
Digital Library
- Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. Hoard: A scalable memory allocator for multithreaded applications. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX), pages 117--128, Cambridge, MA, November 2000. Google Scholar
Digital Library
- Emery D. Berger, Benjamin G. Zorn, and Kathryn S. McKinley. Composing high-performance memory allocators. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 114--124, Snowbird, Utah, June 2001. Google Scholar
Digital Library
- Greg Bollella, James Gosling, Benjamin Brosgol, Peter Dibble, Steve Furr, and Mark Turnbull. The Real-Time Specification for Java. Addison-Wesley, 2000. Google Scholar
Digital Library
- Gilad Bracha and William Cook. Mixin-based inheritance. In Norman Meyrowitz, editor, Proceedings of the Conference on Object-Oriented Programming: Systems, Languages, and Applications (OOPSLA) / Proceedings of the European Conference on Object-Oriented Programming (ECOOP), pages 303--311, Ottawa, Canada, 1990. ACM Press. Google Scholar
Digital Library
- Dov Bulka and David Mayhew. Efficient C++. Addison-Wesley, 2001.Google Scholar
- Trishul Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 191--202, Snowbird, Utah, June 2001. Google Scholar
Digital Library
- Trishul M. Chilimbi, Mark D. Hill, and James R. Larus. Cache-conscious structure layout. In Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 1--12, Atlanta, GA, May 1999. Google Scholar
Digital Library
- Margaret A. Ellis and Bjarne Stroustrop. The Annotated C++ Reference Manual. Addison-Wesley, 1990. Google Scholar
Digital Library
- Boris Fomitchev. STLport. http://www.stlport.org/.Google Scholar
- Christopher W. Fraser and David R. Hanson. A Retargetable C Compiler: Design and Implementation. Addison-Wesley, 1995. Google Scholar
Digital Library
- Free Software Foundation. GCC Home Page. http://gcc.gnu.org/.Google Scholar
- David Gay and Alex Aiken. Memory management with explicit regions. In Proceedings of the 1998 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 313--323, Montreal, Canada, June 1998. Google Scholar
Digital Library
- David Gay and Alex Aiken. Language support for regions. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 70--80, Snowbird, Utah, June 2001. Google Scholar
Digital Library
- Wolfram Gloger. Dynamic memory allocator implementations in Linux system libraries. http://www.dent.med.uni-muenchen.de/¿ wmglo/malloc-slides.htmlGoogle Scholar
- Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, and James Cheney. Region-based memory management in cyclone. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 282--293, Berlin, Germany, June 2002. Google Scholar
Digital Library
- Sam Guyer, Daniel A. Jiménez, and Calvin Lin. The C-Breeze compiler infrastructure. Technical Report UTCS-TR01-43, The University of Texas at Austin, November 2001.Google Scholar
- David R. Hanson. Fast allocation and deallocation of memory based on object lifetimes. In Software Practice & Experience, number 20(1), pages 5--12. Wiley, January 1990. Google Scholar
Digital Library
- David R. Hanson. C Interfaces and Implementation. Addison-Wesley, 1997.Google Scholar
- Reed Hastings and Bob Joyce. Purify: Fast detection of memory leaks and access errors. In Proceedings of the Winter USENIX 1992 Conference, pages 125--136, December 1992.Google Scholar
- Mark S. Johnstone and Paul R. Wilson. The memory fragmentation problem: Solved? In International Symposium on Memory Management, pages 26--36, Vancouver, B.C., Canada, 1998. Google Scholar
Digital Library
- Doug Lea. A memory allocator. http://g.oswego.edu/dl/html/malloc.html.Google Scholar
- Scott Meyers. Effective C++. Addison-Wesley, 1996.Google Scholar
- Scott Meyers. More Effective C++. Addison-Wesley, 1997.Google Scholar
- Bartosz Milewski. C++ In Action: Industrial-Strength Programming Techniques. Addison-Wesley, 2001.Google Scholar
- Philip A. Nelson. bc -- An arbitrary precision calculator language. http://www.gnu.org/software/bc/bc.html.Google Scholar
- Jeffrey Richter. Advanced Windows: the developer's guide to the Win32 API for Windows NT 3.5 and Windows 95. Microsoft Press. Google Scholar
Digital Library
- Gustavo Rodriguez-Rivera, Mike Spertus, and Charles Fiterman. Conservative garbage collection for general memory allocators. In International Symposium on Memory Management, Minneapolis, Minnesota, 2000. Google Scholar
Digital Library
- D. T. Ross. The AED free storage package. Communications of the ACM, 10(8):481--492, 1967. Google Scholar
Digital Library
- Colin Runciman and Niklas Rojemo. Lag, drag and postmortem heap profiling. In Implementation of Functional Languages Workshop, Bastad, Sweden, September 1995.Google Scholar
- SGI. The Standard Template Library for C++: Allocators. http://www.sgi.com/tech/stl/Allocators.html.Google Scholar
- Ran Shaham, Elliot K. Kolodner, and Mooly Sagiv. Heap profiling for space-efficient Java. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 104--113, Snowbird, Utah, June 2001. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. SPEC2000. http://www.spec.org.Google Scholar
- Standard Performance Evaluation Corporation. SPEC95. http://www.spec.org.Google Scholar
- Lincoln Stein, Doug MacEachern, and Linda Mui. Writing Apache Modules with Perl and C. O'Reilly & Associates, 1999. Google Scholar
Digital Library
- Bjarne Stroustrup. The C++ Programming Language, Second Edition. (Addison-Wesley), 1991. Google Scholar
Digital Library
- Suzanne Pierce. PPRC: Microsoft's Tool Box. http://research.microsoft.com/research/pprc/mstoolbox.asp.Google Scholar
- Mads Tofte and Jean-Pierre Talpin. Region-based memory management. Information and Computation, 132(2):109--176, 1997. Google Scholar
Digital Library
- Dan N. Truong, François Bodin, and André Seznec. Improving cache behavior of dynamically allocated data structures. In International Conference on Parallel Architectures and Compilation Techniques, pages 322--329, October 1998. Google Scholar
Digital Library
- Kiem-Phong Vo. Vmalloc: A general and efficient memory allocator. In Software Practice & Experience, number 26, pages 1--18. Wiley, 1996.Google Scholar
- Mark Weiser, Alan Demers, and Carl Hauser. The Portable Common Runtime approach to interoperability. In Twelfth ACM Symposium on Operating Systems Principles, pages 114--122, December 1989. Google Scholar
Digital Library
- P. R. Wilson, M. S. Johnstone, M. Neely, and D. Boles. Dynamic storage allocation: A survey and critical review. Lecture Notes in Computer Science, 986, 1995. Google Scholar
Digital Library
- Benjamin G. Zorn. The measured cost of conservative garbage collection. Software Practice and Experience, 23(7):733--756, 1993. Google Scholar
Digital Library
Index Terms
OOPSLA 2002: Reconsidering custom memory allocation
Recommendations
Minimizing write activities to non-volatile memory via scheduling and recomputation
SASP '10: Proceedings of the 2010 IEEE 8th Symposium on Application Specific Processors (SASP)Non-volatile memories, such as flash memory, Phase Change Memory (PCM), and Magnetic Random Access Memory (MRAM), have many desirable characteristics for embedded DSP systems to employ them as main memory. These characteristics include low-cost, shock-...
A Novel Memory Block Management Scheme for PCM Using WOM-Code
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and SystemsPhase Change Memory (PCM) is a promising DRAM replacement in embedded systems due to its attractive characteristics including low static power consumption and high density. However, long write latency is one of the major drawbacks in current PCM ...
Mellow writes: extending lifetime in resistive memories through selective slow write backs
ISCA'16Emerging resistive memory technologies, such as PCRAM and ReRAM, have been proposed as promising replacements for DRAM-based main memory, due to their better scalability, low standby power, and non-volatility. However, limited write endurance is a major ...






Comments