skip to main content
research-article

Prudent Memory Reclamation in Procrastination-Based Synchronization

Published:25 March 2016Publication History
Skip Abstract Section

Abstract

Procrastination is the fundamental technique used in synchronization mechanisms such as Read-Copy-Update (RCU) where writers, in order to synchronize with readers, defer the freeing of an object until there are no readers referring to the object. The synchronization mechanism determines when the deferred object is safe to reclaim and when it is actually reclaimed. Hence, such memory reclamations are completely oblivious of the memory allocator state. This induces poor memory allocator performance, for instance, when the reclamations are ill-timed. Furthermore, deferred objects provide hints about the future that inform memory regions that are about to be freed. Although useful, hints are not exploited as deferred objects are not visible to memory allocators. We introduce Prudence, a dynamic memory allocator, that is tightly integrated with the synchronization mechanism to ensure visibility of deferred objects to the memory allocator. Such an integration enables Prudence to (i) identify the safe time to reclaim deferred objects' memory, (ii) have an inclusive view of the allocated, free and about-to-be-freed objects, and (iii) exploit optimizations based on the hints about the future during important state transitions. Our evaluation in the Linux kernel shows that Prudence integrated with RCU performs 3.9X to 28X better in micro-benchmarks compared to SLUB, a recent memory allocator in the Linux kernel. It also improves the overall performance perceptibly (4%-18%) for a mix of widely used synthetic and application benchmarks. Further, it performs better (up to 98%) in terms of object hits in caches, object cache churns, slab churns, peak memory usage and total fragmentation, when compared with the SLUB allocator.

References

  1. Maya Arbel and Hagit Attiya. Concurrent updates with rcu: Search tree as an example. In Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing, PODC '14, pages 196--205, New York, NY, USA, 2014. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrea Arcangeli, Mingming Cao, Paul E McKenney, and Dipankar Sarma. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel. In USENIX Annual Technical Conference, FREENIX Track, pages 297--309, 2003.Google ScholarGoogle Scholar
  3. David A Barrett and Benjamin G Zorn. Using lifetime predictors to improve memory allocation performance. In ACM SIGPLAN Notices, volume 28, pages 187--196. ACM, 1993.Google ScholarGoogle Scholar
  4. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 29--44, New York, NY, USA, 2009. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrew Baumann, Jeremy Kerr, Jonathan Appavoo, Dilma Da Silva, Orran Krieger, and Robert W Wisniewski. Module hot-swapping for dynamic update and reconfiguration in K42. In 6th Linux. Conf. Au, 2005.Google ScholarGoogle Scholar
  6. Emery D Berger, Kathryn S McKinley, Robert D Blumofe, and Paul R Wilson. Hoard: A scalable memory allocator for multithreaded applications. ACM Sigplan Notices, 35(11):117--128, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Emery D Berger, Benjamin G Zorn, and Kathryn S McKinley. Oopsla 2002: reconsidering custom memory allocation. ACM SIGPLAN Notices, 48(4S):46--57, 2013.Google ScholarGoogle Scholar
  8. Jeff Bonwick. The Slab Allocator: An Object-Caching Kernel Memory Allocator. In USENIX summer, volume 16. Boston, MA, USA, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. Scalable address spaces using rcu balanced trees. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 199--210, New York, NY, USA, 2012. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jonathan Corbet. The SLUB allocator. http://lwn.net/Articles/229984/, 2007.Google ScholarGoogle Scholar
  11. Jonathan Corbet. Relocating rcu callbacks. http://lwn.net/Articles/522262/, 2012.Google ScholarGoogle Scholar
  12. Jonathan Corbet. Epoll evolving. https://lwn.net/Articles/633422/, 2015.Google ScholarGoogle Scholar
  13. Jason Evans. A scalable concurrent malloc (3) implementation for freebsd. In Proc. of the BSDCan Conference, Ottawa, Canada, 2006.Google ScholarGoogle Scholar
  14. The Apache Software Foundation. Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.2/programs/ab.html, 2015.Google ScholarGoogle Scholar
  15. The Apache Software Foundation. Apache HTTP Server Project. http://httpd.apache.org/, 2015.Google ScholarGoogle Scholar
  16. Benjamin Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In OSDI, volume 99, pages 87--100, 1999.Google ScholarGoogle Scholar
  17. Richard Golding, Peter Bosch, John Wilkes, USENIX Association, et al. Idleness is not sloth. In USENIX, pages 201--212, 1995.Google ScholarGoogle Scholar
  18. Mel Gorman. Understanding the Linux virtual memory manager. Prentice Hall, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. The PostgreSQL Global Development Group. pgbench. http://www.postgresql.org/docs/devel/static/pgbench.html, 2015.Google ScholarGoogle Scholar
  20. The PostgreSQL Global Development Group. PostgreSQL. http://www.postgresql.org/, 2015.Google ScholarGoogle Scholar
  21. Dinakar Guniguntala, Paul E McKenney, Josh Triplett, and Jonathan Walpole. The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with Linux. IBM Systems Journal, 47(2):221--236, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thomas E Hart, Paul E McKenney, Angela Demke Brown, and Jonathan Walpole. Performance of memory reclamation for lockless synchronization. Journal of Parallel and Distributed Computing, 67(12):1270--1285, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Philip W Howard and Jonathan Walpole. Relativistic red-black trees. Concurrency and Computation: Practice and Experience, 2013.Google ScholarGoogle Scholar
  24. Hajime Inoue, Darko Stefanović, and Stephanie Forrest. On the prediction of java object lifetimes. Computers, IEEE Transactions on, 55(7):880--892, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Rick Jones. NetPerf. http://www.netperf.org/, 2012.Google ScholarGoogle Scholar
  26. Jeffrey Katcher. Postmark: A new file system benchmark. Technical report, Technical Report TR3022, Network Appliance, 1997. www.netapp.com/tech_library/3022.html, 1997.Google ScholarGoogle Scholar
  27. H. T. Kung and Philip L. Lehman. Concurrent manipulation of binary search trees. ACM Trans. Database Syst., 5(3):354--382, September 1980.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Christoph Lameter. SLUB: The unqueued slab allocator. http://lwn.net/Articles/229096/, 2007.Google ScholarGoogle Scholar
  29. Ran Liu, Heng Zhang, and Haibo Chen. Scalable read-mostly synchronization using passive reader-writer locks. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 219--230, Philadelphia, PA, June 2014. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yandong Mao, Eddie Kohler, and Robert Tappan Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European conference on Computer Systems, pages 183--196. ACM, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  31. Paul E McKenney. Exploiting Deferred Destructions: An Analysis of Read-Copy-Update Techniques in Operating System kernels. PhD thesis, Oregon Health & Science University, 2004.Google ScholarGoogle Scholar
  32. Paul E McKenney. Structured deferral: synchronization via procrastination. Communications of the ACM, 56(7):40--49, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Paul E McKenney. RCU Linux usage. www.rdrop.com/users/paulmck/RCU/linuxusage.html, 2014.Google ScholarGoogle Scholar
  34. Paul E McKenney, Jonathan Appavoo, Andi Kleen, Orran Krieger, Rusty Russell, Dipankar Sarma, and Maneesh Soni. Read-copy update. In AUUG Conference Proceedings, page 175. AUUG, Inc., 2001.Google ScholarGoogle Scholar
  35. Paul E McKenney, Dipankar Sarma, Ingo Molnar, and Suparna Bhattacharya. Extending RCU for realtime and embedded workloads. In Ottawa Linux Symposium, pages v2, pages 123--138, 2006.Google ScholarGoogle Scholar
  36. Paul E McKenney, Dipankar Sarma, and Maneesh Soni. Scaling dcache with RCU. Linux Journal, 2004(117):3, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Paul E McKenney and John D Slingwine. Read-copy update: Using execution history to solve concurrency problems. In Parallel and Distributed Computing and Systems, pages 509--518, 1998.Google ScholarGoogle Scholar
  38. James Morris. Have You Driven an SELinux Lately? In Linux Symposium Proceedings, 2008.Google ScholarGoogle Scholar
  39. Robert Olsson and Stefan Nilsson. Trash a dynamic lc-trie and hash data structure. In High Performance Switching and Routing, 2007. HPSR'07. Workshop on, pages 1--6. IEEE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  40. Dipankar Sarma and Paul E McKenney. Making RCU safe for deep sub-millisecond response realtime applications. In Proceedings of the 2004 USENIX Annual Technical Conference (FREENIX Track), pages 182--191, 2004.Google ScholarGoogle Scholar
  41. Josh Triplett, Paul E McKenney, and Jonathan Walpole. Scalable concurrent hash tables via relativistic programming. ACM SIGOPS Operating Systems Review, 44(3):102--109, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Josh Triplett, Paul E McKenney, and Jonathan Walpole. Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming. In USENIX Annual Technical Conference, page 11, 2011.Google ScholarGoogle Scholar
  43. Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32. ACM, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. David Wentzlaff and Anant Agarwal. Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper. Syst. Rev., 43:76--85, April 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Prudent Memory Reclamation in Procrastination-Based Synchronization

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 51, Issue 4
    ASPLOS '16
    April 2016
    774 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2954679
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
      March 2016
      824 pages
      ISBN:9781450340915
      DOI:10.1145/2872362
      • General Chair:
      • Tom Conte,
      • Program Chair:
      • Yuanyuan Zhou

    Copyright © 2016 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 25 March 2016

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!