skip to main content
research-article

NUMA-aware reader-writer locks

Published:23 February 2013Publication History
Skip Abstract Section

Abstract

Non-Uniform Memory Access (NUMA) architectures are gaining importance in mainstream computing systems due to the rapid growth of multi-core multi-chip machines. Extracting the best possible performance from these new machines will require us to revisit the design of the concurrent algorithms and synchronization primitives which form the building blocks of many of today's applications. This paper revisits one such critical synchronization primitive -- the reader-writer lock.

We present what is, to the best of our knowledge, the first family of reader-writer lock algorithms tailored to NUMA architectures. We present several variations which trade fairness between readers and writers for higher concurrency among readers and better back-to-back batching of writers from the same NUMA node. Our algorithms leverage the lock cohorting technique to manage synchronization between writers in a NUMA-friendly fashion, binary flags to coordinate readers and writers, and simple distributed reader counter implementations to enable NUMA-friendly concurrency among readers. The end result is a collection of surprisingly simple NUMA-aware algorithms that outperform the state-of-the-art reader-writer locks by up to a factor of 10 in our microbenchmark experiments. To evaluate our algorithms in a realistic setting we also present performance results of the kccachetest benchmark of the Kyoto-Cabinet distribution, an open-source database which makes heavy use of pthread reader-writer locks. Our locks boost the performance of kccachetest by up to 40% over the best prior alternatives.

References

  1. B. B. Brandenburg and J. H. Anderson. Spin-based Reader-Writer Synchronization for Multiprocessor Real-time Systems. Real-Time Syst., 46(1):25--87, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. J. Courtois, F. Heymans, and D. L. Parnas. Concurrent control with "readers" and "writers". Communications of the ACM, 14(10):667--668, 1971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Dice, V. J. Marathe, and N. Shavit. Flat Combining NUMA Locks. In Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Dice. Solaris Scheduling: SPARC and CPUIDs. URL https://blogs.oracle.com/dave/entry/solaris_scheduling_and_cpuids.Google ScholarGoogle Scholar
  5. D. Dice. A Partitioned Ticket Lock. In Proceedings of the 23rd ACM Aymposium on Parallelism in Algorithms and Architectures, pages 309--310, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Dice and N. Shavit. TLRW: Return of the Read-Write Lock. In Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, pages 284--293, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Dice, V. J. Marathe, and N. Shavit. Lock Cohorting: A General Technique for Designing NUMA Locks. In Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, pages 247--256, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. W. Dijkstra. The origin of concurrent programming. chapter Cooperating sequential processes, pages 65--138. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Ellen, Y. Lev, V. Luchangco, andM.Moir. SNZI: Scalable NonZero Indicators. In Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, pages 13--22, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Freudenthal and A. Gottlieb. Process coordination with fetchand-increment. In Proceedings of the 4th International Conferenceon Architectural Support for Programming Languages and Operating Systems, pages 260--268, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. C. Hsieh and W. E. Weihl. Scalable Reader-Writer Locks for Parallel Systems. In Proceedings of the Sixth International Parallel Processing Symposium, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. M. Mellor-Crummey and M. L. Scott. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 9(1):21--65, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. M. Mellor-Crummey and M. L. Scott. Synchronization without Contention. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 269--278, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. O. Krieger, M. Stumm, R. Unrau, and J. Hanna. A Fair Fast Scalable Reader-Writer Lock. In Proceedings of the 1993 International Conference on Parallel Processing, pages 201--204, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Lev, V. Luchangco, and M. Olszewski. Scalable Reader-Writer Locks. In Proceedings of the 21st Annual Symposium on Parallelism in Algorithms and Architectures, pages 101--110, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. M. Mellor-Crummey and M. L. Scott. Scalable Reader-Writer Synchronization for Shared-MemoryMultiprocessors. In Proceedings of the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 106--113, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Z. Radovic and E. Hagersten. Hierarchical Backoff Locks for Nonuniform Communication Architectures. In HPCA-9, pages 241--252, Anaheim, California, USA, Feb. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Shirako, N. Vrvilo, E. G.Mercer, and V. Sarkar. Design, verification and applications of a new read-write lock algorithm. In Proceedinbgs of the 24th ACM symposium on Parallelism in algorithms and architectures, pages 48--57, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Victor Luchangco and Dan Nussbaum and Nir Shavit. A Hierarchical CLH Queue Lock. In Proceedings of the 12th International Euro-Par Conference, pages 801--810, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Vyukov. Distributed Reader-Writer Mutex. URL http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-writer-mutex.Google ScholarGoogle Scholar

Index Terms

  1. NUMA-aware reader-writer locks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 48, Issue 8
        PPoPP '13
        August 2013
        309 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2517327
        Issue’s Table of Contents
        • cover image ACM Conferences
          PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
          February 2013
          332 pages
          ISBN:9781450319225
          DOI:10.1145/2442516

        Copyright © 2013 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 February 2013

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!