skip to main content
research-article

Layout Lock: A Scalable Locking Paradigm for Concurrent Data Layout Modifications

Published:26 January 2017Publication History
Skip Abstract Section

Abstract

Data-structures can benefit from dynamic data layout modifications when the size or the shape of the data structure changes during the execution, or when different phases in the program execute different workloads. However, in a modern multi-core environment, layout modifications involve costly synchronization overhead. In this paper we propose a novel layout lock that incurs a negligible overhead for reads and a small overhead for updates of the data structure. We then demonstrate the benefits of layout changes and also the advantages of the layout lock as its supporting synchronization mechanism for two data structures. In particular, we propose a concurrent binary search tree, and a concurrent array set, that benefit from concurrent layout modifications using the proposed layout lock. Experience demonstrates performance advantages and integration simplicity.

References

  1. D. Alistarh, W. M. Leiserson, A. Matveev, and N. Shavit. ThreadScan. In Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15, pages 123--132. ACM Press, jun 2015.Google ScholarGoogle Scholar
  2. A. Arcangeli, M. Cao, P. E. McKenney, and D. Sarma. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel. In USENIX Annual Technical Conference, FREENIX Track, pages 297--309, 2003. H. Boehm. Can seqlocks get along with programming language memory models? Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness - MSPC'12, pages 12 -- 20, 2012.Google ScholarGoogle Scholar
  3. A. Braginsky and E. Petrank. A lock-free B+tree. In Proceedinbgs of the 24th ACM symposium on Parallelism in algorithms and architectures - SPAA '12, page 58, New York, New York, USA, jun 2012. ACM Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Brandenburg and J. Anderson. Spin-based reader-writer synchronization for multiprocessor real-time systems. RealTime Systems, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. G. Bronson, J. Casper, H. Chafi, and K. Olukotun. A practical concurrent binary search tree. Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP'10, pages 257--268, may 2010. ISSN 03621340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. A. Brown. Reclaiming Memory for Lock-Free Data Structures. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing - PODC '15, pages 261--270. ACM Press, jul 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Calciu, D. Dice, Y. Lev, V. Luchangco, V. Marathe, and N. Shavit. NUMA-aware reader-writer locks. Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP'13, pages 157--166, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Chakrabarti and F. Chow. Structure Layout Optimizations in the Open64 Compiler : Design , Implementation and Measurements. Open64 workshop, 2008.Google ScholarGoogle Scholar
  9. N. Cohen and E. Petrank. Efficient Memory Management for LockFree Data Structures with Optimistic Access. In Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15, pages 254--263, jun 2015a.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Cohen and E. Petrank. Automatic memory reclamation for lockfree data structures. Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications - OOPLSA'15, pages 260--279, oct 2015b.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. David, R. Guerraoui, and V. Trigonakis. Asynchronized Concurrency. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS'15, pages 631--644, mar 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Desnoyers, P. E. McKenney, A. S. Stern, M. R. Dagenais, and J. Walpole. User-Level Implementations of Read-Copy Update. IEEE Transactions on Parallel and Distributed Systems, 23(2): 375--382, feb 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Dice, M. Herlihy, and A. Kogan. Fast non-intrusive memory reclamation for highly-concurrent data structures. In Proceedings of the 2016 ACM SIGPLAN International Symposium on Memory Management - ISMM 2016, pages 36--45, New York, New York, USA, 2016. ACM Press. ISBN 9781450343176. doi: 10.1145/2926697.2926699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation - PLDI'99, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Drachsler, M. Vechev, and E. Yahav. Practical concurrent binary search trees via logical ordering. In Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '14, pages 343--356. ACM Press, feb 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Eizenberg, S. Hu, G. Pokam, and J. Devietti. Remix: online detection and repair of cache contention for the JVM. Proceedings of the 37th ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel. Non-blocking binary search trees. In Proceeding of the 29th ACM SIGACTSIGOPS symposium on Principles of distributed computing -PODC '10, page 131. ACM Press, jul 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Gramoli. More than you ever wanted to know about synchronization. PPoPP, Feb, 2015. URL http://ssrg.nicta.com.au/publications/nictaabstracts/8487.pdf.Google ScholarGoogle Scholar
  19. W. Hsieh and W. Weihl. Scalable reader-writer locks for parallel systems. Proceedings of the Sixth International Parallel Processing Symposium, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. A framework for interprocedural locality optimization using both loop and data layout transformations. In Proceedings of the 1999 International Conference on Parallel Processing, pages 95--102. IEEE Comput. Soc, 1999a. Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Kandemir, J. Ramanujam, and A. Choudhary. Improving cache locality by a combination of loop and data transformations. IEEE Transactions on Computers, 48(2):159--167, 1999b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. O. Kennedy and L. Ziarek. Just-In-Time Data Structures. Proceedings. of the 7th Biennial Conference on Innovative Data Systems Research - CIDR'15, 2015.Google ScholarGoogle Scholar
  23. C. Lameter. Effective synchronization on Linux/NUMA systems. Gelato Conference, pages 1--23, 2005. URL http://www.kde.ps.pl/mirrors/ftp.kernel.org/linux/kernel/people/christoph/gelato/gelato2005-paper.pdf.Google ScholarGoogle Scholar
  24. D. Lea and JSR-166. StampedLock. URL http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/concurrent/locks/StampedLock.java.Google ScholarGoogle Scholar
  25. Y. Lev, V. Luchangco, and M. Olszewski. Scalable reader-writer locks. Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures - SPAA'09, pages 101--110, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Li, Y. Tan, W. Wang, Q. Zhang, and Z. Wang. A Cacheconscious Structure Definition for List. Journal of Applied Sciences, 13(8):1192--1198, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  27. Y. Lin, K. Wang, S. Blackburn, and A. Hosking. Stop and go: understanding yieldpoint behavior. Proceedings of the 2015 International Symposium on Memory Management - ISMM'15, pages 70 -- 80, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Liu, H. Zhang, and H. Chen. Scalable read-mostly synchronization using passive reader-writer locks. Proceedings of the 2014 USENIX Annual Technical Conference - USENIX ATC'14, pages 219--230, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Q. Lu, X. Gao, and S. Krishnamoorthy. Empirical performancemodel driven data layout optimization. 17th International Workshop on Languages and Compilers for High Performance Computing, LCPC'04, pages 72--86, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Mannarswamy. Region based structure layout optimization by selective data copying. 18th International Conference on Parallel Architectures and Compilation Techniques - PACT '09, pages 338--347, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. E. McKenney, M. Desnoyers, and L. Jiangshan. User-space RCU. URL https://lwn.net/Articles/573424/.Google ScholarGoogle Scholar
  32. J. Mellor-Crummey and M. Scott. Scalable reader-writer synchronization for shared-memory multiprocessors. Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP'91, pages 106--113, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Morrison and M. Arbel. Predicate RCU : An RCU for Scalable Concurrent Updates. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP'15, pages 21--30, 2015.Google ScholarGoogle Scholar
  34. A. Natarajan and N. Mittal. Fast concurrent lock-free binary search trees. Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP'14, pages 317--328, 2014. doi: 10.1145/2555243.2555256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. Raman, R. Hundt, and S. Mannarswamy. Structure layout optimization for multithreaded programs. In International Symposium on Code Generation and Optimization, CGO 2007, pages 271--282. IEEE, mar 2007. ISBN 0769527647. doi: 10.1109/CGO.2007.36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. A. N. Soules, J. Appavoo, K. Hui, R. W. Wisniewski, D. Da Silva, G. R. Ganger, O. Krieger, M. Stumm, M. Auslander, M. Ostrowski, B. Rosenburg, and J. Xenidis. System Support for Online Reconfiguration. In USENIX Annual Technical Conference. Proceedings of the 2003 Conference on, pages 141--154, 2003. ISBN 1--931971--10--2.Google ScholarGoogle Scholar
  37. I. Sung, J. Stratton, and W. Hwu. Data layout transformation exploiting memory-level parallelism in structured grid manycore applications. Proceedings of the 19th international conference on Parallel architectures and compilation techniques - PACT'10, pages 513--522, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Triplett, P. E. McKenney, and J. Walpole. Scalable concurrent hash tables via relativistic programming. ACM SIGOPS Operating Systems Review, 44(3):102, 2010. ISSN 01635980. doi: 10.1145/1842733.1842750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. D. Wael. Just-in-time data structures: towards declarative swap rules. Proceedings of the 13th International Workshop on Dynamic Analysis - WODA'15, pages 33--34, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. D. Wael, S. Marr, and J. D. Koster. Just-in-time data structures. 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software - Onward!'15, pages 61--75, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Xu. CoCo: Sound and adaptive replacement of java collections. 27th European Conference Object-Oriented Programming - ECOOP'13, pages 1--26, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Layout Lock: A Scalable Locking Paradigm for Concurrent Data Layout Modifications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 52, Issue 8
      PPoPP '17
      August 2017
      442 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/3155284
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
        January 2017
        476 pages
        ISBN:9781450344937
        DOI:10.1145/3018743

      Copyright © 2017 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 January 2017

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!