Abstract
Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as local locking and remote locking only perform well under a certain level of contention, and often require non-trivial tuning for a particular configuration. Besides, for large NUMA systems, because of unmanaged lock server's nomination, current distance-first NUMA policies cannot perform satisfactorily.
In this work, we propose SANL, a locking scheme that can deliver high performance under various contention levels by adaptively switching between the local and the remote lock scheme. Furthermore, we introduce a new NUMA policy for the remote lock that jointly considers node distances and server utilization when choosing lock servers. A comparison with seven representative locking schemes shows that SANL outperforms the others in most contention situations. In one group test, SANL is 3.7 times faster than RCL lock and 17 times faster than POSIX mutex.
- S. Boyd-Wickizer, M. F. Kaashoek, R. Morris, and N. Zeldovich. Non-scalable locks are dangerous. In Proc. Linux Symposium, 2012.Google Scholar
- A. T. Clements, M. F. Kaashoek, and N. Zeldovich. Scalable address spaces using RCU balanced trees. In Proc. ASPLOS, 2012. Google Scholar
Digital Library
- D. Dice, V. J. Marathe, and N. Shavit. Lock cohorting: A general technique for designing NUMA locks. In Proc. PPoPP, 2012. Google Scholar
Digital Library
- K. Fatourou, Panagiota and Nikolaos. Revisiting the combining synchronization technique. In Proc. PPoPP, 2012. Google Scholar
Digital Library
- D. Hendler, I. Incze, N. Shavit, and M. Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proc. SPAA, 2010. Google Scholar
Digital Library
- J.-P. Lozi, F. David, G. Thomas, J. Lawall, G. Muller, et al. Remote Core Locking: migrating critical-section execution to improve the performance of multithreaded applications. In Proc. USENIX ATC, 2012. Google Scholar
Digital Library
- V. Luchangco, D. Nussbaum, and N. Shavit. A hierarchical CLH queue lock. In Proc. ICPP, 2006. Google Scholar
Digital Library
- P. E. McKenney, J. Appavoo, A. Kleen, O. Krieger, R. Russell, D. Sarma, and M. Soni. Read-copy update. In Proc. AUUG, 2001.Google Scholar
- J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst., 9(1):21--65, Feb. 1991. Google Scholar
Digital Library
- Y. Oyama, K. Taura, and A. Yonezawa. Executing parallel programs with synchronization bottlenecks efficiently. In Proc. PDSIA, 1999.Google Scholar
- Z. Radovic and E. Hagersten. Hierarchical backoff locks for nonuniform communication architectures. In Proc. HPCA, 2003. Google Scholar
Digital Library
- N. Vasudevan, K. S. Namjoshi, and S. A. Edwards. Simple and fast biased locks. In Proc. PACT, 2010. Google Scholar
Digital Library
Recommendations
Scalable adaptive NUMA-aware lock: combining local locking and remote locking for efficient concurrency
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingScalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as local locking and remote locking ...
Scalable Adaptive NUMA-Aware Lock
Scalable locking is a key building block for scalable multi-threaded software. Its performance is especially critical in multi-socket, multi-core machines with non-uniform memory access (NUMA). Previous schemes such as in-place locks and delegation ...
An efficient lock-aware transactional memory implementation
ICOOOLPS '09: Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming SystemsTransactional memory (TM) is an emerging concurrency control mechanism that provides a simple and composable programming model. Unfortunately, transactions violate the semantics of mutual exclusion locks when they execute concurrently. Due to the ...






Comments