Abstract
Chase and Lev's concurrent deque is a key data structure in shared-memory parallel programming and plays an essential role in work-stealing schedulers. We provide the first correctness proof of an optimized implementation of Chase and Lev's deque on top of the POWER and ARM architectures: these provide very relaxed memory models, which we exploit to improve performance but considerably complicate the reasoning. We also study an optimized x86 and a portable C11 implementation, conducting systematic experiments to evaluate the impact of memory barrier optimizations. Our results demonstrate the benefits of hand tuning the deque code when running on top of relaxed memory models.
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. In Euro-Par, 2009. Google Scholar
Digital Library
- R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999. Google Scholar
Digital Library
- D. Chase and Y. Lev. Dynamic circular work-stealing deque. In SPAA, 2005. Google Scholar
Digital Library
- M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, 1998. Google Scholar
Digital Library
- T. Gautier, X. Besseron, and L. Pigeon. KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In PASCO, 2007. Google Scholar
Digital Library
- JTC1/SC22/WG14. Programming languages -- C, Committee Draft. ISO/IEC, Apr. 2011.Google Scholar
- S. Mador-Haim, L. Maranget, S. Sarkar, K. Memarian, J. Alglave, S. Owens, R. Alur, M. M. K. Martin, P. Sewell, and D. Williams. An Axiomatic Memory Model for POWER Multiprocessors. In CAV, 2012. Google Scholar
Digital Library
- L. Maranget, S. Sarkar, and P. Sewell. A tutorial introduction to the ARM and POWER relaxed memory model, 2012. Draft. http://www.cl.cam.ac.uk/ pes20/ppc-supplemental/test7.pdf.Google Scholar
- P. E. McKenney and R. Silvera, 2011.smallhttp://www.rdrop.com/users/paulmck/scalability/!paper/N2745r.2011.03.04a.html.Google Scholar
- M. M. Michael, M. T. Vechev, and V. A. Saraswat. Idempotent work stealing. In PPOPP, 2009. Google Scholar
Digital Library
- S. Sarkar, K. Memarian, S. Owens, M. Batty, P. Sewell, L. Maranget, J. Alglave, and D. Williams. Synchronising C/CGoogle Scholar
- and POWER. In PLDI, 2012.Google Scholar
- S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams. Understanding POWER multiprocessors. In PLDI, 2011. Google Scholar
Digital Library
- P. Sewell, S. Sarkar, S. Owens, F. Zappa Nardelli, and M. O. Myreen. x86-TSO: a rigorous and usable programmer's model for x86 multiprocessors. Commun. ACM, 53(7):89--97, 2010. Google Scholar
Digital Library
- A. Terekhov. Brief tentative example x86 implementation for C/C++ memory model, 2008.smallhttp://www.decadent.org.uk/pipermail/ cpp-threads/2008-December/001933.html.Google Scholar
Index Terms
Correct and efficient work-stealing for weak memory models
Recommendations
Work-stealing without the baggage
OOPSLA '12Work-stealing is a promising approach for effectively exploiting software parallelism on parallel hardware. A programmer who uses work-stealing explicitly identifies potential parallelism and the runtime then schedules work, keeping otherwise idle ...
Correct and efficient work-stealing for weak memory models
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingChase and Lev's concurrent deque is a key data structure in shared-memory parallel programming and plays an essential role in work-stealing schedulers. We provide the first correctness proof of an optimized implementation of Chase and Lev's deque on top ...
A work-stealing scheduler for X10's task parallelism with suspension
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingThe X10 programming language is intended to ease the programming of scalable concurrent and distributed applications. X10 augments a familiar imperative object-oriented programming model with constructs to support light-weight asynchronous tasks as well ...







Comments