Abstract
Shared memory concurrency relies on synchronisation primitives: compare-and-swap, load-reserve/store-conditional (aka LL/SC), language-level mutexes, and so on. In a sequentially consistent setting, or even in the TSO setting of x86 and Sparc, these have well-understood semantics. But in the very relaxed settings of IBM®, POWER®, ARM, or C/C++, it remains surprisingly unclear exactly what the programmer can depend on.
This paper studies relaxed-memory synchronisation. On the hardware side, we give a clear semantic characterisation of the load-reserve/store-conditional primitives as provided by POWER multiprocessors, for the first time since they were introduced 20 years ago; we cover their interaction with relaxed loads, stores, barriers, and dependencies. Our model, while not officially sanctioned by the vendor, is validated by extensive testing, comparing actual implementation behaviour against an oracle generated from the model, and by detailed discussion with IBM staff. We believe the ARM semantics to be similar.
On the software side, we prove sound a proposed compilation scheme of the C/C++ synchronisation constructs to POWER, including C/C++ spinlock mutexes, fences, and read-modify-write operations, together with the simpler atomic operations for which soundness is already known from our previous work; this is a first step in verifying concurrent algorithms that use load-reserve/store-conditional with respect to a realistic semantics. We also build confidence in the C/C++ model in its own terms, fixing some omissions and contributing to the C standards committee adoption of the C++11 concurrency model.
- S. V. Adve andM. D. Hill. Weak ordering---a new definition. In Proc. ISCA, 1990. Google Scholar
Digital Library
- J. Alglave and L. Maranget. Stability in weak memory models. In Proc. CAV, 2011. Google Scholar
Digital Library
- J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Fences in weak memory models. In Proc. CAV, 2010. Google Scholar
Digital Library
- H.-J. Boehm and S.V. Adve. Foundations of the C++ concurrency memory model. In Proc. PLDI, 2008. Google Scholar
Digital Library
- P. Becker, editor. Programming Languages --- C++. 2011. ISO/IEC 14882:2011. A non-final but recent version is available at http://www.open-std.org/jtc1/sc22/ wg21/docs/papers/2011/n3242.pdf.Google Scholar
- M. Batty, K. Memarian, S. Owens, S. Sarkar, and P. Sewell. Clarifying and compiling C/C++ concurrency: from C++11 to POWER. In Proc. POPL, 2012. http://www.cl.cam.ac. uk/~pes20/cppppc/. Google Scholar
Digital Library
- H.-J. Boehm. Threads cannot be implemented as a library. In Proc. PLDI, 2005. Google Scholar
Digital Library
- M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C++ concurrency. In Proc. POPL, 2011. Google Scholar
Digital Library
- Supplementary material. http://www.cl.cam.ac.uk/users/pes20/cppppc-supplemental.Google Scholar
- F. Corella, J. M. Stone, and C. M. Barton. A formal specification of the PowerPC shared memory architecture. Technical Report RC18638, IBM, 1993.Google Scholar
- M. Herlihy. A methodology for implementing highly concurrent data objects. TOPLAS, 15(5):745--770, Nov 1993. Google Scholar
Digital Library
- Intel. A formal specification of Intel Itanium processor family memory ordering. http://www.intel.com/design/itanium/downloads/251429.htm, October 2002.Google Scholar
- Programming Languages --- C. 2011. ISO/IEC 9899:2011. A non-final but recent version is available at http://www. open-std.org/jtc1/sc22/wg14/docs/n1539.pdf.Google Scholar
- E. H. Jensen, G. W. Hagensen, and J. M. Broughton. A new approach to exclusive data access in shared memory multiprocessors. (Technical Report UCRL-97663), Nov 1987.Google Scholar
- L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput., C-28(9):690--691, 1979. Google Scholar
Digital Library
- P. E. McKenney. {patch rfc tip/core/rcu 0/28} preview of rcu changes for 3.3, November 2011. https://lkml.org/lkml/2011/11/2/363.Google Scholar
- M. M. Michael. Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel Distrib. Syst., 15:491--504, June 2004. Google Scholar
Digital Library
- P. E. McKenney and R. Silvera. Example POWER implementation for C/C++ memory model. http: //www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a.html, 2011.Google Scholar
- S. Owens, P. Böhm, F. Zappa Nardelli, and P. Sewell. Lem: A lightweight tool for heavyweight semantics. In Proc. ITP, LNCS 6898, 2011. Rough Diamond" section. Google Scholar
Digital Library
- M. Parkinson, R. Bornat, and P. O'Hearn. Modular verification of a non-blocking stack. In Proc. POPL, 2007. Google Scholar
Digital Library
- Power ISA Version 2.06. IBM, 2009.Google Scholar
- J. Ševčík. Safe optimisations for shared-memory concurrent programs. In Proc. PLDI, 2011. Google Scholar
Digital Library
- D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. TOPLAS, 10:282--312, 1988. Google Scholar
Digital Library
- S. Sarkar, P. Sewell, J. Alglave, L.Maranget, and D. Williams. Understanding POWER multiprocessors. In PLDI, 2011 Google Scholar
Digital Library
Index Terms
Synchronising C/C++ and POWER
Recommendations
Synchronising C/C++ and POWER
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and ImplementationShared memory concurrency relies on synchronisation primitives: compare-and-swap, load-reserve/store-conditional (aka LL/SC), language-level mutexes, and so on. In a sequentially consistent setting, or even in the TSO setting of x86 and Sparc, these ...
Safe optimisations for shared-memory concurrent programs
PLDI '11Current proposals for concurrent shared-memory languages, including C++ and C, provide sequential consistency only for programs without data races (the DRF guarantee). While the implications of such a contract for hardware optimisations are relatively ...
Mathematizing C++ concurrency
POPL '11Shared-memory concurrency in C and C++ is pervasive in systems programming, but has long been poorly defined. This motivated an ongoing shared effort by the standards committees to specify concurrent behaviour in the next versions of both languages. ...







Comments