Abstract
ARM has a relaxed memory model, previously specified in informal prose for ARMv7 and ARMv8. Over time, and partly due to work building formal semantics for ARM concurrency, it has become clear that some of the complexity of the model is not justified by the potential benefits. In particular, the model was originally non-multicopy-atomic: writes could become visible to some other threads before becoming visible to all — but this has not been exploited in production implementations, the corresponding potential hardware optimisations are thought to have insufficient benefits in the ARM context, and it gives rise to subtle complications when combined with other ARMv8 features. The ARMv8 architecture has therefore been revised: it now has a multicopy-atomic model. It has also been simplified in other respects, including more straightforward notions of dependency, and the architecture now includes a formal concurrency model.
In this paper we detail these changes and discuss their motivation. We define two formal concurrency models: an operational one, simplifying the Flowing model of Flur et al., and the axiomatic model of the revised ARMv8 specification. The models were developed by an academic group and by ARM staff, respectively, and this extended collaboration partly motivated the above changes. We prove the equivalence of the two models. The operational model is integrated into an executable exploration tool with new web interface, demonstrated by exhaustively checking the possible behaviours of a loop-unrolled version of a Linux kernel lock implementation, a previously known bug due to unprevented speculation, and a fixed version.
Supplemental Material
- Allon Adir, Hagit Attiya, and Gil Shurek. 2003. Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture. IEEE Trans. Parallel Distrib. Syst. 14, 5 (2003), 502–515. Google Scholar
Digital Library
- Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and Phillip W. Hutto. 1995. Causal memory: definitions, implementation, and programming. Distributed Computing 9, 1 (1995), 37–49. Google Scholar
Digital Library
- Jade Alglave, Anthony Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. 2009. The Semantics of Power and ARM Multiprocessor Machine Code. In Proc. DAMP 2009.Google Scholar
- Jade Alglave and Luc Maranget. 2017. http://diy.inria.fr/doc/index.html . (April 2017).Google Scholar
- Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. 2010. Fences in Weak Memory Models. In Proc. CAV. Google Scholar
Digital Library
- Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. 2011. Litmus: running tests against hardware. In Proceedings of TACAS 2011: the 17th international conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer-Verlag, Berlin, Heidelberg, 41–44. http://dl.acm.org/citation.cfm?id=1987389.1987395 Google Scholar
Cross Ref
- Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM TOPLAS 36, 2, Article 7 (July 2014), 74 pages. Google Scholar
Digital Library
- ARM Ltd. 2016. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd. ARM DDI 0487A.k_iss10775 (ID092916).Google Scholar
- ARM Ltd. 2017. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd. ARM DDI 0487B.a (ID033117).Google Scholar
- ARM Ltd. 2017. ARM Processor Cortex-A53 MPCore Product Revision r0 Software Developers Errata Notice. http://infocenter.arm.com/help/topic/com.arm.doc.epm048406/Cortex_A53_MPCore_Software_Developers_Errata_ Notice.pdf . (May 2017).Google Scholar
- Arvind Arvind and Jan-Willem Maessen. 2006. Memory Model = Instruction Reordering + Store Atomicity. SIGARCH Comput. Archit. News 34, 2 (May 2006), 29–40. Google Scholar
Digital Library
- Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter Sewell. 2015. The Problem of Programming Language Concurrency Semantics. In Programming Languages and Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings. 283–307. Google Scholar
Cross Ref
- Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ Concurrency. In Proc. POPL. Google Scholar
Digital Library
- Pete Becker (Ed.). 2011. Programming Languages — C++. ISO/IEC 14882:2011. http://www.open- std.org/jtc1/sc22/wg21/ docs/papers/2011/n3242.pdf .Google Scholar
- James Bornholt and Emina Torlak. 2017. Synthesizing Memory Models from Framework Sketches and Litmus Tests. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 467–481. Google Scholar
Digital Library
- Sebastian Burckhardt and Madanlal Musuvathi. 2008. Effective Program Verification for Relaxed Memory Models. Springer Berlin Heidelberg, Berlin, Heidelberg, 107–120. Google Scholar
Digital Library
- Nathan Chong and Samin Ishtiaq. 2008. Reasoning about the ARM weakly consistent memory model. In MSPC. Google Scholar
Digital Library
- William W. Collier. 1992. Reasoning about parallel architectures. Prentice Hall, Englewood Cliffs. http://opac.inria.fr/record= b1105256Google Scholar
Digital Library
- F. Corella, J. M. Stone, and C. M. Barton. 1993. A formal specification of the PowerPC shared memory architecture. Technical Report RC18638. IBM.Google Scholar
- Mike Daines. 2017. Viz.js, a hack to put Graphviz on the web. https://github.com/mdaines/viz.js/ . (2017).Google Scholar
- Will Deacon. 2015. Linux commit ‘arm64: spinlock: serialise spin_unlock_wait against concurrent lockers’. https://git.kernel. org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d86b8da04dfa . (2015).Google Scholar
- Will Deacon. 2016. The ARMv8 Application Level Memory Model. https://github.com/herd/herdtools7/blob/master/herd/ libdir/aarch64.cat . (2016).Google Scholar
- Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016a. The Flowing and POP Models (supplementary material for Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA). (2016). http://www.cl.cam.ac.uk/~sf502/popl16/model_full.pdf .Google Scholar
- Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016b. Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA. In Proceedings of POPL: the 43rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google Scholar
Digital Library
- Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. 2017. Mixed-size Concurrency: ARM, POWER, C/C++11, and SC. In POPL 2017: The 44th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Paris, France. Google Scholar
Digital Library
- Emden R. Gansner and Stephen C. North. 2000. An open graph visualization system and its applications to software engineering. Software: Practice and Experience 30, 11 (2000), 1203–1233. http://www.graphviz.org/ . Google Scholar
Digital Library
- Kathryn E. Gray, Gabriel Kerneis, Dominic Mulligan, Christopher Pulte, Susmit Sarkar, and Peter Sewell. 2015. An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessors. In Proc. MICRO-48, the 48th Annual IEEE/ACM International Symposium on Microarchitecture. Google Scholar
Digital Library
- Lisa Higham, Jalal Kawash, and Nathaly Verwaal. 1998. Weak Memory Consistency Models. Part I: Definitions and Comparisons. Technical Report. Department of Computer Science, The University of Calgary.Google Scholar
- David Howells, Paul E. McKenney, Will Deacon, and Peter Zijlstra. 2016. Documentation/memory-barriers.txt. https: //www.kernel.org/doc/Documentation/memory- barriers.txt . (2016).Google Scholar
- Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. 2017. A Promising Semantics for Relaxedmemory Concurrency. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 175–189. Google Scholar
Digital Library
- Linux contributors. 2014. Documentation/locking/spinlocks.txt. https://www.kernel.org/doc/Documentation/locking/ spinlocks.txt . (2014).Google Scholar
- Daniel Lustig, Andrew Wright, Alexandros Papakonstantinou, and Olivier Giroux. 2017. Automated Synthesis of Comprehensive Memory Model Litmus Test Suites. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’17). ACM, New York, NY, USA, 661–675. Google Scholar
Digital Library
- Sela Mador-Haim, Rajeev Alur, and Milo M. K. Martin. 2010. Generating Litmus Tests for Contrasting Memory Consistency Models. In Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings. 273–287. Google Scholar
Digital Library
- Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, and Derek Williams. 2012. An Axiomatic Memory Model for POWER Multiprocessors. In Proceedings of CAV 2012: the 24th International Conference on Computer Aided Verification. 495–512. Google Scholar
Digital Library
- Yatin A. Manerkar, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. 2015. CCICheck: Using µHb Graphs to Verify the Coherence-consistency Interface. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA, 26–37. Google Scholar
Digital Library
- Luc Maranget, Susmit Sarkar, and Peter Sewell. 2012. A Tutorial Introduction to the ARM and POWER Relaxed Memory Models. Draft available from http://www.cl.cam.ac.uk/~pes20/ppc- supplemental/test7.pdf . (2012).Google Scholar
- Paul E McKenney. 2017. Remove spin_unlock_wait(). (Jun 2017). https://lkml.org/lkml/2017/6/29/967Google Scholar
- Dominic P. Mulligan, Scott Owens, Kathryn E. Gray, Tom Ridge, and Peter Sewell. 2014. Lem: reusable engineering of realworld semantics. In Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming. 175–188. Google Scholar
Digital Library
- Scott Owens, Susmit Sarkar, and Peter Sewell. 2009. A better x86 memory model: x86-TSO. In Proceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674. 391–407.Google Scholar
Digital Library
- Jean Pichon-Pharabod and Peter Sewell. 2016. A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions. In Proceedings of POPL. Google Scholar
Digital Library
- Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. 2012. Synchronising C/C++ and POWER. In Proceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (Beijing). 311–322. Google Scholar
Digital Library
- Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. 2011. Understanding POWER Multiprocessors. In Proceedings of PLDI 2011: the 32nd ACM SIGPLAN conference on Programming Language Design and Implementation. 175–186. Google Scholar
Digital Library
- Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus Myreen, and Jade Alglave. 2009. The Semantics of x86-CC Multiprocessor Machine Code. In Proceedings of POPL 2009: the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages. 379–391. Google Scholar
Digital Library
- Jérôme Vouillon and Vincent Balat. 2014. From bytecode to JavaScript: the Js_of_ocaml compiler. Software: Practice and Experience 44, 8 (2014), 951–972. Google Scholar
Digital Library
- John Wickerson, Mark Batty, Tyler Sorensen, and George A. Constantinides. 2017. Automatically Comparing Memory Consistency Models. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2017). ACM, New York, NY, USA, 190–204. Google Scholar
Digital Library
- Alon Zakai. 2011. Emscripten: An LLVM-to-JavaScript Compiler. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion (OOPSLA ’11). ACM, New York, NY, USA, 301–312. Google Scholar
Digital Library
Index Terms
Simplifying ARM concurrency: multicopy-atomic axiomatic and operational models for ARMv8
Recommendations
Promising-ARM/RISC-V: a simpler and faster operational concurrency model
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and ImplementationFor ARMv8 and RISC-V, there are concurrency models in two styles, extensionally equivalent: axiomatic models, expressing the concurrency semantics in terms of global properties of complete executions; and operational models, that compute incrementally. ...
Modelling the ARMv8 architecture, operationally: concurrency and ISA
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesIn this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64-bit application-level instruction set (ISA). Our goal is to clarify what the range of architecturally allowable ...
Modelling the ARMv8 architecture, operationally: concurrency and ISA
POPL '16In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64-bit application-level instruction set (ISA). Our goal is to clarify what the range of architecturally allowable ...






Comments