skip to main content
research-article

Declarative fence insertion

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

Previous work has shown how to insert fences that enforce sequential consistency. However, for many concurrent algorithms, sequential consistency is unnecessarily strong and can lead to high execution overhead. The reason is that, often, correctness relies on the execution order of a few specific pairs of instructions. Algorithm designers can declare those execution orders and thereby enable memory-model-independent reasoning about correctness and also ease implementation of algorithms on multiple platforms. The literature has examples of such reasoning, while tool support for enforcing the orders has been lacking until now. In this paper we present a declarative approach to specify and enforce execution orders. Our fence insertion algorithm first identifies the execution orders that a given memory model enforces automatically, and then inserts fences that enforce the rest. Our benchmarks include three off-the-shelf transactional memory algorithms written in C/C++ for which we specify suitable execution orders. For those benchmarks, our experiments with the x86 and ARMv7 memory models show that our tool inserts fences that are competitive with those inserted by the original authors. Our tool is the first to insert fences into transactional memory algorithms and it solves the long-standing problem of how to easily port such algorithms to a novel memory model.

Skip Supplemental Material Section

Supplemental Material

References

  1. GNU GCC 4.8.2. Built-in functions for atomic memory access. https://gcc.gnu.org/onlinedocs/gcc-4.8. 0/gcc/_005f_005fsync-Builtins.html#g_t_005f_ 005fsync-Builtins, 2013. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  2. Jade Alglave, Daniel Kroening, Vincent Nimal, and Daniel Poetzl. Don’t sit on the fence: A static analysis approach to automatic fence insertion. In CAV, 2014.Google ScholarGoogle Scholar
  3. ARM. Arm compiler toolchain assembler reference. http://infocenter.arm.com/help/index.jsp? topic=/com.arm.doc.dui0489c/CIHGHHIE.html, 2011. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  4. John Bender. Parry. https://bitbucket.org/ucla-pls/ parry, 2015.Google ScholarGoogle Scholar
  5. Joseph Cheriyan, Howard Karloff, and Yuval Rabani. Approximating directed multicuts. In FOCS, pages 320–328, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Intel Corp. Intel 64 and ia-32 architectures software developers manual. http://www.intel.com/content/ dam/www/public/us/en/documents/manuals/ 64-ia-32-architectures-software-developermanual-325462.pdf, 2015. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  7. Marie-Christine Costa, Lucas Létocart, and Frédéric Roupin. Minimal Multicut and Maximal Integer Multiflow: A Survey. European Journal of Operational Research, 162:55–69, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  8. Heming Cui, Jiri Simsa, Yi-Hong Lin, Hao Li, Ben Blum, Xinan Xu, Junfeng Yang, Garth A. Gibson, and Randal E. Bryant. Parrot: A practical runtime for deterministic, stable, and reliable threads. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP ’13, pages 388–405, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andrei Dan, Yuri Meshman, Martin Vechev, and Eran Yahav. Predicate abstraction for relaxed memory models. In SAS, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  10. Chi Cao Minh Dave Dice, Nir Shavit. Tl2 and tl2 eager. https://bitbucket.org/ucla-pls/stamp-tl2-x86/ src/master/tl2.c, 2015. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  11. Tiago de Paula Peixoto. Graph-tool, efficient network analysis. https://graph-tool.skewed.de/, 2015. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  12. Dave Dice, Ori Shalev, and Nir Shavit. Transactional locking II. In Distributed Computing, pages 194–208. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Dave Dice and Nir Shavit. Tlrw: Return of the read-write lock. In SPAA, pages 284–293, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David Dice. A race in locksupport park() arising from weak memory models. https://blogs.oracle.com/ dave/entry/a_race_in_locksupport_park, 2009. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  15. Xing Fang, Jaejin Lee, and Samuel Midkiff. Automatic fence insertion for shared memory multiprocessing. In ICS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. GNU. Glpk (gnu linear programming kit). https://www. gnu.org/software/glpk/, 2015.Google ScholarGoogle Scholar
  17. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  18. Richard Grisenthwaite. Barrier litmus tests and cookbook. http://infocenter.arm.com/help/topic/com. arm.doc.genc007826/Barrier_Litmus_Tests_and_ Cookbook_A08.pd, 2009. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  19. Maurice Herlihy and J. Eliot B. Moss. Transactional memory: Architectural support for lock-free data structures. SIGARCH Comput. Archit. News, 21(2):289–300, May 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vilas Jagannath, Milos Gligoric, Dongyun Jin, Qingzhou Luo, Grigore Rosu, and Darko Marinov. Improved multithreaded unit testing. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, ESEC/FSE ’11, pages 223–233, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Michael Kuperstein, Martin Vechev, and Eran Yahav. Automatic inference of memory fences. In Proceedings of the 2010 Conference on Formal Methods in Computer-Aided Design, FMCAD ’10, pages 111–120, Austin, TX, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Michael Kuperstein, Martin Vechev, and Eran Yahav. Partialcoherence abstractions for relaxed memory models. In PLDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jaejin Lee and David Padua. Hiding relaxed memory consistency with a compiler. IEEE Transactions on Computers, 50(8), August 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mohsen Lesani. On the Correctness of Transactional Memory Algorithms. PhD thesis, UCLA, 2014.Google ScholarGoogle Scholar
  25. Mohsen Lesani and Jens Palsberg. Decomposing opacity. In Proceedings of DISC’14, International Symposium on Distributed Computing, Austin, Texas, October 2014.Google ScholarGoogle ScholarCross RefCross Ref
  26. Feng Liu, Nayden Nedeve, Nedyalko Prisadnikov, Martin Vechev, and Eran Yahav. Dynamic synthesis for relaxed memory models. In PLDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Virendra J Marathe, Michael F Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N Scherer III, and Michael L Scott. Lowering the overhead of nonblocking software transactional memory. Technical Report 893, University of Rochester, 2006.Google ScholarGoogle Scholar
  28. Yuri Meshman, Andrei Dan, Martin Vechev, and Eran Yahav. Synthesis of memory fences via refinement propagation. In Proceedings on Static Analysis Symposium, Lecture Notes in Computer Science Volume 8723, pages 237–252, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  29. Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, and Kunle Olukotun. Stamp: Stanford transactional applications for multi-processing. In Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on, pages 35– 46. IEEE, 2008.Google ScholarGoogle Scholar
  30. {http://stamp.stanford.edu, online, accessed Nov 2014}.Google ScholarGoogle Scholar
  31. University of Rochester and Lehigh University Departments of Computer Science. Rstm byteeager. https://code.google.com/p/rstm/source/browse/ trunk/libstm/algs/byteeager.cpp, 2015.Google ScholarGoogle Scholar
  32. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  33. LLVM Project. Llvm language reference manual. http:// llvm.org/docs/LangRef.html, 2015.Google ScholarGoogle Scholar
  34. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  35. Sage Project. Sagemath. http://www.sagemath.org/, 2015. {Online, accessed Feb 2015}.Google ScholarGoogle Scholar
  36. Dennis Shasha and Marc Snir. Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst., 10(2):282–312, April 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Declarative fence insertion

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 50, Issue 10
          OOPSLA '15
          October 2015
          953 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2858965
          • Editor:
          • Andy Gill
          Issue’s Table of Contents
          • cover image ACM Conferences
            OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
            October 2015
            953 pages
            ISBN:9781450336895
            DOI:10.1145/2814270

          Copyright © 2015 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2015

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!