Abstract
Atomic blocks allow programmers to delimit sections of code as 'atomic', leaving the language's implementation to enforce atomicity. Existing work has shown how to implement atomic blocks over word-based transactional memory that provides scalable multi-processor performance without requiring changes to the basic structure of objects in the heap. However, these implementations perform poorly because they interpose on all accesses to shared memory in the atomic block, redirecting updates to a thread-private log which must be searched by reads in the block and later reconciled with the heap when leaving the block.This paper takes a four-pronged approach to improving performance: (1) we introduce a new 'direct access' implementation that avoids searching thread-private logs, (2) we develop compiler optimizations to reduce the amount of logging (e.g. when a thread accesses the same data repeatedly in an atomic block), (3) we use runtime filtering to detect duplicate log entries that are missed statically, and (4) we present a series of GC-time techniques to compact the logs generated by long-running atomic blocks.Our implementation supports short-running scalable concurrent benchmarks with less than 50\% overhead over a non-thread-safe baseline. We support long atomic blocks containing millions of shared memory accesses with a 2.5-4.5x slowdown.
- Agesen, O., Detlefs, D., Garthwaite, A., Knippel, R., Ramakrishna, Y. S., and White, D. An efficient meta-lock for implementing ubiquitous synchronization. In Object-Oriented Programming, Systems, Languages & Applications (OOPSLA) (Nov. 1999), vol. 34(10) of ACM SIGPLAN Notices, pp. 207--222. Google Scholar
Digital Library
- Allan, E., Chase, D., Luchangco, V., Maessen, J.-W., Ryu, S., Steele Jr, G. L., and Tobin-Hochstadt, S. The Fortress language specification v0.618, Apr. 2005.Google Scholar
- Ananian, C. S., and Rinard, M. Efficient software transactions for object-oriented languages. In OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Lanaguages (SCOOL) (Oct. 2005). Also available in the University of Rochester digital archive.Google Scholar
- Bacon, D. F., Konuru, R., Murthy, C., and Serrano, M. Thin locks: Featherweight synchronization for Java. In Programming Language Design and Implementation (PLDI) (Jun. 1998), vol. 33(5) of ACM SIGPLAN Notices, pp. 258--268. Google Scholar
Digital Library
- Carlstrom, B. D., Chung, J., Chafi, H., McDonald, A., Minh, C. C., Hammond, L., Kozyrakis, C., and Olukotun, K. Transactional execution of Java programs. In OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Lanaguages (SCOOL) (Oct. 2005). Also available in the University of Rochester digital archiveGoogle Scholar
- Charles, P., Donawa, C., Ebcioglu, K., Grothoff, C., Kielstra, A., Sarkar, V., and Praun, C. V. X10: An object-oriented approach to non-uniform cluster computing. In Object-Oriented Programming, Systems, Languages & Applications (OOPSLA) (Oct. 2005), pp. 519--538. Google Scholar
Digital Library
- Cray Inc. The Chapel language specification v0.4, Feb. 2005.Google Scholar
- Dice, D. Implementing fast Java monitors with relaxed-locks. In Proceedings of USENIX JVM 2001 (2001), pp. 79--90. Google Scholar
Digital Library
- Diniz, P. C., and Rinard, M. C. Lock coarsening: Eliminating lock overhead in automatically parallelized object-based programs. Journal of Parallel and Distributed Computing 49, 2 (Mar. 1998), 218--244. Google Scholar
Digital Library
- Fraser, K. Practical lock freedom. PhD thesis, University of Cambridge Computer Laboratory, 2003.Google Scholar
- Harris, T. Exceptions and side-effects in atomic blocks. In PODC 2004 Workshop on Concurrency and Synchronization in Java programs (CSJP) (Jul. 2004), pp. 46--53. Proceedings published as Memorial University of Newfoundland CS Technical Report 2004-01. Google Scholar
Digital Library
- Harris, T., and Fraser, K. Language support for lightweight transactions. In Object-Oriented Programming, Systems, Langauges & Applications (OOPSLA) (Oct. 2003), pp. 388--402. Google Scholar
Digital Library
- Harris, T., Herlihy, M., Marlow, S., and Peyton-Jones, S. Composable memory transactions. In Symposium on Principles and Practice of Parallel Programming (PPoPP) (Jun. 2005), pp. 48--60. Google Scholar
Digital Library
- Hanke, S., Ottmann, T., and Soisalon-Soininen, E. Relaxed Balanced Red-Black Trees. In Italian Conference on Algorithms and Complexity (1997), vol. 1203 of Springer-Verlag LNCS, pp. 193--204. Google Scholar
Digital Library
- Herlihy, M. SXM1.1: Software transactional memory package for c#. Tech. rep., Brown University & Microsoft Research, May 2005.Google Scholar
- Herlihy, M., Luchangco, V., Moir, M., and Scherer, III, W. N. Software transactional memory for dynamic-sized data structures. In Principles of Distributed Computing (PODC) (Jul. 2003), pp. 92--101. Google Scholar
Digital Library
- Herlihy, M., and Moss, J. E. B. Transactional memory: architectural support for lock-free data structures. In International Symposium on Computer Architecture (ISCA) (May 1993), pp. 289--300. Google Scholar
Digital Library
- Herlihy, M. P., and Wing, J. M. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS) 12, 3 (Jul. 1990), 463--492. Google Scholar
Digital Library
- Hunt, G.C. et al. An overview of the Singularity project. Tech. Rep. MSR-TR-2005-135, Microsoft Research, Oct. 2005.Google Scholar
- International Business Machines Corp. System/370 Principles of Operation, 1983.Google Scholar
- Kaunitz, J., and van Ekert, L. Audit trail compaction for database recovery. Commun. ACM 27, 7 (Jul. 1984), 678--683. Google Scholar
Digital Library
- Liskov, B. Distributed programming in argus. Commun. ACM 31, 3 (Mar. 1988), 300--312. Google Scholar
Digital Library
- Marathe, V. J., Scherer III, W. N., and Scott, M. L. Design tradeoffs in modern software transactional memory systems. In Proceedings of the 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems (Oct. 2004). Google Scholar
Digital Library
- Rajwar, R., and Goodman, J. R. Speculative lock elision: Enabling highly concurrent multithreaded execution. In 34th Annual International Symposium on Microarchitecture (Dec. 2001), pp. 294--305. Google Scholar
Digital Library
- Rajwar, R., and Goodman, J. R. Transactional lock-free execution of lock-based programs. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), vol. 37(10) of ACM SIG\-PLAN Notices, pp. 5--17. Google Scholar
Digital Library
- Rajwar, R., Herlihy, M., and Lai, K. Virtualizing transactional memory. In International Symposium on Computer Architecture (ISCA) (Jun. 2005), pp.494--505. Google Scholar
Digital Library
- Ringenburg, M. F., and Grossman, D. AtomCaml: First-class atomicity via rollback. In International Conference on Functional Programming (ICFP) (Sept. 2005), pp. 92--104. Google Scholar
Digital Library
- Scherer III, W. N., and Scott, M. L. Contention management in dynamic software transactional memory. In PODC 2004 Workshop on Concurrency and Synchronization in Java Programs (CSJP) (Jul. 2004). Proceedings published as Memorial University of Newfoundland CS Technical Report 2004-01 Google Scholar
Digital Library
- Shinnar, A., Tarditi, D., Plesko, M., and Steensgaard, B. Integrating support for undo with exception handling. Tech. Rep. MSR-TR-2004-140, Microsoft Research, Dec. 2004.Google Scholar
- Welc, A., Jagannathan, S., and Hosking, A. Transactional monitors for concurrent objects. In European Conference on Object-Oriented Programming (ECOOP) (Jun. 2004), pp.519--542.Google Scholar
Cross Ref
Index Terms
Optimizing memory transactions
Recommendations
Optimizing memory transactions
PLDI '06: Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and ImplementationAtomic blocks allow programmers to delimit sections of code as 'atomic', leaving the language's implementation to enforce atomicity. Existing work has shown how to implement atomic blocks over word-based transactional memory that provides scalable multi-...
Split hardware transactions: true nesting of transactions using best-effort hardware transactional memory
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programmingTransactional Memory (TM) is on its way to becoming the programming API of choice for writing correct, concurrent, and scalable programs. Hardware TM (HTM) implementations are expected to be significantly faster than pure software TM (STM); however, ...
STM systems: enforcing strong isolation between transactions and non-transactional code
ICA3PP'12: Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part ITransactional memory (TM) systems implement the concept of an atomic execution unit called transaction in order to discharge programmers from explicit synchronization management. But when shared data is atomically accessed by both transaction and non-...







Comments