Abstract
Concurrent data-structures, such as stacks, queues, and deques, often implicitly enforce a total order over elements in their underlying memory layout. However, much of this order is unnecessary: linearizability only requires that elements are ordered if the insert methods ran in sequence. We propose a new approach which uses timestamping to avoid unnecessary ordering. Pairs of elements can be left unordered if their associated insert operations ran concurrently, and order imposed as necessary at the eventual removal.
We realise our approach in a new non-blocking data-structure, the TS (timestamped) stack. Using the same approach, we can define corresponding queue and deque data-structures. In experiments on x86, the TS stack outperforms and outscales all its competitors -- for example, it outperforms the elimination-backoff stack by factor of two. In our approach, more concurrency translates into less ordering, giving less-contended removal and thus higher performance and scalability. Despite this, the TS stack is linearizable with respect to stack semantics.
The weak internal ordering in the TS stack presents a challenge when establishing linearizability: standard techniques such as linearization points work well when there exists a total internal order. We present a new stack theorem, mechanised in Isabelle, which characterises the orderings sufficient to establish stack semantics. By applying our stack theorem, we show that the TS stack is indeed linearizable. Our theorem constitutes a new, generic proof technique for concurrent stacks, and it paves the way for future weakly ordered data-structure designs.
Supplemental Material
Available for Download
A Scalable, Correct Time-Stamped Stack We provide an appendix and two artifacts: * A proof in Isabelle/HOL of the Stack Theorem (Theorem 1 in our paper). * The TS stack, a concurrent stack written in C, along with implementations of several other algorithms for comparison purposes.
- Y. Afek and A. Morrison. Fast concurrent queues for x86 processors. In phPPoPP. ACM, 2013. Google Scholar
Digital Library
- H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. Michael, and M. Vechev. Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. In phPOPL, 2011. Google Scholar
Digital Library
- G. Bar-Nissan, D. Hendler, and A. Suissa. A dynamic elimination-combining stack algorithm. In phOPODIS, 2011. Google Scholar
Digital Library
- M. Batty, M. Dodds, and A. Gotsman. Library abstraction for C/CGoogle Scholar
- concurrency. In phPOPL, 2013.Google Scholar
- }scalComputational Systems Group, University of Salzburg. Scal framework. URL http://scal.cs.uni-salzburg.at.Google Scholar
- M. Gorelik and D. Hendler. Brief announcement: an asymmetric flat-combining based queue algorithm. In phPODC, 2013. Google Scholar
Digital Library
- A. Haas, C. Kirsch, M. Lippautz, and H. Payer. How FIFO is your concurrent FIFO queue? In phRACES. ACM, 2012. Google Scholar
Digital Library
- A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, and A. Sokolova. Distributed queues in shared memory--multicore performance and scalability through quantitative relaxation. In phCF. ACM, 2013. Google Scholar
Digital Library
- D. Hendler, N. Shavit, and L. Yerushalmi. A scalable lock-free stack algorithm. In phSPAA. ACM, 2004. Google Scholar
Digital Library
- D. Hendler, I. Incze, N. Shavit, and M. Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In phSPAA, 2010. Google Scholar
Digital Library
- Henzinger, Payer, and Sezgin}Henzinger13bT. Henzinger, H. Payer, and A. Sezgin. Replacing competition with cooperation to achieve scalable lock-free FIFO queues. Technical Report IST-2013--124-v1Google Scholar
- 1, IST Austria, 2013\natexlaba.Google Scholar
- Henzinger, Sezgin, and Vafeiadis}Henzinger13T. A. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented linearizability proofs. In phCONCUR, 2013\natexlabb. Google Scholar
Digital Library
- M. Herlihy and N. Shavit. phThe Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., 2008. Google Scholar
Digital Library
- M. Herlihy and J. Wing. Linearizability: a correctness condition for concurrent objects. phTOPLAS, 12 (3), 1990. Google Scholar
Digital Library
- M. Hoffman, O. Shalev, and N. Shavit. The baskets queue. In phOPODIS. Springer, 2007. Google Scholar
Digital Library
- Intel. Intel 64 and ia-32 architectures software developer's manual, volume 3b: System programming guide, part 2, 2013. URL http://download.intel.com/products/processor/manual/253669.pdf.Google Scholar
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. phCommunications ACM, 21, July 1978. Google Scholar
Digital Library
- M. Michael and M. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In phPODC. ACM, 1996. Google Scholar
Digital Library
- H. H. Nguyen and M. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In phISMM. ACM, 2007. Google Scholar
Digital Library
- W. Ruan, Y. Liu, and M. Spear. Boosting timestamp-based transactional memory by exploiting hardware cycle counters. In phTRANSACT, 2013. Google Scholar
Digital Library
- P. Sewell, S. Sarkar, S. Owens, F. Z. Nardelli, and M. O. Myreen. x86-TSO: a rigorous and usable programmer's model for x86 multiprocessors. phCommun. ACM, 53 (7), 2010. Google Scholar
Digital Library
- R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ5118, IBM Almaden Research Center, April 1986.Google Scholar
Index Terms
A Scalable, Correct Time-Stamped Stack
Recommendations
A scalable lock-free stack algorithm
SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architecturesThe literature describes two high performance concurrent stack algorithms based on combining funnels and elimination trees. Unfortunately, the funnels are linearizable but blocking, and the elimination trees are non-blocking but not linearizable. ...
Linearizability: a correctness condition for concurrent objects
A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits ...
A Scalable, Correct Time-Stamped Stack
POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesConcurrent data-structures, such as stacks, queues, and deques, often implicitly enforce a total order over elements in their underlying memory layout. However, much of this order is unnecessary: linearizability only requires that elements are ordered ...







Comments