Abstract
We propose a design for a fine-grained lock-based skiplist optimized for Graphics Processing Units (GPUs). While GPUs are often used to accelerate streaming parallel computations, it remains a significant challenge to efficiently offload concurrent computations with more complicated data-irregular access and fine-grained synchronization. Natural building blocks for such computations would be concurrent data structures, such as skiplists, which are widely used in general purpose computations. Our design utilizes array-based nodes which are accessed and updated by warp-cooperative functions, thus taking advantage of the fact that GPUs are most efficient when memory accesses are coalesced and execution divergence is minimized. The proposed design has been implemented, and measurements demonstrate improved performance of up to 2.6x over skiplist designs for the GPU existing today.
- M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on gpus. In IISWC, 2012. Google Scholar
Digital Library
- J. L. Carlson. Redis in Action. Manning Publications Co., 2013.Google Scholar
Digital Library
- D. Cederman, B. Chatterjee, and P. Tsigas. Understanding the performance of concurrent data structures on graphics processors. In Euro-Par, 2012. Google Scholar
Digital Library
- P. Misra and M. Chaudhuri. Performance evaluation of concurrent lock-free data structures on gpus. In ICPDS, 2012,\http://www.cse.iitk.ac.in/users/mainakc/lockfree.html.Google Scholar
Digital Library
- Nvidia. CUDA C Programming Guide v7.5, september 2015. NVIDIA Developer Zone: website, 2015.Google Scholar
- RocksDB. A persistent key-value store for fast storage environments. http://rocksdb.org/, 2014.Google Scholar
- N. Shavit and I. Lotan. Skiplist-based concurrent priority queues. In IPDPS, 2000. Google Scholar
Cross Ref
Index Terms
POSTER: A GPU-Friendly Skiplist Algorithm
Recommendations
POSTER: A GPU-Friendly Skiplist Algorithm
PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingWe propose a design for a fine-grained lock-based skiplist optimized for Graphics Processing Units (GPUs). While GPUs are often used to accelerate streaming parallel computations, it remains a significant challenge to efficiently offload concurrent ...
Performance Evaluation of Concurrent Lock-free Data Structures on GPUs
ICPADS '12: Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed SystemsGraphics processing units (GPUs) have emerged as a strong candidate for high-performance computing. While regular data-parallel computations with little or no synchronization are easy to map on the GPU architectures, it is a challenge to scale up ...
Accelerated bulk memory operations on heterogeneous multi-core systems
A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the past few years, the general-purpose computing on GPU (GPGPU). Recently, revolutionary measures have been taken along this ...







Comments