ABSTRACT
Today Graphics Processing Units (GPUs) are a largely underexploited resource on existing desktops and a possible cost-effective enhancement to high-performance systems. To date, most applications that exploit GPUs are specialized scientific applications. Little attention has been paid to harnessing these highly-parallel devices to support more generic functionality at the operating system or middleware level. This study starts from the hypothesis that generic middleware level techniques that improve distributed system reliability or performance (such as content addressing, erasure coding, or data similarity detection) can be significantly accelerated using GPU support.
We take a first step towards validating this hypothesis, focusing on distributed storage systems. As a proof of concept, we design StoreGPU, a library that accelerates a number of hashing based primitives popular in distributed storage system implementations. Our evaluation shows that StoreGPU enables up to eight-fold performance gains on synthetic benchmarks as well as on a high-level application: the online similarity detection between large data files.
- ATI Close To Metal (CTM) Technical Reference, 2008.Google Scholar
- CUDA 1.1 Beta. http://developer.nvidia.com, 2007.Google Scholar
- Geforce 8 Series, http://www.nvidia.com/. 2008.Google Scholar
- Geforce 9 Series, http://www.nvidia.com/. 2008.Google Scholar
- Jon Peddie Research Report: Nvidia on a roll, grabs more desktop graphics market share in 4Q, http://www.jonpeddie.com/about/press/MarketWatch_Q405.shtml. 2006.Google Scholar
- Jon Peddie Research Report: Overall GPU market was up an astounding 20% - desktop displaced mobile http://www.jonpeddie.com/about/press/2007/GPU_market_Q307.shtml. 2007.Google Scholar
- NVIDIA CUDA Compute Unified Device Architecture: Programming Guide v0.8. 2008.Google Scholar
- Twisted Storage, http://twistedstorage.sourceforge.net/. 2008.Google Scholar
- Al-Kiswany, S., et al. stdchk: A Checkpoint Storage System for Desktop Grid Computing. in ICDCS '08. 2008. Beijing, China. Google Scholar
Digital Library
- Altschul, S.F., et al., Basic Local Alighnment Tool. Molecular Biology, 1990. 215: p. 403--410.Google Scholar
- Bloom, B., Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of ACM, 1970. 13(7): p. 422--426. Google Scholar
Digital Library
- Buck, I., et al., Brook for GPUs: stream computing on graphics hardware. ACM Trans. Graph., 2004. 23(3): p. 777--786. Google Scholar
Digital Library
- Byers, J.W., et al. A Digital Fountain Approach to Reliable Distribution of Bulk Data. in SIGCOM. 1998. Google Scholar
Digital Library
- Chun, B.-G., et al. Efficient Replica Maintenance for Distributed Storage Systems. in 3rd USENIX Symposium on Networked Systems Design & Implementation (NSDI). 2006. San Jose, CA. Google Scholar
Digital Library
- Cox, L.P. and B.D. Noble. Samsara: honor among thieves in peer-to-peer storage. in ACM Symposium on Operating Systems Principles. 2003. Google Scholar
Digital Library
- Dabek, F., et al. Wide-area cooperative storage with CFS. in 18th ACM Symposium on Operating Systems Principles (SOSP '01). 2001. Chateau Lake Louise, Banff, Canada. Google Scholar
Digital Library
- Dabiri, D. and I.F. Blake, Fast parallel algorithms for decoding Reed-Solomon codes based on remainder polynomials. IEEE Transactions on Information Theory, 1995. 41(4): p. 873--885. Google Scholar
Digital Library
- Damgard, I. A Design Principle for Hash Functions. in Advances in Cryptology - CRYPTO. 1989: Lecture Notes in Computer Science. Google Scholar
Digital Library
- DeCandia, G., et al. Dynamo: Amazon's Highly Available Key-value Store. in SOSP07. 2007. Google Scholar
Digital Library
- Eshghi, K., et al. JumboStore: Providing Efficient Incremental Upload and Versioning for a Utility Rendering Service. in USENIX FAST 2007. Google Scholar
Digital Library
- Fu, K., M.F. Kaashoek, and D. Mazières. Fast and secure distributed read-only file system. in USENIX OSDI. 2000. Google Scholar
Digital Library
- Gilchrist, J. Parallel Compression with BZIP2. in IASTED PDCS, 2004.Google Scholar
- Govindaraju, N.K., et al. Fast Computation of Database Operations using Graphics Processors. in ACM SIGMOD International Conference on Management of Data. 2004. Google Scholar
Digital Library
- Hargrove, P.H. and J.C. Duell. Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters. in Scientific Discovery through Advanced Computing Program. 2006.Google Scholar
- Huffman, D., A Method for the Construction of Minimum-Redundancy Codes. Proceedings of the IRE, 1952. 40(9).Google Scholar
- Karger, D.R., et al. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. in STOC, 1997. Google Scholar
Digital Library
- Kotla, R., L. Alvisi, and M. Dahlin. SafeStore: A Durable and Practical Storage System. in USENIX Conference, 2007. Google Scholar
Digital Library
- Kruger, J. and R. Westermann. Linear Algebra Operators for GPU Implementation of Numerical Algorithms. in ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. 2003. Google Scholar
Digital Library
- Liu, W., et al. Bio-sequence database scanning on a GPU. in Parallel and Distributed Processing Symposium, IPDPS. 2006. Google Scholar
Digital Library
- Merkle, R. A Certified Digital Signature. in Advances in Cryptology - CRYPTO. 1989: Lecture Notes in Computer Science. Google Scholar
Digital Library
- Moya, V., et al. Shader performance analysis on a modern GPU architecture. in IEEE/ACM International Symposium on Microarchitecture, MICRO-38. 2005. Google Scholar
Digital Library
- Muthitacharoen, A., B. Chen, and D. Mazieres. A Low-bandwidth Network File System. in Symposium on Operating Systems Principles (SOSP). 2001. Banff, Canada. Google Scholar
Digital Library
- Owens, J.D., et al., A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum, 2007. 26(1): p. 80--113.Google Scholar
- Quinlan, S. and S. Dorward. Venti: A New Approach to Archival Data Storage. in USENIX FAST 2002. Google Scholar
Digital Library
- Rowstron, A. and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. in Middleware'01. Google Scholar
Digital Library
- Stoica, I., et al. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. in SIGCOMM 2001. 2001. San Diego, USA. Google Scholar
Digital Library
- Thompson, C.J., S. Hahn, and M. Oskin. Using Modern Graphics Architectures for General-Purpose Computing: A Framework and Analysis. in ACM/IEEE international symposium on Microarchitecture. 2002. Google Scholar
Digital Library
- Vilayannur, M., P. Nath, and A. Sivasubramaniam. Providing Tunable Consistency for a Parallel File Store. in USENIX Conference on File and Storage Technologies. 2005. Google Scholar
Digital Library
- Weatherspoon, H. and J. Kubiatowicz. Erasure Coding vs. Replication: A Quantitative Comparison. in International Workshop on Peer-to-Peer Systems IPTPS. 2002. Google Scholar
Digital Library
- Yumerefendi, A.R. and J.S. Chase. Strong Accountability for Network Storage. in FAST'07. 2007. Google Scholar
Digital Library
Index Terms
StoreGPU: exploiting graphics processing units to accelerate distributed storage systems
Recommendations
On GPU's viability as a middleware accelerator
Today Graphics Processing Units (GPUs) are a largely underexploited resource on existing desktops and a possible cost-effective enhancement to high-performance systems. To date, most applications that exploit GPUs are specialized scientific ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Parallel Document Inversion using GPU
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent SystemsRecent advances in the technology of the Graphics Processing Unit (GPU) has led to a surge of interest in using the GPU for general purpose applications. We can utilize the GPU in computation as a massive parallel co-processor because the GPU consists ...





Comments