skip to main content
10.1145/1383422.1383443acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

StoreGPU: exploiting graphics processing units to accelerate distributed storage systems

Published:23 June 2008Publication History

ABSTRACT

Today Graphics Processing Units (GPUs) are a largely underexploited resource on existing desktops and a possible cost-effective enhancement to high-performance systems. To date, most applications that exploit GPUs are specialized scientific applications. Little attention has been paid to harnessing these highly-parallel devices to support more generic functionality at the operating system or middleware level. This study starts from the hypothesis that generic middleware level techniques that improve distributed system reliability or performance (such as content addressing, erasure coding, or data similarity detection) can be significantly accelerated using GPU support.

We take a first step towards validating this hypothesis, focusing on distributed storage systems. As a proof of concept, we design StoreGPU, a library that accelerates a number of hashing based primitives popular in distributed storage system implementations. Our evaluation shows that StoreGPU enables up to eight-fold performance gains on synthetic benchmarks as well as on a high-level application: the online similarity detection between large data files.

References

  1. ATI Close To Metal (CTM) Technical Reference, 2008.Google ScholarGoogle Scholar
  2. CUDA 1.1 Beta. http://developer.nvidia.com, 2007.Google ScholarGoogle Scholar
  3. Geforce 8 Series, http://www.nvidia.com/. 2008.Google ScholarGoogle Scholar
  4. Geforce 9 Series, http://www.nvidia.com/. 2008.Google ScholarGoogle Scholar
  5. Jon Peddie Research Report: Nvidia on a roll, grabs more desktop graphics market share in 4Q, http://www.jonpeddie.com/about/press/MarketWatch_Q405.shtml. 2006.Google ScholarGoogle Scholar
  6. Jon Peddie Research Report: Overall GPU market was up an astounding 20% - desktop displaced mobile http://www.jonpeddie.com/about/press/2007/GPU_market_Q307.shtml. 2007.Google ScholarGoogle Scholar
  7. NVIDIA CUDA Compute Unified Device Architecture: Programming Guide v0.8. 2008.Google ScholarGoogle Scholar
  8. Twisted Storage, http://twistedstorage.sourceforge.net/. 2008.Google ScholarGoogle Scholar
  9. Al-Kiswany, S., et al. stdchk: A Checkpoint Storage System for Desktop Grid Computing. in ICDCS '08. 2008. Beijing, China. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Altschul, S.F., et al., Basic Local Alighnment Tool. Molecular Biology, 1990. 215: p. 403--410.Google ScholarGoogle Scholar
  11. Bloom, B., Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of ACM, 1970. 13(7): p. 422--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Buck, I., et al., Brook for GPUs: stream computing on graphics hardware. ACM Trans. Graph., 2004. 23(3): p. 777--786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Byers, J.W., et al. A Digital Fountain Approach to Reliable Distribution of Bulk Data. in SIGCOM. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chun, B.-G., et al. Efficient Replica Maintenance for Distributed Storage Systems. in 3rd USENIX Symposium on Networked Systems Design & Implementation (NSDI). 2006. San Jose, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cox, L.P. and B.D. Noble. Samsara: honor among thieves in peer-to-peer storage. in ACM Symposium on Operating Systems Principles. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dabek, F., et al. Wide-area cooperative storage with CFS. in 18th ACM Symposium on Operating Systems Principles (SOSP '01). 2001. Chateau Lake Louise, Banff, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dabiri, D. and I.F. Blake, Fast parallel algorithms for decoding Reed-Solomon codes based on remainder polynomials. IEEE Transactions on Information Theory, 1995. 41(4): p. 873--885. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Damgard, I. A Design Principle for Hash Functions. in Advances in Cryptology - CRYPTO. 1989: Lecture Notes in Computer Science. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. DeCandia, G., et al. Dynamo: Amazon's Highly Available Key-value Store. in SOSP07. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Eshghi, K., et al. JumboStore: Providing Efficient Incremental Upload and Versioning for a Utility Rendering Service. in USENIX FAST 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Fu, K., M.F. Kaashoek, and D. Mazières. Fast and secure distributed read-only file system. in USENIX OSDI. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gilchrist, J. Parallel Compression with BZIP2. in IASTED PDCS, 2004.Google ScholarGoogle Scholar
  23. Govindaraju, N.K., et al. Fast Computation of Database Operations using Graphics Processors. in ACM SIGMOD International Conference on Management of Data. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hargrove, P.H. and J.C. Duell. Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters. in Scientific Discovery through Advanced Computing Program. 2006.Google ScholarGoogle Scholar
  25. Huffman, D., A Method for the Construction of Minimum-Redundancy Codes. Proceedings of the IRE, 1952. 40(9).Google ScholarGoogle Scholar
  26. Karger, D.R., et al. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. in STOC, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kotla, R., L. Alvisi, and M. Dahlin. SafeStore: A Durable and Practical Storage System. in USENIX Conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kruger, J. and R. Westermann. Linear Algebra Operators for GPU Implementation of Numerical Algorithms. in ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liu, W., et al. Bio-sequence database scanning on a GPU. in Parallel and Distributed Processing Symposium, IPDPS. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Merkle, R. A Certified Digital Signature. in Advances in Cryptology - CRYPTO. 1989: Lecture Notes in Computer Science. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Moya, V., et al. Shader performance analysis on a modern GPU architecture. in IEEE/ACM International Symposium on Microarchitecture, MICRO-38. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Muthitacharoen, A., B. Chen, and D. Mazieres. A Low-bandwidth Network File System. in Symposium on Operating Systems Principles (SOSP). 2001. Banff, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Owens, J.D., et al., A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum, 2007. 26(1): p. 80--113.Google ScholarGoogle Scholar
  34. Quinlan, S. and S. Dorward. Venti: A New Approach to Archival Data Storage. in USENIX FAST 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Rowstron, A. and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. in Middleware'01. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Stoica, I., et al. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. in SIGCOMM 2001. 2001. San Diego, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Thompson, C.J., S. Hahn, and M. Oskin. Using Modern Graphics Architectures for General-Purpose Computing: A Framework and Analysis. in ACM/IEEE international symposium on Microarchitecture. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Vilayannur, M., P. Nath, and A. Sivasubramaniam. Providing Tunable Consistency for a Parallel File Store. in USENIX Conference on File and Storage Technologies. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Weatherspoon, H. and J. Kubiatowicz. Erasure Coding vs. Replication: A Quantitative Comparison. in International Workshop on Peer-to-Peer Systems IPTPS. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yumerefendi, A.R. and J.S. Chase. Strong Accountability for Network Storage. in FAST'07. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. StoreGPU: exploiting graphics processing units to accelerate distributed storage systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              HPDC '08: Proceedings of the 17th international symposium on High performance distributed computing
              June 2008
              252 pages
              ISBN:9781595939975
              DOI:10.1145/1383422

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 23 June 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate166of966submissions,17%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader