skip to main content
research-article

GPUfs: integrating a file system with GPUs

Published:16 March 2013Publication History
Skip Abstract Section

Abstract

PU hardware is becoming increasingly general purpose, quickly outgrowing the traditional but constrained GPU-as-coprocessor programming model. To make GPUs easier to program and easier to integrate with existing systems, we propose making the host's file system directly accessible from GPU code. GPUfs provides a POSIX-like API for GPU programs, exploits GPU parallelism for efficiency, and optimizes GPU file access by extending the buffer cache into GPU memory. Our experiments, based on a set of real benchmarks adopted to use our file system, demonstrate the feasibility and benefits of our approach. For example, we demonstrate a simple self-contained GPU program which searches for a set of strings in the entire tree of Linux kernel source files over seven times faster than an eight-core CPU run.

References

  1. Amittai Aviram, Shu-Chun Weng, Sen Hu, and Bryan Ford. Efficient system-enforced deterministic parallelism. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schupbach, and Akhilesh Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles, pages 29--44, New York, NY, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amr Bayoumi, Michael Chu, Yasser Hanafy, Patricia Harrell, and Gamal Refai-Ahmed. Scientific and Engineering Computing Using ATI Stream Technology. Computing in Science and Engineering, 11(6):92--97, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. Brook for GPUs: Stream Computing on Graphics Hardware. ACM Transactions on Graphics, 23(3), August 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Wolfgang Effelsberg and Theo Haerder. Principles of database buffer management. ACM Transactions on Database Systems, 9(4):560--595, December 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Isaac Gelado, John E. Stone, Javier Cabezas, Sanjay Patel, Nacho Navarro, and Wen-mei W. Hwu. An asymmetric distributed shared memory model for heterogeneous parallel systems. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 347--358, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Khronos Group. OpenCL - the open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl.Google ScholarGoogle Scholar
  8. Tianyi David Han and Tarek S. Abdelrahman. hiCUDA: a high-level directive-based language for GPU programming. In Proceedings of the 2nd Workshop on General Purpose Processing on Graphics Processing Units (GPGPU-2), March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Stephen Hemminger. fast reader/writer lock for gettimeofday 2.5.30, 2002. http://lwn.net/Articles/7388/.Google ScholarGoogle Scholar
  10. John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West. Scale and performance in a distributed file system. ACM Transactions on Computing Systems, 6(1), February 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Intel Xeon-Phi Coprocessor: System Software Developers Guide, November 2012. http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessorsystem-software-developers-guide.html.Google ScholarGoogle Scholar
  12. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the Cell multiprocessor. IBM Journal of Research and Development, 49:589--604, July 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Kato, M. McThrow, C. Maltzahn, and S. Brandt. Gdev: Firstclass GPU resource management in the operating system. In USENIX Annual Technical Conference, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Walt Ligon and Rob Ross. Parallel i/o and the parallel virtual file system. In William Gropp, Ewing Lusk, and Thomas Sterling, editors, Beowulf Cluster Computing with Linux, pages 493--535. MIT Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yuki Matsuo, Taku Shimosawa, and Yutaka Ishikawa. A file I/O system for many-core based clusters. In Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, pages 3:1--3:8, New York, NY, USA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Michael D. McCool and Bruce D'Amora. Programming using Rapid-Mind on the Cell BE. In SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, page 222, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Paul E. McKenney, Dipankar Sarma, Andrea Arcangeli, Andi Kleen, Orran Krieger, and Rusty Russell. Read-copy update. In Ottawa Linux Symposium, pages 338--367, June 2002.Google ScholarGoogle Scholar
  19. Edmund B. Nightingale, Orion Hodson, Ross McIlroy, Chris Hawblitzel, and Galen Hunt. Helios: heterogeneous multiprocessing with satellite kernels. In SOSP '09: Proceedings of the 22nd ACM symposium on Operating systems principles, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. NVIDIA CUDA 4.2 Developer Guide. http://developer.nvidia.com/category/zone/cuda-zone.Google ScholarGoogle Scholar
  21. NVIDIA's Next Generation CUDA Compute Architecture: Fermi, 2011. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf.Google ScholarGoogle Scholar
  22. Christopher J. Rossbach, Jon Currey, Mark Silberstein, Baishakhi Ray, and EmmettWitchel. PTask: operating system abstractions to manage GPUs as compute devices. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pages 233--248, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Livio Soares and Michael Stumm. FlexSC: flexible system call scheduling with exception-less system calls. In Proceedings of the 9th USENIX conference on Operating systems design and implementation, pages 1--8, Berkeley, CA, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jeff A. Stuart, Michael Cox, and John D. Owens. GPU-to-CPU callbacks. In Third Workshop on UnConventional High Performance Computing (UCHPC 2010), August 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sain-Zee Ueng, Melvin Lathara, Sara S. Baghsorkhi, andWen-MeiW. Hwu. CUDA-Lite: Reducing GPU Programming Complexity. In LCPC 2008, 21th Annual Workshop on Languages and Compilers for Parallel Computing, 2008.Google ScholarGoogle Scholar
  26. BruceWalker, Gerald Popek, Robert English, Charles Kline, and Greg Thiel. The LOCUS distributed operating system. In Proceedings of the ninth ACM symposium on Operating systems principles, pages 49--70, New York, NY, USA, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yaron Weinsberg, Danny Dolev, Tal Anker, Muli Ben-Yehuda, and Pete Wyckoff. Tapping into the fountain of CPUs: on operating system support for programmable devices. In 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08), March 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Zadok and I. Badulescu. A stackable file system interface for Linux. In LinuxExpo Conference Proceedings, pages 141--151, Raleigh, NC, May 1999.Google ScholarGoogle Scholar

Index Terms

  1. GPUfs: integrating a file system with GPUs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 48, Issue 4
          ASPLOS '13
          April 2013
          540 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2499368
          Issue’s Table of Contents
          • cover image ACM Conferences
            ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
            March 2013
            574 pages
            ISBN:9781450318709
            DOI:10.1145/2451116

          Copyright © 2013 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 March 2013

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!