skip to main content
research-article

Firepile: run-time compilation for GPUs in scala

Published:22 October 2011Publication History
Skip Abstract Section

Abstract

Recent advances have enabled GPUs to be used as general-purpose parallel processors on commodity hardware for little cost. However, the ability to program these devices has not kept up with their performance. The programming model for GPUs has a number of restrictions that make it difficult to program. For example, software running on the GPU cannot perform dynamic memory allocation, requiring the programmer to pre-allocate all memory the GPU might use. To achieve good performance, GPU programmers must also be aware of how data is moved between host and GPU memory and between the different levels of the GPU memory hierarchy.

We describe Firepile, a library for GPU programming in Scala. The library enables a subset of Scala to be executed on the GPU. Code trees can be created from run-time function values, which can then be analyzed and transformed to generate GPU code. A key property of this mechanism is that it is modular: unlike with other meta-programming constructs, the use of code trees need not be exposed in the library interface. Code trees are general and can be used by library writers in other application domains. Our experiments show Firepile users can achieve performance comparable to C code targeted to the GPU with shorter, simpler, and easier-to-understand code.

References

  1. Aparapi: Java API for expressing GPU bound data parallel algorithms. http://developer.amd.com/zones/java/aparapi/Pages/default.aspx, 2011.Google ScholarGoogle Scholar
  2. Joshua Auerbach, David F. Bacon, Perry Cheng, and Rodric Rabbah. Lime: a Java-compatible and synthesizable language for heterogeneous architectures. In Proceedings of the 25th ACM Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA 2010), pages 89--108, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alan Bawden. Quasiquotation in Lisp. In Partial Evaluation and Semantic-Based Program Manipulation, pages 4--12, 1999.Google ScholarGoogle Scholar
  4. Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, and Pat Hanrahan. Brook for GPUs: stream computing on graphics hardware. In ACM SIGGRAPH 2004 Papers (SIGGRAPH '04), pages 777--786, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hassan Chafi, Zach DeVito, Adriaan Moors, Tiark Rompf, Arvind K. Sujeeth, Pat Hanrahan, Martin Odersky, and Kunle Olukotun. Language virtualization for heterogeneous parallel computing. In Onward! '10: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Olivier Chafik. JavaCL: Java wrappers for OpenCL. http://code.google.com/p/javacl, 2011.Google ScholarGoogle Scholar
  7. Olivier Chafik. ScalaCL: Faster Scala: optimizing compiler plugin+GPU-based collections (OpenCL). http://code.google.com/p/scalacl, 2011.Google ScholarGoogle Scholar
  8. Clyther: Python language extension for OpenCL. http://clyther.sourceforge.net, 2011.Google ScholarGoogle Scholar
  9. ECMA. Standard ECMA-334: C# language specification (4th edition). http://www.ecma-international.org/publications/standards/Ecma-334.htm, June 2006.Google ScholarGoogle Scholar
  10. Miguel Garcia, Anastasia Izmaylova, and Sibylle Schupp. Extending Scala with database query capability. Journal of Object Technology, July 2010.Google ScholarGoogle ScholarCross RefCross Ref
  11. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification. Addison Wesley, 3rd edition, 2005. ISBN 0321246780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. GPU.NET: Library for developing GPU-accelerated applications with .NET. http://www.tidepowerd.com/product, 2011.Google ScholarGoogle Scholar
  13. Shan Shan Huang, Amir Hormati, David F. Bacon, and Rodric Rabbah. Liquid metal: Object-oriented programming across the hardware/software boundary. In Proceedings of the 22nd European Conference on Object-Oriented Programming (ECOOP 2008), volume 5142 of Lecture Notes in Computer Science, pages 76--103, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. JOCL: Java bindings for OpenCL. http://www.jocl.org, 2011.Google ScholarGoogle Scholar
  15. Richard Kelsey, William Clinger, and Jonathan Rees (editors). Revised5 report on the algorithmic language Scheme. ACM SIGPLAN Notices, 33(9):26--76, October 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan C. Catanzaro, Paul Ivanov, and Ahmed Fasih. PyCUDA: GPU run-time code generation for high-performance computing. http://arxiv.org/abs/0911.3456, 2009. In submission.Google ScholarGoogle Scholar
  17. G. Korland, N. Shavit, and P. Felber. Noninvasive concurrency with Java S™. In Third Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG-3), January 2010.Google ScholarGoogle Scholar
  18. Sean Lee, Vinod Grover, Manuel M. T. Chakravarty, and Gabriele Keller. GPU kernels as data-parallel array computations in Haskell. In Workshop on Exploiting Parallelism using GPUs and other Hardware-Assisted Methods (EPHAM), 2009.Google ScholarGoogle Scholar
  19. Calle Lejdfors and Lennart Ohlsson. Implementing an embedded gpu language by combining translation and generation. In Proceedings of the 2006 ACM symposium on Applied computing (SAC '06), pages 1610--1614, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Geoffrey Mainland and Greg Morrisett. Nikola: embedding compiled GPU functions in Haskell. In Proceedings of the third ACM symposium on Haskell (Haskell '10), pages 67--78, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Geoffrey B. Mainland. Why it's nice to be quoted: Quasiquoting for Haskell. In Proceedings of the 2007 ACM symposium on Haskell (Haskell '07), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Munshi and Khronos OpenCL Working Group. The OpenCL specification, 2009.Google ScholarGoogle Scholar
  23. NVIDIA. Compute unified device architecture programming guide. http://developer.download.nvidia.com/compute/cuda/1_0/NVIDIA_CUDA_Programming_Guide_1.0.pdf, 2008.Google ScholarGoogle Scholar
  24. NVIDIA. NVIDIA OpenCL best practices guide, version 1.0. http://www.nvidia.com/content/cudazone/CUDABrowser/downloads/papersNVIDIA_OpenCL_BestPracticesGuide.pdf, 2009.Google ScholarGoogle Scholar
  25. NVIDIA. NVIDIA's next generation CUDA compute architecture: Fermi. http://www.nvidia.com/content/PDF/fermi_white_papersNVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf, 2010.Google ScholarGoogle Scholar
  26. Martin Odersky et al. The Scala language specification, 2006--2011.Google ScholarGoogle Scholar
  27. PyOpenCL: Python programming environment for OpenCL. http://mathema.tician.de/software/pyopencl, 2011.Google ScholarGoogle Scholar
  28. Johannes Rudolph and Peter Thiemann. Mnemonics: type-safe bytecode generation at run time. In Proceedings of the 2010 ACM SIGPLAN workshop on Partial evaluation and program manipulation (PEPM), pages 15--24, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Guy L. Steele, Jr. and Richard P. Gabriel. The evolution of Lisp. In HOPL-II: The second ACM SIGPLAN conference on History of programming languages, pages 231--270, New York, NY, USA, 1993. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Walid Taha. A gentle introduction to multi-stage programming. In Domain-Specific Program Generation, pages 30---50, 2003.Google ScholarGoogle Scholar
  31. Walid Taha and Tim Sheard. Multi-stage programming with explicit annotations. In Proceedings of the ACM-SIGPLAN Symposium on Partial Evaluation and semantic based program manipulations (PEPM), pages 203--217, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. David Tarditi, Sidd Puri, and Jose Oglesby. Accelerator: Using data parallelism to program GPUs for general-purpose uses. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Vijay Sundaresan. Soot: A Java bytecode optimization framework. In Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research (CASCON), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yonghong Yan, Max Grossman, and Vivek Sarkar. JCUDA: A programmer-friendly interface for accelerating Java programs with CUDA. In Proceedings of the 15th International Euro-Par Conference on Parallel Processing (Euro-Par '09), pages 887--899, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Firepile: run-time compilation for GPUs in scala

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 47, Issue 3
        GCPE '11
        March 2012
        179 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2189751
        Issue’s Table of Contents
        • cover image ACM Conferences
          GPCE '11: Proceedings of the 10th ACM international conference on Generative programming and component engineering
          October 2011
          194 pages
          ISBN:9781450306898
          DOI:10.1145/2047862

        Copyright © 2011 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 October 2011

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!