Abstract
We describe Nikola, a first-order language of array computations embedded in Haskell that compiles to GPUs via CUDA using a new set of type-directed techniques to support re-usable computations. Nikola automatically handles a range of low-level details for Haskell programmers, such as marshaling data to/from the GPU, size inference for buffers, memory management, and automatic loop parallelization. Additionally, Nikola supports both compile-time and run-time code generation, making it possible for programmers to choose when and where to specialize embedded programs.
Supplemental Material
- }}Robert Atkey, Sam Lindley, and Jeremy Yallop. Unembedding domain-specific languages. In Proceedings of the 2nd ACM SIGPLAN Symposium on Haskell (Haskell '09), pages 37--48, Edinburgh, Scotland, 2009. ACM. Google Scholar
Digital Library
- }}Per Bjesse, Koen Claessen, Mary Sheeran, and Satnam Singh. Lava: hardware design in haskell. In Proceedings of the 3rd ACM SIGPLAN International Conference on Functional Programming (ICFP '98), pages 174--184, Baltimore, Maryland, United States, 1998. ACM. Google Scholar
Digital Library
- }}Guy E Blelloch. Prefix sums and their applications. Technical Report CMU-CS-90-190, School of Computer Science, Carnegie Mellon University, 1990.Google Scholar
- }}Guy E. Blelloch, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha, and Siddhartha Chatterjee. Implementation of a portable nested data-parallel language. Journal of Parallel and Distributed Computing, 21 (1): 4--14, 1994. Google Scholar
Digital Library
- }}Manuel Chakravarty, Gabriele Keller, and Sean Lee. accelerate, October 2009. URL http://www.cse.unsw.edu.au/~chak/project/accelerate/.Google Scholar
- }}Koen Claessen and David Sands. Observable sharing for functional circuit description. In Proceedings of the 5th Asian Computing Science Conference on Advances in Computing Science, pages 62--73. Springer-Verlag, 1999. Google Scholar
Digital Library
- }}Conal Elliott. Functional images. In The Fun of Programming, "Cornerstones of Computing" series. Palgrave, March 2003.Google Scholar
- }}Conal Elliott. Programming graphics processors functionally. In Proceedings of the 2004 ACM SIGPLAN Workshop on Haskell (Haskell '04), pages 45--56, Snowbird, Utah, USA, 2004. ACM. Google Scholar
Digital Library
- }}Conal Elliott, Sigbjörn Finne, and Oege De Moor. Compiling embedded languages. Journal of Functional Programming, 13 (3): 455--481, 2003. Google Scholar
Digital Library
- }}David Gay, Philip Levis, J. Robert von Behren, Matt Welsh, Eric A. Brewer, and David E. Culler. The nesC language: A holistic approach to networked embedded systems. In Proceedings of the ACM SIGPLAN 2003 conference on Programming Language Design and Implementation (PLDI '03), page 1--11. ACM, 2003. Google Scholar
Digital Library
- }}Andy Gill. Type-safe observable sharing in haskell. In Proceedings of the 2nd ACM SIGPLAN Symposium on Haskell (Haskell '09), pages 117--128, Edinburgh, Scotland, 2009. ACM. Google Scholar
Digital Library
- }}John Hughes. The design of a pretty-printing library. In J. Jeuring and E. Meijer, editors, Advanced Functional Programming, pages 53--96. Springer Verlag, LNCS 925, 1995. Google Scholar
Digital Library
- }}Graham Hutton. Higher-order functions for parsing. Journal of Functional Programming, 2 (3): 323---343, July 1992.Google Scholar
Cross Ref
- }}Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan Catanzaro, Paul Ivanov, and Ahmed Fasih. PyCUDA: GPU Run-Time code generation for High-Performance computing. 0911.3456, November 2009.Google Scholar
- }}Sean Lee, Manuel Chakravarty, Vinod Grover, and Gabriele Keller. GPU kernels as Data-Parallel array computations in haskell. In Workshop on Exploiting Parallelism using GPUs and other Hardware-Assisted Methods (EPAHM 2009), 2009.Google Scholar
- }}Daan Leijen and Erik Meijer. Domain specific embedded compilers. In Proceedings of the 2nd conference on Domain-specific languages, pages 109--122, Austin, Texas, United States, 1999. ACM. Google Scholar
Digital Library
- }}Roman Leshchinskiy. vector: Efficient arrays, February 2010. URL http://hackage.haskell.org/package/vector.Google Scholar
- }}Geoffrey Mainland. Why it's nice to be quoted: quasiquoting for haskell. In Proceedings of the ACM SIGPLAN Workshop on Haskell (Haskell '07), page 73--82, New York, NY, USA, 2007. ACM. Google Scholar
Digital Library
- }}Geoffrey Mainland, Greg Morrisett, and Matt Welsh. Flask: Staged functional programming for sensor networks. In Proceeding of the 13th ACM SIGPLAN International Conference on Functional Programming (ICFP '08), page 335--346, New York, NY, USA, 2008. ACM. Google Scholar
Digital Library
- }}John T. O'Donnell. Generating netlists from executable circuit specifications. In Proceedings of the 1992 Glasgow Workshop on Functional Programming, pages 178--194. Springer-Verlag, 1993. Google Scholar
Digital Library
- }}Izzet Pembeci, Henrik Nilsson, and Gregory Hager. Functional reactive robotics: an exercise in principled integration of domain-specific languages. In Proceedings of the 4th ACM SIGPLAN International Conference on Principles and Practice of Declarative Programming, pages 168--179, Pittsburgh, PA, USA, 2002. ACM. Google Scholar
Digital Library
- }}Simon L. Peyton Jones, Roman Leshchinskiy, and Manuel Chakravarty. Harnessing the multicores: Nested data parallelism in haskell. In Programming Languages and Systems, page 138. 2008. Google Scholar
Digital Library
- }}F. Pfenning and C. Elliot. Higher-order abstract syntax. In Proceedings of the ACM SIGPLAN 1988 conference on Programming Language Design and Implementation (PLDI '88), pages 199--208, Atlanta, Georgia, United States, 1988. ACM. Google Scholar
Digital Library
- }}Tim Sheard and Simon L. Peyton Jones. Template meta-programming for haskell. In Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell (Haskell '02), pages 1--16, Pittsburgh, Pennsylvania, 2002. ACM. Google Scholar
Digital Library
- }}Joel Svensson, Koen Claessen, and Mary Sheeran. Obsidian: A domain specific embedded language for parallel programming of graphics processors. In Proceedings of 20th International Symposium on the Implementation and Application of Functional Languages (IFL '08), Hatfield, UK, 2008. Google Scholar
Digital Library
- }}Joel Svensson, Koen Claessen, and Mary Sheeran. GPGPU kernel implementation using an embedded language: a status report. Technical Report 2010:01, Chalmers University of Technology, January 2010.Google Scholar
- }}Walid Taha and Tim Sheard. Multi-stage programming with explicit annotations. In Proceedings of the 1997 ACM SIGPLAN symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM '97), pages 203--217, Amsterdam, The Netherlands, 1997. ACM. Google Scholar
Digital Library
Index Terms
Nikola: embedding compiled GPU functions in Haskell
Recommendations
Nikola: embedding compiled GPU functions in Haskell
Haskell '10: Proceedings of the third ACM Haskell symposium on HaskellWe describe Nikola, a first-order language of array computations embedded in Haskell that compiles to GPUs via CUDA using a new set of type-directed techniques to support re-usable computations. Nikola automatically handles a range of low-level details ...
A performance study of general-purpose applications on graphics processors using CUDA
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation
ICPPW '12: Proceedings of the 2012 41st International Conference on Parallel Processing WorkshopsThe use of accelerators in high-performance computing is increasing. The most commonly used accelerator is the graphics processing unit (GPU) because of its low cost and massively parallel performance. The two most common programming environments for ...







Comments