skip to main content
research-article

Efficient parallel stencil convolution in Haskell

Published:22 September 2011Publication History
Skip Abstract Section

Abstract

Stencil convolution is a fundamental building block of many scientific and image processing algorithms. We present a declarative approach to writing such convolutions in Haskell that is both efficient at runtime and implicitly parallel. To achieve this we extend our prior work on the Repa array library with two new features: partitioned and cursored arrays. Combined with careful management of the interaction between GHC and its back-end code generator LLVM, we achieve performance comparable to the standard OpenCV library.

Skip Supplemental Material Section

Supplemental Material

_talk6.mp4

References

  1. S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for Deterministic Parallel Java. In In Proc. Intl. Conf. on Object-Oriented Programming, Systems, Languages, and Applications, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proc. of the 15th Symposium on Principles of Programming Languages, pages 1--11, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Barrett, P. Roth, and S. Poole. Finite difference stencils implemented using Chapel. Technical report, Oak Ridge National Laboratory, 2007.Google ScholarGoogle Scholar
  4. M. Bolingbroke and S. Peyton Jones. Supercompilation by evaluation. In Proc. of the third ACM Haskell Symposium, pages 135--146. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Bradski and A. Kaehler. Learning OpenCV: Computer Vision with the OpenCV Library. O'Reilly Media, 2008.Google ScholarGoogle Scholar
  6. J. Canny. Finding edges and lines in images. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Carr, C. Ding, and P. Sweany. Improving software pipelining with unroll-and-jam. In Proc. of the 29th Hawaii International Conference on System Sciences. IEEE Computer Society, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating Haskell array codes with multicore GPUs. In Proc. of the sixth workshop on Declarative Aspects of Multicore Programming, pages 3--14. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. L. Chamberlain, S.-E. Choi, E. C. Lewis, C. Lin, L. Snyder, and W. D. Weathersby. ZPL: A machine independent programming language for parallel computers. IEEE Transactions on Software Engineering, 26: 197--211, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In Proc. of the 12th ACM SIGPLAN International Conference on Functional programming, pages 315--326. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proc, of the ACM/IEEE Conference on Supercomputing, pages 4:1--4:12. IEEE Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. G. Feitelson and L. Rudolph. Gang scheduling performance benefits for fine-grain synchronization. Journal of Parallel and Distributed Computing, 16: 306--318, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  13. P. N. Hilfinger, D. Bonachea, D. Gay, S. Graham, B. Liblit, G. Pike, and K. Yelick. Titanium language reference manual. Technical report, Berkeley, CA, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. S. Ierotheou, S. P. Johnson, M. Cross, and P. F. Leggett. Computer aided parallelisation tools (CAPTools) - conceptual overview and performance on the parallelisation of structured mesh codes. Parallel Comput., 22: 163--195, February 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, Shape-polymorphic, Parallel Arrays in Haskell. In Proc. of the 15th ACM SIGPLAN International Conference on Functional Programming, pages 261--272. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Krishnamoorthy, M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan. Effective automatic parallelization of stencil computations. In Proc. of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 235--244. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Launchbury and S. L. Peyton Jones. Lazy functional state threads. In Proc. of the ACM SIGPLAN 1994 conference on Programming Language Design and Implementation, pages 24--35. ACM, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Lesniak. PASTHA: parallelizing stencil calculations in Haskell. In Proc. of the 5th ACM SIGPLAN workshop on Declarative Aspects of Multicore Programming, pages 5--14. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Mitchell. Rethinking supercompilation. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, pages 309--320. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. W. Numrich. The computational energy spectrum of a program as it executes. The Journal of Supercomputing, 52 (2): 119--134, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. O'Gorman, M. J. Sammon, and M. Seul. Practical Algorithms for Image Analysis. Cambridge University Press, 2nd edition, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. A. Orchard, M. Bolingbroke, and A. Mycroft. Ypnos: Declarative, Parallel Structured Grid Programming. In Proc. of the 5th ACM SIGPLAN workshop on Declarative Aspects of Multicore Programming, pages 15--24. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Peyton Jones, A. Tolmach, and T. Hoare. Playing by the rules: Rewriting as a practical optimisation technique in GHC. In Proc. of the Haskell Workshop, 2001.Google ScholarGoogle Scholar
  24. Repa. The Repa Home Page, Mar. 2011. http://trac.haskell.org/repa.Google ScholarGoogle Scholar
  25. B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Global value numbers and redundant computations. In Proc. of the 15th Symposium on Principles of Programming Languages. ACM, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S.-B. Scholz. Single assignment C -- efficient support for high-level array operations in a functional setting. Journal of Functional Programming, 13 (6): 1005--1059, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. A. Terei and M. M. Chakravarty. An LLVM backend for GHC. In Proc. of the third ACM Symposium on Haskell, pages 109--120. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient parallel stencil convolution in Haskell

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 46, Issue 12
          Haskell '11
          December 2011
          129 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2096148
          Issue’s Table of Contents
          • cover image ACM Conferences
            Haskell '11: Proceedings of the 4th ACM symposium on Haskell
            September 2011
            136 pages
            ISBN:9781450308601
            DOI:10.1145/2034675

          Copyright © 2011 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 September 2011

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!