Abstract
Existing approaches to array fusion can deal with straight-line producer consumer pipelines, but cannot fuse branching data flows where a generated array is consumed by several different consumers. Branching data flows are common and natural to write, but a lack of fusion leads to the creation of an intermediate array at every branch point. We present a new array fusion system that handles branches, based on Waters's series expression framework, but extended to work in a functional setting. Our system also solves a related problem in stream fusion, namely the introduction of duplicate loop counters. We demonstrate speedup over existing fusion systems for several key examples.
- Paul Caspi, Daniel Pilaud, Nicolas Halbwachs, and John Plaice. Lustre: A declarative language for programming synchronous systems. In POPL: Principles of Programming Languages. ACM, 1987. Google Scholar
Digital Library
- Paul Caspi and Marc Pouzet. A Functional Extension to Lustre. In International Symposium on Languages for Intentional Programming. World Scientific, 1995.Google Scholar
- Paul Caspi and Marc Pouzet. Synchronous Kahn networks. In ICFP: International Conference on Functional Programming. ACM, 1996. Google Scholar
Digital Library
- Manuel M. T. Chakravarty and Gabriele Keller. Functional array fusion. In ICFP: International Conference on Functional Programming. ACM, 2001. Google Scholar
Digital Library
- Siddhartha Chatterjee, Guy E. Blelloch, and Allan L. Fisher. Size and access inference for data-parallel programs. In PLDI: Programming Language Design and Implementation. ACM, 1991. Google Scholar
Digital Library
- Koen Claessen, Mary Sheeran, and Joel Svensson. Expressive array constructs in an embedded GPU kernel programming language. In DAMP: Declarative Aspects of Multicore Programming. ACM, 2012. Google Scholar
Digital Library
- Duncan Coutts, Roman Leshchinskiy, and Don Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP: Internal Conference of Functional Programming. ACM, 2007. Google Scholar
Digital Library
- Andrew Gill, John Launchbury, and Simon L Peyton Jones. A short cut to deforestation. In FPCA: Functional Programming Languages and Computer Architecture. ACM, 1993. Google Scholar
Digital Library
- Clemens Grelck, Karsten Hinckfuß, and Sven-Bodo Scholz. With-loop fusion for data locality and parallelism. In IFL: Implementation and Application of Functional Languages. Springer-Verlag, 2006. Google Scholar
Digital Library
- Nicolas Halbwachs, Pascal Raymond, and Christophe Ratel. Generating efficient code from data-flow programs. In PLILP: Programming Langauge Implementation and Logic Programming, 1991.Google Scholar
Cross Ref
- Zhenjiang Hu, Hideya Iwasaki, Masato Takeichi, and Akihiko Takano. Tupling calculation eliminates multiple data traversals. In ICFP: International Conference on Functional Programming. ACM, 1997. Google Scholar
Digital Library
- John Hughes. The Design and Implementation of Programming Languages. PhD thesis, Programming Research Group, Oxford University, July 1983.Google Scholar
- Gabriele Keller, Manuel M. T. Chakravarty, Roman Leshchinskiy, Simon L. Peyton Jones, and Ben Lippmeier. Regular, Shape-polymorphic, Parallel Arrays in Haskell. In ICFP: International Conference on Functional Programming. ACM, 2010. Google Scholar
Digital Library
- John Launchbury and Simon L. Peyton Jones. Lazy functional state threads. In PLDI: Programming Language Design and Implementation. ACM, 1994. Google Scholar
Digital Library
- Simon Marlow, Alexey Rodriguez Yakushev, and Simon Peyton Jones. Faster laziness using dynamic pointer tagging. In ICFP: International Conference on Functional Programming, 2007. Google Scholar
Digital Library
- Erik Meijer, Maarten Fokkinga, and Ross Paterson. Functional programming with bananas, lenses, envelopes and barbed wire. In FPCA: Functional Programming Languages and Computer Architecture. ACM, 1991. Google Scholar
Digital Library
- Simon Peyton Jones and John Launchbury. Unboxed values as first class citizens in a non-strict functional language. In FPCA: Functional Programming and Computer Architecture. ACM, 1991. Google Scholar
Digital Library
- Simon Peyton Jones, Roman Leshchinskiy, Gabriele Keller, and Manuel M. T. Chakravarty. Harnessing the multicores: Nested data parallelism in Haskell. In FSTTCS: Foundations of Software Technology and Theoretical Computer Science. Schloss Dagstuhl, 2008.Google Scholar
- Marc Pouzet. Lucid Synchrone, version 3. Tutorial and reference manual. Université Paris-Sud, LRI, April 2006.Google Scholar
- Tiark Rompf et al. Optimizing data structures in high-level programs: new directions for extensible compilers based on staging. In POPL: Principles of Programming Languages. ACM, 2013. Google Scholar
Digital Library
- Vivek Sarkar and Guang R Gao. Optimization of array accesses by collective loop transformations. In International Conference on Supercomputing. ACM, 1991. Google Scholar
Digital Library
- Olin Shivers. The anatomy of a loop. In ICFP: International Conference on Functional Programming. ACM, 2005. Google Scholar
Digital Library
- Guy Steele. Common Lisp the Language. Digital Press, 1990. Google Scholar
Digital Library
- Philip Wadler. Listlessness is better than laziness. In LISP and Functional Programming, 1984. Google Scholar
Digital Library
- Joe D. Warren. A hierarchical basis for reordering transformations. In POPL: Principles of Programming Languages. ACM, 1984. Google Scholar
Digital Library
- Richard C. Waters. Efficient interpretation of synchronizable series expressions. In PLDI: Programming Language Design and Implementation. ACM, 1987.Google Scholar
Digital Library
- Richard C. Waters. Automatic transformation of series expressions into loops. TOPLAS: Transactions on Programming Languages and Systems, 13(1), 1991. Google Scholar
Digital Library
Index Terms
Data flow fusion with series expressions in Haskell
Recommendations
Accelerating Haskell array codes with multicore GPUs
DAMP '11: Proceedings of the sixth workshop on Declarative aspects of multicore programmingCurrent GPUs are massively parallel multicore processors optimised for workloads with a large degree of SIMD parallelism. Good performance requires highly idiomatic programs, whose development is work intensive and requires expert knowledge.
To raise ...
Data flow fusion with series expressions in Haskell
Haskell '13: Proceedings of the 2013 ACM SIGPLAN symposium on HaskellExisting approaches to array fusion can deal with straight-line producer consumer pipelines, but cannot fuse branching data flows where a generated array is consumed by several different consumers. Branching data flows are common and natural to write, ...
Using fusion to enable late design decisions for pipelined computations
FHPC 2016: Proceedings of the 5th International Workshop on Functional High-Performance ComputingWe present an embedded language in Haskell for programming pipelined computations. The language is a combination of Feldspar (a functional language for array computations) and a new implementation of Ziria (a language for describing streaming ...







Comments