Abstract
Parallel reduction is a major component of parallel programming and widely used for summarization and aggregation. It is not well understood, however, what sorts of nontrivial summarizations can be implemented as parallel reductions. This paper develops a calculus named λas, a simply typed lambda calculus with algebraic simplification. This calculus provides a foundation for studying parallelization of complex reductions by equational reasoning. Its key feature is δ abstraction. A δ abstraction is observationally equivalent to the standard λ abstraction, but its body is simplified before the arrival of its arguments by using algebraic properties such as associativity and commutativity. In addition, the type system of λas guarantees that simplifications due to δ abstractions do not lead to serious overheads. The usefulness of λas is demonstrated on examples of developing complex parallel reductions, including those containing more than one reduction operator, loops with jumps, prefix-sum patterns, and even tree manipulations.
Supplemental Material
- Guy E. Blelloch. 1993. Prefix Sums and Their Applications. In Synthesis of Parallel Algorithms, John H. Reif (Ed.). Morgan Kaufmann Publishers, Chapter 1. Google Scholar
Digital Library
- Peter Buneman, Gao Cong, Wenfei Fan, and Anastasios Kementsietsidis. 2006. Using Partial Evaluation in Distributed Query Evaluation. In Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 12-15, 2006. ACM, 211–222. Google Scholar
Digital Library
- David Callahan. 1992. Recognizing and Parallelizing Bounded Recurrences. In Languages and Compilers for Parallel Computing, Fourth International Workshop, Santa Clara, California, USA, August 7-9, 1991, Proceedings (Lecture Notes in Computer Science), Vol. 589. Springer, 169–185. Google Scholar
Digital Library
- David Castro, Kevin Hammond, and Susmit Sarkar. 2016. Farms, pipes, streams and reforestation: reasoning about structured parallel processes using types and hylomorphisms. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. ACM, 4–17. Google Scholar
Digital Library
- David Castro, Kevin Hammond, Susmit Sarkar, and Yasir Alguwaifli. 2018. Automatically deriving cost models for structured parallel processes using hylomorphisms. Future Generation Comp. Syst. 79 (2018), 653–668. Google Scholar
Digital Library
- Yun-Yan Chi and Shin-Cheng Mu. 2011. Constructing List Homomorphisms from Proofs. In Programming Languages and Systems - 9th Asian Symposium, APLAS 2011, Kenting, Taiwan, December 5-7, 2011. Proceedings (Lecture Notes in Computer Science), Vol. 7078. Springer, 74–88. Google Scholar
Digital Library
- Wei-Ngan Chin, Akihiko Takano, and Zhenjiang Hu. 1998. Parallelization via Context Preservation. In Proceedings of the 1998 International Conference on Computer Languages, ICCL ’98, May 14-16, 1998, Chicago, IL, USA. IEEE Computer Society, 153–162. Google Scholar
Digital Library
- Murray I. Cole. 1989. Algorithmic Skeletons: Structural Management of Parallel Computation. MIT Press. Google Scholar
Digital Library
- Gao Cong, Wenfei Fan, and Anastasios Kementsietsidis. 2007. Distributed query evaluation with performance guarantees. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12-14, 2007. ACM, 509–520. Google Scholar
Digital Library
- Gao Cong, Wenfei Fan, Anastasios Kementsietsidis, Jianzhong Li, and Xianmin Liu. 2012. Partial Evaluation for Distributed XPath Query Processing and Beyond. ACM Trans. Database Syst. 37, 4 (2012), 32:1–32:43. Google Scholar
Digital Library
- Charles Consel and Olivier Danvy. 1992. Partial Evaluation in Parallel. Lisp and Symbolic Computation 5, 4 (1992), 327–342. Google Scholar
Digital Library
- Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In 6th Symposium on Operating System Design and Implementation (OSDI 2004), December 6-8, 2004, San Francisco, California, USA. 137–150. Google Scholar
Digital Library
- Steven J. Deitz, David Callahan, Bradford L. Chamberlain, and Lawrence Snyder. 2006. Global-view abstractions for user-defined reductions and scans. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2006, New York, New York, USA, March 29-31, 2006. ACM, 40–47. Google Scholar
Digital Library
- Kento Emoto, Sebastian Fischer, and Zhenjiang Hu. 2012. Filter-embedding semiring fusion for programming with MapReduce. Formal Asp. Comput. 24, 4-6 (2012), 623–645.Google Scholar
Cross Ref
- Kento Emoto, Zhenjiang Hu, Kazuhiko Kakehi, Kiminori Matsuzaki, and Masato Takeichi. 2010. Generators-of-Generators Library with Optimization Capabilities in Fortress. In Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31 - September 3, 2010, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 6272. Springer, 26–37. Google Scholar
Digital Library
- Azadeh Farzan and Victor Nicolet. 2017. Synthesis of divide and conquer parallelism for loops. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 540–555. Google Scholar
Digital Library
- Grigory Fedyukovich, Maaz Bin Safeer Ahmad, and Rastislav Bodík. 2017. Gradual synthesis for static parallelization of single-pass array-processing programs. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 572–585. Google Scholar
Digital Library
- Allan L. Fisher and Anwar M. Ghuloum. 1994. Parallelizing Complex Scans and Reductions. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), Orlando, Florida, June 20-24, 1994. ACM, 135–146. Google Scholar
Digital Library
- Matthew Fluet and Riccardo Pucella. 2006. Phantom types and subtyping. J. Funct. Program. 16, 6 (2006), 751–791. Google Scholar
Digital Library
- Matthew Fluet, Mike Rainey, John H. Reppy, and Adam Shaw. 2008. Implicitly threaded parallelism in Manticore. J. Funct. Program. 20, 5-6 (2008), 537–576. Google Scholar
Digital Library
- Sergei Gorlatch. 1999. Extracting and Implementing List Homomorphisms in Parallel Program Development. Science of Computer Programming 33, 1 (1999), 1–27. Google Scholar
Digital Library
- Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: purely functional GP U-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 556–571. Google Scholar
Digital Library
- Zhenjiang Hu, Hideya Iwasaki, and Masato Takechi. 1997. Formal derivation of efficient parallel programs by construction of list homomorphisms. ACM Transactions on Programming Langauges and Systems 19, 3 (1997), 444–461. Google Scholar
Digital Library
- Zhenjiang Hu, Masato Takeichi, and Wei-Ngan Chin. 1998. Parallelization in Calculational Forms. In POPL ’98: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 19-21, 1998, San Diego, CA, USA. ACM, 316–328. Google Scholar
Digital Library
- Peng Jiang, Linchuan Chen, and Gagan Agrawal. 2018. Revealing parallel scans and reductions in recurrences through function reconstruction. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018, Limassol, Cyprus, November 01-04, 2018. ACM, 10:1–10:13. Google Scholar
Digital Library
- Neil D. Jones. 1996. An Introduction to Partial Evaluation. Comput. Surveys 28, 3 (1996), 480–503. Google Scholar
Digital Library
- Gabriele Keller, Manuel M. T. Chakravarty, Roman Leshchinskiy, Simon L. Peyton Jones, and Ben Lippmeier. 2010. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceeding of the 15th ACM SIGPLAN International Conference on Functional Programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010. ACM, 261–272. Google Scholar
Digital Library
- Naoki Kobayashi, Kazutaka Matsuda, Ayumi Shinohara, and Kazuya Yaguchi. 2012. Functional programs as compressed data. Higher-Order and Symbolic Computation 25, 1 (2012), 39–84. Google Scholar
Digital Library
- Richard E. Ladner and Michael J. Fischer. 1980. Parallel Prefix Computation. J. ACM 27, 4 (1980), 831–838. Google Scholar
Digital Library
- John Launchbury. 1993. A Natural Semantics for Lazy Evaluation. In POPL ’93: Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, South Carolina, USA. ACM, 144–154. Google Scholar
Digital Library
- Jean-Jacques Lévy. 1976. An Algebraic Interpretation of the λβK-Calculus; and an Application of a Labelled λ-Calculus. Theor. Comput. Sci. 2, 1 (1976), 97–114.Google Scholar
Cross Ref
- Simon Marlow, Patrick Maier, Hans-Wolfgang Loidl, Mustafa Aswad, and Philip W. Trinder. 2010. Seq no more: better strategies for parallel Haskell. In Proceedings of the 3rd ACM SIGPLAN Symposium on Haskell, Haskell 2010, Baltimore, MD, USA, 30 September 2010. ACM, 91–102. Google Scholar
Digital Library
- Kiminori Matsuzaki, Zhenjiang Hu, Kazuhiko Kakehi, and Masato Takeichi. 2005. Systematic Derivation of Tree Contraction Algorithms. Parallel Processing Letters 15, 3 (2005), 321–336.Google Scholar
Cross Ref
- Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2006. Towards automatic parallelization of tree reductions in dynamic programming. In SPAA 2006: Proceedings of the 18th Annual ACM Symposium on Parallel Algorithms and Architectures, Cambridge, Massachusetts, USA, July 30 - August 2, 2006. ACM, 39–48. Google Scholar
Digital Library
- Yasuhiko Minamide. 1998. A Functional Representation of Data Structures with a Hole. In POPL ’98, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, CA, USA, January 19-21, 1998. ACM, 75–84. Google Scholar
Digital Library
- Akimasa Morihata and Kiminori Matsuzaki. 2010. Automatic Parallelization of Recursive Functions Using Quantifier Elimination. In Functional and Logic Programming, 10th International Symposium, FLOPS 2010, Sendai, Japan, April 19-21, 2010. Proceedings (Lecture Notes in Computer Science), Vol. 6009. Springer, 321–336. Google Scholar
Digital Library
- Akimasa Morihata and Kiminori Matsuzaki. 2011. Balanced trees inhabiting functional parallel programming. In Proceeding of the 16th ACM SIGPLAN International Conference on Functional Programming, ICFP 2011, Tokyo, Japan, September 19-21, 2011. ACM, 117–128. Google Scholar
Digital Library
- Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2009. The Third Homomorphism Theorem on Trees: Downward & Upward Lead to Divide-and-Conquer. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, Georgia, USA, January 21-23, 2009. ACM, 177–185. Google Scholar
Digital Library
- Kazutaka Morita, Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2007. Automatic inversion generates divide-and-conquer parallel programs. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007. ACM, 146–155. Google Scholar
Digital Library
- Susumu Nishimura and Atsushi Ohori. 1999. Parallel Functional Programming on Recursively Defined Data via Data-Parallel Recursion. J. Funct. Program. 9, 4 (1999), 427–462. Google Scholar
Digital Library
- Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, October 4-7, 2015. ACM, 153–167. Google Scholar
Digital Library
- Shigeyuki Sato and Hideya Iwasaki. 2011. Automatic parallelization via matrix multiplication. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. ACM, 470–479. Google Scholar
Digital Library
- Toshio Suganuma, Hideaki Komatsu, and Toshio Nakatani. 1996. Detection and Global Optimization of Reduction Operations for Distributed Parallel Machines. In ICS ’96: Proceedings of the 1996 International Conference on Supercomputing, May 25-28, 1996, Philadelphia, PA, USA. ACM, 18–25. Google Scholar
Digital Library
- Val Tannen. 1988. Combining Algebra and Higher-Order Types. In Proceedings of the Third Annual Symposium on Logic in Computer Science (LICS ’88), Edinburgh, Scotland, UK, July 5-8, 1988. IEEE Computer Society, 82–90.Google Scholar
Cross Ref
- Val Tannen and Jean H. Gallier. 1991. Polymorphic Rewriting Conserves Algebraic Strong Normalization. Theor. Comput. Sci. 83, 1 (1991), 3–28. Google Scholar
Digital Library
- Kazushige Terui. 2012. Semantic Evaluation, Intersection Types and Complexity of Simply Typed Lambda Calculus. In 23rd International Conference on Rewriting Techniques and Applications (RTA’12) , RTA 2012, May 28 - June 2, 2012, Nagoya, Japan (LIPIcs), Vol. 15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 323–338.Google Scholar
- Dana N. Xu, Siau-Cheng Khoo, and Zhenjiang Hu. 2004. PType System: A Featherweight Parallelizability Detector. In Programming Languages and Systems: Second Asian Symposium, APLAS 2004, Taipei, Taiwan, November 4-6, 2004. Proceedings (Lecture Notes in Computer Science), Vol. 3302. Springer, 197–212.Google Scholar
Cross Ref
Index Terms
Lambda calculus with algebraic simplification for reduction parallelization by equational reasoning
Recommendations
Lambda Abstraction Algebras: Coordinatizing Models of Lambda Calculus
Lambda abstraction algebras are designed to algebraize the untyped lambda calculus in the same way cylindric and polyadic algebras algebraize the first-order logic; they are intended as an alternative to combinatory algebras in this regard. Like ...
Lambda Abstraction Algebras: Coordinatizing Models of Lambda Calculus
Lambda abstraction algebras are designed to algebraize the untyped lambda calculus in the same way cylindric and polyadic algebras algebraize the first-order logic; they are intended as an alternative to combinatory algebras in this regard. Like ...
Nonmodularity Results for Lambda Calculus
The variety (equational class) of lambda abstraction algebras was introduced to algebraize the untyped lambda calculus in the same way cylindric and polyadic algebras algebraize the first-order predicate logic. In this paper we prove that the lattice of ...






Comments