skip to main content
research-article
Open Access

Lambda calculus with algebraic simplification for reduction parallelization by equational reasoning

Published:26 July 2019Publication History
Skip Abstract Section

Abstract

Parallel reduction is a major component of parallel programming and widely used for summarization and aggregation. It is not well understood, however, what sorts of nontrivial summarizations can be implemented as parallel reductions. This paper develops a calculus named λas, a simply typed lambda calculus with algebraic simplification. This calculus provides a foundation for studying parallelization of complex reductions by equational reasoning. Its key feature is δ abstraction. A δ abstraction is observationally equivalent to the standard λ abstraction, but its body is simplified before the arrival of its arguments by using algebraic properties such as associativity and commutativity. In addition, the type system of λas guarantees that simplifications due to δ abstractions do not lead to serious overheads. The usefulness of λas is demonstrated on examples of developing complex parallel reductions, including those containing more than one reduction operator, loops with jumps, prefix-sum patterns, and even tree manipulations.

Skip Supplemental Material Section

Supplemental Material

a80-morihata.webm

References

  1. Guy E. Blelloch. 1993. Prefix Sums and Their Applications. In Synthesis of Parallel Algorithms, John H. Reif (Ed.). Morgan Kaufmann Publishers, Chapter 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Peter Buneman, Gao Cong, Wenfei Fan, and Anastasios Kementsietsidis. 2006. Using Partial Evaluation in Distributed Query Evaluation. In Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 12-15, 2006. ACM, 211–222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David Callahan. 1992. Recognizing and Parallelizing Bounded Recurrences. In Languages and Compilers for Parallel Computing, Fourth International Workshop, Santa Clara, California, USA, August 7-9, 1991, Proceedings (Lecture Notes in Computer Science), Vol. 589. Springer, 169–185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David Castro, Kevin Hammond, and Susmit Sarkar. 2016. Farms, pipes, streams and reforestation: reasoning about structured parallel processes using types and hylomorphisms. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. ACM, 4–17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David Castro, Kevin Hammond, Susmit Sarkar, and Yasir Alguwaifli. 2018. Automatically deriving cost models for structured parallel processes using hylomorphisms. Future Generation Comp. Syst. 79 (2018), 653–668. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Yun-Yan Chi and Shin-Cheng Mu. 2011. Constructing List Homomorphisms from Proofs. In Programming Languages and Systems - 9th Asian Symposium, APLAS 2011, Kenting, Taiwan, December 5-7, 2011. Proceedings (Lecture Notes in Computer Science), Vol. 7078. Springer, 74–88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Wei-Ngan Chin, Akihiko Takano, and Zhenjiang Hu. 1998. Parallelization via Context Preservation. In Proceedings of the 1998 International Conference on Computer Languages, ICCL ’98, May 14-16, 1998, Chicago, IL, USA. IEEE Computer Society, 153–162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Murray I. Cole. 1989. Algorithmic Skeletons: Structural Management of Parallel Computation. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gao Cong, Wenfei Fan, and Anastasios Kementsietsidis. 2007. Distributed query evaluation with performance guarantees. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12-14, 2007. ACM, 509–520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gao Cong, Wenfei Fan, Anastasios Kementsietsidis, Jianzhong Li, and Xianmin Liu. 2012. Partial Evaluation for Distributed XPath Query Processing and Beyond. ACM Trans. Database Syst. 37, 4 (2012), 32:1–32:43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Charles Consel and Olivier Danvy. 1992. Partial Evaluation in Parallel. Lisp and Symbolic Computation 5, 4 (1992), 327–342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In 6th Symposium on Operating System Design and Implementation (OSDI 2004), December 6-8, 2004, San Francisco, California, USA. 137–150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Steven J. Deitz, David Callahan, Bradford L. Chamberlain, and Lawrence Snyder. 2006. Global-view abstractions for user-defined reductions and scans. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2006, New York, New York, USA, March 29-31, 2006. ACM, 40–47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kento Emoto, Sebastian Fischer, and Zhenjiang Hu. 2012. Filter-embedding semiring fusion for programming with MapReduce. Formal Asp. Comput. 24, 4-6 (2012), 623–645.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kento Emoto, Zhenjiang Hu, Kazuhiko Kakehi, Kiminori Matsuzaki, and Masato Takeichi. 2010. Generators-of-Generators Library with Optimization Capabilities in Fortress. In Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31 - September 3, 2010, Proceedings, Part II (Lecture Notes in Computer Science), Vol. 6272. Springer, 26–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Azadeh Farzan and Victor Nicolet. 2017. Synthesis of divide and conquer parallelism for loops. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 540–555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Grigory Fedyukovich, Maaz Bin Safeer Ahmad, and Rastislav Bodík. 2017. Gradual synthesis for static parallelization of single-pass array-processing programs. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 572–585. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Allan L. Fisher and Anwar M. Ghuloum. 1994. Parallelizing Complex Scans and Reductions. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), Orlando, Florida, June 20-24, 1994. ACM, 135–146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Matthew Fluet and Riccardo Pucella. 2006. Phantom types and subtyping. J. Funct. Program. 16, 6 (2006), 751–791. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Matthew Fluet, Mike Rainey, John H. Reppy, and Adam Shaw. 2008. Implicitly threaded parallelism in Manticore. J. Funct. Program. 20, 5-6 (2008), 537–576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sergei Gorlatch. 1999. Extracting and Implementing List Homomorphisms in Parallel Program Development. Science of Computer Programming 33, 1 (1999), 1–27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Troels Henriksen, Niels G. W. Serup, Martin Elsman, Fritz Henglein, and Cosmin E. Oancea. 2017. Futhark: purely functional GP U-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. ACM, 556–571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zhenjiang Hu, Hideya Iwasaki, and Masato Takechi. 1997. Formal derivation of efficient parallel programs by construction of list homomorphisms. ACM Transactions on Programming Langauges and Systems 19, 3 (1997), 444–461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zhenjiang Hu, Masato Takeichi, and Wei-Ngan Chin. 1998. Parallelization in Calculational Forms. In POPL ’98: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 19-21, 1998, San Diego, CA, USA. ACM, 316–328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peng Jiang, Linchuan Chen, and Gagan Agrawal. 2018. Revealing parallel scans and reductions in recurrences through function reconstruction. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018, Limassol, Cyprus, November 01-04, 2018. ACM, 10:1–10:13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Neil D. Jones. 1996. An Introduction to Partial Evaluation. Comput. Surveys 28, 3 (1996), 480–503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Gabriele Keller, Manuel M. T. Chakravarty, Roman Leshchinskiy, Simon L. Peyton Jones, and Ben Lippmeier. 2010. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceeding of the 15th ACM SIGPLAN International Conference on Functional Programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010. ACM, 261–272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Naoki Kobayashi, Kazutaka Matsuda, Ayumi Shinohara, and Kazuya Yaguchi. 2012. Functional programs as compressed data. Higher-Order and Symbolic Computation 25, 1 (2012), 39–84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Richard E. Ladner and Michael J. Fischer. 1980. Parallel Prefix Computation. J. ACM 27, 4 (1980), 831–838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. John Launchbury. 1993. A Natural Semantics for Lazy Evaluation. In POPL ’93: Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, South Carolina, USA. ACM, 144–154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jean-Jacques Lévy. 1976. An Algebraic Interpretation of the λβK-Calculus; and an Application of a Labelled λ-Calculus. Theor. Comput. Sci. 2, 1 (1976), 97–114.Google ScholarGoogle ScholarCross RefCross Ref
  32. Simon Marlow, Patrick Maier, Hans-Wolfgang Loidl, Mustafa Aswad, and Philip W. Trinder. 2010. Seq no more: better strategies for parallel Haskell. In Proceedings of the 3rd ACM SIGPLAN Symposium on Haskell, Haskell 2010, Baltimore, MD, USA, 30 September 2010. ACM, 91–102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kiminori Matsuzaki, Zhenjiang Hu, Kazuhiko Kakehi, and Masato Takeichi. 2005. Systematic Derivation of Tree Contraction Algorithms. Parallel Processing Letters 15, 3 (2005), 321–336.Google ScholarGoogle ScholarCross RefCross Ref
  34. Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2006. Towards automatic parallelization of tree reductions in dynamic programming. In SPAA 2006: Proceedings of the 18th Annual ACM Symposium on Parallel Algorithms and Architectures, Cambridge, Massachusetts, USA, July 30 - August 2, 2006. ACM, 39–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yasuhiko Minamide. 1998. A Functional Representation of Data Structures with a Hole. In POPL ’98, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, CA, USA, January 19-21, 1998. ACM, 75–84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Akimasa Morihata and Kiminori Matsuzaki. 2010. Automatic Parallelization of Recursive Functions Using Quantifier Elimination. In Functional and Logic Programming, 10th International Symposium, FLOPS 2010, Sendai, Japan, April 19-21, 2010. Proceedings (Lecture Notes in Computer Science), Vol. 6009. Springer, 321–336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Akimasa Morihata and Kiminori Matsuzaki. 2011. Balanced trees inhabiting functional parallel programming. In Proceeding of the 16th ACM SIGPLAN International Conference on Functional Programming, ICFP 2011, Tokyo, Japan, September 19-21, 2011. ACM, 117–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2009. The Third Homomorphism Theorem on Trees: Downward & Upward Lead to Divide-and-Conquer. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, Georgia, USA, January 21-23, 2009. ACM, 177–185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kazutaka Morita, Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi. 2007. Automatic inversion generates divide-and-conquer parallel programs. In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego, California, USA, June 10-13, 2007. ACM, 146–155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Susumu Nishimura and Atsushi Ohori. 1999. Parallel Functional Programming on Recursively Defined Data via Data-Parallel Recursion. J. Funct. Program. 9, 4 (1999), 427–462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP 2015, Monterey, CA, USA, October 4-7, 2015. ACM, 153–167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shigeyuki Sato and Hideya Iwasaki. 2011. Automatic parallelization via matrix multiplication. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. ACM, 470–479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Toshio Suganuma, Hideaki Komatsu, and Toshio Nakatani. 1996. Detection and Global Optimization of Reduction Operations for Distributed Parallel Machines. In ICS ’96: Proceedings of the 1996 International Conference on Supercomputing, May 25-28, 1996, Philadelphia, PA, USA. ACM, 18–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Val Tannen. 1988. Combining Algebra and Higher-Order Types. In Proceedings of the Third Annual Symposium on Logic in Computer Science (LICS ’88), Edinburgh, Scotland, UK, July 5-8, 1988. IEEE Computer Society, 82–90.Google ScholarGoogle ScholarCross RefCross Ref
  45. Val Tannen and Jean H. Gallier. 1991. Polymorphic Rewriting Conserves Algebraic Strong Normalization. Theor. Comput. Sci. 83, 1 (1991), 3–28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kazushige Terui. 2012. Semantic Evaluation, Intersection Types and Complexity of Simply Typed Lambda Calculus. In 23rd International Conference on Rewriting Techniques and Applications (RTA’12) , RTA 2012, May 28 - June 2, 2012, Nagoya, Japan (LIPIcs), Vol. 15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 323–338.Google ScholarGoogle Scholar
  47. Dana N. Xu, Siau-Cheng Khoo, and Zhenjiang Hu. 2004. PType System: A Featherweight Parallelizability Detector. In Programming Languages and Systems: Second Asian Symposium, APLAS 2004, Taipei, Taiwan, November 4-6, 2004. Proceedings (Lecture Notes in Computer Science), Vol. 3302. Springer, 197–212.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Lambda calculus with algebraic simplification for reduction parallelization by equational reasoning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!