Abstract
We present a novel programming language design that attempts to combine the clarity and safety of high-level functional languages with the efficiency and parallelism of low-level numerical languages. We treat arrays as eagerly-memoized functions on typed index sets, allowing abstract function manipulations, such as currying, to work on arrays. In contrast to composing primitive bulk-array operations, we argue for an explicit nested indexing style that mirrors application of functions to arguments. We also introduce a fine-grained typed effects system which affords concise and automatically-parallelized in-place updates. Specifically, an associative accumulation effect allows reverse-mode automatic differentiation of in-place updates in a way that preserves parallelism. Empirically, we benchmark against the Futhark array programming language, and demonstrate that aggressive inlining and type-driven compilation allows array programs to be written in an expressive, "pointful" style with little performance penalty.
Supplemental Material
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, USA. 265–283. isbn:9781931971331Google Scholar
Digital Library
- Atılım Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. Automatic Differentiation in Machine Learning: A Survey. J. Mach. Learn. Res., 18, 1 (2017), Jan., 5595–5637. issn:1532-4435Google Scholar
Digital Library
- James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy). 4, 1–7.Google Scholar
- Gilbert Bernstein, Michael Mara, Tzu-Mao Li, Dougal Maclaurin, and Jonathan Ragan-Kelley. 2020. Differentiating a Tensor Language. arxiv:2008.11256.Google Scholar
- Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. 2017. Julia: A fresh approach to numerical computing. SIAM review, 59, 1 (2017), 65–98. https://doi.org/10.1137/141000671 Google Scholar
Digital Library
- Christian Bischof, Alan Carle, George Corliss, Andreas Griewank, and Paul Hovland. 1992. ADIFOR — generating derivative codes from Fortran programs. Scientific Programming, 1, 1 (1992), 11–29. https://doi.org/10.1155/1992/717832 Google Scholar
Digital Library
- Guy E. Blelloch. 1993. NESL: A Nested Data-Parallel Language (Version 2.6). USA.Google Scholar
Digital Library
- Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’08). Association for Computing Machinery, New York, NY, USA. 101–113. isbn:9781595938602 https://doi.org/10.1145/1375581.1375595 Google Scholar
Digital Library
- Jonathan Immanuel Brachthäuser, Philipp Schuster, and Klaus Ostermann. 2020. Effects as Capabilities: Effect Handlers and Lightweight Effect Polymorphism. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 126, Nov., 30 pages. https://doi.org/10.1145/3428194 Google Scholar
Digital Library
- James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jaxGoogle Scholar
- Manuel M T Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell array codes with multicore GPUs. In DAMP ’11: The 6th workshop on Declarative Aspects of Multicore Programming. ACM. https://doi.org/10.1145/1926354.1926358 Google Scholar
Digital Library
- Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC). 44–54. https://doi.org/10.1109/IISWC.2009.5306797 Google Scholar
Digital Library
- Conal Elliott. 2018. The Simple Essence of Automatic Differentiation. Proc. ACM Program. Lang., 2, ICFP (2018), Article 70, July, 29 pages. https://doi.org/10.1145/3236765 Google Scholar
Digital Library
- Roy Frostig, Matthew Johnson, Dougal Maclaurin, Adam Paszke, and Alexey Radul. 2021. Decomposing reverse-mode automatic differentiation. In LAFI ’21: POPL 2021 workshop on Languages for Inference.Google Scholar
- Andreas Griewank and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (second ed.). Society for Industrial and Applied Mathematics, USA. isbn:0898716594 https://doi.org/10.1137/1.9780898717761 Google Scholar
- Tobias Grosser, Armin Größ linger, and C. Lengauer. 2012. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation. Parallel Process. Lett., 22 (2012), https://doi.org/10.1142/S0129626412500107 Google Scholar
Cross Ref
- Charles R. Harris, K. Jarrod Millman, St’efan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fern’andez del R’ıo, Mark Wiebe, Pearu Peterson, Pierre G’erard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature, 585, 7825 (2020), Sept., 357–362. https://doi.org/10.1038/s41586-020-2649-2 Google Scholar
- Laurent Hascoet and Valérie Pascual. 2013. The Tapenade automatic differentiation tool: principles, model, and specification. ACM Transactions on Mathematical Software (TOMS), 39, 3 (2013), 1–43. https://doi.org/10.1145/2450153.2450158 Google Scholar
Digital Library
- Troels Henriksen, Sune Hellfritzsch, Ponnuswamy Sadayappan, and Cosmin Oancea. 2020. Compiling Generalized Histograms for GPU. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’20). IEEE Press, Article 97, 14 pages. isbn:9781728199986 https://doi.org/10.1109/SC41405.2020.00101 Google Scholar
Cross Ref
- Troels Henriksen, Ken Friis Larsen, and Cosmin E. Oancea. 2016. Design and GPGPU Performance of Futhark’s Redomap Construct. In Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY 2016). Association for Computing Machinery, New York, NY, USA. 17–24. isbn:9781450343848 https://doi.org/10.1145/2935323.2935326 Google Scholar
Digital Library
- Troels Henriksen, Niels GW Serup, Martin Elsman, Fritz Henglein, and Cosmin E Oancea. 2017. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. 556–571. https://doi.org/10.1145/3062341.3062354 Google Scholar
Digital Library
- Anders Kiel Hovgaard, Troels Henriksen, and Martin Elsman. 2018. High-Performance Defunctionalisation in Futhark. In International Symposium on Trends in Functional Programming. 136–156. https://doi.org/10.1007/978-3-030-18506-0_7 Google Scholar
Cross Ref
- Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and Fredo Durand. 2020. DiffTaichi: Differentiable Programming for Physical Simulation. In International Conference on Learning Representations. https://openreview.net/forum?id=B1eB5xSFvrGoogle Scholar
- Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. 2019. Taichi: A Language for High-Performance Computation on Spatially Sparse Data Structures. ACM Trans. Graph., 38, 6 (2019), Article 201, Nov., 16 pages. issn:0730-0301 https://doi.org/10.1145/3355089.3356506 Google Scholar
Digital Library
- Jan Hückelheim, Navjot Kukreja, Sri Hari Krishna Narayanan, Fabio Luporini, Gerard Gorman, and Paul Hovland. 2019. Automatic differentiation for adjoint stencil loops. In Proceedings of the 48th International Conference on Parallel Processing. 1–10. https://doi.org/10.1145/3337821.3337906 Google Scholar
Digital Library
- Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arxiv:1810.07951.Google Scholar
- Kenneth E. Iverson. 1962. A Programming Language. John Wiley & Sons, Inc., USA. isbn:978-0-471-43014-8Google Scholar
Digital Library
- Rasmus Wriedt Larsen and Troels Henriksen. 2017. Strategies for Regular Segmented Reductions on GPU. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing (FHPC 2017). Association for Computing Machinery, New York, NY, USA. 42–52. isbn:9781450351812 https://doi.org/10.1145/3122948.3122952 Google Scholar
Digital Library
- John Launchbury and Simon L. Peyton Jones. 1994. Lazy Functional State Threads. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (PLDI ’94). Association for Computing Machinery, New York, NY, USA. 24–35. isbn:089791662X https://doi.org/10.1145/178243.178246 Google Scholar
Digital Library
- Daan Leijen. 2014. Koka: Programming with Row Polymorphic Effect Types. Electronic Proceedings in Theoretical Computer Science, 153 (2014), Jun, 100–126. issn:2075-2180 https://doi.org/10.4204/eptcs.153.8 Google Scholar
Cross Ref
- Tzu-Mao Li, Michaël Gharbi, Andrew Adams, Frédo Durand, and Jonathan Ragan-Kelley. 2018. Differentiable programming for image processing and deep learning in Halide. ACM Trans. Graph. (Proc. SIGGRAPH), 37, 4 (2018), 139:1–139:13. https://doi.org/10.1145/3197517.3201383 Google Scholar
Digital Library
- Dougal Maclaurin, David Duvenaud, and Ryan P Adams. 2014. Autograd: Effortless gradients in numpy. ICML ’15 AutoML workshop.Google Scholar
- Oleksandr Manzyuk, Barak A. Pearlmutter, Alexey Andreyevich Radul, David R. Rush, and Jeffrey Mark Siskind. 2019. Perturbation confusion in forward automatic differentiation of higher-order functions. Journal of Functional Programming, 29 (2019), e12. https://doi.org/10.1017/S095679681900008X Google Scholar
Cross Ref
- Kiminori Matsuzaki and Kento Emoto. 2010. Implementing Fusion-Equipped Parallel Skeletons by Expression Templates. In Implementation and Application of Functional Languages, Marco T. Morazán and Sven-Bodo Scholz (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 72–89. isbn:978-3-642-16478-1 https://doi.org/10.1007/978-3-642-16478-1_5 Google Scholar
Cross Ref
- Trevor L. McDonell, Manuel M T Chakravarty, Gabriele Keller, and Ben Lippmeier. 2013. Optimising Purely Functional GPU Programs. In ICFP ’13: The 18th ACM SIGPLAN International Conference on Functional Programming. ACM. https://doi.org/10.1145/2500365.2500595 Google Scholar
Digital Library
- Robin Milner, Mads Tofte, and David Macqueen. 1997. The Definition of Standard ML. MIT Press, Cambridge, MA, USA. isbn:0262631814Google Scholar
- Neil Mitchell. 2010. Rethinking Supercompilation. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). Association for Computing Machinery, New York, NY, USA. 309–320. isbn:9781605587943 https://doi.org/10.1145/1863543.1863588 Google Scholar
Digital Library
- Shayan Najd, Sam Lindley, Josef Svenningsson, and Philip Wadler. 2016. Everything Old is New Again: Quoted Domain-Specific Languages. In Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM ’16). Association for Computing Machinery, New York, NY, USA. 25–36. isbn:9781450340977 https://doi.org/10.1145/2847538.2847541 Google Scholar
Digital Library
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. 8024–8035.Google Scholar
- Barak A Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Transactions on Programming Languages and Systems (TOPLAS), 30, 2 (2008), 1–36. https://doi.org/10.1145/1330017.1330018 Google Scholar
Digital Library
- Simon Peyton Jones. 2008. Harnessing the Multicores: Nested Data Parallelism in Haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08). Springer-Verlag, Berlin, Heidelberg. 138. isbn:9783540893295 https://doi.org/10.1007/978-3-540-89330-1_10 Google Scholar
Digital Library
- Simon Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler Inliner. J. Funct. Program., 12, 5 (2002), July, 393–434. issn:0956-7968 https://doi.org/10.1017/S0956796802004331 Google Scholar
Digital Library
- Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Mark Shields. 2007. Practical Type Inference for Arbitrary-Rank Types. J. Funct. Program., 17, 1 (2007), Jan., 1–82. issn:0956-7968 https://doi.org/10.1017/S0956796806006034 Google Scholar
Digital Library
- Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. SIGPLAN Not., 48, 6 (2013), June, 519–530. issn:0362-1340 https://doi.org/10.1145/2499370.2462176 Google Scholar
Digital Library
- Sam Ritchie and Gerald Jay Sussman. 2021. AD on Higher Order Functions. Unpublished note.Google Scholar
- Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock. 2018. Relay: A new IR for machine learning frameworks. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. 58–68. https://doi.org/10.1145/3211346.3211348 Google Scholar
Digital Library
- Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient Differentiable Programming in a Functional Array-Processing Language. Proc. ACM Program. Lang., 3, ICFP (2019), Article 97, July, 30 pages. https://doi.org/10.1145/3341701 Google Scholar
Digital Library
- Justin Slepak, Olin Shivers, and Panagiotis Manolios. 2014. An Array-Oriented Language with Static Rank Polymorphism. In Proceedings of the 23rd European Symposium on Programming Languages and Systems - Volume 8410. Springer-Verlag, Berlin, Heidelberg. 27–46. isbn:9783642548321 https://doi.org/10.1007/978-3-642-54833-8_3 Google Scholar
Digital Library
- Guy L. Steele, Eric Allen, David Chase, Christine Flood, Victor Luchangco, Jan-Willem Maessen, and Sukyoung Ryu. 2011. Fortress (Sun HPCS Language). Springer US, Boston, MA. 718–735. isbn:978-0-387-09766-4 https://doi.org/10.1007/978-0-387-09766-4_190 Google Scholar
Cross Ref
- Michel Steuwer, Toomas Remmelg, and Christophe Dubach. 2017. Lift: A Functional Data-Parallel IR for High-Performance GPU Code Generation. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO ’17). IEEE Press, 74–85. isbn:9781509049318 https://doi.org/10.1109/CGO.2017.7863730 Google Scholar
Cross Ref
- J. A. Stratton, Christopher I. Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, N. Anssari, G. Liu, and W. Hwu. 2012. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing.Google Scholar
- Nikhil Swamy, Juan Chen, Cédric Fournet, Pierre-Yves Strub, Karthikeyan Bhargavan, and Jean Yang. 2011. Secure Distributed Programming with Value-Dependent Types. In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11). Association for Computing Machinery, New York, NY, USA. 266–278. isbn:9781450308656 https://doi.org/10.1145/2034773.2034811 Google Scholar
Digital Library
- Seiya Tokui, Ryosuke Okuta, Takuya Akiba, Yusuke Niitani, Toru Ogawa, Shunta Saito, Shuji Suzuki, Kota Uenishi, Brian Vogel, and Hiroyuki Yamazaki Vincent. 2019. Chainer: A deep learning framework for accelerating the research cycle. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2002–2011. https://doi.org/10.1145/3292500.3330756 Google Scholar
Digital Library
- Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions. arxiv:1802.04730.Google Scholar
Index Terms
Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming
Recommendations
Exploiting Implicit Parallelism in Dynamic Array Programming Languages
ARRAY'14: Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array ProgrammingWe have built an interpreter for the array programming language J. The interpreter exploits implicit data parallelism in the language to achieve good parallel speedups on a variety of benchmark applications.
Many array programming languages operate on ...
Array programming in Whiley
ARRAY 2017: Proceedings of the 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array ProgrammingArrays are a fundamental mechanism for developing and reasoning about programs. Using them, one can easily encode a range of important algorithms from various domains, such as for sorting, graph traversal, heap manipulation and more. However, the ...
High-level object oriented programming with array technology
APL '00: Proceedings of the international conference on APL-Berlin-2000 conferenceAlthough classical object-oriented programming languages provide high-level modeling capacities (abstract data type, inheritance etc.), they remain low-level relative to data manipulation. Addressing this problem with object oriented programming ...






Comments