skip to main content
research-article
Open Access

Compilation of sparse array programming models

Published:15 October 2021Publication History
Skip Abstract Section

Abstract

This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports element-wise application, reduction, and broadcasting of arbitrary functions over dense and sparse arrays with any fill value. Such a language has great expressive power and can express sparse and dense linear and tensor algebra, functions over images, exclusion and inclusion filters, and even graph algorithms.

Our compiler strategy generalizes prior work in the literature on sparse tensor algebra compilation to support any function applied to sparse arrays, instead of only addition and multiplication. To achieve this, we generalize the notion of sparse iteration spaces beyond intersections and unions. These iteration spaces are automatically derived by considering how algebraic properties annotated onto functions interact with the fill values of the arrays. We then show how to compile these iteration spaces to efficient code.

When compared with two widely-used Python sparse array packages, our evaluation shows that we generate built-in sparse array library features with a performance of 1.4× to 53.7× when measured against PyData/Sparse for user-defined functions and between 0.98× and 5.53× when measured against SciPy/Sparse for sparse array slicing. Our technique outperforms PyData/Sparse by 6.58× to 70.3×, and (where applicable) performs between 0.96× and 28.9× that of a dense NumPy implementation, on end-to-end sparse array applications. We also implement graph linear algebra kernels in our system with a performance of between 0.56× and 3.50× compared to that of the hand-optimized SuiteSparse:GraphBLAS library.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

Supplemental materials for the article Compilation of Sparse Array Programming Models in OOPSLA 2021. The supplemental materials consist of a PDF that contains the appendix to the article.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, and Matthieu Devin. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.Google ScholarGoogle Scholar
  2. Hameer Abbasi. 2018. Sparse: a more modern sparse array library. In Proceedings of the 17th Python in Science Conference. 27–30.Google ScholarGoogle ScholarCross RefCross Ref
  3. John W. Backus, R. J. Beeber, Sheldon Best, Richard Goldberg, Lois M. Haibt, Harlan L. Herrick, Robert A. Nelson, David Sayre, Peter B. Sheridan, Harold Stern, Irving Ziller, Robert A. Hughes, and Roy Nutt. 1957. The FORTRAN automatic coding system. In Western Joint Computer Conference. Los Angeles, California. 188–198. https://doi.org/10.1145/1455567.1455599 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brett W. Bader and Tamara G. Kolda. 2007. Efficient MATLAB Computations with Sparse and Factored Tensors. Journal on Scientific Computing, 30, 1 (2007), 205–231. https://doi.org/10.1137/060676489 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aart J. C. Bik and Harry A. G. Wijshoff. 1993. Compilation Techniques for Sparse Matrix Computations. In International Conference on Supercomputing. 416–424. https://doi.org/10.1145/165939.166023 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Navoneel Chakrabarty. 2019. Brain MRI Images for Brain Tumor Detection. https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detectionGoogle ScholarGoogle Scholar
  7. Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2018. Format Abstraction for Sparse Tensor Algebra Compilers. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 123, Oct., 30 pages. issn:2475-1421 https://doi.org/10.1145/3276493 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2020. Automatic Generation of Efficient Sparse Tensor Format Conversion Routines. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA. 823–838. isbn:9781450376136 https://doi.org/10.1145/3385412.3385963 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Timothy A. Davis. 2019. Algorithm 1000: SuiteSparse:GraphBLAS: Graph Algorithms in the Language of Sparse Linear Algebra. ACM Trans. Math. Softw., 45, 4 (2019), Article 44, Dec., 25 pages. issn:0098-3500 https://doi.org/10.1145/3322125 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Timothy A. Davis and Yifan Hu. 2011. The University of Florida Sparse Matrix Collection. ACM Trans. Math. Softw., 38, 1 (2011), Article 1, Dec., 25 pages. issn:0098-3500 https://doi.org/10.1145/2049662.2049663 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Charles R Harris, K Jarrod Millman, Stéfan J van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, and Nathaniel J Smith. 2020. Array programming with NumPy. Nature, 585, 7825 (2020), 357–362.Google ScholarGoogle Scholar
  12. Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. 2019. Taichi: a language for high-performance computation on spatially sparse data structures. ACM Transactions on Graphics (TOG), 38, 6 (2019), 1–16. https://doi.org/10.1145/3355089.3356506 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Su Huang, Rafail Baimouratov, Pengdong Xiao, Anand Ananthasubramaniam, and Wieslaw L Nowinski. 2006. A Medical Imaging and Visualization Toolkit in Java. Journal of Digital Imaging, 19, 1 (2006), 17–29. issn:1618-727X https://doi.org/10.1007/s10278-005-9247-6 Google ScholarGoogle ScholarCross RefCross Ref
  14. Kenneth E Iverson. 1962. A programming language. In Proceedings of the May 1-3, 1962, spring joint computer conference. 345–351.Google ScholarGoogle Scholar
  15. J. Kepner, P. Aaltonen, D. Bader, A. Buluç, F. Franchetti, J. Gilbert, D. Hutchison, M. Kumar, A. Lumsdaine, H. Meyerhenke, S. McMillan, C. Yang, J. D. Owens, M. Zalewski, T. Mattson, and J. Moreira. 2016. Mathematical foundations of the GraphBLAS. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). 1–9. https://doi.org/10.1109/HPEC.2016.7761646 Google ScholarGoogle ScholarCross RefCross Ref
  16. Jeremy Kepner and John Gilbert. 2011. Graph Algorithms in the Language of Linear Algebra. Society for Industrial and Applied Mathematics, USA. isbn:0898719909Google ScholarGoogle Scholar
  17. Jinman Kim, David D. Feng, and Tom W. Cai. 2000. A Web Based Medical Image Data Processing and Management System. In Selected Papers from the Pan-Sydney Workshop on Visualisation - Volume 2 (VIP ’00). Australian Computer Society, Inc., AUS. 89–91. isbn:0909925801Google ScholarGoogle Scholar
  18. F. Kjolstad, P. Ahrens, S. Kamil, and S. Amarasinghe. 2019. Tensor Algebra Compilation with Workspaces. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 180–192. https://doi.org/10.1109/CGO.2019.8661185 Google ScholarGoogle ScholarCross RefCross Ref
  19. Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The tensor algebra compiler. Proceedings of the ACM on Programming Languages, 1, OOPSLA (2017), 1–29. https://doi.org/10.1145/3133901 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Fredrik Berg Kjølstad. 2020. Sparse tensor algebra compilation. Ph. D. Dissertation. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  21. Vladimir Kotlyar, Keshav Pingali, and Paul Stodghill. 1997. A relational approach to the compilation of sparse matrix programs. In Euro-Par Parallel Processing. Springer, Passau, Germany. 318–327. https://doi.org/10.1007/BFb0002751 Google ScholarGoogle ScholarCross RefCross Ref
  22. Leslie Lamport. 1974. The Parallel Execution of DO Loops. Commun. ACM, 17, 2 (1974), 83–93. http://research.microsoft.com/en-us/um/people/lamport/pubs/do-loops.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  23. Calvin Lin and Lawrence Snyder. 1993. ZPL: An array sublanguage. In International Workshop on Languages and Compilers for Parallel Computing. 96–114.Google ScholarGoogle Scholar
  24. Tim Mattson, David Bader, Jon Berry, Ayd∈ Buluç, Jack Dongarra, Christos Faloutsos, John Feo, John R. Gilbert, Joseph Gonzalez, Bruce Hendrickson, Jeremy Kepner, Charles E Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Michael Stonebraker, Steve Wallach, and Andrew Yoo. 2013. Standards for Graph Algorithm Primitives. In IEEE High Performance Extreme Computing Conference. IEEE, 1–2. https://doi.org/10.1109/HPEC.2013.6670338 Google ScholarGoogle ScholarCross RefCross Ref
  25. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  26. SciPy. 2021. SciPy Roadmap V1.6.2. https://docs.scipy.org/doc/scipy-1.6.2/reference/roadmap.html [Online; accessed 04/12/2021]Google ScholarGoogle Scholar
  27. Ryan Senanayake, Changwan Hong, Ziheng Wang, Amalee Wilson, Stephen Chou, Shoaib Kamil, Saman Amarasinghe, and Fredrik Kjolstad. 2020. A Sparse Iteration Space Transformation Framework for Sparse Tensor Algebra. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 158, Nov., 30 pages. https://doi.org/10.1145/3428226 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shaden Smith, Jee W. Choi, Jiajia Li, Richard Vuduc, Jongsoo Park, Xing Liu, and George Karypis. 2017. FROSTT: The Formidable Repository of Open Sparse Tensors and Tools. http://frostt.io/Google ScholarGoogle Scholar
  29. Edgar Solomonik and Torsten Hoefler. 2015. Sparse tensor algebra as a parallel programming model. arXiv preprint arXiv:1512.00066.Google ScholarGoogle Scholar
  30. K. Somkantha, N. Theera-Umpon, and S. Auephanwiriyakul. 2011. Boundary Detection in Medical Images Using Edge Following Algorithm Based on Intensity Gradient and Texture Gradient Features. IEEE Transactions on Biomedical Engineering, 58, 3 (2011), 567–573. https://doi.org/10.1109/TBME.2010.2091129 Google ScholarGoogle ScholarCross RefCross Ref
  31. Anand Venkat, Mary Hall, and Michelle Strout. 2015. Loop and Data Transformations for Sparse Matrix Code. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2015). 521–532. https://doi.org/10.1145/2737924.2738003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Pauli Virtanen, Ralf Gommers, Travis E Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, and Jonathan Bright. 2020. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17, 3 (2020), 261–272.Google ScholarGoogle Scholar
  33. Gert Wollny, Peter Kellman, María J. Ledesma-Carbayo, Matthew M. Skinner, Jean-Jaques Hublin, and Thomas Hierl. 2013. MIA - A free and open source software for gray scale medical image analysis. Source Code Biol Med, Article 20, https://doi-org.stanford.idm.oclc.org/10.1186/1751-0473-8-20 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Compilation of sparse array programming models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 5, Issue OOPSLA
        October 2021
        2001 pages
        EISSN:2475-1421
        DOI:10.1145/3492349
        Issue’s Table of Contents

        Copyright © 2021 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 October 2021
        Published in pacmpl Volume 5, Issue OOPSLA

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!