Abstract
This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports element-wise application, reduction, and broadcasting of arbitrary functions over dense and sparse arrays with any fill value. Such a language has great expressive power and can express sparse and dense linear and tensor algebra, functions over images, exclusion and inclusion filters, and even graph algorithms.
Our compiler strategy generalizes prior work in the literature on sparse tensor algebra compilation to support any function applied to sparse arrays, instead of only addition and multiplication. To achieve this, we generalize the notion of sparse iteration spaces beyond intersections and unions. These iteration spaces are automatically derived by considering how algebraic properties annotated onto functions interact with the fill values of the arrays. We then show how to compile these iteration spaces to efficient code.
When compared with two widely-used Python sparse array packages, our evaluation shows that we generate built-in sparse array library features with a performance of 1.4× to 53.7× when measured against PyData/Sparse for user-defined functions and between 0.98× and 5.53× when measured against SciPy/Sparse for sparse array slicing. Our technique outperforms PyData/Sparse by 6.58× to 70.3×, and (where applicable) performs between 0.96× and 28.9× that of a dense NumPy implementation, on end-to-end sparse array applications. We also implement graph linear algebra kernels in our system with a performance of between 0.56× and 3.50× compared to that of the hand-optimized SuiteSparse:GraphBLAS library.
Supplemental Material
Available for Download
Supplemental materials for the article Compilation of Sparse Array Programming Models in OOPSLA 2021. The supplemental materials consist of a PDF that contains the appendix to the article.
- Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, and Matthieu Devin. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.Google Scholar
- Hameer Abbasi. 2018. Sparse: a more modern sparse array library. In Proceedings of the 17th Python in Science Conference. 27–30.Google Scholar
Cross Ref
- John W. Backus, R. J. Beeber, Sheldon Best, Richard Goldberg, Lois M. Haibt, Harlan L. Herrick, Robert A. Nelson, David Sayre, Peter B. Sheridan, Harold Stern, Irving Ziller, Robert A. Hughes, and Roy Nutt. 1957. The FORTRAN automatic coding system. In Western Joint Computer Conference. Los Angeles, California. 188–198. https://doi.org/10.1145/1455567.1455599 Google Scholar
Digital Library
- Brett W. Bader and Tamara G. Kolda. 2007. Efficient MATLAB Computations with Sparse and Factored Tensors. Journal on Scientific Computing, 30, 1 (2007), 205–231. https://doi.org/10.1137/060676489 Google Scholar
Digital Library
- Aart J. C. Bik and Harry A. G. Wijshoff. 1993. Compilation Techniques for Sparse Matrix Computations. In International Conference on Supercomputing. 416–424. https://doi.org/10.1145/165939.166023 Google Scholar
Digital Library
- Navoneel Chakrabarty. 2019. Brain MRI Images for Brain Tumor Detection. https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detectionGoogle Scholar
- Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2018. Format Abstraction for Sparse Tensor Algebra Compilers. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 123, Oct., 30 pages. issn:2475-1421 https://doi.org/10.1145/3276493 Google Scholar
Digital Library
- Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2020. Automatic Generation of Efficient Sparse Tensor Format Conversion Routines. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA. 823–838. isbn:9781450376136 https://doi.org/10.1145/3385412.3385963 Google Scholar
Digital Library
- Timothy A. Davis. 2019. Algorithm 1000: SuiteSparse:GraphBLAS: Graph Algorithms in the Language of Sparse Linear Algebra. ACM Trans. Math. Softw., 45, 4 (2019), Article 44, Dec., 25 pages. issn:0098-3500 https://doi.org/10.1145/3322125 Google Scholar
Digital Library
- Timothy A. Davis and Yifan Hu. 2011. The University of Florida Sparse Matrix Collection. ACM Trans. Math. Softw., 38, 1 (2011), Article 1, Dec., 25 pages. issn:0098-3500 https://doi.org/10.1145/2049662.2049663 Google Scholar
Digital Library
- Charles R Harris, K Jarrod Millman, Stéfan J van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, and Nathaniel J Smith. 2020. Array programming with NumPy. Nature, 585, 7825 (2020), 357–362.Google Scholar
- Yuanming Hu, Tzu-Mao Li, Luke Anderson, Jonathan Ragan-Kelley, and Frédo Durand. 2019. Taichi: a language for high-performance computation on spatially sparse data structures. ACM Transactions on Graphics (TOG), 38, 6 (2019), 1–16. https://doi.org/10.1145/3355089.3356506 Google Scholar
Digital Library
- Su Huang, Rafail Baimouratov, Pengdong Xiao, Anand Ananthasubramaniam, and Wieslaw L Nowinski. 2006. A Medical Imaging and Visualization Toolkit in Java. Journal of Digital Imaging, 19, 1 (2006), 17–29. issn:1618-727X https://doi.org/10.1007/s10278-005-9247-6 Google Scholar
Cross Ref
- Kenneth E Iverson. 1962. A programming language. In Proceedings of the May 1-3, 1962, spring joint computer conference. 345–351.Google Scholar
- J. Kepner, P. Aaltonen, D. Bader, A. Buluç, F. Franchetti, J. Gilbert, D. Hutchison, M. Kumar, A. Lumsdaine, H. Meyerhenke, S. McMillan, C. Yang, J. D. Owens, M. Zalewski, T. Mattson, and J. Moreira. 2016. Mathematical foundations of the GraphBLAS. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). 1–9. https://doi.org/10.1109/HPEC.2016.7761646 Google Scholar
Cross Ref
- Jeremy Kepner and John Gilbert. 2011. Graph Algorithms in the Language of Linear Algebra. Society for Industrial and Applied Mathematics, USA. isbn:0898719909Google Scholar
- Jinman Kim, David D. Feng, and Tom W. Cai. 2000. A Web Based Medical Image Data Processing and Management System. In Selected Papers from the Pan-Sydney Workshop on Visualisation - Volume 2 (VIP ’00). Australian Computer Society, Inc., AUS. 89–91. isbn:0909925801Google Scholar
- F. Kjolstad, P. Ahrens, S. Kamil, and S. Amarasinghe. 2019. Tensor Algebra Compilation with Workspaces. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 180–192. https://doi.org/10.1109/CGO.2019.8661185 Google Scholar
Cross Ref
- Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The tensor algebra compiler. Proceedings of the ACM on Programming Languages, 1, OOPSLA (2017), 1–29. https://doi.org/10.1145/3133901 Google Scholar
Digital Library
- Fredrik Berg Kjølstad. 2020. Sparse tensor algebra compilation. Ph. D. Dissertation. Massachusetts Institute of Technology.Google Scholar
- Vladimir Kotlyar, Keshav Pingali, and Paul Stodghill. 1997. A relational approach to the compilation of sparse matrix programs. In Euro-Par Parallel Processing. Springer, Passau, Germany. 318–327. https://doi.org/10.1007/BFb0002751 Google Scholar
Cross Ref
- Leslie Lamport. 1974. The Parallel Execution of DO Loops. Commun. ACM, 17, 2 (1974), 83–93. http://research.microsoft.com/en-us/um/people/lamport/pubs/do-loops.pdfGoogle Scholar
Digital Library
- Calvin Lin and Lawrence Snyder. 1993. ZPL: An array sublanguage. In International Workshop on Languages and Compilers for Parallel Computing. 96–114.Google Scholar
- Tim Mattson, David Bader, Jon Berry, Ayd∈ Buluç, Jack Dongarra, Christos Faloutsos, John Feo, John R. Gilbert, Joseph Gonzalez, Bruce Hendrickson, Jeremy Kepner, Charles E Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Michael Stonebraker, Steve Wallach, and Andrew Yoo. 2013. Standards for Graph Algorithm Primitives. In IEEE High Performance Extreme Computing Conference. IEEE, 1–2. https://doi.org/10.1109/HPEC.2013.6670338 Google Scholar
Cross Ref
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle Scholar
Digital Library
- SciPy. 2021. SciPy Roadmap V1.6.2. https://docs.scipy.org/doc/scipy-1.6.2/reference/roadmap.html [Online; accessed 04/12/2021]Google Scholar
- Ryan Senanayake, Changwan Hong, Ziheng Wang, Amalee Wilson, Stephen Chou, Shoaib Kamil, Saman Amarasinghe, and Fredrik Kjolstad. 2020. A Sparse Iteration Space Transformation Framework for Sparse Tensor Algebra. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 158, Nov., 30 pages. https://doi.org/10.1145/3428226 Google Scholar
Digital Library
- Shaden Smith, Jee W. Choi, Jiajia Li, Richard Vuduc, Jongsoo Park, Xing Liu, and George Karypis. 2017. FROSTT: The Formidable Repository of Open Sparse Tensors and Tools. http://frostt.io/Google Scholar
- Edgar Solomonik and Torsten Hoefler. 2015. Sparse tensor algebra as a parallel programming model. arXiv preprint arXiv:1512.00066.Google Scholar
- K. Somkantha, N. Theera-Umpon, and S. Auephanwiriyakul. 2011. Boundary Detection in Medical Images Using Edge Following Algorithm Based on Intensity Gradient and Texture Gradient Features. IEEE Transactions on Biomedical Engineering, 58, 3 (2011), 567–573. https://doi.org/10.1109/TBME.2010.2091129 Google Scholar
Cross Ref
- Anand Venkat, Mary Hall, and Michelle Strout. 2015. Loop and Data Transformations for Sparse Matrix Code. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2015). 521–532. https://doi.org/10.1145/2737924.2738003 Google Scholar
Digital Library
- Pauli Virtanen, Ralf Gommers, Travis E Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, and Jonathan Bright. 2020. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17, 3 (2020), 261–272.Google Scholar
- Gert Wollny, Peter Kellman, María J. Ledesma-Carbayo, Matthew M. Skinner, Jean-Jaques Hublin, and Thomas Hierl. 2013. MIA - A free and open source software for gray scale medical image analysis. Source Code Biol Med, Article 20, https://doi-org.stanford.idm.oclc.org/10.1186/1751-0473-8-20 Google Scholar
Cross Ref
Index Terms
Compilation of sparse array programming models
Recommendations
L-shaped coprime array structures for DOA estimation
AbstractThis paper proposes a new sparse array geometry for 2-D (azimuth and elevation) direction-of-arrival (DOA) estimation based on coprime sampling. The proposed array structure is L-shaped coprime array (LCA) whose each portion is one dimensional ...
V-Shaped Sparse Arrays for 2-D DOA Estimation
This paper proposes a new sparse array geometry for 2-D (azimuth and elevation) direction-of-arrival (DOA) estimation. The proposed array geometry is V-shaped sparse array, and it is composed of two linear portions which are crossing each other. The ...
Static and Dynamic Program Compilation by Interpreter Specialization
Interpretation and run-time compilation techniques are increasingly important because they can support heterogeneous architectures, evolving programming languages, and dynamically-loaded code. Interpretation is simple to implement, but yields poor ...






Comments