skip to main content
article

Automatic locality-friendly interface extension of numerical functions

Published:15 September 2014Publication History
Skip Abstract Section

Abstract

Raising the level of abstraction is a key concern of software engineering, and libraries (either used directly or as a target of a program generation system) are a successful technique to raise programmer productivity and to improve software quality. Unfortunately successful libraries may contain functions that may not be general enough. For example, many numeric performance libraries contain functions that work on one- or higher-dimensional arrays. A problem arises if a program wants to invoke such a function on a non-contiguous subarray (e.g., in C the column of a matrix or a subarray of an image). If the library developer did not foresee this scenario, the client program must include explicit copy steps before and after the library function call, incurring a possibly high performance penalty. A better solution would be an enhanced library function that allows for the desired access pattern. Exposing the access pattern allows the compiler to optimize for the intended usage scenario(s). As we do not want the library developer to generate all interesting versions manually, we present a tool that takes a library function written in C and generates such a customized function for typical accesses. We describe the approach, discuss limitations, and report on the performance. As example access patterns we consider those most common in numerical applications: striding and block striding, general permutations, as well as scaling. We evaluate the tool on various library functions including filters, scans, reductions, sorting, FFTs, and linear algebra operations. The automatically generated custom version is in most cases significantly faster than using individual steps, offering speed-ups that are typically in the range of 1.2-1.8x.

References

  1. Spiral website. http://spiral.net/codegenerator.html.Google ScholarGoogle Scholar
  2. E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, 3rd edition, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Apel, D. Batory, C. Kästner, and G. Saake. Feature-Oriented Software Product Lines: Concepts and Implementation. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Clavel, F. Durn, S. Eker, P. Lincoln, N. Mart-Oliet, J. Meseguer, and C. Talcott. The maude 2.0 system. In R. Nieuwenhuis, editor, Rewriting Techniques and Applications, volume 2706 of Lecture Notes in Computer Science, pages 76–87. Springer Berlin Heidelberg, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Clements and L. Northrop. Software Product Lines: Practices and Patterns. Addison-Wesley Professional, 3rd edition, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Czarnecki and U. Eisenecker. Generative Programming: Methods, Tools, and Applications. Addison-Wesley Professional, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw., 16(1):1–17, 1990. http://www.netlib.org/blas/blasqr.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Frigo. A fast Fourier transform compiler. In Programming Languages, Design and Implementation (PLDI), pages 169– 180, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216–231, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Hess. Automatic refactoring: Locality friendly interface enhancements for numerical functions. Technical report, ETH Zurich, 2013.Google ScholarGoogle Scholar
  11. Intel. Intel C++ Compiler XE 13.1 User and Reference Guide. Santa Clara, CA, 2013. Document number: 323273-131US.Google ScholarGoogle Scholar
  12. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. D. Napoli, D. Fabregat-Traver, G. Quintana-Orti, and P. Bientinesi. Towards an efficient use of the BLAS library for multilinear tensor contractions. http://arxiv.org/pdf/1307.2100.pdf, 2013.Google ScholarGoogle Scholar
  14. D. L. Parnas. Designing software for ease of extension and contraction. In Proc. International Conference on Software Engineering (ICSE), pages 264–277, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on “Program Generation, Optimization, and Adaptation”, 93(2):232– 275, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. Püschel, F. Franchetti, and Y. Voronenko. Encyclopedia of Parallel Computing, chapter Spiral. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Visser. Stratego: A language for program transformation based on rewriting strategies. System description of Stratego 0.5. In A. Middeldorp, editor, Rewriting Techniques and Applications (RTA’01), volume 2051 of Lecture Notes in Computer Science, pages 357–361. Springer-Verlag, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Voronenko, F. de Mesmay, and M. Püschel. Computer generation of general size linear transform libraries. In Code Generation and Optimization (CGO), pages 102–113, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic locality-friendly interface extension of numerical functions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!