Abstract
We present VOBLA, a domain-specific language designed for programming linear algebra libraries. VOBLA is compiled to PENCIL, a domain independent intermediate language designed for efficient mapping to accelerator architectures such as GPGPUs. PENCIL is compiled to efficient, platform-specific OpenCL code using techniques based on the polyhedral model. This approach addresses both the programmer productivity and performance portability concerns associated with accelerator programming.
We demonstrate our approach by using VOBLA to implement a BLAS library. We have evaluated the performance of OpenCL code generated using our compilation flow on ARM Mali, AMD Radeon, and AMD Opteron platforms. The generated code is currently on average 1.9x slower than highly hand-optimized OpenCL code, but on average 8.1x faster than straightforward OpenCL code. Given that the VOBLA coding takes significantly less effort compared to hand-optimizing OpenCL code, we believe our approach leads to improved productivity and performance portability.
- R. Baghdadi, A. Cohen, S. Guelton, S. Verdoolaege, J. Inoue, and T. Grosser. PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs. Workshop on Domain Specific Languages, WOLFHPC'12, 2012.Google Scholar
- A. Faucher, C. Fu, D. Callahan, K. Spagnoli, and P. Nagpal. C++ AMP BLAS. http://ampblas.codeplex.com/, 2013.Google Scholar
- M. Fowler and R. Parsons. Domain-Specific Languages. Addison Wesley, 2011. Google Scholar
Digital Library
- GNU Project. GSL: GNU Scientific Library. http://www.gnu.org/software/gsl/, 1996--2013.Google Scholar
- H. Joong, K. J. Brown, A. K. Sujeeth, and H. Chafi. Implementing Domain-Specific Languages for Heterogeneous Parallel Computing. IEEE Micro, 31:42--53, October 2011. Google Scholar
Digital Library
- K. Goto. GotoBLAS: Texas Advanced Computing Center Software. http://www.tacc.utexas.edu/tacc-software/gotoblas2, 2013.Google Scholar
- S. Kelly and R. Pohjonen. Worst Practices for Domain-Specific Modelling. Software, IEEE, 26(4):22--29, Aug. 2009. Google Scholar
Digital Library
- A. Kravets, S. van Haastregt, U. Beaugnon, D. Tweed, J. Absar, and A. Lokhmotov. VOBLA and PENCIL tools. https://github. com/carpproject, 2014.Google Scholar
- C. Lawson, R. Hanson, D. Kincaid, and F. Krogh. Basic Linear Algebra Subprograms for Fortran Usage. ACM Trans. Math. Softw., 5(3):308--323, September 1979. Google Scholar
Digital Library
- M. Luján, T. L. Freeman, and J. R. Gurd. OoLALA: an Object Oriented Analysis and Design of Numerical Linear Algebra. In Proceeding of the conference on Object-oriented programming, systems, languages, and applications, OOPSLA '00, pages 229--252, 2000. Google Scholar
Digital Library
- J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: a Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In Proceedings of the conference on Programming Language Design and Implementation, PLDI '13, pages 519--530, 2013. Google Scholar
Digital Library
- The Netlib. BLAS -- Basic Linear Algebra Subprograms. http://www.netlib.org/blas/, 1979.Google Scholar
- The Netlib. LAPACK -- Linear Algebra Package. http://www netlib.org/lapack/, 1992.Google Scholar
- P. Tillet, K. Rupp, and S. Selberherr. An Automatic OpenCL Compute Kernel Generator for Basic Linear Algebra Operations. In Proceedings of the Symposium on High Performance Computing, HPC '12, pages 4:1--4:2. Society for Computer Simulation International, 2012. Google Scholar
Digital Library
- A. van Deursen, P. Klint, and J. Visser. Domain-Specific Languages: an Annotated Bibliography. SIGPLAN Not., 35(6):26--36, June 2000. Google Scholar
Digital Library
- S. Verdoolaege, J. Carlos Juega, A. Cohen, J. Ignacio Gömez, C. Tenllado, and F. Catthoor. Polyhedral Parallel Code Generation for CUDA. ACM Trans. Archit. Code Optim., 9(4):54:1--54:23, Jan. 2013. Google Scholar
Digital Library
- J. Walter and M. Koch. uBLAS: Basic Linear Algebra Library. http://www.boost.org/doc/libs/1_54_0/libs/numeric/ublas/doc/index.htm, 2013.Google Scholar
- R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated Empirical Optimization of Software and the ATLAS Project. Parallel Computing, 27:2001, 2000.Google Scholar
Index Terms
VOBLA: a vehicle for optimized basic linear algebra
Recommendations
VOBLA: a vehicle for optimized basic linear algebra
LCTES '14: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsWe present VOBLA, a domain-specific language designed for programming linear algebra libraries. VOBLA is compiled to PENCIL, a domain independent intermediate language designed for efficient mapping to accelerator architectures such as GPGPUs. PENCIL is ...
Fast Linear Algebra on GPU
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and SystemsGPUs have been successfully used for acceleration of many mathematical functions and libraries. A common limitation of those libraries is a minimal size of primitives being handled in order to achieve significant speedups compared to their CPU versions. ...
A Trip to Tahiti: Approaching a 5 TFlop SGEMM Using 3 AMD GPUs
SAAHPC '12: Proceedings of the 2012 Symposium on Application Accelerators in High Performance ComputingUsing GPUs as computational accelerators has been a growing area of research in the past several years. One particular area amenable to exploiting video card hardware is dense linear algebra. We continue this trend by generalizing the MAGMA xGEMM ...







Comments