skip to main content
research-article

Extending a C-like language for portable SIMD programming

Published:25 February 2012Publication History
Skip Abstract Section

Abstract

SIMD instructions are common in CPUs for years now. Using these instructions effectively requires not only vectorization of code, but also modifications to the data layout. However, automatic vectorization techniques are often not powerful enough and suffer from restricted scope of applicability; hence, programmers often vectorize their programs manually by using intrinsics: compiler-known functions that directly expand to machine instructions. They significantly decrease programmer productivity by enforcing a very error-prone and hard-to-read assembly-like programming style. Furthermore, intrinsics are not portable because they are tied to a specific instruction set.

In this paper, we show how a C-like language can be extended to allow for portable and efficient SIMD programming. Our extension puts the programmer in total control over where and how control-flow vectorization is triggered. We present a type system and a formal semantics of our extension and prove the soundness of the type system. Using our prototype implementation IVL that targets Intel's MIC architecture and SSE instruction set, we show that the generated code is roughly on par with handwritten intrinsic code.

References

  1. J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of Control Dependence to Data Dependence. In POPL, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Allen and K. Kennedy. Automatic Translation of FORTRAN Programs to Vector Form. ACM Trans. Program. Lang. Syst., 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. aobench. URL http://code.google.com/p/aobench/.Google ScholarGoogle Scholar
  4. G. E. Blelloch et al. Implementation of a Portable Nested Data-Parallel Language. In PPOPP, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhauser Boston, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Farrar. Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics, 23: 156--161, January 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Georgiev and P. Slusallek. RTfact: Generic Concepts for Flexible and High Performance Ray Tracing. In IEEE/Eurographics Symposium on Interactive Ray Tracing, 2008.Google ScholarGoogle Scholar
  8. A. Ghuloum et al. Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture. Intel Technology Journal, 11 (04), November 2007.Google ScholarGoogle ScholarCross RefCross Ref
  9. GNU Press. Using the GNU Compiler Collection. For GCC version 4.6.2.Google ScholarGoogle Scholar
  10. P. Hanrahan and J. Lawson. A Language for Shading and Lighting Calculations. In SIGGRAPH, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. }ICCIntel Corp. Intel® Compilers and Libraries. URL http://software.intel.com/en-us/articles/intel-compilers.Google ScholarGoogle Scholar
  12. Intel Corp. Intel SPMD Program Compiler. URL http://ispc.github.com.Google ScholarGoogle Scholar
  13. Intel Corp. Intel® 64 and IA-32 Architectures Optimization Reference Manual, 2009.Google ScholarGoogle Scholar
  14. Intel Corp. The Intel Many Integrated Core (MIC) Architecture, 2010.Google ScholarGoogle Scholar
  15. K. E. Iverson. A Programming Language. John Wiley & Sons, Inc., 1962. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Karrenberg and S. Hack. Whole Function Vectorization. In CGO, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Khronos Group. OpenCL 1.0 Specification, 2009.Google ScholarGoogle Scholar
  18. A. Krall and S. Lelait. Compilation Techniques for Multimedia Processors. Int. J. Parallel Program., 28 (4): 347--361, 2000. Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Larsen and S. Amarasinghe. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. PLDI, 35 (5): 145--156, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Leißa, S. Hack, and I. Wald. Extending a C-like Language for Portable SIMD Programming. The full version of our PPoPP'12 paper available online at http://www.cdl.uni-saarland.de/projects/vecimp.Google ScholarGoogle Scholar
  21. K.-C. Li and H. Schwetman. Vector C--A Vector Processing Language. Journal of Parallel and Distributed Computing, 2 (2): 132 -- 169, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  22. A. Lokhmotov, B. R. Gaster, A. Mycroft, N. Hickey, and D. Stuttard. Revisiting SIMD Programming. In LCPC, pages 32--46, 2007.Google ScholarGoogle Scholar
  23. MatLab. URL http://www.mathworks.com/products/matlab.Google ScholarGoogle Scholar
  24. M. McCool. A Retargetable, Dynamic Compiler and Embedded language. In CGO, 2011.Google ScholarGoogle Scholar
  25. G. Michaelson and P. Cockshott. Vector Pascal, an array language, 2002.Google ScholarGoogle Scholar
  26. V. Ngo. Parallel Loop Transformation Techniques For Vector-Based Multiprocessor Systems. PhD thesis, University of Minnesota, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Norrish. C formalised in HOL. PhD thesis, University of Cambridge, 1998.Google ScholarGoogle Scholar
  28. D. Nuzman and R. Henderson. Multi-platform Auto-vectorization. In CGO, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Nuzman and A. Zaks. Outer-Loop Vectorization: Revisited for Short SIMD Architectures. In PACT, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. NVIDIA. CUDA Programming Guide, 2009.Google ScholarGoogle Scholar
  31. R. G. Scarborough and H. G. Kolsky. A vectorizing Fortran compiler. IBM J. Res. Dev., 30 (2): 163--171, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. Seiler et al. Larrabee: A Many-Core x86 Architecture for Visual Computing. In SIGGRAPH, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Shin. Introducing Control Flow into Vectorized Code. In PACT '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Shin, C. Jacqueline, and M. W. Hall. Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures. In PACT, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N. Sreraman and R. Govindarajan. A Vectorizing Compiler for Multimedia Extensions. Int. J. Parallel Program., 28 (4): 363--400, 2000. Google ScholarGoogle ScholarCross RefCross Ref
  36. I. Wald. Fast Construction of SAH BVHs on the Intel® Many Integrated Core (MIC) Architecture. IEEE Transactions on Visualization and Computer Graphics, 99, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Zhou and K. A. Ross. Implementing Database Operations Using SIMD Instructions. In SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extending a C-like language for portable SIMD programming

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 47, Issue 8
              PPOPP '12
              August 2012
              334 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/2370036
              Issue’s Table of Contents
              • cover image ACM Conferences
                PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
                February 2012
                352 pages
                ISBN:9781450311601
                DOI:10.1145/2145816

              Copyright © 2012 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 February 2012

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!