skip to main content
research-article

Data-only flattening for nested data parallelism

Authors Info & Claims
Published:23 February 2013Publication History
Skip Abstract Section

Abstract

Data parallelism has proven to be an effective technique for high-level programming of a certain class of parallel applications, but it is not well suited to irregular parallel computations. Blelloch and others proposed nested data parallelism (NDP) as a language mechanism for programming irregular parallel applications in a declarative data-parallel style. The key to this approach is a compiler transformation that flattens the NDP computation and data structures into a form that can be executed efficiently on a wide-vector SIMD architecture. Unfortunately, this technique is ill suited to execution on today's multicore machines. We present a new technique, called data-only flattening, for the compilation of NDP, which is suitable for multicore architectures. Data-only flattening transforms nested data structures in order to expose programs to various optimizations while leaving control structures intact. We present a formal semantics of data-only flattening in a core language with a rewriting system. We demonstrate the effectiveness of this technique in the Parallel ML implementation and we report encouraging experimental results across various benchmark applications.

References

  1. L. Bergstrom, M. Fluet, M. Rainey, J. Reppy, and A. Shaw. Lazy tree splitting. In ICFP '10, pages 93--104, New York, NY, September 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. E. Blelloch. Vector models for data-parallel computing. MIT Press, Cambridge, MA, USA, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. E. Blelloch. Prefix sums and their applications. Technical Report CMU-CS-90-190, School of Computer Science, Carnegie Mellon University, Nov. 1990.Google ScholarGoogle Scholar
  4. G. E. Blelloch. Programming parallel algorithms. CACM, 39(3):85--97, Mar. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. E. Blelloch and G. W. Sabot. Compiling collection-oriented languages onto massively parallel computers. JPDC, 8(2):119--134, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H.-J. Boehm, R. Atkinson, and M. Plass. Ropes: an alternative to strings. SP&E, 25(12):1315--1330, Dec. 1995. ISSN 0038-0644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. M. T. Chakravarty and G. Keller. More types for nested data parallel programming. In ICFP '00, pages 94--105, New York, NY, Sept. 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. M. T. Chakravarty, G. Keller, R. Leshchinskiy, and W. Pfannenstiel. Nepal -- nested data parallelism in Haskell. In Euro-Par '01, volume 2150 of LNCS, pages 524--534, New York, NY, Aug. 2001. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton Jones, G. Keller, and S. Marlow. Data Parallel Haskell: A status report. In DAMP '07, pages 10--18, New York, NY, Jan. 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI '04, pages 137--150, Berkeley, CA, Dec. 2004. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Fluet, N. Ford, M. Rainey, J. Reppy, A. Shaw, and Y. Xiao. Status Report: The Manticore Project. In ML '07, pages 15--24, New York, NY, Oct. 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. GHC. The Glasgow Haskell Compiler. Available from http://www.haskell.org/ghc.Google ScholarGoogle Scholar
  13. G. Keller. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. PhD thesis, Technische Universitat Berlin, Berlin, Germany, 1999.Google ScholarGoogle Scholar
  14. G. Keller and M. M. T. Chakravarty. Flattening trees. In Euro-Par '98: Proceedings of the 4th International Euro-Par Conference on Parallel Processing, pages 709--719, London, UK, 1998. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Keller, M. M. T. Chakravarty, R. Leshchinskiy, B. Lippmeier, and S. Peyton Jones. Vectorisation Avoidance. In HASKELL '12, New York, NY, Sept. 2012. ACM. Forthcoming. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Leshchinskiy, M. M. T. Chakravarty, and G. Keller. Higher order flattening. In V. Alexandrov, D. van Albada, P. Sloot, and J. Dongarra, editors, ICCS '06, number 3992 in LNCS, pages 920--928, New York, NY, May 2006. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML (Revised). The MIT Press, Cambridge, MA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. MLton. The MLton Standard ML compiler. Available at http://mlton.org.Google ScholarGoogle Scholar
  19. R. S. Nikhil. ID Language Reference Manual. Laboratory for Computer Science, MIT, Cambridge, MA, July 1991.Google ScholarGoogle Scholar
  20. D. W. Palmer, J. F. Prins, and S. Westfold. Work-efficient nested dataparallelism. In FoMPP5, pages 186--193, Los Alamitos, CA, 1995. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Peyton Jones, A. Tolmach, and T. Hoare. Playing by the rules: Rewriting as a practical optimization technique in GHC. In Proceedings of the 2001 Haskell Workshop, pages 203--233, Sept. 2001.Google ScholarGoogle Scholar
  22. S. Peyton Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In APLAS '08, pages 138--138, New York, NY, Dec. 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Rainey. Effective Scheduling Techniques for High-Level Parallel Programming Languages. PhD thesis, University of Chicago, Aug. 2010. Available from http://manticore.cs.uchicago.edu. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Sengupta, M. Harris, Y. Zhang, and J. D. Owens. Scan primitives for GPU computing. In GH '07, pages 97--106, Aire-la-Ville, Switzerland, Aug. 2007. Eurographics Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Shaw. Implementation techniques for nested-data-parallel languages. PhD thesis, University of Chicago, Aug. 2011. Available from http://manticore.cs.uchicago.edu. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Spoonhower, G. E. Blelloch, R. Harper, and P. B. Gibbons. Space profiling for parallel functional programs. In ICFP '08, pages 253--264, New York, NY, Sept. 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Data-only flattening for nested data parallelism

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 48, Issue 8
      PPoPP '13
      August 2013
      309 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2517327
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
        February 2013
        332 pages
        ISBN:9781450319225
        DOI:10.1145/2442516

      Copyright © 2013 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 February 2013

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!