skip to main content
research-article

A comparison of programming models for multiprocessors with explicitly managed memory hierarchies

Published:14 February 2009Publication History
Skip Abstract Section

Abstract

On multiprocessors with explicitly managed memory hierarchies (EMM), software has the responsibility of moving data in and out of fast local memories. This task can be complex and error-prone even for expert programmers. Before we can allow compilers to handle this complexity for us, we must identify the abstractions that are general enough to allow us to write applications with reasonable effort, yet specific enough to exploit the vast on-chip memory bandwidth of EMM multi-processors. To this end, we compare two programming models against hand-tuned codes on the STI Cell, paying attention to programmability and performance. The first programming model, Sequoia, abstracts the memory hierarchy as private address spaces, each corresponding to a parallel task. The second, Cellgen, is a new framework which provides OpenMP-like semantics and the abstraction of a shared address space divided into private and shared data. We compare three applications programmed using these models against their hand-optimized counterparts in terms of abstractions, programming complexity, and performance.

References

  1. A. M. Aji, W. Feng, F. Blagojevic, and D. S. Nikolopoulos. Cell-SWat: Modeling and Scheduling Wavefront Computations on the Cell Broadband Engine. In Proceedings of the 2008 ACM Conference on Computing Frontiers (CF08), pages 13--22, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Balart, M. González, X. Martorell, E. Ayguadé, Z. Sura, T. Chen, T. Zhang, K. O'Brien, and K. M. O'Brien. A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor. In Proc. of the 20th International Workshop on Languages and Compilers for Parallel Computing, LNCS Vol. 5234, pages 125--140, Oct. 2007.Google ScholarGoogle Scholar
  3. P. Bellens, J. M. Pérez, R. M. Badia, and J. Labarta. CellSs: A Programming Model for the Cell BE Architecture. In Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing (Supercomputing'2006), page 86, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. P. L. Carter. Documentation Of The Saprc-99 Chemical Mechanism For Voc Reactivity Assessment. Final Report Contract No. 92-329, California Air Resources Board, May 8 2000.Google ScholarGoogle Scholar
  5. JT. Chen, R. Raghavan, J. N. Dale, and E. Iwata. Cell Broadband Engine and Its First Implementation -- A Performance View. IBM Journal of Research and Development, 51(5):559--572, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Chen, Z. Sura, K. M. O'Brien, and J. K. O'Brien. Optimizing the Use of Static Buffers for DMA on a CELL Chip. In Languages and Compilers for Parallel Computing, 19th International Workshop (LCPC), pages 314--329, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. H. Crawford, P. Henning, M. Kistler, and C. Wright. Accelerating Computing With the Cell Broadband Engine Processor. In Proceedings of the 2008 ACM Conference on Computing Frontiers (CF08), pages 3--12, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. J. Dally, F. Labonte, A. Das, P. Hanrahan, J. H. Ahn, J. Gummaraju, M. Erez, N. Jayasena, I. Buck, T. J. Knight, and U. J. Kapasi. Merri-mac: Supercomputing with Streams. In Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computin (Supercomputing'2003), page 35, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Duran, J. M. Perez, E. Ayguade, R. M. Badia, and J. Labarta. Extending the OpenMP Tasking Model to Allow Dependent Tasks. In OpenMP in a New Era of Parallelism, Proceedings of the 4th International Workshop on OpenMP, LNCS Vol. 5004, pages 111--122, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Fatahalian, D. R. Horn, T. J. Knight, L. Leem, M. Houston, J. Y. Park, M. Erez, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: Programming the Memory Hierarchy. In Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing (Supercomputing'2006), page 83, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Feng, K. W. Cameron, and D. A. Buell. PBPI: A High Performance Implementation of Bayesian Phylogenetic Inference. In Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing (Supercomputing'2006), page 75, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. I. Gordon, W. Thies, and S. P. Amarasinghe. Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASP-LOS), pages 151--162, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Gummaraju, J. Coburn, Y. Turner, and M. Rosenblum. Streamware: Programming General-Purpose Multicore Processors Using Streams. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASP-LOS), pages 297--307, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Hundsdorfer. Numerical Solution of Advection-Diffusion-Reaction Equations. Technical report, Centrum voor Wiskunde en Informatica, 1996.Google ScholarGoogle Scholar
  15. IBM Corporation. Software development kit for multi-core acceleration version 3.0. Oct. 2007.Google ScholarGoogle Scholar
  16. D. Jimenez-Gonzalez, X. Martorell, and A. Ramirez. Performance Analysis of Cell Broadband Engine for High Memory Bandwidth Applications. Performance Analysis of Systems & Software, 2007. ISPASS 2007. IEEE International Symposium on, pages 210--219, April 2007.Google ScholarGoogle Scholar
  17. J. C. Linford and A. Sandu. Optimizing Large Scale Chemical Transport Models for Multicore Platforms. In Proceedings of the 2008 Spring Simulation Multiconference, Ottawa, Canada, April 14-18 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Mattson. Introduction to OpenMP -- Tutorial. In Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing (Supercomputing'2006), page 209, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. D. McCool and B. D'Amora. Programming using RapidMind on the Cell BE -- Tutorial. In Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing (Super-computing'2006), page 222, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Mitchell, L. Carter, and J. Ferrante. Localizing Non-Affine Array References. In Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 192--202, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips. GPU Computing. Proceedings of the IEEE, 95(6):879--899, May 2008.Google ScholarGoogle ScholarCross RefCross Ref
  22. B. Rose. Cellstream. http://www.cs.vt.edu/~bar234/cellstream.Google ScholarGoogle Scholar
  23. A. Sandu, D. Daescu, G. Carmichael, and T. Chai. Adjoint Sensitivity Analysis of Regional Air Quality Models. Journal of Computational Physics, 204:222--252, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. H. Wang, J. D. Collins, G. N. Chinya, H. Jiang, X. Tian, M. Girkar, N. Y. Yang, G.-Y. Lueh, and H. Wang. EXOCHI: Architecture and Programming Environment for a Heterogeneous Multi-core Multi-threaded System. In PLDI'07: Proceedings of the 2007 ACM SIG-PLAN conference on Programming Language Design and Implemen-tation, pages 156--166, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A comparison of programming models for multiprocessors with explicitly managed memory hierarchies

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 44, Issue 4
          PPoPP '09
          April 2009
          294 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/1594835
          Issue’s Table of Contents
          • cover image ACM Conferences
            PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
            February 2009
            322 pages
            ISBN:9781605583976
            DOI:10.1145/1504176

          Copyright © 2009 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 February 2009

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!