skip to main content
research-article

SpiceC: scalable parallelism via implicit copying and explicit commit

Authors Info & Claims
Published:12 February 2011Publication History
Skip Abstract Section

Abstract

In this paper we present an approach to parallel programming called SpiceC. SpiceC simplifies the task of parallel programming through a combination of an intuitive computation model and SpiceC directives. The SpiceC parallel computation model consists of multiple threads where every thread has a private space for data and all threads share data via a shared space. Each thread performs computations using its private space thus offering isolation which allows for speculative computations. SpiceC provides easy to use SpiceC compiler directives using which the programmers can express different forms of parallelism. It allows developers to express high level constraints on data transfers between spaces while the tedious task of generating the code for the data transfers is performed by the compiler. SpiceC also supports data transfers involving dynamic data structures without help from developers. SpiceC allows developers to create clusters of data to enable parallel data transfers. SpiceC programs are portable across modern chip multiprocessor based machines that may or may not support cache coherence. We have developed implementations of SpiceC for shared memory systems with and without cache coherence. We evaluate our implementation using seven benchmarks of which four are parallelized speculatively. Our compiler generated implementations achieve speedups ranging from 2x to 18x on a 24 core system.

References

  1. M. Azimi, N. Cherukuri, D. N. Jayasimha, A. Kumar, P. Kundu, S. Park, I. Schoinas, and A. S. Vaidya. Integration challenges and tradeoffs for tera-scale architectures. Intel Technology Journal, 11 (3): 173--184, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  2. G. Bikshandi, J. Guo, D. Hoeflinger, G. Almasi, B. B. Fraguela, M. J. Garzarn, D. Padua, and C. V. Praun. Programming for parallelism and locality with hierarchically tiled arrays. In PPoPP, pages 48--57, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. D. Bosschere, W. Luk, X. Martorell, N. Navarro, M. O'Boyle, D. Pnevmatikatos, A. Ramirez, P. Sainrat, A. Seznec, P. Stenstrom, and O. Temam. High-performance embedded architecture and compilation roadmap. In Transactions on HiPEAC, volume 1, pages 5--29, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. International Journal of High Performance Computing Applications, 21 (3): 291--312, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In OOPSLA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. A. Chien and W. J. Dally. Concurrent aggregates. In PPoPP, pages 187--196, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Chung, H. Chafi, C. Minh, A. McDonald, B. Carlstrom, C. Kozyrakis, and K. Olukotun. The common case transactional behavior of multithreaded programs. In HPCA, pages 266--277, 2006.Google ScholarGoogle Scholar
  8. U. Consortium. UPC language specifications, v1.2. Lawrence Berkeley National Lab Tech Report LBNL-59208, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  9. L. Dagum and R. Menon. Openmp: An industry-standard api for shared-memory programming. IEEE computational science & engineering, 5 (1): 46--55, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Dotsenko, C. Coarfa, and J. Mellor-Crummey. A multi-platform co-array fortran compiler. In PACT, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Fatahalian, D. R. Horn, T. J. Knight, L. Leem, M. Houston, J. Y. Park, M. Erez, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: Programming the memory hierarchy. In SC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with the Message Passing Interface. The MIT Press, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Hilfinger, D. Bonachea, K. Datta, D. Gay, S. Graham, B. Liblit, G. Pike, J. Su, and K. Yelick. Titanium language reference manual. U.C. Berkeley Tech Report, UCB/EECS-2005-15, 2005.Google ScholarGoogle Scholar
  15. K. Kelsey, T. Bai, C. Ding, and C. Zhang. Fast track: A software system for speculative program optimization. In CGO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. In PLDI, pages 211--222, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Kulkarni, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew. Optimistic parallelism benefits from data partitioning. In ASPLOS, pages 233--243, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Kulkarni, M. Burtscher, R. Inkulu, K. Pingali, and C. Cascaval. How much parallelism is there in irregular applications? In PPoPP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Mehrara, J. Hao, P.-C. Hsu, and S. Mahlke. Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory. In PLDI, pages 166--176, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst., 9 (1): 21--65, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Mendez-Lojo, D. Nguyen, D. Prountzos, X. Sui, M. A. Hassaan, M. Kulkarni, M. Burtscher, and K. Pingali. Structure-drive optimization for amorphous data-parallel programs. In PPoPP, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Prabhu, G. Ramalingam, and K. Vaswani. Safe programmable speculative parallelism. In PLDI, pages 50--61, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Quinlan. Rose: Compiler support for object-oriented framework. In CPC, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  24. R. Rangan, N. Vachharajani, M. Vachharajani, and D. I. August. Decoupled software pipelining with the synchronization array. In PACT, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Reinders. Intel threading building blocks: outfitting C for multi-core processor. O'Reilly Media, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. F. Spear, L. Dalessandro, V. J. Marathe, and M. L. Scott. A comprehensive strategy for contention management in software transactional memory. In PPoPP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Tian, M. Feng, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In MICRO, 2008.Google ScholarGoogle Scholar
  28. C. Tian, M. Feng, and R. Gupta. Supporting speculative parallelization in the presence of dynamic data structures. In PLDI, pages 62--73, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Tian, M. Feng, and R. Gupta. Speculative parallelization using state separation and multiple value prediction. In ISMM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T.-H. Weng, R.-K. Perng, and B. Chapman. Openmp implementation of SPICE3 circuit simulator. Int. J. Parallel Program., 35 (5): 493--505, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Zhou and B. Demsky. Bamboo: a data-centric, object-oriented approach to many-core software. In PLDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SpiceC: scalable parallelism via implicit copying and explicit commit

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 8
      PPoPP '11
      August 2011
      300 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2038037
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '11: Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
        February 2011
        326 pages
        ISBN:9781450301190
        DOI:10.1145/1941553
        • General Chair:
        • Calin Cascaval,
        • Program Chair:
        • Pen-Chung Yew

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 February 2011

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!