skip to main content
research-article

ALTER: exploiting breakable dependences for parallelization

Published:04 June 2011Publication History
Skip Abstract Section

Abstract

For decades, compilers have relied on dependence analysis to determine the legality of their transformations. While this conservative approach has enabled many robust optimizations, when it comes to parallelization there are many opportunities that can only be exploited by changing or re-ordering the dependences in the program.

This paper presents Alter: a system for identifying and enforcing parallelism that violates certain dependences while preserving overall program functionality. Based on programmer annotations, Alter exploits new parallelism in loops by reordering iterations or allowing stale reads. Alter can also infer which annotations are likely to benefit the program by using a test-driven framework.

Our evaluation of Alter demonstrates that it uncovers parallelism that is beyond the reach of existing static and dynamic tools. Across a selection of 12 performance-intensive loops, 9 of which have loop-carried dependences, Alter obtains an average speedup of 2.0x on 4 cores.

References

  1. The Parallel Dwarfs Project. http://paralleldwarfs.codeplex.com.Google ScholarGoogle Scholar
  2. F. Aleen and N. Clark. Commutativity Analysis for Software Parallelization: Letting Program Transformations See the Big Picture. In ASPLOS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: An Adaptive, Generic Parallel C++ Library. LNCS, 2624:195--210, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. Yelick. A view of the parallel computing landscape. CACM, 52(10), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Berger, K. McKinley, R. Blumofe, and P. Wilson. Hoard: A Scalable Memory Allocator for Multithreaded Applications. In ASPLOS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Berger, T. Yang, T. Liu, and G. Novark. Grace: Safe Multithreaded Programming for C/C++. In OOPSLA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the sequential programming model for multi-core. In MICRO, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Burckhardt, A. Baldassion, and D. Leijen. Concurrent Programming with Revisions and Isolation Types. In OOPSLA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Burke and R. Cytron. Interprocedural dependence analysis and parallelization. SIGPLAN Notices, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Carlisle. Olden: Parallelizing Programs with Dynamic Data Structures on Distributed-Memory Machines. PhD thesis, Princeton University, June 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Chen, F. Min, V. Nagarajan, and R. Gupta. Copy or Discard Execution Model for Speculative Parallelization on Multicores. In MICRO, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhauser Boston, 1st edition, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Datta. Auto-tuning Stencil Codes for Cache-Based Multicore Platforms. PhD thesis, UC Berkeley, December 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Dias, J. Seco, and J. Lourenco. Snapshot isolation anomalies detection in software transactional memory. In INForum, 2010.Google ScholarGoogle Scholar
  15. C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software Behavior Oriented Parallelization. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Guerraoui and M. Kapalka. On the Correctness of Transactional Memory. In PPoPP, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, and M. Lam. Maximizing Multiprocessor Performance with the SUIF Compiler. IEEE Computer, 29:84--89, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Herlihy and J. E. B. Moss. Transactional Memory: Architectural Support for Lock-free Data Structures. In ISCA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. Hwu, S. Ryoo, S. Ueng, J. Kelm, I. Gelado, S. Stone, R. Kidd, S. Baghsorkhi, A. Mahesri, S. Tsao, N. Navarro, S. Lumetta, M. Frank, and S. Patel. Implicitly parallel programming models for thousand-core microprocessors. In DAC, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Kennedy and J. Allen. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann Publishers Inc., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. Krishnan and J. Torrellas. A Chip-Multiprocessor Architecture with Speculative Multithreading. IEEE Trans. on Computers, 48(9), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali. Lonestar: A Suite of Parallel Irregular Programs. In ISPASS, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Kulkarni, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew. Optimistic Parallelism Benefits from Data Partitioning. In ASPLOS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic Parallelism Requires Abstractions. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Larus and R. Rajwar. Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan & Claypool Publishers, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Maydan, S. Amarasinghe, and M. Lam. Array data-flow analysis and its use in array privatization. In POPL, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Mehrara, J. Hao, P. Hsu, and S. Mahlke. Parallelizing Sequential applications on Commodity Hardware using a Low-cost Software Transactional Memory. In PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford Transactional Applications for Multi-Processing. In IISWC, 2008.Google ScholarGoogle Scholar
  29. S. Misailovic, D. Kim, and M. Rinard. Parallelizing Sequential Programs With Statistical Accuracy Tests. Technical Report MIT-CSAIL-TR-2010-038, MIT, Aug 2010.Google ScholarGoogle Scholar
  30. S. Moon and M. W. Hall. Evaluation of Predicated Array Data-flow Analysis for Automatic Parallelization. In PPoPP, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. W. Press, S. Teukolsky, W. Vetterling, and B. Flannery. Numerical Recipes in C. Cambridge University Press, 2nd edition, 1992.Google ScholarGoogle Scholar
  32. W. Pugh and D. Wonnacott. Constraint-based array dependence analysis. ACM TOPLAS., 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. E. Raman, N. Vachharajani, R. Rangan, and D. August. SPICE: Speculative Parallel Iteration Chunk Execution. In CGO, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Rauchwerger and D. Padua. The Privatizing DOALL Test: A Run-Time Technique for DOALL Loop Identification and Array Privatization. In ICS, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Rauchwerger and D. Padua. The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization. In PLDI, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Reps. Undecidability of context-sensitive data-independence analysis. ACM TOPLAS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Riegel, C. Fetzer, and P. Felber. Snapshot isolation for software transactional memory. In TRANSACT, 2006.Google ScholarGoogle Scholar
  38. M. Rinard and P. Diniz. Commutativity Analysis: A New Analysis Technique for Parallelizing Compilers. ACM TOPLAS, 19(6), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. Sohi, S. Breach, and T. N. Vijaykumar. Multiscalar Processors. In ISCA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. R. Tarjan. A Unified Approach to Path Problems. J. ACM, 28(3), 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. W. Thies, V. Chandrasekhar, and S. Amarasinghe. A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs. In MICRO, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Tian, M. Feng, and R. Gupta. Supporting Speculative Parallelization in the Presence of Dynamic Data Structures. In PLDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Tournavitis, Z. Wang, B. Franke, and M. O'Boyle. Towards a Holistic Approach to Auto-parallelization: Integrating Profile-driven Parallelism Detection and Machine-learning Based Mapping. PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. N. Vachharajani, R. Rangan, E. Raman, M. Bridges, G. Ottoni, and D. August. Speculative Decoupled Software Pipelining. In PACT, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. H. Vandierendonck, S. Rul, and K. Bosschere. The Paralax infrastructure: automatic parallelization with a helping hand. In PACT, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ALTER: exploiting breakable dependences for parallelization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!