ABSTRACT
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has provided a systematic approach for parallelizing irregular applications based on the idea of optimistic or speculative execution of programs. However, the overhead of optimistic parallel execution can be substantial. In this paper, we show that many irregular algorithms have structure that can be exploited and present three key optimizations that take advantage of algorithmic structure to reduce speculative overheads. We describe the implementation of these optimizations in the Galois system and present experimental results to demonstrate their benefits. To the best of our knowledge, this is the first system to exploit algorithmic structure to optimize the execution of irregular programs.
- Daniel K. Blandford, Guy E. Blelloch, and Clemens Kadow. Engineering a compact parallel Delaunay algorithm in 3D. In SCG '06: 22nd Symposium on Computational Geometry, pages 292--300, 2006. Google Scholar
Digital Library
- A. Braunstein, M. Mèzard, and R. Zecchina. Survey propagation: An algorithm for satisfiability. Random Structures and Algorithms, 27(2):201--226, 2005. Google Scholar
Digital Library
- Sigmund Cherem, Trishul Chilimbi, and Sumit Gulwani. Inferring locks for atomic sections. In PLDI '08, pages 304--315, 2008. Google Scholar
Digital Library
- Andrey N. Chernikov and Nikos P. Chrisochoides. Three-dimensional Delaunay refinement for multi-core processors. In ICS '08, 2008. Google Scholar
Digital Library
- Thomas Cormen, Charles Leiserson, Ronald Rivest, and Clifford Stein, editors. Introduction to Algorithms. MIT Press, 2001. Google Scholar
Digital Library
- David Eppstein. Spanning trees and spanners. In J. Sack and J. Urrutia, editors, Handbook of Computational Geometry, pages 425--461. Elsevier, 1999.Google Scholar
- Paul Feautrier. Some efficient solutions to the affine scheduling problem: One dimensional time. International Journal of Parallel Programming, October 1992. Google Scholar
Digital Library
- Andrew V. Goldberg and Robert E. Tarjan. A new approach to the maximum-flow problem. J. ACM, 35(4):921--940, 1988. Google Scholar
Digital Library
- Leonidas J. Guibas, Donald E. Knuth, and Micha Sharir. Randomized incremental construction of delaunay and voronoi diagrams. Algorithmica, 7(1):381--413, December 1992.Google Scholar
Digital Library
- David R. Jefferson. Virtual time. ACM Trans. Program. Lang. Syst., 7(3):404--425, 1985. Google Scholar
Digital Library
- G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96--129, 1998. Google Scholar
Digital Library
- M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. SIGPLAN Not. (Proceedings of PLDI 2007), 42(6):211--222, 2007. Google Scholar
Digital Library
- Milind Kulkarni, Martin Burtscher, Calin Casc¸aval, and Keshav Pingali. Lonestar: A suite of parallel irregular programs. In ISPASS '09: IEEE International Symposium on Performance Analysis of Systems and Software, 2009.Google Scholar
Cross Ref
- Milind Kulkarni, Martin Burtscher, Rajasekhar Inkulu, Keshav Pingali, and Calin Casc¸aval. How much parallelism is there in irregular applications? In PPoPP '09, pages 3--14, 2009. Google Scholar
Digital Library
- Milind Kulkarni, Keshav Pingali, Ganesh Ramanarayanan, BruceWalter, Kavita Bala, and L. Paul Chew. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News, 36(1):233--243, 2008. Google Scholar
Digital Library
- Roberto Lubllinerman, Swarat Chaudhuri, and Pavol Cerny. Parallel programming with object assemblies. In OOPSLA, 2009. Google Scholar
Digital Library
- Timothy Mattson, Beverly Sanders, and Berna Massingill. Patterns for Parallel Programming. Addison-Wesley Publishers, 2004. Google Scholar
Digital Library
- Bill McCloskey, Feng Zhou, David Gay, and Eric Brewer. Autolocker: synchronization inference for atomic sections. In POPL '06, pages 346--358, 2006. Google Scholar
Digital Library
- J. Eliot B. Moss and Antony L. Hosking. Nested transactional memory: model and architecture sketches. Sci. Comput. Program., 63(2):186--201, 2006. Google Scholar
Digital Library
- Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for java. In CC'03, pages 138--152. Springer-Verlag, 2003. Google Scholar
Digital Library
- http://www.openmp.org/.Google Scholar
- D. Patterson, K. Keutzer, K. Asanovica, K. Yelick, and R. Bodik. Berkeley dwarfs. http://view.eecs.berkeley.edu/.Google Scholar
- C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Trans. Comput., 36(12):1425--1439, 1987. Google Scholar
Digital Library
- Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Chi Cao Minh, and Benjamin Hertzberg. McRT-STM: a high performance software transactional memory system for a multi-core runtime. In PPoPP '06, pages 187--197, 2006. Google Scholar
Digital Library
- Marc Snir. http://wing.cs.uiuc.edu/group/patterns/.Google Scholar
- Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. Soot - a java bytecode optimization framework. In CASCON '99, page 13, 1999. Google Scholar
Digital Library
Index Terms
Structure-driven optimizations for amorphous data-parallel programs
Recommendations
The tao of parallelism in algorithms
PLDI '11For more than thirty years, the parallel programming community has used the dependence graph as the main abstraction for reasoning about and exploiting parallelism in "regular" algorithms that use dense arrays, such as finite-differences and FFTs. In ...
Structure-driven optimizations for amorphous data-parallel programs
PPoPP '10Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has provided a systematic approach for parallelizing irregular applications ...
A shape analysis for optimizing parallel graph programs
POPL '11: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesComputations on unstructured graphs are challenging to parallelize because dependences in the underlying algorithms are usually complex functions of runtime data values, thwarting static parallelization. One promising general-purpose parallelization ...







Comments