Abstract
Many sequential loops are actually scans or reductions and can be parallelized across iterations despite the loop-carried dependences. In this work, we consider the parallelization of such scan/reduction loops, and propose a practical runtime approach called sampling-and-reconstruction to extract the hidden scan/reduction patterns in these loops.
- Azadeh Farzan and Victor Nicolet. 2017. Synthesis of Divide and Conquer Parallelism for Loops (PLDI 2017). 540--555. Google Scholar
Digital Library
- Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing User-defined Aggregations Using Symbolic Execution (SOSP '15). Google Scholar
Digital Library
Index Terms
Revealing parallel scans and reductions in sequential loops through function reconstruction
Recommendations
Revealing parallel scans and reductions in sequential loops through function reconstruction
PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingMany sequential loops are actually scans or reductions and can be parallelized across iterations despite the loop-carried dependences. In this work, we consider the parallelization of such scan/reduction loops, and propose a practical runtime approach ...
Revealing parallel scans and reductions in recurrences through function reconstruction
PACT '18: Proceedings of the 27th International Conference on Parallel Architectures and Compilation TechniquesMany sequential loops are actually recurrences and can be parallelized across iterations as scans or reductions. Many efforts over the past 2+ decades have focused on parallelizing such loops by extracting and exploiting the hidden scan/reduction ...
Automatic privatization for parallel execution of loops
ICAISC'12: Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part IIPrivatization of data is an important technique that has been used by compilers to parallelize loops by eliminating storage-related dependences. The code can be executed on multi-processors machines in reduced period of time. In this paper, we present ...







Comments