Abstract
Divide-and-conquer is a common parallel programming skeleton supported by many cross-platform multithreaded libraries, and most commonly used by programmers for parallelization. The challenges of producing (manually or automatically) a correct divide-and-conquer parallel program from a given sequential code are two-fold: (1) assuming that a good solution exists where individual worker threads execute a code identical to the sequential one, the programmer has to provide the extra code for dividing the tasks and combining the partial results (i.e. joins), and (2) the sequential code may not be suitable for divide-and-conquer parallelization as is, and may need to be modified to become a part of a good solution. We address both challenges in this paper. We present an automated synthesis technique to synthesize correct joins and an algorithm for modifying the sequential code to make it suitable for parallelization when necessary. This paper focuses on class of loops that traverse a read-only collection and compute a scalar function over that collection. We present theoretical results for when the necessary modifications to sequential code are possible, theoretical guarantees for the algorithmic solutions presented here, and experimental evaluation of the approach's success in practice and the quality of the produced parallel programs.
- A LON, N., M ATIAS, Y., AND S ZEGEDY, M. The space complexity of approximating the frequency moments. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing (1996), STOC ’96, pp. 20–29. Google Scholar
Digital Library
- A LUR, R., B OD ´ IK, R., D ALLAL, E., F ISMAN, D., G ARG, P., J UNIWAL, G., K RESS -G AZIT, H., M ADHUSUDAN, P., M ARTIN, M. M. K., R AGHOTHAMAN, M., S AHA, S., S E - SHIA, S. A., S INGH, R., S OLAR -L EZAMA, A., T ORLAK, E., AND U DUPA, A. Syntax-guided synthesis. In Dependable Software Systems Engineering. 2015, pp. 1–25.Google Scholar
- A PPEL, A. W. SSA is functional programming. SIGPLAN Not. 33, 4 (Apr. 1998), 17–20. Google Scholar
Digital Library
- B ABCOCK, B., B ABU, S., D ATAR, M., M OTWANI, R., AND W IDOM, J. Models and issues in data stream systems. In Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (2002), PODS ’02, pp. 1–16. Google Scholar
Digital Library
- B ACON, D. F., G RAHAM, S. L., AND S HARP, O. J. Compiler transformations for high-performance computing. ACM Comput. Surv. 26, 4 (Dec. 1994), 345–420. Google Scholar
Digital Library
- B ASTOUL, C. Efficient code generation for automatic parallelization and optimization. In Proceedings of the Second International Conference on Parallel and Distributed Computing (2003), ISPDC’03, pp. 23–30. Google Scholar
Digital Library
- B ASTOUL, C. Code generation in the polyhedral model is easier than you think. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (2004), PACT ’04, pp. 7–16. Google Scholar
Digital Library
- B EN -A SHER, Y., AND H ABER, G. Parallel solutions of simple indexed recurrence equations. IEEE Trans. Parallel Distrib. Syst. 12, 1 (Jan. 2001), 22–37. Google Scholar
Digital Library
- B IRD, R. S. An introduction to the theory of lists. In Proceedings of the NATO Advanced Study Institute on Logic of Programming and Calculi of Discrete Design (1987), pp. 5– 42. Google Scholar
Digital Library
- B LELLOCH, G. E. Prefix sums and their applications.Google Scholar
- B LUME, W., D OALLO, R., E IGENMANN, R., G ROUT, J., H OEFLINGER, J., L AWRENCE, T., L EE, J., P ADUA, D., P AEK, Y., P OTTENGER, B., R AUCHWERGER, L., AND T U, P. Parallel programming with Polaris. Computer 29, 12 (Dec. 1996), 78–82. Google Scholar
Digital Library
- B OONE, W. W. The word problem. Proceedings of the National Academy of Sciences of the United States of America 44 (1958), 1061–1065.Google Scholar
Cross Ref
- C HIN, W.-N., T AKANO, A., AND H U, Z. Parallelization via context preservation. In Proceedings of the 1998 International Conference on Computer Languages (1998), ICCL ’98, pp. 153–162. Google Scholar
Digital Library
- C ONTRERAS, G., AND M ARTONOSI, M. Characterizing and improving the performance of Intel Threading Building Blocks. In 4th International Symposium on Workload Characterization (IISWC 2008), Seattle, Washington, USA, September 14-16, 2008 (2008), pp. 57–66.Google Scholar
Cross Ref
- F ARZAN, A., AND N ICOLET, V. Automated synthesis of divide and conquer parallelism.Google Scholar
- F EDYUKOVICH, G., M AAZ B IN S AFEER, A., AND B ODIK, R. Gradual synthesis for static parallelization of single-pass array-processing programs. In PLDI (2017). Google Scholar
Digital Library
- F ISHER, A. L., AND G HULOUM, A. M. Parallelizing complex scans and reductions. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (1994), PLDI ’94, pp. 135–146. Google Scholar
Digital Library
- G ESER, A., AND G ORLATCH, S. Parallelizing functional programs by generalization. In Proceedings of the 6th International Joint Conference on Algebraic and Logic Programming (1997), ALP ’97-HOA ’97, pp. 46–60. Google Scholar
Digital Library
- G IBBONS, J. The third homomorphism theorem. J. Funct. Program. 6, 4 (1996), 657–665.Google Scholar
- G ORLATCH, S. Systematic extraction and implementation of divide-and-conquer parallelism. In Proceedings of the 8th International Symposium on Programming Languages: Implementations, Logics, and Programs (1996), PLILP ’96, pp. 274–288. Google Scholar
Digital Library
- G ORLATCH, S. Extracting and implementing list homomorphisms in parallel program development. Sci. Comput. Program. 33, 1 (Jan. 1999), 1–27. Google Scholar
Digital Library
- H ILLIS, W. D., AND S TEELE J R, G. L. Data parallel algorithms. Communications of the ACM 29, 12 (1986), 1170– 1183. Google Scholar
Digital Library
- H WANSOO, H., AND C HAU -W EN, T. A comparison of parallelization techniques for irregular reductions. In Parallel and Distributed Processing Symposium., Proceedings 15th International (2001), p. 27. Google Scholar
Digital Library
- K AMIL, S., C HEUNG, A., I TZHAKY, S., AND S OLAR - L EZAMA, A. Verified lifting of stencil computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (2016), PLDI ’16, pp. 711–726. Google Scholar
Digital Library
- K EJARIWAL, A., D’A LBERTO, P., N ICOLAU, A., AND P OLYCHRONOPOULOS, C. D. A geometric approach for partitioning n-dimensional non-rectangular iteration spaces. In Proceedings of the 17th International Conference on Languages and Compilers for High Performance Computing (2005), LCPC’04, pp. 102–116. Google Scholar
Digital Library
- K ELSEY, R. A. A correspondence between continuation passing style and static single assignment form. In Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations (1995), IR ’95, pp. 13–22. Google Scholar
Digital Library
- L ADNER, R. E., AND F ISCHER, M. J. Parallel prefix computation. Journal of the ACM (JACM) 27, 4 (1980), 831–838. Google Scholar
Digital Library
- L EINO, K. R. M. Dafny: An automatic program verifier for functional correctness. In Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (2010), pp. 348–370. Google Scholar
Digital Library
- M ARCH É, C. Normalized rewriting: an alternative to rewriting modulo a set of equations. Journal of Symbolic Computation 21, 3 (1996), 253–288. Google Scholar
Digital Library
- M ARCH É, C., AND U RBAIN, X. Termination of associativecommutative rewriting by dependency pairs. Springer Berlin Heidelberg, Berlin, Heidelberg, 1998, pp. 241–255. Google Scholar
Digital Library
- M ORIHATA, A. A short cut to parallelization theorems. In ACM SIGPLAN International Conference on Functional Programming, ICFP’13, Boston, MA, USA - September 25 - 27, 2013 (2013), pp. 245–256. Google Scholar
Digital Library
- M ORIHATA, A., AND M ATSUZAKI, K. Automatic parallelization of recursive functions using quantifier elimination. In Functional and Logic Programming, 10th International Symposium, FLOPS 2010, Sendai, Japan, April 19-21, 2010. Proceedings (2010), pp. 321–336. Google Scholar
Digital Library
- M ORIHATA, A., M ATSUZAKI, K., H U, Z., AND T AKEICHI, M. The third homomorphism theorem on trees: downward & upward lead to divide-and-conquer. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, GA, USA, January 21-23, 2009 (2009), pp. 177–185. Google Scholar
Digital Library
- M ORITA, K., M ORIHATA, A., M ATSUZAKI, K., H U, Z., AND T AKEICHI, M. Automatic inversion generates divide- and-conquer parallel programs. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (2007), PLDI ’07, pp. 146–155. Google Scholar
Digital Library
- N ARENDRAN, P., AND R USINOWITCH, M. Any ground associative-commutative theory has a finite canonical system. Springer Berlin Heidelberg, Berlin, Heidelberg, 1991, pp. 423–434. Google Scholar
Digital Library
- N ECULA, G. C., M C P EAK, S., R AHUL, S. P., AND W EIMER, W. CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs. 2002.Google Scholar
- P ADUA, D. A., AND W OLFE, M. J. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec. 1986), 1184–1201. Google Scholar
Digital Library
- P APADIMITRIOU, C. H., AND S IPSER, M. Communication complexity. J. Comput. Syst. Sci. 28, 2 (1984), 260–269.Google Scholar
Cross Ref
- P HEATT, C. Intel® threading building blocks. Journal of Computing Sciences in Colleges 23, 4 (2008), 298–298. Google Scholar
Digital Library
- P INGALI, K., N GUYEN, D., K ULKARNI, M., B URTSCHER, M., H ASSAAN, M. A., K ALEEM, R., L EE, T.-H., L ENHARTH, A., M ANEVICH, R., M ÉNDEZ -L OJO, M., P ROUNTZOS, D., AND S UI, X. The tao of parallelism in algorithms. SIGPLAN Not. 46, 6 (June 2011), 12–25. Google Scholar
Digital Library
- R AGoogle Scholar
- R AYCHEV, V., M USUVATHI, M., AND M YTKOWICZ, T. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), SOSP ’15, pp. 153–167. Google Scholar
Digital Library
- S ATO, S., AND I WASAKI, H. Automatic parallelization via matrix multiplication. SIGPLAN Not. 46, 6 (June 2011), 470– 479. Google Scholar
Digital Library
- S MITH, C., AND A LBARGHOUTHI, A. Mapreduce program synthesis. SIGPLAN Not. 51, 6 (June 2016), 326–340. Google Scholar
Digital Library
- S OLAR -L EZAMA, A., A RNOLD, G., T ANCAU, L., B ODIK, R., S ARASWAT, V., AND S ESHIA, S. Sketching stencils. SIGPLAN Not. 42, 6 (June 2007), 167–178. Google Scholar
Digital Library
- S OLAR -L EZAMA, A., J ONES, C. G., AND B ODIK, R. Sketching concurrent data structures. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (2008), PLDI ’08, pp. 136–148. Google Scholar
Digital Library
- T ORLAK, E., AND B OD ´ IK, R. Growing solver-aided languages with rosette. In ACM Symposium on New Ideas in Programming and Reflections on Software, Onward! 2013, part of SPLASH ’13, Indianapolis, IN, USA, October 26-31, 2013 (2013), pp. 135–152. Google Scholar
Digital Library
- V ASILACHE, N., B ASTOUL, C., AND C OHEN, A. Polyhedral code generation in the real world. In Proceedings of the 15th International Conference on Compiler Construction (2006), CC’06, pp. 185–201. Google Scholar
Digital Library
- ˇ C ERN ´ Y, P., H ENZINGER, T. A., R ADHAKRISHNA, A., R YZHYK, L., AND T ARRACH, T. Regression-free synthesis for concurrency. In Proceedings of the 16th International Conference on Computer Aided Verification - Volume 8559 (2014), pp. 568–584. Google Scholar
Digital Library
- W ILSON, R., F RENCH, R., W ILSON, C., A MARASINGHE, S., A NDERSON, J., T JIANG, S., L IAO, S., T SENG, C., H ALL, M., L AM, M., AND H ENNESSY, J. The suif compiler system: A parallelizing and optimizing research compiler. Tech. rep., Stanford, CA, USA, 1994. Google Scholar
Digital Library
Index Terms
Synthesis of divide and conquer parallelism for loops
Recommendations
Modular divide-and-conquer parallelization of nested loops
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and ImplementationWe propose a methodology for automatic generation of divide-and-conquer parallel implementations of sequential nested loops. We focus on a class of loops that traverse read-only multidimensional collections (lists or arrays) and compute a function over ...
Phased synthesis of divide and conquer programs
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and ImplementationWe propose a fully automated method that takes as input an iterative or recursive reference implementation and produces divide-and-conquer implementations that are functionally equivalent to the input. Three interdependent components have to be ...
Synthesis of divide and conquer parallelism for loops
PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and ImplementationDivide-and-conquer is a common parallel programming skeleton supported by many cross-platform multithreaded libraries, and most commonly used by programmers for parallelization. The challenges of producing (manually or automatically) a correct divide-...






Comments