skip to main content
article

Synthesis of divide and conquer parallelism for loops

Published:14 June 2017Publication History
Skip Abstract Section

Abstract

Divide-and-conquer is a common parallel programming skeleton supported by many cross-platform multithreaded libraries, and most commonly used by programmers for parallelization. The challenges of producing (manually or automatically) a correct divide-and-conquer parallel program from a given sequential code are two-fold: (1) assuming that a good solution exists where individual worker threads execute a code identical to the sequential one, the programmer has to provide the extra code for dividing the tasks and combining the partial results (i.e. joins), and (2) the sequential code may not be suitable for divide-and-conquer parallelization as is, and may need to be modified to become a part of a good solution. We address both challenges in this paper. We present an automated synthesis technique to synthesize correct joins and an algorithm for modifying the sequential code to make it suitable for parallelization when necessary. This paper focuses on class of loops that traverse a read-only collection and compute a scalar function over that collection. We present theoretical results for when the necessary modifications to sequential code are possible, theoretical guarantees for the algorithmic solutions presented here, and experimental evaluation of the approach's success in practice and the quality of the produced parallel programs.

References

  1. A LON, N., M ATIAS, Y., AND S ZEGEDY, M. The space complexity of approximating the frequency moments. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing (1996), STOC ’96, pp. 20–29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A LUR, R., B OD ´ IK, R., D ALLAL, E., F ISMAN, D., G ARG, P., J UNIWAL, G., K RESS -G AZIT, H., M ADHUSUDAN, P., M ARTIN, M. M. K., R AGHOTHAMAN, M., S AHA, S., S E - SHIA, S. A., S INGH, R., S OLAR -L EZAMA, A., T ORLAK, E., AND U DUPA, A. Syntax-guided synthesis. In Dependable Software Systems Engineering. 2015, pp. 1–25.Google ScholarGoogle Scholar
  3. A PPEL, A. W. SSA is functional programming. SIGPLAN Not. 33, 4 (Apr. 1998), 17–20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B ABCOCK, B., B ABU, S., D ATAR, M., M OTWANI, R., AND W IDOM, J. Models and issues in data stream systems. In Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (2002), PODS ’02, pp. 1–16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B ACON, D. F., G RAHAM, S. L., AND S HARP, O. J. Compiler transformations for high-performance computing. ACM Comput. Surv. 26, 4 (Dec. 1994), 345–420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B ASTOUL, C. Efficient code generation for automatic parallelization and optimization. In Proceedings of the Second International Conference on Parallel and Distributed Computing (2003), ISPDC’03, pp. 23–30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B ASTOUL, C. Code generation in the polyhedral model is easier than you think. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (2004), PACT ’04, pp. 7–16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B EN -A SHER, Y., AND H ABER, G. Parallel solutions of simple indexed recurrence equations. IEEE Trans. Parallel Distrib. Syst. 12, 1 (Jan. 2001), 22–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B IRD, R. S. An introduction to the theory of lists. In Proceedings of the NATO Advanced Study Institute on Logic of Programming and Calculi of Discrete Design (1987), pp. 5– 42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B LELLOCH, G. E. Prefix sums and their applications.Google ScholarGoogle Scholar
  11. B LUME, W., D OALLO, R., E IGENMANN, R., G ROUT, J., H OEFLINGER, J., L AWRENCE, T., L EE, J., P ADUA, D., P AEK, Y., P OTTENGER, B., R AUCHWERGER, L., AND T U, P. Parallel programming with Polaris. Computer 29, 12 (Dec. 1996), 78–82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B OONE, W. W. The word problem. Proceedings of the National Academy of Sciences of the United States of America 44 (1958), 1061–1065.Google ScholarGoogle ScholarCross RefCross Ref
  13. C HIN, W.-N., T AKANO, A., AND H U, Z. Parallelization via context preservation. In Proceedings of the 1998 International Conference on Computer Languages (1998), ICCL ’98, pp. 153–162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C ONTRERAS, G., AND M ARTONOSI, M. Characterizing and improving the performance of Intel Threading Building Blocks. In 4th International Symposium on Workload Characterization (IISWC 2008), Seattle, Washington, USA, September 14-16, 2008 (2008), pp. 57–66.Google ScholarGoogle ScholarCross RefCross Ref
  15. F ARZAN, A., AND N ICOLET, V. Automated synthesis of divide and conquer parallelism.Google ScholarGoogle Scholar
  16. F EDYUKOVICH, G., M AAZ B IN S AFEER, A., AND B ODIK, R. Gradual synthesis for static parallelization of single-pass array-processing programs. In PLDI (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F ISHER, A. L., AND G HULOUM, A. M. Parallelizing complex scans and reductions. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (1994), PLDI ’94, pp. 135–146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G ESER, A., AND G ORLATCH, S. Parallelizing functional programs by generalization. In Proceedings of the 6th International Joint Conference on Algebraic and Logic Programming (1997), ALP ’97-HOA ’97, pp. 46–60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G IBBONS, J. The third homomorphism theorem. J. Funct. Program. 6, 4 (1996), 657–665.Google ScholarGoogle Scholar
  20. G ORLATCH, S. Systematic extraction and implementation of divide-and-conquer parallelism. In Proceedings of the 8th International Symposium on Programming Languages: Implementations, Logics, and Programs (1996), PLILP ’96, pp. 274–288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G ORLATCH, S. Extracting and implementing list homomorphisms in parallel program development. Sci. Comput. Program. 33, 1 (Jan. 1999), 1–27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H ILLIS, W. D., AND S TEELE J R, G. L. Data parallel algorithms. Communications of the ACM 29, 12 (1986), 1170– 1183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H WANSOO, H., AND C HAU -W EN, T. A comparison of parallelization techniques for irregular reductions. In Parallel and Distributed Processing Symposium., Proceedings 15th International (2001), p. 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K AMIL, S., C HEUNG, A., I TZHAKY, S., AND S OLAR - L EZAMA, A. Verified lifting of stencil computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (2016), PLDI ’16, pp. 711–726. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K EJARIWAL, A., D’A LBERTO, P., N ICOLAU, A., AND P OLYCHRONOPOULOS, C. D. A geometric approach for partitioning n-dimensional non-rectangular iteration spaces. In Proceedings of the 17th International Conference on Languages and Compilers for High Performance Computing (2005), LCPC’04, pp. 102–116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K ELSEY, R. A. A correspondence between continuation passing style and static single assignment form. In Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations (1995), IR ’95, pp. 13–22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L ADNER, R. E., AND F ISCHER, M. J. Parallel prefix computation. Journal of the ACM (JACM) 27, 4 (1980), 831–838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L EINO, K. R. M. Dafny: An automatic program verifier for functional correctness. In Logic for Programming, Artificial Intelligence, and Reasoning (LPAR) (2010), pp. 348–370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M ARCH É, C. Normalized rewriting: an alternative to rewriting modulo a set of equations. Journal of Symbolic Computation 21, 3 (1996), 253–288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M ARCH É, C., AND U RBAIN, X. Termination of associativecommutative rewriting by dependency pairs. Springer Berlin Heidelberg, Berlin, Heidelberg, 1998, pp. 241–255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M ORIHATA, A. A short cut to parallelization theorems. In ACM SIGPLAN International Conference on Functional Programming, ICFP’13, Boston, MA, USA - September 25 - 27, 2013 (2013), pp. 245–256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M ORIHATA, A., AND M ATSUZAKI, K. Automatic parallelization of recursive functions using quantifier elimination. In Functional and Logic Programming, 10th International Symposium, FLOPS 2010, Sendai, Japan, April 19-21, 2010. Proceedings (2010), pp. 321–336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M ORIHATA, A., M ATSUZAKI, K., H U, Z., AND T AKEICHI, M. The third homomorphism theorem on trees: downward & upward lead to divide-and-conquer. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2009, Savannah, GA, USA, January 21-23, 2009 (2009), pp. 177–185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M ORITA, K., M ORIHATA, A., M ATSUZAKI, K., H U, Z., AND T AKEICHI, M. Automatic inversion generates divide- and-conquer parallel programs. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (2007), PLDI ’07, pp. 146–155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N ARENDRAN, P., AND R USINOWITCH, M. Any ground associative-commutative theory has a finite canonical system. Springer Berlin Heidelberg, Berlin, Heidelberg, 1991, pp. 423–434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. N ECULA, G. C., M C P EAK, S., R AHUL, S. P., AND W EIMER, W. CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs. 2002.Google ScholarGoogle Scholar
  37. P ADUA, D. A., AND W OLFE, M. J. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec. 1986), 1184–1201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. P APADIMITRIOU, C. H., AND S IPSER, M. Communication complexity. J. Comput. Syst. Sci. 28, 2 (1984), 260–269.Google ScholarGoogle ScholarCross RefCross Ref
  39. P HEATT, C. Intel® threading building blocks. Journal of Computing Sciences in Colleges 23, 4 (2008), 298–298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. P INGALI, K., N GUYEN, D., K ULKARNI, M., B URTSCHER, M., H ASSAAN, M. A., K ALEEM, R., L EE, T.-H., L ENHARTH, A., M ANEVICH, R., M ÉNDEZ -L OJO, M., P ROUNTZOS, D., AND S UI, X. The tao of parallelism in algorithms. SIGPLAN Not. 46, 6 (June 2011), 12–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R AGoogle ScholarGoogle Scholar
  42. R AYCHEV, V., M USUVATHI, M., AND M YTKOWICZ, T. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), SOSP ’15, pp. 153–167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. S ATO, S., AND I WASAKI, H. Automatic parallelization via matrix multiplication. SIGPLAN Not. 46, 6 (June 2011), 470– 479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S MITH, C., AND A LBARGHOUTHI, A. Mapreduce program synthesis. SIGPLAN Not. 51, 6 (June 2016), 326–340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S OLAR -L EZAMA, A., A RNOLD, G., T ANCAU, L., B ODIK, R., S ARASWAT, V., AND S ESHIA, S. Sketching stencils. SIGPLAN Not. 42, 6 (June 2007), 167–178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. S OLAR -L EZAMA, A., J ONES, C. G., AND B ODIK, R. Sketching concurrent data structures. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (2008), PLDI ’08, pp. 136–148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. T ORLAK, E., AND B OD ´ IK, R. Growing solver-aided languages with rosette. In ACM Symposium on New Ideas in Programming and Reflections on Software, Onward! 2013, part of SPLASH ’13, Indianapolis, IN, USA, October 26-31, 2013 (2013), pp. 135–152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. V ASILACHE, N., B ASTOUL, C., AND C OHEN, A. Polyhedral code generation in the real world. In Proceedings of the 15th International Conference on Compiler Construction (2006), CC’06, pp. 185–201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. ˇ C ERN ´ Y, P., H ENZINGER, T. A., R ADHAKRISHNA, A., R YZHYK, L., AND T ARRACH, T. Regression-free synthesis for concurrency. In Proceedings of the 16th International Conference on Computer Aided Verification - Volume 8559 (2014), pp. 568–584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. W ILSON, R., F RENCH, R., W ILSON, C., A MARASINGHE, S., A NDERSON, J., T JIANG, S., L IAO, S., T SENG, C., H ALL, M., L AM, M., AND H ENNESSY, J. The suif compiler system: A parallelizing and optimizing research compiler. Tech. rep., Stanford, CA, USA, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Synthesis of divide and conquer parallelism for loops

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 52, Issue 6
          PLDI '17
          June 2017
          708 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/3140587
          Issue’s Table of Contents
          • cover image ACM Conferences
            PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation
            June 2017
            708 pages
            ISBN:9781450349888
            DOI:10.1145/3062341

          Copyright © 2017 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 June 2017

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!