skip to main content
research-article

Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator

Published:17 October 2010Publication History
Skip Abstract Section

Abstract

While most approaches to automatic parallelization focus on compilation approaches for parallelizing loop iterations, we advocate the need for new virtual machines that can parallelize the execution of recursive programs. In this paper, we show that recursive programs can be effectively parallelized when arguments to procedures are evaluated concurrently and branches of conditional statements are speculatively executed in parallel. We introduce the continuator concept, a runtime structure that tracks and manages the control dependences between such concurrently spawned tasks, ensuring adherence to the sequential semantics of the parallelized program. As a proof of concept, we discuss the details of a parallel interpreter for Scheme (implemented in Common Lisp) based on these ideas, and show the results from executing the Clinger benchmark suite for Scheme.

References

  1. }}H. Abelson and G. J. Sussman. Structure and Interpretation of Computer Programs. MIT Press, Cambridge, MA, USA, 1996. ISBN 0262011530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. }}U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing. In SPAA '00: Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures, pages 1--12, New York, NY, USA, 2000. ACM. ISBN 1-58113-185-2. doi: http://doi.acm.org/10.1145/341800.341801. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. }}S. Agarwal, R. Barik, D. Bonachea, V. Sarkar, R. K. Shyamasundar, and K. Yelick. Deadlock-free scheduling of X10 computations with bounded resources. In SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, pages 229--240, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-667-7. doi: http://doi.acm.org/10.1145/1248377.1248416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. }}A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, & Tools with Gradiance. Addison-Wesley Publishing Company, USA, 2007. ISBN 0321547985, 9780321547989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. }}E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele, Jr., and S. Tobin-Hochstadt. The Fortress Language Specification, version 1.0, March 2008.Google ScholarGoogle Scholar
  6. }}J. Aycock. A brief history of just-in-time. ACM Comput. Surv., 35(2):97--113, 2003. ISSN 0360-0300. doi: http://doi.acm.org/10.1145/857076.857077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. }}R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999. ISSN 0004-5411. doi: http://doi.acm.org/10.1145/324133.324234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. }}R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. SIGPLAN Not., 30(8):207--216, 1995. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/209937.209958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. }}U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. SIGPLAN Not., 43(6):101--113, 2008. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/1379022.1375595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. }}P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Comput., 24(3-4):421--444, 1998. ISSN 0167-8191. doi: http://dx.doi.org/10.1016/S0167-8191(98)00020-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. }}D. Chase and Y. Lev. Dynamic circular work-stealing deque. In SPAA '05: Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, pages 21--28, New York, NY, USA, 2005. ACM. ISBN 1-58113-986-1. doi: http://doi.acm.org/10.1145/1073970.1073974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. }}M. Cintra and D. R. Llanos. Toward efficient and robust software speculative parallelization on multiprocessors. In PPoPP '03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 13--24, New York, NY, USA, 2003. ACM. ISBN 1-58113-588-2. doi: http://doi.acm.org/10.1145/781498.781501. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. }}M. Cintra and D. R. Llanos. Design space exploration of a software speculative parallelization scheme. IEEE Trans. Parallel Distrib. Syst., 16(6):562--576, 2005. ISSN 1045-9219. doi: http://dx.doi.org/10.1109/TPDS.2005.69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. }}W. Clinger. Twobit and Larceny benchmark suite. http://www.ccs.neu.edu/home/will/Twobit/.Google ScholarGoogle Scholar
  15. }}P. Costanza, C. Herzeel, and T. D'Hondt. Context-oriented Software Transactional Memory in Common Lisp. In DLS '09: Proceedings of the 5th symposium on Dynamic languages, pages 59--68, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-769-1. doi: http://doi.acm.org/10.1145/1640134.1640144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. }}L. G. DeMichiel and R. P. Gabriel. The Common Lisp Object System: an overview. In European conference on objectoriented programming on ECOOP '87, pages 151--170, London, UK, 1987. Springer-Verlag. ISBN 0-387-18353-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. }}T. D'Hondt. Are Bytecodes an Atavism? In Self-Sustaining Systems: First Workshop, S3 2008 Potsdam, Germany, May 15-16, 2008 Revised Selected Papers, pages 140--155, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-89274-8. doi: http://dx.doi.org/10.1007/978-3-540-89275-5 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. }}J. Dongarra, I. Foster, G. Fox,W. Gropp, K. Kennedy, L. Torczon, and A. White, editors. Sourcebook of parallel computing. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003. ISBN 1-55860-871-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. }}D. P. Friedman and M. Wand. Essentials of Programming Languages, 3rd Edition. The MIT Press, 2008. ISBN 0262062798, 9780262062794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. }}R. P. Gabriel. Performance and evaluation of LISP systems. Massachusetts Institute of Technology, Cambridge, MA, USA, 1985. ISBN 0-262-07093-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. }}R. P. Gabriel and J. McCarthy. Queue-based multi-processing LISP. In LFP '84: Proceedings of the 1984 ACM Symposium on LISP and functional programming, pages 25--44, New York, NY, USA, 1984. ACM. ISBN 0-89791-142-3. doi: http://doi.acm.org/10.1145/800055.802019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. }}R. P. Gabriel and G. L. Steele, Jr. A pattern of language evolution. In C. Herzeel, editor, LISP50: Celebrating the 50th Anniversary of Lisp, pages 1--10, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-383-9. doi: http://doi.acm.org/10.1145/1529966.1529967. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. }}A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in- time type specialization for dynamic languages. pages 465--478, 2009. doi: http://doi.acm.org/10.1145/1542476.1542528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. }}R. Goldman and R. P. Gabriel. Preliminary results with the initial implementation of Qlisp. In LFP '88: Proceedings of the 1988 ACM conference on LISP and functional programming, pages 143--152, New York, NY, USA, 1988. ACM. ISBN 0-89791-273-X. doi: http://doi.acm.org/10.1145/62678.62696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. }}R. Goldman and R. P. Gabriel. Qlisp: Parallel Processing in Lisp. IEEE Software, 6(4):51--59, 1989. doi: doi:10.1109/52.31652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. }}Google. V8 JavaScript Engine. http://code.google.com/p/v8/.Google ScholarGoogle Scholar
  27. }}J. Gosling and H. McGilton. The Java Language Environment: A White Paper. Technical report, Sun Microsystems, Menlo Park, CA, USA, May 1996.Google ScholarGoogle Scholar
  28. }}R. H. Halstead. Multilisp: A language for concurrent symbolic computing. ACM transactions on languages and systems, 7 (4):501--538, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. }}T. Harris and K. Fraser. Language Support for Lightweight Transactions. OOPSLA'03, Proceedings, 2003. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/949343.949340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. }}J. L. Hennessy and D. A. Patterson. Computer Architecture: a quantitative approach. Morgan Kaufmann Publishers, San Francisco, CA, USA, fourth edition, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. }}C. Herzeel, P. Costanza, and T. D'Hondt. Controlling dynamic parallelization through layered reflection. In "7th Workshop on Parallel/High-Performance Object-Oriented Scientific Computing (POOSC'08)", 2008.Google ScholarGoogle Scholar
  32. }}C. Herzeel, P. Costanza, and T. D'Hondt. An Extensible Interpreter Framework for Software Transactional Memory. Journal of Universal Computer Science, 16(2):221--245, 2010.Google ScholarGoogle Scholar
  33. }}W. Horwat. Mozilla Javascript repository. http://mxr.mozilla.org/mozilla/source/js2/semantics/, 11 1999.Google ScholarGoogle Scholar
  34. }}P. Hudak. Yale Haskell '91 implementation. http://www.haskell.org/haskellwiki/Implementations, 10 1991.Google ScholarGoogle Scholar
  35. }}Intel. Intel Cilk SDK Programmer's Guide. http://software.intel.com/en-us/articles/intel-cilk/.Google ScholarGoogle Scholar
  36. }}A. C. Kay. The early history of Smalltalk. SIGPLAN Not., 28(3):69--95, 1993. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/155360.155364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. }}K. Kennedy and J. R. Allen. Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. ISBN 1-55860-286-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. }}B.W. Kernighan and C. J. VanWyk. Timing trials, or the trials of timing: experiments with scripting and user-interface languages. Software: Practice and Experience, 28(8):819--843, 1998. ISSN 0038-0644. doi: http://dx.doi.org/10.1002/(SICI)1097-024X(19980710)28:8h819::AID-SPE184i3.3.CO;2-F. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. }}B.W. Kernighan and C. J. VanWyk. Timing trials, or the trials of timing: experiments with scripting and user-interface languages. Software: Practice and Experience, 28(8):819--843, 1998. ISSN 0038-0644. doi: http://dx.doi.org/10.1002/(SICI)1097-024X(19980710)28:8h819::AID-SPE184i3.3.CO;2-F. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. }}T. Kistler and M. Franz. A Tree-Based Alternative to Java Byte-Codes. Int. J. Parallel Program., 27(1):21--33, 1999. ISSN 0885-7458. doi: http://dx.doi.org/10.1023/A:1018740018601. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. }}T. Kotzmann, C. Wimmer, H. Mossenbock, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpotTMclient compiler for Java 6. ACM Trans. Archit. Code Optim., 5(1):1--32, 2008. ISSN 1544-3566. doi: http://doi.acm.org/10.1145/1369396.1370017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. }}J. R. Larus and R. Rajwar. Transactional Memory. Morgan Claypool Publishers, USA, 2007. ISBN 1-59829-124-6.Google ScholarGoogle Scholar
  43. }}H. Masuhara, G. Kiczales, and C. Dutchyn. A compilation and optimization model for aspect-oriented programs. In CC'03: Proceedings of the 12th international conference on Compiler construction, pages 46--60, Berlin, Heidelberg, 2003. Springer-Verlag. ISBN 3-540-00904-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. }}J. McCarthy. History of LISP. In History of programming languages I, pages 173--185, New York, NY, USA, 1981. ACM. ISBN 0-12-745040-8. doi: http://doi.acm.org/10.1145/800025.1198360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. }}H. G. Okuna and A. Gupta. Parallel Execution of OPSS in QLISP. Technical report, Stanford University, Stanford, CA, USA, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. }}R. B. Osborne. Speculative Computation in Multilisp. PhD thesis, Massachusetts Institute of Technology, December 1989.Google ScholarGoogle Scholar
  47. }}R. B. Osborne. Speculative Computation in Multilisp. In LFP '90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 198--208, New York, NY, USA, 1990. ACM. ISBN 0-89791-368-X. doi: http://doi.acm.org/10.1145/91556.91644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. }}M. Pall. The LuaJIT Project. http://luajit.org/.Google ScholarGoogle Scholar
  49. }}J. D. Pehoushek and J. S. Weening. Low-cost process creation and dynamic partitioning in Qlisp. In Proceedings of the US/Japan workshop on Parallel Lisp on Parallel Lisp: languages and systems, pages 182--199, New York, NY, USA, 1990. Springer-Verlag New York, Inc. ISBN 0-387-52782-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. }}S. L. Peyton Jones. Parallel implementations of functional programming languages. Comput. J., 32(2):175--186, 1989. ISSN 0010-4620. doi: http://dx.doi.org/10.1093/comjnl/32.2.175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. }}C. Queinnec. Lisp in small pieces. Cambridge University Press, New York, NY, USA, 1996. ISBN 0-521-56247-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. }}C. Quinones, C. Madriles, J. Sanchez, P. Marcuello, A. Gonzalez, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. SIGPLAN Not., 40(6):269--279, 2005. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/1064978.1065043. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. }}R. Rugina and M. Rinard. Automatic parallelization of divide and conquer algorithms. In PPoPP '99: Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 72--83, New York, NY, USA, 1999. ACM. ISBN 1-58113-100-3. doi: http://doi.acm.org/10.1145/301104.301111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. }}P. Rundberg and P. Stenstrom. Low-cost thread-level data dependence speculation on multiprocessors. In Proc. of the workshop on Multithreading Execution and Compilation at MICRO-33, 2000.Google ScholarGoogle Scholar
  55. }}S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 2003. ISBN 0137903952. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. }}M. Sperber, R. K. Dybvig, M. Flatt, A. van Straaten, R. Kelsey, W. Clinger, J. Reese, R. B. Findler, and J. Matthews. Revised6 Report on the Algorithmic Language Scheme, September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. }}G. L. Steele, Jr. LAMBDA: The Ultimate Declarative. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. }}G. L. Steele, Jr. Debunking the "Expensive Procedure Call"" Myth or, Procedure Call Implementations Considered Harmful or, LAMDBA: The Ultimate GOTO. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. }}G. L. Steele, Jr. Rabbit: A Compiler for Scheme. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. }}G. L. Steele, Jr. Organizing Functional Code for Parallel Execution; or, foldl and foldr Considered Slightly Harmful. Keynote at the 14th ACM SIGPLAN International Conference on Functional Programming (ICFP 2009), http://www.vimeo.com/6624203, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. }}G. L. Steele, Jr. and G. J. Sussman. Lambda: The Ultimate Imperative. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. }}G. L. Steele, Jr. and G. J. Sussman. The Art of the Interpreter or, The Modularity Complex (Parts Zero, One, and Two). Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. }}G. L. Steele, Jr and G. J. Sussman. Design of LISP-based processors, or SCHEME: A Dielectric LISP, or Finite Memories Considered Harmful, or LAMBDA: The Ultimate Opcode. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. }}G. J. Sussman and G. L. Steele, Jr. An Interpreter for Extended Lambda Calculus. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. }}J. Weening. Parallel execution of Lisp programs. PhD thesis, Stanford Comput. Sci. Rep. STANCS-89-1265, June 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!