Abstract
While most approaches to automatic parallelization focus on compilation approaches for parallelizing loop iterations, we advocate the need for new virtual machines that can parallelize the execution of recursive programs. In this paper, we show that recursive programs can be effectively parallelized when arguments to procedures are evaluated concurrently and branches of conditional statements are speculatively executed in parallel. We introduce the continuator concept, a runtime structure that tracks and manages the control dependences between such concurrently spawned tasks, ensuring adherence to the sequential semantics of the parallelized program. As a proof of concept, we discuss the details of a parallel interpreter for Scheme (implemented in Common Lisp) based on these ideas, and show the results from executing the Clinger benchmark suite for Scheme.
- }}H. Abelson and G. J. Sussman. Structure and Interpretation of Computer Programs. MIT Press, Cambridge, MA, USA, 1996. ISBN 0262011530. Google Scholar
Digital Library
- }}U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing. In SPAA '00: Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures, pages 1--12, New York, NY, USA, 2000. ACM. ISBN 1-58113-185-2. doi: http://doi.acm.org/10.1145/341800.341801. Google Scholar
Digital Library
- }}S. Agarwal, R. Barik, D. Bonachea, V. Sarkar, R. K. Shyamasundar, and K. Yelick. Deadlock-free scheduling of X10 computations with bounded resources. In SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, pages 229--240, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-667-7. doi: http://doi.acm.org/10.1145/1248377.1248416. Google Scholar
Digital Library
- }}A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, & Tools with Gradiance. Addison-Wesley Publishing Company, USA, 2007. ISBN 0321547985, 9780321547989. Google Scholar
Digital Library
- }}E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele, Jr., and S. Tobin-Hochstadt. The Fortress Language Specification, version 1.0, March 2008.Google Scholar
- }}J. Aycock. A brief history of just-in-time. ACM Comput. Surv., 35(2):97--113, 2003. ISSN 0360-0300. doi: http://doi.acm.org/10.1145/857076.857077. Google Scholar
Digital Library
- }}R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46(5):720--748, 1999. ISSN 0004-5411. doi: http://doi.acm.org/10.1145/324133.324234. Google Scholar
Digital Library
- }}R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. SIGPLAN Not., 30(8):207--216, 1995. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/209937.209958. Google Scholar
Digital Library
- }}U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. SIGPLAN Not., 43(6):101--113, 2008. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/1379022.1375595. Google Scholar
Digital Library
- }}P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Comput., 24(3-4):421--444, 1998. ISSN 0167-8191. doi: http://dx.doi.org/10.1016/S0167-8191(98)00020-9. Google Scholar
Digital Library
- }}D. Chase and Y. Lev. Dynamic circular work-stealing deque. In SPAA '05: Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, pages 21--28, New York, NY, USA, 2005. ACM. ISBN 1-58113-986-1. doi: http://doi.acm.org/10.1145/1073970.1073974. Google Scholar
Digital Library
- }}M. Cintra and D. R. Llanos. Toward efficient and robust software speculative parallelization on multiprocessors. In PPoPP '03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 13--24, New York, NY, USA, 2003. ACM. ISBN 1-58113-588-2. doi: http://doi.acm.org/10.1145/781498.781501. Google Scholar
Digital Library
- }}M. Cintra and D. R. Llanos. Design space exploration of a software speculative parallelization scheme. IEEE Trans. Parallel Distrib. Syst., 16(6):562--576, 2005. ISSN 1045-9219. doi: http://dx.doi.org/10.1109/TPDS.2005.69. Google Scholar
Digital Library
- }}W. Clinger. Twobit and Larceny benchmark suite. http://www.ccs.neu.edu/home/will/Twobit/.Google Scholar
- }}P. Costanza, C. Herzeel, and T. D'Hondt. Context-oriented Software Transactional Memory in Common Lisp. In DLS '09: Proceedings of the 5th symposium on Dynamic languages, pages 59--68, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-769-1. doi: http://doi.acm.org/10.1145/1640134.1640144. Google Scholar
Digital Library
- }}L. G. DeMichiel and R. P. Gabriel. The Common Lisp Object System: an overview. In European conference on objectoriented programming on ECOOP '87, pages 151--170, London, UK, 1987. Springer-Verlag. ISBN 0-387-18353-1. Google Scholar
Digital Library
- }}T. D'Hondt. Are Bytecodes an Atavism? In Self-Sustaining Systems: First Workshop, S3 2008 Potsdam, Germany, May 15-16, 2008 Revised Selected Papers, pages 140--155, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-89274-8. doi: http://dx.doi.org/10.1007/978-3-540-89275-5 8. Google Scholar
Digital Library
- }}J. Dongarra, I. Foster, G. Fox,W. Gropp, K. Kennedy, L. Torczon, and A. White, editors. Sourcebook of parallel computing. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003. ISBN 1-55860-871-0. Google Scholar
Digital Library
- }}D. P. Friedman and M. Wand. Essentials of Programming Languages, 3rd Edition. The MIT Press, 2008. ISBN 0262062798, 9780262062794. Google Scholar
Digital Library
- }}R. P. Gabriel. Performance and evaluation of LISP systems. Massachusetts Institute of Technology, Cambridge, MA, USA, 1985. ISBN 0-262-07093-6. Google Scholar
Digital Library
- }}R. P. Gabriel and J. McCarthy. Queue-based multi-processing LISP. In LFP '84: Proceedings of the 1984 ACM Symposium on LISP and functional programming, pages 25--44, New York, NY, USA, 1984. ACM. ISBN 0-89791-142-3. doi: http://doi.acm.org/10.1145/800055.802019. Google Scholar
Digital Library
- }}R. P. Gabriel and G. L. Steele, Jr. A pattern of language evolution. In C. Herzeel, editor, LISP50: Celebrating the 50th Anniversary of Lisp, pages 1--10, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-383-9. doi: http://doi.acm.org/10.1145/1529966.1529967. Google Scholar
Digital Library
- }}A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in- time type specialization for dynamic languages. pages 465--478, 2009. doi: http://doi.acm.org/10.1145/1542476.1542528. Google Scholar
Digital Library
- }}R. Goldman and R. P. Gabriel. Preliminary results with the initial implementation of Qlisp. In LFP '88: Proceedings of the 1988 ACM conference on LISP and functional programming, pages 143--152, New York, NY, USA, 1988. ACM. ISBN 0-89791-273-X. doi: http://doi.acm.org/10.1145/62678.62696. Google Scholar
Digital Library
- }}R. Goldman and R. P. Gabriel. Qlisp: Parallel Processing in Lisp. IEEE Software, 6(4):51--59, 1989. doi: doi:10.1109/52.31652. Google Scholar
Digital Library
- }}Google. V8 JavaScript Engine. http://code.google.com/p/v8/.Google Scholar
- }}J. Gosling and H. McGilton. The Java Language Environment: A White Paper. Technical report, Sun Microsystems, Menlo Park, CA, USA, May 1996.Google Scholar
- }}R. H. Halstead. Multilisp: A language for concurrent symbolic computing. ACM transactions on languages and systems, 7 (4):501--538, 1985. Google Scholar
Digital Library
- }}T. Harris and K. Fraser. Language Support for Lightweight Transactions. OOPSLA'03, Proceedings, 2003. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/949343.949340. Google Scholar
Digital Library
- }}J. L. Hennessy and D. A. Patterson. Computer Architecture: a quantitative approach. Morgan Kaufmann Publishers, San Francisco, CA, USA, fourth edition, 2007. Google Scholar
Digital Library
- }}C. Herzeel, P. Costanza, and T. D'Hondt. Controlling dynamic parallelization through layered reflection. In "7th Workshop on Parallel/High-Performance Object-Oriented Scientific Computing (POOSC'08)", 2008.Google Scholar
- }}C. Herzeel, P. Costanza, and T. D'Hondt. An Extensible Interpreter Framework for Software Transactional Memory. Journal of Universal Computer Science, 16(2):221--245, 2010.Google Scholar
- }}W. Horwat. Mozilla Javascript repository. http://mxr.mozilla.org/mozilla/source/js2/semantics/, 11 1999.Google Scholar
- }}P. Hudak. Yale Haskell '91 implementation. http://www.haskell.org/haskellwiki/Implementations, 10 1991.Google Scholar
- }}Intel. Intel Cilk SDK Programmer's Guide. http://software.intel.com/en-us/articles/intel-cilk/.Google Scholar
- }}A. C. Kay. The early history of Smalltalk. SIGPLAN Not., 28(3):69--95, 1993. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/155360.155364. Google Scholar
Digital Library
- }}K. Kennedy and J. R. Allen. Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002. ISBN 1-55860-286-0. Google Scholar
Digital Library
- }}B.W. Kernighan and C. J. VanWyk. Timing trials, or the trials of timing: experiments with scripting and user-interface languages. Software: Practice and Experience, 28(8):819--843, 1998. ISSN 0038-0644. doi: http://dx.doi.org/10.1002/(SICI)1097-024X(19980710)28:8h819::AID-SPE184i3.3.CO;2-F. Google Scholar
Digital Library
- }}B.W. Kernighan and C. J. VanWyk. Timing trials, or the trials of timing: experiments with scripting and user-interface languages. Software: Practice and Experience, 28(8):819--843, 1998. ISSN 0038-0644. doi: http://dx.doi.org/10.1002/(SICI)1097-024X(19980710)28:8h819::AID-SPE184i3.3.CO;2-F. Google Scholar
Digital Library
- }}T. Kistler and M. Franz. A Tree-Based Alternative to Java Byte-Codes. Int. J. Parallel Program., 27(1):21--33, 1999. ISSN 0885-7458. doi: http://dx.doi.org/10.1023/A:1018740018601. Google Scholar
Digital Library
- }}T. Kotzmann, C. Wimmer, H. Mossenbock, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpotTMclient compiler for Java 6. ACM Trans. Archit. Code Optim., 5(1):1--32, 2008. ISSN 1544-3566. doi: http://doi.acm.org/10.1145/1369396.1370017. Google Scholar
Digital Library
- }}J. R. Larus and R. Rajwar. Transactional Memory. Morgan Claypool Publishers, USA, 2007. ISBN 1-59829-124-6.Google Scholar
- }}H. Masuhara, G. Kiczales, and C. Dutchyn. A compilation and optimization model for aspect-oriented programs. In CC'03: Proceedings of the 12th international conference on Compiler construction, pages 46--60, Berlin, Heidelberg, 2003. Springer-Verlag. ISBN 3-540-00904-3. Google Scholar
Digital Library
- }}J. McCarthy. History of LISP. In History of programming languages I, pages 173--185, New York, NY, USA, 1981. ACM. ISBN 0-12-745040-8. doi: http://doi.acm.org/10.1145/800025.1198360. Google Scholar
Digital Library
- }}H. G. Okuna and A. Gupta. Parallel Execution of OPSS in QLISP. Technical report, Stanford University, Stanford, CA, USA, 1987. Google Scholar
Digital Library
- }}R. B. Osborne. Speculative Computation in Multilisp. PhD thesis, Massachusetts Institute of Technology, December 1989.Google Scholar
- }}R. B. Osborne. Speculative Computation in Multilisp. In LFP '90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 198--208, New York, NY, USA, 1990. ACM. ISBN 0-89791-368-X. doi: http://doi.acm.org/10.1145/91556.91644. Google Scholar
Digital Library
- }}M. Pall. The LuaJIT Project. http://luajit.org/.Google Scholar
- }}J. D. Pehoushek and J. S. Weening. Low-cost process creation and dynamic partitioning in Qlisp. In Proceedings of the US/Japan workshop on Parallel Lisp on Parallel Lisp: languages and systems, pages 182--199, New York, NY, USA, 1990. Springer-Verlag New York, Inc. ISBN 0-387-52782-6. Google Scholar
Digital Library
- }}S. L. Peyton Jones. Parallel implementations of functional programming languages. Comput. J., 32(2):175--186, 1989. ISSN 0010-4620. doi: http://dx.doi.org/10.1093/comjnl/32.2.175. Google Scholar
Digital Library
- }}C. Queinnec. Lisp in small pieces. Cambridge University Press, New York, NY, USA, 1996. ISBN 0-521-56247-3. Google Scholar
Digital Library
- }}C. Quinones, C. Madriles, J. Sanchez, P. Marcuello, A. Gonzalez, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. SIGPLAN Not., 40(6):269--279, 2005. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/1064978.1065043. Google Scholar
Digital Library
- }}R. Rugina and M. Rinard. Automatic parallelization of divide and conquer algorithms. In PPoPP '99: Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 72--83, New York, NY, USA, 1999. ACM. ISBN 1-58113-100-3. doi: http://doi.acm.org/10.1145/301104.301111. Google Scholar
Digital Library
- }}P. Rundberg and P. Stenstrom. Low-cost thread-level data dependence speculation on multiprocessors. In Proc. of the workshop on Multithreading Execution and Compilation at MICRO-33, 2000.Google Scholar
- }}S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Pearson Education, 2003. ISBN 0137903952. Google Scholar
Digital Library
- }}M. Sperber, R. K. Dybvig, M. Flatt, A. van Straaten, R. Kelsey, W. Clinger, J. Reese, R. B. Findler, and J. Matthews. Revised6 Report on the Algorithmic Language Scheme, September 2007. Google Scholar
Digital Library
- }}G. L. Steele, Jr. LAMBDA: The Ultimate Declarative. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1976. Google Scholar
Digital Library
- }}G. L. Steele, Jr. Debunking the "Expensive Procedure Call"" Myth or, Procedure Call Implementations Considered Harmful or, LAMDBA: The Ultimate GOTO. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1977. Google Scholar
Digital Library
- }}G. L. Steele, Jr. Rabbit: A Compiler for Scheme. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1978. Google Scholar
Digital Library
- }}G. L. Steele, Jr. Organizing Functional Code for Parallel Execution; or, foldl and foldr Considered Slightly Harmful. Keynote at the 14th ACM SIGPLAN International Conference on Functional Programming (ICFP 2009), http://www.vimeo.com/6624203, 2009. Google Scholar
Digital Library
- }}G. L. Steele, Jr. and G. J. Sussman. Lambda: The Ultimate Imperative. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1976. Google Scholar
Digital Library
- }}G. L. Steele, Jr. and G. J. Sussman. The Art of the Interpreter or, The Modularity Complex (Parts Zero, One, and Two). Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1978. Google Scholar
Digital Library
- }}G. L. Steele, Jr and G. J. Sussman. Design of LISP-based processors, or SCHEME: A Dielectric LISP, or Finite Memories Considered Harmful, or LAMBDA: The Ultimate Opcode. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1979. Google Scholar
Digital Library
- }}G. J. Sussman and G. L. Steele, Jr. An Interpreter for Extended Lambda Calculus. Technical report, Massachusetts Institute of Technology, Cambridge, MA, USA, 1975. Google Scholar
Digital Library
- }}J. Weening. Parallel execution of Lisp programs. PhD thesis, Stanford Comput. Sci. Rep. STANCS-89-1265, June 1989. Google Scholar
Digital Library
Index Terms
Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator
Recommendations
Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator
OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsWhile most approaches to automatic parallelization focus on compilation approaches for parallelizing loop iterations, we advocate the need for new virtual machines that can parallelize the execution of recursive programs. In this paper, we show that ...
A comparison of automatic parallelization tools/compilers on the SGI origin 2000
SC '98: Proceedings of the 1998 ACM/IEEE conference on SupercomputingPorting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is time consuming and costly, porting codes would ideally be automated by using some parallelization ...
Automatic Parallelization of Recursive Procedures
PACT '99: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation TechniquesParallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time ...







Comments