Abstract
In this paper we present an automated way of using spare CPU resources within a shared memory multi-processor or multi-core machine. Our approach is (i) to profile the execution of a program, (ii) from this to identify pieces of work which are promising sources of parallelism, (iii) recompile the program with this work being performed speculatively via a work-stealing system and then (iv) to detect at run-time any attempt to perform operations that would reveal the presence of speculation.
We assess the practicality of the approach through an implementation based on GHC 6.6 along with a limit study based on the execution profiles we gathered. We support the full Concurrent Haskell language compiled with traditional optimizations and including I/O operations and synchronization as well as pure computation. We use 20 of the larger programs from the 'nofib' benchmark suite. The limit study shows that programs vary a lot in the parallelism we can identify: some have none, 16 have a potential 2x speed-up, 4 have 32x. In practice, on a 4-core processor, we get 10-80% speed-ups on 7 programs. This is mainly achieved at the addition of a second core rather than beyond this.
This approach is therefore not a replacement for manual parallelization, but rather a way of squeezing extra performance out of the threads of an already-parallel program or out of a program that has not yet been parallelized.
References
- Arvind, David E. Culler, and Gino K. Maa. Assessing the benefits of fine-grained parallelism in dataflow programs. Technical Report 279, Computation Structures Group, MIT, March 1988.Google Scholar
- K. E. Batcher. Sorting networks and their applications. In Proceedings of the AFIPS Spring Joint Computing Conference, 1969.Google Scholar
- R. Blumofe and C. Leiserson. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico., pages 356--368, November 1994.Google Scholar
Digital Library
- Robert Ennals. Adaptive Evaluation of Non-Strict Programs. PhD thesis, Cambridge University Computer Laboratory, 2004.Google Scholar
- Robert Ennals and Simon Peyton Jones. Optimistic evaluation: an adaptive evaluation strategy for non-strict programs. In ICFP '03: Proceedings of the eighth ACM SIGPLAN international conference on Functional programming, pages 287--298. ISBN 1-58113-756-7. Google Scholar
Digital Library
- Kevin Hammond. Parallel functional programming: An introduction. In International Symposium on Parallel Symbolic Computation, Hagenberg/Linz, Austria, September 1994.Google Scholar
- Tim Harris. Dynamic adaptive pre-tenuring. In International Symposium on Memory Management (ISMM,'00), volume 36(1) of ACM SIGPLAN Notices, pages 127--136, January 2001. Google Scholar
Digital Library
- Tim Harris, Simon Marlow, and Simon Peyton Jones. Haskell on a shared-memory multiprocessor. In Haskell '05: Proceedings of the 2005 ACM SIGPLAN workshop on Haskell, pages 49--61, September 2005. Google Scholar
Digital Library
- Danny Hendler, Yossi Lev, Mark Moir, and Nir Shavit. A dynamic-sized nonblocking work stealing deque. Technical Report TR-2005-144, Sun Microsystems Laboratories, 2005.Google Scholar
- Raj Jain. The art of computer systems performance analysis. Wiley, 1991.Google Scholar
- H-W. Loidl. Granularity in Large-Scale Parallel Functional Programming. PhD thesis, Department of Computing Science, University of Glasgow, March 1998.Google Scholar
- James Stewart Mattson. An effective speculative evaluation technique for parallel supercombinator graph reduction. PhD thesis, University of California at San Diego.Google Scholar
- Rishiyur S Nikhil and Arvind. Implicit Parallel Programming in pH. Morgan Kaufman, 2001. Google Scholar
Digital Library
- Randy B. Osborne. Speculative computation in multilisp. In LFP '90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 198--208. ISBN 0-89791-368-X. Google Scholar
Digital Library
- Simon Peyton Jones, Andrew Gordon, and Sigbjorn Finne. Concurrent haskell. In POPL '96: Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 295--308, New York, NY, USA, 1996. ISBN 0-89791-769-3. Google Scholar
Digital Library
- Paul Roe. Parallel Programming Using Functional Languages. PhD thesis, Department of Computing Science, University of Glasgow, 1991.Google Scholar
- Kenneth R. Traub. Implementation of non-strict functional programming languages. MIT Press, Cambridge, MA, USA, 1991. ISBN 0-262-70042-5. Google Scholar
Digital Library
- Kenneth R. Traub, Gregory M. Papadopoulos, Michael J. Beckerle, James E. Hicks, and Jonathan Young. Overview of the Monsoon project. In ICCD '91: Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors, pages 150--155. ISBN 0-8186-2270-9. Google Scholar
Digital Library
- P. W. Trinder, K. Hammond, H.-W. Loidl, and S. L. Peyton Jones. Algorithm + strategy = parallelism. J. Funct. Program., 8 (1): 23--60, 1998. ISSN 0956-7968. Google Scholar
Digital Library
Index Terms
Feedback directed implicit parallelism






Comments