skip to main content
10.1145/1411204.1411239acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

A scheduling framework for general-purpose parallel languages

Published:20 September 2008Publication History

ABSTRACT

The trend in microprocessor design toward multicore and manycore processors means that future performance gains in software will largely come from harnessing parallelism. To realize such gains, we need languages and implementations that can enable parallelism at many different levels. For example, an application might use both explicit threads to implement course-grain parallelism for independent tasks and implicit threads for fine-grain data-parallel computation over a large array. An important aspect of this requirement is supporting a wide range of different scheduling mechanisms for parallel computation.

In this paper, we describe the scheduling framework that we have designed and implemented for Manticore, a strict parallel functional language. We take a micro-kernel approach in our design: the compiler and runtime support a small collection of scheduling primitives upon which complex scheduling policies can be implemented. This framework is extremely flexible and can support a wide range of different scheduling policies. It also supports the nesting of schedulers, which is key to both supporting multiple scheduling policies in the same application and to hierarchies of speculative parallel computations.

In addition to describing our framework, we also illustrate its expressiveness with several popular scheduling techniques. We present a (mostly) modular approach to extending our schedulers to support cancellation. This mechanism is essential for implementing eager and speculative parallelism. We finally evaluate our framework with a series of benchmarks and an analysis.

Skip Supplemental Material Section

Supplemental Material

Video

References

  1. Arora, N. S., R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors, 1998.Google ScholarGoogle Scholar
  2. Arvind, R. S. Nikhil, and K. K. Pingali. I-structures: Data structures for parallel computing. ACM TOPLAS, 11(4), October 1989, pp. 598--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blumofe, R. D. and C. E. Leiserson. Scheduling multithreaded computations by work stealing. JACM, 46(5), 1999, pp. 720--748. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blumofe, R. D., C. C. Leiserson, and B. Song. Automatic processor allocation for work-stealing jobs, 1998.Google ScholarGoogle Scholar
  5. Blumofe, R. D. and D. Papadopoulos. The performance of work stealing in multiprogrammed environments (extended abstract). In Measurement and Modeling of Computer Systems, 1998, pp. 266--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Burton, F. W. and M. R. Sleep. Executing functional programs on a virtual tree of processors. In FPCA '81, New York, NY, October 1981. ACM, pp. 187--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carlisle, M., L. J. Hendren, A. Rogers, and J. Reppy. Supporting SPMD execution for dynamic data structures. ACM TOPLAS, 17(2), March 1995, pp. 233--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chakravarty, M. M. T., R. Leschchinski, S. Peyton Jones, G. Keller, and S. Marlow. Data Parallel Haskell: A status report. In DAMP '07, New York, NY, January 2007. ACM, pp. 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Doligez, D. and G. Gonthier. Portable, unobtrusive garbage collection for multiprocessor systems. In POPL '94, New York, NY, January 1994. ACM, pp. 70--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Dybvig, R. K. and R. Hieb. Engines from continuations. Comp. Lang., 14(2), 1989, pp. 109--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Doligez, D. and X. Leroy. A concurrent, generational garbage collector for a multithreaded implementation of ml. In POPL '93, New York, NY, January 1993. ACM, pp. 113--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Danaher, J. S., I.-T. A. Lee, and C. E. Leiserson. Programming with exceptions in JCilk. Science of Computer Programming, 63(2), 2006, pp. 147--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fedorova, A. Operating System Scheduling for Chip Multithreaded Processors. Ph.D. dissertation, Department of Computer Science, Harvard University, Boston, MA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Feeley, M. Polling efficiently on stock hardware. In FPCA '93, New York, NY, June 1993. ACM, pp. 179--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Feitelson, D. G. Job scheduling in multiprogrammed parallel systems. Research Report RC 19790 (87657), IBM, October 1994. Second revision, August 1997.Google ScholarGoogle Scholar
  16. Fluet, M., N. Ford, M. Rainey, J. Reppy, A. Shaw, and Y. Xiao. Status report: The Manticore project. In ML '07, New York, NY, October 2007. ACM, pp. 15--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Frigo, M., C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI '98, New York, NY, June 1998. pp. 212--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Fluet, M., M. Rainey, J. Reppy, A. Shaw, and Y. Xiao. Manticore: A heterogeneous parallel language. In DAMP '07, New York, NY, January 2007. ACM, pp. 37--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Fluet, M., M. Rainey, J. Reppy, and A. Shaw. Implicitly-threaded parallelism in Manticore. In ICFP '08, New York, NY, September 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Flanagan, C., A. Sabry, B. F. Duba, and M. Felleisen. The essence of compiling with continuations. In PLDI '93, New York, NY, June 1993. ACM, pp. 237--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Halstead Jr., R. H. Implementation of multilisp: Lisp on a multiprocessor. In LFP '84, New York, NY, August 1984. ACM, pp. 9--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Haynes, C. T. and D. P. Friedman. Engines build process abstractions. In LFP '84, New York, NY, August 1984. ACM, pp. 18--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Haynes, C. T., D. P. Friedman, and M. Wand. Continuations and coroutines. In LFP '84, New York, NY, August 1984. ACM, pp. 293--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hauser, C., C. Jacobi, M. Theimer, B. Welch, and M. Weiser. Using threads in interactive systems: A case study. In SOSP '93, December 1993, pp. 94--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jagannathan, S. and J. Philbin. A customizable substrate for concurrent languages. In PLDI '92, New York, NY, June 1992. ACM, pp. 55--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, P., S. Marlow, S. Peyton Jones, and A. Tolmach. Lightweight concurrency primitives for GHC. In HASKELL '07, New York, NY, September 2007. ACM, pp. 107--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mohr, E., D. A. Kranz, and R. H. Halstead Jr. Lazy task creation: a technique for increasing the granularity of parallel programs. In LFP '90, New York, NY, June 1990. ACM, pp. 185--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael, M. M. and M. L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In PODC '96, New York, NY, May 1996. ACM, pp. 267--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nikhil, R. S. ID Language Reference Manual. Laboratory for Computer Science, MIT, Cambridge, MA, July 1991.Google ScholarGoogle Scholar
  30. Osborne, R. B. Speculative computation in multilisp. In LFP '90, New York, NY, June 1990. ACM, pp. 198--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rainey, M. The Manticore runtime model. Master's dissertation, University of Chicago, January 2007. Available from http://manticore.cs.uchicago.edu.Google ScholarGoogle Scholar
  32. Rainey, M. Prototyping nested schedulers. In Redex Workshop, September 2007.Google ScholarGoogle Scholar
  33. Ramsey, N. Concurrent programming in ML. Technical Report CS-TR-262-90, Dept. of C.S., Princeton University, April 1990.Google ScholarGoogle Scholar
  34. Reppy, J. H. First-class synchronous operations in Standard ML. Technical Report TR 89-1068, Dept. of CS, Cornell University, December 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Reppy, J. H. Asynchronous signals in Standard ML. Technical Report TR 90-1144, Dept. of CS, Cornell University, Ithaca, NY, August 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Reppy, J. H. Concurrent Programming in ML. Cambridge University Press, Cambridge, England, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Reppy, J. Optimizing nested loops using local CPS conversion. HOSC, 15, 2002, pp. 161--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Reppy, J. and Y. Xiao. Specialization of CML message-passing primitives. In POPL '07, New York, NY, January 2007. ACM, pp. 315--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Reppy, J. and Y. Xiao. Toward a parallel implementation of Concurrent ML. In DAMP '08, New York, NY, January 2008. ACM.Google ScholarGoogle Scholar
  40. Shaw, A. Data parallelism in Manticore. Master's dissertation, University of Chicago, July 2007. Available from http://manticore.cs.uchicago.edu.Google ScholarGoogle Scholar
  41. Shivers, O. Continuations and threads: Expressing machine concurrency directly in advanced languages. In CW '97, New York, NY, January 1997. ACM.Google ScholarGoogle Scholar
  42. Vandevoorde, M. T. and E. S. Roberts. Workcrews: an abstraction for controlling parallelism. IJPP, 17(4), August 1988, pp. 347--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Wand, M. Continuation-based multiprocessing. In LFP '80, New York, NY, August 1980. ACM, pp. 19--28. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A scheduling framework for general-purpose parallel languages

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!