skip to main content
research-article
Open Access

Fairness in responsive parallelism

Published:26 July 2019Publication History
Skip Abstract Section

Abstract

Research on parallel computing has historically revolved around compute-intensive applications drawn from traditional areas such as high-performance computing. With the growing availability and usage of multicore chips, applications of parallel computing now include interactive parallel applications that mix compute-intensive tasks with interaction, e.g., with the user or more generally with the external world. Recent theoretical work on responsive parallelism presents abstract cost models and type systems for ensuring and reasoning about responsiveness and throughput of such interactive parallel programs.

In this paper, we extend prior work by considering a crucial metric: fairness. To express rich interactive parallel programs, we allow programmers to assign priorities to threads and instruct the scheduler to obey a notion of fairness. We then propose the fairly prompt scheduling principle for executing such programs; the principle specifies the schedule for multithreaded programs on multiple processors. For such schedules, we prove theoretical bounds on the execution and response times of jobs of various priorities. In particular, we bound the amount, i.e., stretch, by which a low-priority job can be delayed by higher-priority work. We also present an algorithm designed to approximate the fairly prompt scheduling principle on multicore computers, implement the algorithm by extending the Standard ML language, and present an empirical evaluation.

Skip Supplemental Material Section

Supplemental Material

a81-muller.webm

webm

109.2 MB

References

  1. Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe. 2002. The data locality of work stealing. Theory of Computing Systems (TOCS) 35, 3 (2002), 321–347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Umut A. Acar, Arthur Charguéraud, Adrien Guatto, Mike Rainey, and Filip Sieczkowski. 2018. Heartbeat Scheduling: Provable Efficiency for Nested Parallelism. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). 769–782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Umut A. Acar, Arthur Charguéraud, and Mike Rainey. 2013. Scheduling Parallel Programs by Work Stealing with Private Deques. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Umut A. Acar, Arthur Charguéraud, and Mike Rainey. 2016. Oracle-guided scheduling for controlling granularity in implicitly parallel languages. Journal of Functional Programming (JFP) 26 (2016), e23.Google ScholarGoogle ScholarCross RefCross Ref
  5. Kunal Agrawal, Jeremy T. Fineman, Kefu Lu, Brendan Sheridan, Jim Sukha, and Robert Utterback. 2014. Provably Good Scheduling for Parallel Programs That Use Data Structures Through Implicit Batching. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’14). 84–95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. 2001. Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems 34, 2 (2001), 115–144.Google ScholarGoogle ScholarCross RefCross Ref
  7. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. 2009. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP ’09). 29–44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Geoffrey Blake, Ronald G. Dreslinski, Trevor Mudge, and Krisztián Flautner. 2010. Evolution of Thread-level Parallelism in Desktop Applications. In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA ’10). 302–313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Harsha Vardhan Simhadri. 2011. Scheduling irregular parallel computations on hierarchical caches. In Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’11). 355–366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Guy E. Blelloch and Phillip B. Gibbons. 2004. Effectively sharing a cache among threads. In SPAA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guy E. Blelloch, Phillip B. Gibbons, and Yossi Matias. 1999. Provably efficient scheduling for languages with fine-grained parallelism. J. ACM 46 (March 1999), 281–321. Issue 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Guy E. Blelloch, Phillip B. Gibbons, and Harsha Vardhan Simhadri. 2010. Low Depth Cache-oblivious Algorithms. In Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’10). ACM, New York, NY, USA, 189–199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Guy E. Blelloch, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha, and Siddhartha Chatterjee. 1994. Implementation of a Portable Nested Data-Parallel Language. J. Parallel Distrib. Comput. 21, 1 (1994), 4–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1996. Cilk: An Efficient Multithreaded Runtime System. J. Parallel and Distrib. Comput. 37, 1 (1996), 55 – 69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Robert D. Blumofe and Charles E. Leiserson. 1998. Space-Efficient Scheduling of Multithreaded Computations. SIAM J. Comput. 27, 1 (1998), 202–229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. J. ACM 46 (Sept. 1999), 720–748. Issue 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, and Zheng Zhang. 2008. Corey: An Operating System for Many Cores. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation. 43–57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Richard P. Brent. 1974. The parallel evaluation of general arithmetic expressions. J. ACM 21, 2 (1974), 201–206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. Warren Burton and M. Ronan Sleep. 1981. Executing functional programs on a virtual tree of processors. In Functional Programming Languages and Computer Architecture (FPCA ’81). ACM Press, 187–194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA ’05). ACM, 519–538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rezaul Alam Chowdhury and Vijaya Ramachandran. 2008. Cache-efficient dynamic programming algorithms for multicores. In Proc. 20th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, New York, NY, USA, 207–216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Demers, S. Keshav, and S. Shenker. 1989. Analysis and Simulation of a Fair Queueing Algorithm. In Symposium Proceedings on Communications Architectures &Amp; Protocols (SIGCOMM ’89). ACM, New York, NY, USA, 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Derek L. Eager, John Zahorjan, and Edward D. Lazowska. 1989. Speedup versus efficiency in parallel systems. IEEE Transactions on Computing 38, 3 (1989), 408–423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kristián Flautner, Rich Uhlig, Steve Reinhardt, and Trevor Mudge. 2000. Thread-level Parallelism and Interactive Performance of Desktop Applications. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IX). 129–138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Matthew Fluet, Mike Rainey, John Reppy, and Adam Shaw. 2011. Implicitly threaded parallelism in Manticore. Journal of Functional Programming 20, 5-6 (2011), 1–40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Matthew Fluet, Mike Rainey, John H. Reppy, and Adam Shaw. 2008. Implicitly-threaded parallelism in Manticore. In ICFP. 119–130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nissim. Francez. 1986. Fairness. Springer US, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The Implementation of the Cilk-5 Multithreaded Language. In PLDI. 212–223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Gao, A. Gutierrez, R. G. Dreslinski, T. Mudge, K. Flautner, and G. Blake. 2014. A study of Thread Level Parallelism on mobile devices. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on. 126–127.Google ScholarGoogle Scholar
  30. The Go Authors. 2018. The Go Programming Language Specification. (Feb. 2018). https://golang.org/ref/spec#Go_statementsGoogle ScholarGoogle Scholar
  31. Pawan Goyal, Xingang Guo, and Harrick M. Vin. 1996. A Hierarchial CP U Scheduler for Multimedia Operating Systems. In Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation (OSDI ’96). ACM, New York, NY, USA, 107–121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. John Greiner and Guy E. Blelloch. 1999. A Provably Time-efficient Parallel Implementation of Full Speculation. ACM Transactions on Programming Languages and Systems 21, 2 (March 1999), 240–285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Adrien Guatto, Sam Westrick, Ram Raghunathan, Umut A. Acar, and Matthew Fluet. 2018. Hierarchical memory management for mutable state. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, Vienna, Austria, February 24-28, 2018. 81–93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Robert H. Halstead. 1985. MULTILISP: a language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems 7 (1985), 501–538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mor Harchol-Balter. 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Carl Hauser, Christian Jacobi, Marvin Theimer, Brent Welch, and Mark Weiser. 1993. Using Threads in Interactive Systems: A Case Study. SIGOPS Oper. Syst. Rev. 27, 5 (Dec. 1993), 94–105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shams Imam and Vivek Sarkar. 2015. Load Balancing Prioritized Tasks via Work-Stealing. In Euro-Par 2015: Parallel Processing - 21st International Conference on Parallel and Distributed Computing. 222–234.Google ScholarGoogle Scholar
  38. Shams Mahmood Imam and Vivek Sarkar. 2014. Habanero-Java library: a Java 8 framework for multicore programming. In 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ ’14. 75–86. Google ScholarGoogle ScholarCross RefCross Ref
  39. Intel. 2011. Intel Threading Building Blocks. (2011). https://www.threadingbuildingblocks.org/ .Google ScholarGoogle Scholar
  40. Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. 2010. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (ICFP ’10). 261–272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. A. Knepper, S. S. Srinivasa, and M. T. Mason. 2010. Hierarchical planning architectures for mobile manipulation tasks in indoor environments. In 2010 IEEE International Conference on Robotics and Automation. 1985–1990.Google ScholarGoogle Scholar
  42. Lindsey Kuper, Aaron Todd, Sam Tobin-Hochstadt, and Ryan R. Newton. 2014. Taming the Parallel Effect Zoo: Extensible Deterministic Parallelism with LVish. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 2–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Steven M LaValle. 1998. Rapidly-exploring random trees: A new tool for path planning. Technical Report TR 98-11. Computer Science Dept., Iowa State University.Google ScholarGoogle Scholar
  44. Doug Lea. 2000. A Java fork/join framework. In Proceedings of the ACM 2000 conference on Java Grande (JAVA ’00). 36–43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Daan Leijen, Wolfram Schulte, and Sebastian Burckhardt. 2009. The design of a task parallel library. In Proceedings of the 24th ACM SIGPLAN conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’09). 227–242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jing Li, Jian Jia Chen, Kunal Agrawal, Chenyang Lu, Chris Gill, and Abusayeed Saifullah. 2014. Analysis of Federated and Global Scheduling for Parallel Real-Time Tasks. In Proceedings of the 2014 Agile Conference (AGILE ’14). IEEE Computer Society, Washington, DC, USA, 85–96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jing Li, Son Dinh, Kevin Kieselbach, Kunal Agrawal, Christopher Gill, and Chenyang Lu. 2016. Randomized Work Stealing for Large Scale Soft Real-time Systems. In Real-Time Systems Symposium (RTSS), 2016 IEEE. IEEE, 203–214.Google ScholarGoogle Scholar
  48. Jing Li, David Ferry, Shaurya Ahuja, Kunal Agrawal, Christopher Gill, and Chenyang Lu. 2017. Mixed-criticality Federated Scheduling for Parallel Real-time Tasks. Real-Time Syst. 53, 5 (Sept. 2017), 760–811. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Sebastian Mattheis, Tobias Schuele, Andreas Raabe, Thomas Henties, and Urs Gleim. 2012. Work Stealing Strategies for Parallel Stream Processing in Soft Real-time Systems. In Proceedings of the 25th International Conference on Architecture of Computing Systems (ARCS’12). Springer-Verlag, Berlin, Heidelberg, 172–183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Stefan K. Muller and Umut A. Acar. 2016. Latency-Hiding Work Stealing: Scheduling Interacting Parallel Computations with Work Stealing. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, Asilomar State Beach/Pacific Grove, CA, USA, July 11-13, 2016. 71–82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Stefan K. Muller, Umut A. Acar, and Robert Harper. 2017. Responsive Parallel Computation: Bridging Competitive and Cooperative Threading. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 677–692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Stefan K. Muller, Umut A. Acar, and Robert Harper. 2018. Competitive Parallelism: Getting Your Priorities Right. Proc. ACM Program. Lang. 2, ICFP, Article 95 (July 2018), 30 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Girija J. Narlikar and Guy E. Blelloch. 1999. Space-Efficient Scheduling of Nested Parallelism. ACM Transactions on Programming Languages and Systems 21 (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Jason Nieh, James G. Hanko, J. Duane Northcutt, and Gerard A. Wall. 1994. SVR4UNIX Scheduler Unacceptable for Multimedia Applications. In Proceedings of the 4th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV ’93). Springer-Verlag, London, UK, UK, 41–53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Stephen Olivier, Jun Huan, Jinze Liu, Jan Prins, James Dinan, P. Sadayappan, and Chau-Wen Tseng. 2006. UTS: An Unbalanced Tree Search Benchmark. In Languages and Compilers for Parallel Computing, 19th International Workshop, LCPC 2006, New Orleans, LA, USA, November 2-4, 2006. Revised Papers. 235–250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ram Raghunathan, Stefan K. Muller, Umut A. Acar, and Guy Blelloch. 2016. Hierarchical Memory Management for Parallel Programs. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016). ACM, New York, NY, USA, 392–406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Abusayeed Saifullah, David Ferry, Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher D. Gill. 2014a. Parallel Real-Time Scheduling of DAGs. IEEE Trans. Parallel Distrib. Syst. 25, 12 (2014), 3242–3252.Google ScholarGoogle ScholarCross RefCross Ref
  58. Abusayeed Saifullah, David Ferry, Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher D. Gill. 2014b. Parallel Real-Time Scheduling of DAGs. IEEE Trans. Parallel Distrib. Syst. 25, 12 (2014), 3242–3252.Google ScholarGoogle ScholarCross RefCross Ref
  59. Abusayeed Saifullah, Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher Gill. 2013. Multi-core real-time scheduling for generalized parallel task models. Real-Time Systems 49, 4 (01 Jul 2013), 404–435.Google ScholarGoogle Scholar
  60. Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne. 2005. Operating system concepts (7. ed.). Wiley.Google ScholarGoogle Scholar
  61. K. C. Sivaramakrishnan, Lukasz Ziarek, and Suresh Jagannathan. 2014. MultiMLton: A multicore-aware runtime for standard ML. Journal of Functional Programming FirstView (6 2014), 1–62.Google ScholarGoogle Scholar
  62. Daniel Spoonhower. 2009. Scheduling Deterministic Parallel Programs. Ph.D. Dissertation. Carnegie Mellon University, Pittsburgh, PA, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. N. Sturtevant. 2012. Benchmarks for Grid-Based Pathfinding. Transactions on Computational Intelligence and AI in Games 4, 2 (2012), 144 – 148. http://web.cs.du.edu/~sturtevant/papers/benchmarks.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  64. Olivier Tardieu, Benjamin Herta, David Cunningham, David Grove, Prabhanjan Kambadur, Vijay Saraswat, Avraham Shinnar, Mikio Takeuchi, and Mandana Vaziri. 2014. X10 and APGAS at Petascale. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’14). 53–66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. J.D. Ullman. 1975. NP-complete scheduling problems. J. Comput. System Sci. 10, 3 (1975), 384 – 393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Carl A. Waldspurger and William E. Weihl. 1994. Lottery Scheduling: Flexible Proportional-Share Resource Management. In Operating Systems Design and Implementation. 1–11. citeseer.ist.psu.edu/waldspurger94lottery.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Martin Wimmer, Daniel Cederman, Jesper Larsson Träff, and Philippas Tsigas. 2013. Work-stealing with Configurable Scheduling Strategies. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13). 315–316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Martin Wimmer, Francesco Versaci, Jesper Larsson Träff, Daniel Cederman, and Philippas Tsigas. 2014. Data Structures for Task-based Priority Scheduling. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’14). 379–380. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fairness in responsive parallelism

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the ACM on Programming Languages
            Proceedings of the ACM on Programming Languages  Volume 3, Issue ICFP
            August 2019
            1054 pages
            EISSN:2475-1421
            DOI:10.1145/3352468
            Issue’s Table of Contents

            Copyright © 2019 Owner/Author

            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 July 2019
            Published in pacmpl Volume 3, Issue ICFP

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader