skip to main content
research-article
Public Access

Towards Optimality in Parallel Scheduling

Published:19 December 2017Publication History
Skip Abstract Section

Abstract

To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively leverage these multi-core chips, one must address the question of how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is an obvious tradeoff: allocating more cores to an individual job reduces the job's runtime, but in turn decreases the efficiency of the overall system. We ask how the system should schedule jobs across cores so as to minimize the mean response time over a stream of incoming jobs.

To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. EQUI requires jobs to change their level of parallelization while they run. Since this is not possible for all workloads, we consider a class of "fixed-width" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, it is possible to achieve EQUI's performance without requiring jobs to change their levels of parallelization by using the optimal fixed level of parallelization, k*. We also show how to analytically derive the optimal k* as a function of the system load, the speedup curve, and the job size distribution.

In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. In particular, we find that policies like EQUI which performed well in the case of a single speedup function now perform poorly. We propose a very simple policy, GREEDY*, which performs near-optimally when compared to the numerically-derived optimal policy.

References

  1. I. Adan, G. J. J. A. N. van Houtum, and J. van der Wal. 1994. Upper and lower bounds for the waiting time in the symmetric shortest queue system. Annals of Operations Research 48 (1994), 197--217.Google ScholarGoogle ScholarCross RefCross Ref
  2. K. Agrawal, J. Li, K. Lu, and B. Moseley. 2016. Scheduling Parallelizable Jobs Online to Minimize the Maximum Flow Time. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '16). ACM, New York, NY, USA, 195--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Ananthanarayanan, M. C. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. 2014. Grass: Trimming stragglers in approximation analytics. (2014).Google ScholarGoogle Scholar
  4. S. V. Anastasiadis and K. C. Sevcik. 1997. Parallel Application Scheduling on Networks of Workstations. J. Parallel and Distrib. Comput. 43 (1997), 109 -- 124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Baskett, K. M. Chandy, R. Muntz, and F. G. Palacios. 1975. Open, Closed, and Mixed Networks of Queues with Different Classes of Customers. J. ACM 22 (1975), 248--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Bienia, S. Kumar, J. P. Singh, and K. Li. 2008. The PARSEC Benchmark Suite: Characterization and ArchitecturalImplications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT '08). ACM, New York, NY, USA, 72--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Bonald and A. Proutière. 2002. Insensitivity in processor-sharing networks. Performance Evaluation 49 (2002), 193--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Bušić, I. Vliegen, and A. Scheller-Wolf. 2012. Comparing Markov chains: aggregation and precedence relations applied to sets of states, with applications to assemble-to-order systems. Mathematics of Operations Research 37 (2012), 259--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chaitanya, B. Urgaonkar, and A. Sivasubramaniam. 2008. Qdsl: a queuing model for systems with differential service levels. ACM SIGMETRICS Performance Evaluation Review 36, 1 (2008), 289--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Cirne and F. Berman. 2002. Using Moldability to Improve the Performance of Supercomputer Jobs. J. Parallel and Distrib. Comput. 62 (2002), 1571--1601. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Edmonds. 1999. Scheduling in the dark. Theoretical Computer Science 235 (1999), 109--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Edmonds and K. Pruhs. 2009. Scalably scheduling processes with arbitrary speedup curves. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '09). ACM, New York, NY, USA, 685--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong. 1997. Theory and Practice in Parallel Job Scheduling. In Proceedings of the International Workshop on Job Scheduling Strategies for Parallel Processing (IPPS '97). Springer-Verlag, London, UK, 1--34. http://dl.acm.org/citation.cfm?id=646378.689517 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Gandhi, M. Harchol-Balter, R. Das, and C. Lefurgy. 2009. Optimal power allocation in server farms. In ACM SIGMETRICS Performance Evaluation Review, Vol. 37. ACM, 157--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Gupta, M. Harchol-Balter, K. Sigman, and W. Whitt. 2007. Analysis of join-the-shortest-queue routing for web server farms. Performance Evaluation 64 (2007), 1062--1081. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Harchol-Balter. 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Harchol-Balter, A. Scheller-Wolf, and A. R. Young. 2009. Surprising results on task assignment in server farms with high-variability workloads. ACM SIGMETRICS Performance Evaluation Review 37 (2009), 287--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. D. Hill and M. R. Marty. 2008. Amdahl's Law in the Multicore Era. Computer 41 (2008), 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K.-C. Huang, T.-C. Huang, Y.-H. Tung, and P.-Z. Shih. 2013. Effective Processor Allocation for Moldable Jobs with Application Speedup Model. In Advances in Intelligent Systems and Applications - Volume 2. Springer, 563--572.Google ScholarGoogle Scholar
  20. L. Kleinrock. 1976. Queueing Systems, Volume II: Computer Applications. Wiley, New York.Google ScholarGoogle Scholar
  21. S.-S. Ko and R. F. Serfozo. 2004. Response times in M/M/s fork-join networks. Advances in Applied Probability 36 (2004), 854--871.Google ScholarGoogle ScholarCross RefCross Ref
  22. G. M. Koole. 2006. Monotonicity in Markov reward and decision chains: Theory and applications. Foundations and Trends in Stochastic Systems 1 (2006), 1--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. A. Lippman. 1973. Semi-Markov decision processes with unbounded rewards. Management Science 19 (1973), 717--731. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Lu, Q. Xie, G. Kliot, A. Geller, J. R. Larus, and A. Greenberg. 2011. Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services. Performance Evaluation 68 (2011), 1056--1071. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. McCool, M. Robison, and A. Reinders. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. D. Nelson and T. K. Philips. 1993. An Approximation for the Mean Response Time for Shortest Queue Routing with General Interarrival and Service Times. Performance Evaluation 17 (1993), 123--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. L. Puterman. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Chichester. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Ren, G. Ananthanarayanan, A. Wierman, and M. Yu. 2015. Hopper: Decentralized speculation-aware cluster scheduling at scale. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 379--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Scully, G. Blelloch, M. Harchol-Balter, and A. Scheller-Wolf. 2017. Optimally Scheduling Jobs with Multiple Tasks. In Proceedings of the ACM Workshop on Mathematical Performance Modeling and Analysis.Google ScholarGoogle Scholar
  30. S. Srinivasan, S. Krishnamoorthy, and P. Sadayappan. 2003. A Robust Scheduling Strategy for Moldable Scheduling of Parallel Jobs. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '03). 92--99.Google ScholarGoogle Scholar
  31. J. N. Tsitsiklis and K. Xu. 2011. On the power of (even a little) centralization in distributed processing. ACM SIGMETRICS Performance Evaluation Review 39 (2011), 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Xu, A. Scheller-Wolf, and K. P. Sycara. 2015. The Benefit of Introducing Variability in Single-Server Queues with Application to Quality-Based Service Domains. Operations Research 63 (2015), 233--246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. X. Zhan, Y. Bao, C. Bienia, and K. Li. 2017. PARSEC3.0: A Multicore Benchmark Suite with Network Stacks and SPLASH-2X. ACM SIGARCH Computer Architecture News 44 (2017), 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards Optimality in Parallel Scheduling

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!