skip to main content
10.5555/3154690.3154739guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Execution templates: caching control plane decisions for strong scaling of data analytics

Published:12 July 2017Publication History

ABSTRACT

Control planes of cloud frameworks trade off between scheduling granularity and performance. Centralized systems schedule at task granularity, but only schedule a few thousand tasks per second. Distributed systems schedule hundreds of thousands of tasks per second but changing the schedule is costly.

We present execution templates, a control plane abstraction that can schedule hundreds of thousands of tasks per second while supporting fine-grained, per-task scheduling decisions. Execution templates leverage a program's repetitive control flow to cache blocks of frequently-executed tasks. Executing a task in a template requires sending a single message. Large-scale scheduling changes install new templates, while small changes apply edits to existing templates.

Evaluations of execution templates in Nimbus, a data analytics framework, find that they provide the fine-grained scheduling flexibility of centralized control planes while matching the strong scaling of distributed ones. Execution templates support complex, real-world applications, such as a fluid simulation with a triply nested loop and data dependent branches.

References

  1. Apache Hadoop. http://wiki.apache.org/hadoop.Google ScholarGoogle Scholar
  2. Facebook AI Research open sources deep-learning modules for Torch. https://research. facebook.com/blog/fair-open-sources-deep-learning-modules-for-torch/.Google ScholarGoogle Scholar
  3. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Savannah, Georgia, USA, 2016. Google ScholarGoogle Scholar
  4. M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, et al. Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1383-1394. ACM, 2015. Google ScholarGoogle Scholar
  5. M. Bauer, S. Treichler, E. Slaughter, and A. Aiken. Legion: Expressing locality and independence with logical regions. In High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for, pages 1-11. IEEE, 2012. Google ScholarGoogle Scholar
  6. E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian, M. Wu, and L. Zhou. Apollo: scalable and coordinated scheduling for cloud-scale computing. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 285- 300, 2014. Google ScholarGoogle Scholar
  7. K. J. Brown, H. Lee, T. Rompf, A. K. Sujeeth, C. De Sa, C. Aberger, and K. Olukotun. Have abstraction and eat performance, too: Optimized heterogeneous computing with parallel patterns. In Proceedings of the 2016 International Symposium on Code Generation and Optimization, pages 194- 205. ACM, 2016. Google ScholarGoogle Scholar
  8. C. Chambers, A. Raniwala, F. Perry, S. Adams, R. R. Henry, R. Bradshaw, and N. Weizenbaum. Flumejava: easy, efficient data-parallel pipelines. In ACM Sigplan Notices, volume 45, pages 363- 375. ACM, 2010. Google ScholarGoogle Scholar
  9. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107-113, 2008. Google ScholarGoogle Scholar
  10. P. Delgado, F. Dinu, A.-M. Kermarrec, and W. Zwaenepoel. Hawk: hybrid datacenter scheduling. In 2015 USENIX Annual Technical Conference (USENIX ATC 15), pages 499-510, 2015. Google ScholarGoogle Scholar
  11. C. Delimitrou, D. Sanchez, and C. Kozyrakis. Tarcil: reconciling scheduling speed and quality in large shared clusters. In Proceedings of the Sixth ACM Symposium on Cloud Computing, pages 97-110. ACM, 2015. Google ScholarGoogle Scholar
  12. P. Dubey, P. Hanrahan, R. Fedkiw, M. Lentine, and C. Schroeder. Physbam: Physically based simulation. In ACM SIGGRAPH 2011 Courses, SIGGRAPH '11, pages 10:1-10:22, New York, NY, USA, 2011. ACM. Google ScholarGoogle Scholar
  13. D. Enright, R. Fedkiw, J. Ferziger, and I. Mitchell. A hybrid particle level set method for improved interface capturing. Journal of Computational Physics, 183(1):83-116, 2002. Google ScholarGoogle Scholar
  14. M. R. Garey, D. S. Johnson, and R. Sethi. The complexity of flowshop and jobshop scheduling. Mathematics of operations research, 1(2):117-129, 1976. Google ScholarGoogle Scholar
  15. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In NSDI, volume 11, pages 24-24, 2011. Google ScholarGoogle Scholar
  16. I. Gog, M. Schwarzkopf, A. Gleave, R. N. M. Watson, and S. Hand. Firmament: Fast, centralized cluster scheduling at scale. In To appear in Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. USENIX, 2016. Google ScholarGoogle Scholar
  17. R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource packing for cluster schedulers. In ACM SIGCOMM Computer Communication Review, volume 44, pages 455- 466. ACM, 2014. Google ScholarGoogle Scholar
  18. R. Grandl, S. Kandula, S. Rao, A. Akella, and J. Kulkarni. Do the hard stuff first: Scheduling dependent computations in data-analytics clusters. arXiv preprint arXiv:1604.07371, 2016.Google ScholarGoogle Scholar
  19. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, volume 11, pages 22-22, 2011. Google ScholarGoogle Scholar
  20. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In ACM SIGOPS Operating Systems Review, volume 41, pages 59- 72. ACM, 2007. Google ScholarGoogle Scholar
  21. M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 261-276. ACM, 2009. Google ScholarGoogle Scholar
  22. L. V. Kale and S. Krishnan. CHARM++: a portable concurrent object oriented system based on C++, volume 28. ACM, 1993.Google ScholarGoogle Scholar
  23. K. Karanasos, S. Rao, C. Curino, C. Douglas, K. Chaliparambil, G. M. Fumarola, S. Heddaya, R. Ramakrishnan, and S. Sakalanaga. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In 2015 USENIX Annual Technical Conference (USENIX ATC 15), pages 485-497, 2015. Google ScholarGoogle Scholar
  24. Q. Ke, M. Isard, and Y. Yu. Optimus: a dynamic rewriting framework for data-parallel execution plans. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 15- 28. ACM, 2013. Google ScholarGoogle Scholar
  25. J. Liu and S. J. Wright. Asynchronous stochastic coordinate descent: Parallelism and convergence properties. SIAM Journal on Optimization, 25(1):351-376, 2015.Google ScholarGoogle Scholar
  26. Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. E. Guestrin, and J. Hellerstein. Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041, 2014. Google ScholarGoogle Scholar
  27. M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems, 12(10):1094-1104, 2001. Google ScholarGoogle Scholar
  28. D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: a timely dataflow system. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 439-455. ACM, 2013. Google ScholarGoogle Scholar
  29. D. G. Murray, M. Schwarzkopf, C. Smowton, S. Smith, A. Madhavapeddy, and S. Hand. Ciel: A universal execution engine for distributed dataflow computing. In NSDI, volume 11, pages 9-9, 2011. Google ScholarGoogle Scholar
  30. K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, B.-G. Chun, and V. ICSI. Making sense of performance in data analytics frameworks. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 293-307, 2015. Google ScholarGoogle Scholar
  31. K. Ousterhout, P. Wendell, M. Zaharia, and I. Stoica. Sparrow: distributed, low latency scheduling. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 69- 84. ACM, 2013. Google ScholarGoogle Scholar
  32. S. Palkar, J. Thomas, and M. Zaharia. Nested vector language: Roofline performance for data parallel code. http://livinglab.mit.edu/wpcontent/ uploads/2016/01/nvlposter.pdf.Google ScholarGoogle Scholar
  33. M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, and J. Wilkes. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 351-364. ACM, 2013. Google ScholarGoogle Scholar
  34. M. Snir. MPI-the Complete Reference: The MPI core, volume 1. MIT press, 1998. Google ScholarGoogle Scholar
  35. V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, et al. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing, page 5. ACM, 2013. Google ScholarGoogle Scholar
  36. S. Venkataraman, A. Panda, K. Ousterhout, A. Ghodsi, M. J. Franklin, B. Recht, and I. Stoica. Drizzle: Fast and adaptable stream processing at scale.Google ScholarGoogle Scholar
  37. S. Venkataraman, Z. Yang, M. Franklin, B. Recht, and I. Stoica. Ernest: Efficient performance prediction for large-scale advanced analytics. In Proceedings of the 13th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2016. Google ScholarGoogle Scholar
  38. R. Xin. Technical Preview of Apache Spark 2.0 Now on Databricks. https://databricks. com/blog/2016/05/11/apache-spark- 2-0-technical-preview-easier-faster-and-smarter.html.Google ScholarGoogle Scholar
  39. R. Xin and J. Rosen. Project Tungsten: Bringing Apache Spark Closer to Bare Metal. https://databricks.com/blog/2015/ 04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html.Google ScholarGoogle Scholar
  40. Y. Yu, M. Isard, D. Fetterly, M. Budiu, Ú. Erlingsson, P. K. Gunda, and J. Currey. Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, volume 8, pages 1-14, 2008. Google ScholarGoogle Scholar
  41. M. Zaharia. New developments in spark and rethinking apis for big data. http:// platformlab.stanford.edu/Seminar% 20Talks/stanford-seminar.pdf.Google ScholarGoogle Scholar
  42. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012. Google ScholarGoogle Scholar

Index Terms

  1. Execution templates: caching control plane decisions for strong scaling of data analytics
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image Guide Proceedings
            USENIX ATC '17: Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference
            July 2017
            811 pages
            ISBN:9781931971386

            Publisher

            USENIX Association

            United States

            Publication History

            • Published: 12 July 2017

            Qualifiers

            • Article