Abstract
We consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λn/k_n (for some positive constant λ and integer k_n). Each incoming job consists of k_n identical tasks that can be executed in parallel, and that can be encoded into at least k_n "replicas" of the same size (by introducing redundancy) so that the job is considered to be completed when any k_n replicas associated with it finish their service. Moreover, we assume that servers can experience random slowdowns in their processing rate so that the service time of a replica is the product of its size and a random slowdown. First, we assume that the server slowdowns are shifted exponential and independent of the replica sizes. In this setting we show that the delay of a typical job is asymptotically minimized (as $n\to\infty$) when the number of replicas per task is a constant that only depends on the arrival rate λ, and on the expected slowdown of servers. Second, we introduce a new model for the server slowdowns in which larger tasks experience less variable slowdowns than smaller tasks. In this setting we show that, under the class of policies where all replicas start their service at the same time, the delay of a typical job is asymptotically minimized (as n\to\infty) when the number of replicas per task is made to depend on the actual size of the tasks being replicated, with smaller tasks being replicated more than larger tasks.
- Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2012. Why let resources idle? Aggressive cloning of jobs with Dolly. In Proceedings of HotCloud .Google Scholar
- Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In Proceedings of NSDI .Google Scholar
- Elene Anton, Urtzi Ayesta, Matthieu Jonckheere, and Ina M. Verloop. 2019. On the stability of redundancy models. (2019). arXiv:1903.04414.Google Scholar
- Soren Asmussen. 2003. Applied Probability and Queues. Springer.Google Scholar
- Benjamin Berg, Jan-Pieter Dorsman, and Mor Harchol-Balter. 2017. Towards Optimality in Parallel Job Scheduling . Proceedings of the ACM on Measurement and Analysis of Computing Systems - SIGMETRICS (2017).Google Scholar
- Sem Borst, Onno Boxma, Jan Friso Groote, and Sjouke Mauw. 2003. Task allocation in a multi-server system. Journal of Scheduling , Vol. 6, 5 (2003), 423--436.Google Scholar
Digital Library
- Shengbo Chen, Yin Sun, Ulas C. Kozat, Longbo Huang, Prasun Sinha, Guanfeng Liang, Xin Liu, and Ness B. Shroff. 2014. When queueing meets coding: Optimal-latency data retrieving scheme in storage clouds. In Proceedings of INFOCOM .Google Scholar
- Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y. Ng, and Kunle Olukotun. 2007. Map-Reduce for Machine Learning on Multicore. In Proceedings of NIPS .Google Scholar
- Ken R. Duffy and Seva Shneer. 2019. MDS coding is better than replication for job completion times. (2019). arXiv:1907.11052.Google Scholar
- Kristen Gardner, Mor Harchol-Balter, Esa Hyytia, and Rhonda Righter. 2017b. Scheduling for Efficiency and Fairness in Systems with Redundancy . Performance Evaluation , Vol. 116, C (2017), 1--25.Google Scholar
Digital Library
- Kristen Gardner, Mor Harchol-Balter, and Alan Scheller-Wolf. 2017a. A better model for job redundancy: Decoupling server slowdown and job size . IEEE/ACM Transactions of Networking , Vol. 25, 6 (2017), 3353--3367.Google Scholar
Digital Library
- Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf, Mark Velednitsky, and Samuel Zbarsky. 2017c. Redundancy-d: The Power of d Choices for Redundancy . Operations Research , Vol. 65, 4 (2017).Google Scholar
- Kristen Gardner, Esa Hyytia, and Rhonda Righter. 2019. A little redundancy goes a long way: Convexity in redundancy systems . Performance Evaluation , Vol. 131 (2019), 22--42.Google Scholar
Digital Library
- Tim Hellemans, Tejas Bodas, and Benny van Houdt. 2019. Performance Analysis of Workload Dependent Load Balancing Policies . Proceedings of the ACM on Measurement and Analysis of Computing Systems - SIGMETRICS (2019).Google Scholar
Digital Library
- Gauri Joshi, Yanpei Liu, and Emina Soljanin. 2012. Coding for Fast Content Download. In Proceedings of Allerton .Google Scholar
Cross Ref
- Gauri Joshi, Emina Soljanin, and Gregory Wornell. 2017. Efficient Redundancy Techniques for Latency Reduction in Cloud Systems . ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) , Vol. 2, 2 (2017).Google Scholar
- Ger Koole and Rhonda Righter. 2008. Resource Allocation in Grid Computing . Journal of Scheduling , Vol. 11, 3 (2008), 163--173.Google Scholar
Digital Library
- Anurag Kumar and Rajeev Shorey. 1993. Performance analysis and scheduling of stochastic fork-join jobs in a multicomputer system. IEEE Transactions of Parallel Distributed Systems , Vol. 4, 10 (1993), 1147--1164.Google Scholar
Digital Library
- Kangwook Lee, Nihar B. Shah, Longbo Huang, and Kannan Ramchandran. 2017. The MDS Queue: Analysing the Latency Performance of Erasure Codes . IEEE Transactions on Information Theory , Vol. 63, 5 (2017), 2822--2842.Google Scholar
Digital Library
- Yuan Li and David Goldberg. 2017. Simple and explicit bounds for multi-server queues with universal frac11-? (and better) scaling . (2017). arXiv:1706.04628.Google Scholar
- Guanfeng Liang and Ulas C. Kozat. 2014. TOFEC: Achieving optimal throughput-delay trade-off of cloud storage using erasure codes. In Proceedings of INFOCOM .Google Scholar
- Yi Lu, Qiaomin Xie, Gabriel Kliot, Alan Geller, Jim R. Larus, and Albert Greenberg. 2011. Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services . Performance Evaluation , Vol. 68, 11 (Nov. 2011), 1056--1071.Google Scholar
Digital Library
- Sanjoy Mitter. 2008. Convex Optimization in Infinite Dimensional Spaces . In Blondel V.D., Boyd S.P., Kimura H. (eds) Recent Advances in Learning and Control. Lecture Notes in Control and Information Sciences, vol 371. Springer, London.Google Scholar
- Michael D. Mitzenmacher. 1996. The power of two choices in randomized load balancing . Ph.D. Dissertation. U.C. Berkeley.Google Scholar
Digital Library
- Randolph Nelson, Don Towsley, and Asser N. Tantawi. 1988. Performance Analysis of Parallel Processing Systems . IEEE Transactions of Software Engineering , Vol. 14, 4 (1988), 532--540.Google Scholar
Digital Library
- Felix Poloczek and Florin Ciucu. 2016. Contrasting Effects of Replication in Parallel Systems: From Overload to Underload and Back. In Proceedings of SIGMETRICS .Google Scholar
Digital Library
- Amr Rizk, Felix Poloczek, and Florin Ciucu. 2016. Stochastic bounds in Fork-Join queueing systems under full and partial mapping . Queueing Systems , Vol. 83, 3--4 (2016), 261--297.Google Scholar
Digital Library
- Nihar B. Shah, Kangwook Lee, and Kannan Ramchandran. 2016. When Do Redundant Requests Reduce Latency? IEEE Transactions on Communications , Vol. 64, 2 (2016), 715--722.Google Scholar
Cross Ref
- Virag Shah, Anne Bouillard, and Francois Baccelli. 2017. Delay comparison of delivery and coding policies in data clusters. In Proceedings of Allerton .Google Scholar
Digital Library
- J. George Shanthikumar and David D. Yao. 1989. Stochastic Monotonicity in General Queueing Networks . Journal of Applied Probability , Vol. 26, 2 (1989), 413--417.Google Scholar
Cross Ref
- Yin Sun, C. Emre Koksal, and Ness B. Shroff. 2017. On Delay-Optimal Scheduling in Queueing Systems with Replications. (2017). arXiv:1603:07322.Google Scholar
- Alexander Thomasian. 2014. Analysis of Fork/Join and Related Queueing Systems . ACM Computing Surveys , Vol. 47, 2 (2014), 1--71.Google Scholar
Digital Library
- Ashish Vulimiri, Philip Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, and Scott Shenker. 2013. Low latency via redundancy. In Proceedings of CoNEXT .Google Scholar
Digital Library
- Nikita D. Vvedenskaya, Roland L. Dobrushin, and Fridrikh I. Karpelevich. 1996. Queueing system with selection of the shortest of two queues: an asymptotic approach. Problems of Information Transmission , Vol. 32, 1 (1996), 15--27.Google Scholar
- Da Wang, Gauri Joshi, and Gregory Wornell. 2014. Efficient task replication for fast response times in parallel computation. In Proceedingss of SIGMETRICS .Google Scholar
Digital Library
- Da Wang, Gauri Joshi, and Gregory Wornell. 2015. Using Straggler Replication to Reduce Latency in Large-scale Parallel Computing. In Proceedingss of SIGMETRICS .Google Scholar
Digital Library
- Weina Wang, Mor Harchol-Balter, Haotian Jiang, Alan Scheller-Wolf, and R. Srikant. 2018. Delay Asymptotics and Bounds for Multi-Taks Parallel Jobs . ACM SIGMETRICS Performance Evaluation Review , Vol. 46, 3 (2018), 2--7.Google Scholar
Digital Library
Index Terms
Delay-optimal Policies in Partial Fork-Join Systems with Redundancy and Random Slowdowns
Recommendations
Delay-Optimal Policies in Partial Fork-Join Systems with Redundancy and Random Slowdowns
SIGMETRICS '20: Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsWe consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λ n/kn (for some positive constant λ and integer kn). Each incoming job consists of kn ...
Delay-Optimal Policies in Partial Fork-Join Systems with Redundancy and Random Slowdowns
We consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λn/kn (for some positive constant kn and integer kn). Each incoming job consists of ???? ...






Comments