Abstract
Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing mechanisms, have emerged as major concerns. Motivated by these issues, we introduce and analyze a novel class of load balancing schemes where the various servers provide occasional queue updates to guide the load assignment.
We show that the proposed schemes strongly outperform JSQ( d ) strategies with comparable communication overhead per job, and can achieve a vanishing waiting time in the many-server limit with just one message per job, just like the popular JIQ scheme. The proposed schemes are particularly geared however towards the sparse feedback regime with less than one message per job, where they outperform corresponding sparsified JIQ versions.
We investigate fluid limits for synchronous updates as well as asynchronous exponential update intervals. The fixed point of the fluid limit is identified in the latter case, and used to derive the queue length distribution. We also demonstrate that in the ultra-low feedback regime the mean stationary waiting time tends to a constant in the synchronous case, but grows without bound in the asynchronous case.
- N Alon, O Gurel-Gurevich, and E Lubetzky. 2010. Choice-memory tradeoff in allocations. Ann. Appl. Probab. , Vol. 20, 4 (08 2010), 1470--1511.Google Scholar
Cross Ref
- J Anselmi and F Dufour. 2018. Power-of- d -Choices with Memory: Fluid Limit and Optimality. arXiv preprint arXiv:1802.06566 (2018).Google Scholar
- R Badonnel and M Burgess. 2008. Dynamic pull-based load balancing for autonomic servers. In Network Operations and Management Symposium, 2008. NOMS 2008. IEEE. IEEE, 751--754.Google Scholar
Cross Ref
- M Bramson, Y Lu, and B Prabhakar. 2010. Randomized load balancing with general service time distributions. In ACM SIGMETRICS Performance Evaluation Review, Vol. 38(1). ACM, 275--286. Google Scholar
Digital Library
- M Bramson, Y Lu, and B Prabhakar. 2012. Asymptotic independence of queues under randomized load balancing. Queueing Systems , Vol. 71, 3 (2012), 247--292. Google Scholar
Digital Library
- A Ephremides, P Varaiya, and J Walrand. 1980. A simple dynamic routing problem. IEEE transactions on Automatic Control , Vol. 25, 4 (1980), 690--693.Google Scholar
- D Gamarnik, J N Tsitsiklis, and M Zubeldia. 2016. Delay, memory, and messaging tradeoffs in distributed service systems. ACM SIGMETRICS Performance Evaluation Review , Vol. 44, 1 (2016), 1--12. Google Scholar
Digital Library
- R Gandhi, H H Liu, Y C Hu, G Lu, J Padhye, L Yuan, and M Zhang. 2014. Duet: Cloud scale load balancing with hardware and software. ACM SIGCOMM Computer Communication Review , Vol. 44, 4 (2014), 27--38. Google Scholar
Digital Library
- N Gast. 2017. Expected values estimated via mean-field approximation are 1/N-accurate. Proceedings of the ACM on Measurement and Analysis of Computing Systems , Vol. 1, 1 (2017), 17. Google Scholar
Digital Library
- V Gupta and N Walton. 2019. Load Balancing in the Nondegenerate Slowdown Regime. Operations Research (2019).Google Scholar
- P J Hunt and T G Kurtz. 1994. Large loss networks. Stochastic Processes and their Applications , Vol. 53, 2 (1994), 363--378.Google Scholar
- Y Lu, Q Xie, G Kliot, A Geller, J R Larus, and A Greenberg. 2011. Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services. Performance Evaluation , Vol. 68, 11 (2011), 1056--1071. Google Scholar
Digital Library
- M J Luczak and J R Norris. 2013. Averaging over fast variables in the fluid limit for Markov chains: application to the supermarket model with memory. The Annals of Applied Probability , Vol. 23, 3 (2013), 957--986.Google Scholar
Cross Ref
- S T Maguluri, R Srikant, and L Ying. 2012. Stochastic models of load balancing and scheduling in cloud computing clusters. In INFOCOM, 2012 Proceedings IEEE. IEEE, 702--710.Google Scholar
Cross Ref
- M Mitzenmacher. 2000. How useful is old information? IEEE Transactions on Parallel and Distributed Systems , Vol. 11, 1 (2000), 6--20. Google Scholar
Digital Library
- M Mitzenmacher. 2001. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems , Vol. 12, 10 (2001), 1094--1104. Google Scholar
Digital Library
- M Mitzenmacher, B Prabhakar, and D Shah. 2002. Load balancing with memory. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings. 799--808. Google Scholar
Digital Library
- D Mukherjee, S C Borst, J SH Van Leeuwaarden, and P A Whiting. 2016. Universality of load balancing schemes on the diffusion scale. Journal of Applied Probability , Vol. 53, 4 (2016), 1111--1124.Google Scholar
Cross Ref
- A Mukhopadhyay, A Karthik, and R R Mazumdar. 2016. Randomized assignment of jobs to servers in heterogeneous clusters of shared servers for low delay. Stochastic Systems , Vol. 6, 1 (2016), 90--131.Google Scholar
Cross Ref
- A Mukhopadhyay, A Karthik, R R Mazumdar, and F Guillemin. 2015. Mean field and propagation of chaos in multi-class heterogeneous loss models. Performance Evaluation , Vol. 91 (2015), 117--131. Google Scholar
Digital Library
- A Mukhopadhyay and R R Mazumdar. 2014. Randomized routing schemes for large processor sharing systems with multiple service rates. In ACM SIGMETRICS Performance Evaluation Review , Vol. 42(1). ACM, 555--556. Google Scholar
Digital Library
- G Pang, R Talreja, and W Whitt. 2007. Martingale proofs of many-server heavy-traffic limits for Markovian queues. Probability Surveys , Vol. 4 (2007), 193--267.Google Scholar
Cross Ref
- P Patel, D Bansal, L Yuan, A Murthy, A Greenberg, D A Maltz, R Kern, H Kumar, M Zikos, H Wu, K Changhoon, and N Karri. 2013. Ananta: Cloud scale load balancing. ACM SIGCOMM Computer Communication Review , Vol. 43, 4 (2013), 207--218. Google Scholar
Digital Library
- A L Stolyar. 2015. Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems , Vol. 80, 4 (2015), 341--361. Google Scholar
Digital Library
- J N Tsitsiklis and K Xu. 2012. On the power of (even a little) resource pooling. Stochastic Systems , Vol. 2, 1 (2012), 1--66.Google Scholar
Cross Ref
- J N Tsitsiklis and K Xu. 2013. Queueing system topologies with limited flexibility. In ACM SIGMETRICS Performance Evaluation Review, Vol. 41(1). ACM, 167--178. Google Scholar
Digital Library
- N D Vvedenskaya, R L Dobrushin, and F I Karpelevich. 1996. Queueing system with selection of the shortest of two queues: An asymptotic approach. Problemy Peredachi Informatsii , Vol. 32, 1 (1996), 20--34.Google Scholar
- W Winston. 1977. Optimality of the shortest line discipline. Journal of Applied Probability , Vol. 14, 1 (1977), 181--189.Google Scholar
Cross Ref
- Q Xie, X Dong, Y Lu, and R Srikant. 2015. Power of d choices for large-scale bin packing: A loss model. ACM SIGMETRICS Performance Evaluation Review , Vol. 43, 1 (2015), 321--334. Google Scholar
Digital Library
- L Ying. 2016. On the approximation error of mean-field models. In ACM SIGMETRICS Performance Evaluation Review , Vol. 44(1). ACM, 285--297. Google Scholar
Digital Library
- L Ying, R Srikant, and X Kang. 2015. The power of slightly more than one sample in randomized load balancing. In Computer Communications (INFOCOM), 2015 IEEE Conference on. IEEE, 1131--1139.Google Scholar
Cross Ref
Index Terms
Hyper-Scalable JSQ with Sparse Feedback
Recommendations
Hyper-Scalable JSQ with Sparse Feedback
SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsLoad balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing ...
Hyper-Scalable JSQ with Sparse Feedback
Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing ...
Asymptotically optimal control of N-systems with $$H_2^*$$H2ź service times under many-server heavy traffic
We address a control problem for a queueing system, known as the "N-system," under the Halfin---Whitt heavy-traffic regime. It has two customer classes and two server pools: servers from one pool can serve both customer classes, while servers from the ...






Comments