Abstract
Motivated by applications in machine learning and statistics, we study distributed optimization problems over a network of processors, where the goal is to optimize a global objective composed of a sum of local functions. In these problems, due to the large scale of the data sets, the data and computation must be distributed over processors resulting in the need for distributed algorithms. In this paper, we consider a popular distributed gradient-based consensus algorithm, which only requires local computation and communication. An important problem in this area is to analyze the convergence rate of such algorithms in the presence of communication delays that are inevitable in distributed systems. We prove the convergence of the gradient-based consensus algorithm in the presence of uniform, but possibly arbitrarily large, communication delays between the processors. Moreover, we obtain an upper bound on the rate of convergence of the algorithm as a function of the network size, topology, and the inter-processor communication delays.
- D. Bertsekas, A. Nedić, and A. Ozdaglar. 2004. Convex Analysis and Optimization. Cambridge, MA: Athena Scientific.Google Scholar
- V.D. Blondel, J.M. Hendrickx, A. Olshevsky, and J.N. Tsitsiklis. 2005. Convergence in multiagent coordination, consensus, and flocking. In Proceeding of the Joint 44th Conference on Decision and Control And European Control Conference. 2996--3000.Google Scholar
- S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2011. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning 3, 1 (2011), 1--22. Google Scholar
Digital Library
- T. Charalambous, Y. Yuan, T. Yang, W. Pan, C. N. Hadjicostis, and M. Johansson. 2015. Distributed Finite-Time Average Consensus in Digraphs in the Presence of Time Delays. IEEE Transactions on Control of Network Systems 2, 4 (Dec 2015), 370--381.Google Scholar
Cross Ref
- Y.C. Eldar D.P. Palomar. Dec. 2009. Convex Optimization in Signal Processing and Communications (1st ed.). Cambridge University Press.Google Scholar
- J.C. Duchi, A. Agarwal, and M.J. Wainwright. 2012. Dual averaging for distributed optimization: Convergence analysis and network scaling. IEEE Transactions on Automatic Control 57, 3 (2012), 592--606.Google Scholar
Cross Ref
- M. Li et. al. 2014. Scaling Distributed Machine Learning with the Parameter Server. In Operating Systems Design and Implementation (OSDI). Google Scholar
Digital Library
- B. Gharesifard and J. Cortés. 2014. Distributed Continuous-Time Convex Optimization on Weight-Balanced Digraphs. IEEE Trans. Automat. Control 59, 3 (2014), 781--786.Google Scholar
Cross Ref
- J.K. Hale and S.M.V. Lunel. 1993. Introduction to Functional Diffential Equations. Vol. 99. Springer-Verlag.Google Scholar
- T. Hastie, T. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springe-Verlag, New York.Google Scholar
- R.A. Horn and C.R. Johnson. 1985. Matrix Analysis. Cambridge, U.K.: Cambridge Univ. Press. Google Scholar
Digital Library
- H. K. Khalil. 2002. Nonlinear System (3rd ed.). Upper Saddle River, NJ: Prentice Hall.Google Scholar
- A. Makhdoumi and A. Ozdaglar. 2014. Broadcast-based distributed alternating direction method of multipliers. In 52nd Annual Allerton Conference on Communication, Control, and Computing. Monticello, IL.Google Scholar
- G. Meteos, J. Bazerque, and G. Giannakis. 2010. Distributed Sparse Linear Regression. IEEE Transactions on Signal Processing 58 (2010), 5262--5276. Google Scholar
Digital Library
- U. Münz, A. Papachristodoulou, and F. Allgöwer. 2011. Consensus in Multi-Agent Systems With Coupling Delays and Switching Topology. IEEE Trans. Automat. Control 56, 12 (2011), 2976 -- 2982.Google Scholar
Cross Ref
- A. Nedič and A. Olshevsky. 2015. Distributed Optimization Over Time-Varying Directed Graphs. IEEE Trans. Automat. Control 60, 3 (2015), 601--615.Google Scholar
Cross Ref
- A. Nedič, A. Olshevsky, A. Ozdaglar, and J. N. Tsitsiklis. 2009. On Distributed Averaging Algorithms and Quantization Effect. IEEE Trans. Automat. Control 54, 11 (2009), 2506--2517.Google Scholar
Cross Ref
- A. Nedíc, A. Olshevsky, and W. Shi. 2016. Achieving linear convergence for distributed optimization over time-varying and directed graphs. arXiv preprint: http://arxiv.org/pdf/1607.03218v1.pdf. (2016).Google Scholar
- A. Nedič and A. Ozdaglar. 2009. Distributed Subgradient Methods for Multi-Agent Optimization. IEEE Trans. Automat. Control 54, 1 (2009), 48--61.Google Scholar
Cross Ref
- A. Nedič and A. Ozdaglar. 2010. Convergence rate for consensus with delays. Journal of Global Optimization 47, 3 (2010), 437'456. Google Scholar
Digital Library
- A. Nedič, A. Ozdaglar, and P. A. Parrilo. 2010. Constrained Consensus and Optimization in Multi-Agent Networks. IEEE Trans. Automat. Control 55, 4 (2010), 922--938.Google Scholar
Cross Ref
- Y. Nesterov. 2004. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Norwell, MA. Google Scholar
Digital Library
- G. Qu and N. Li. 2016. Harnessing Smoothness to Accelerate Distributed Optimization. arXiv preprint: https: //arxiv.org/pdf/1605.07112v1.pdf. (2016).Google Scholar
- S. Shalev-Shwartz and S. Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms (1st ed.). Cambridge University Press. Google Scholar
Digital Library
- W. Shi, Q. Ling, G. Wu, and W. Yin. 2014. On the Linear Convergence of the ADMM in Decentralized Consensus Optimization. IEEE Transactions on Signal Processing 62, 7 (2014), 1750--1761. Google Scholar
Digital Library
- W. Shi, Q. Ling, G. Wu, and W. Yin. 2015. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization. SIAM Journal on Optimization 25, 2 (2015), 944--966. Google Scholar
Digital Library
- B. Touri and B. Gharesifard. 2015. Continuous-time distributed convex optimization on time-varying directed networks. In IEEE 54th Annual Conference on Decision and Control (CDC). Japan.Google Scholar
- K.I. Tsianos, S. Lawlor, and M.G. Rabbat. 2012. Distributed dual averaging for convex optimization under communication delays. In Proc. of American Control Conference (ACC).Google Scholar
- K.I. Tsianos, S. Lawlor, and M.G. Rabbat. 2012. Push-Sum Distributed Dual Averaging for Convex Optimization. In Proc. of the 51st IEEE Conference on Decision and Control (CDC). Hawaii, USA.Google Scholar
- K.I. Tsianos and M.G. Rabbat. 2012. Consensus-Based Distributed Optimization: Practical Issues and Applications in Large-Scale Machine Learning. In Proc. of Allerton Conference on Communication, Control, and Computing.Google Scholar
- K.I. Tsianos and M.G. Rabbat. 2012. The Impact of Communication Delays on Distributed Consensus Algorithms. arXiv preprint: https://arxiv.org/pdf/1207.5839.pdf. (2012).Google Scholar
- E. Wei and A. Ozdaglar. 2013. On the O(1/k) convergence of asynchronous distributed alternating direction method of multipliers. arXiv preprint: https://arxiv.org/abs/1307.8254. (2013).Google Scholar
Index Terms
On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays
Recommendations
On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays
SIGMETRICS '18Motivated by applications in machine learning and statistics, we study distributed optimization problems over a network of processors, where the goal is to optimize a global objective composed of a sum of local functions. In these problems, due to the ...
On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays
SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer SystemsMotivated by applications in machine learning and statistics, we study distributed optimization problems over a network of processors, where the goal is to optimize a global objective composed of a sum of local functions. In these problems, due to the ...
Distributed multi-agent optimization subject to nonidentical constraints and communication delays
In this paper, we study a distributed optimization problem using a subgradient projection algorithm for multi-agent systems subject to nonidentical constraints and communication delays under local communication. Here the agents capable of communicating ...






Comments