Abstract
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint functions are revealed after the decision is made at each time slot. The best known expected regret for solving such a problem is $\mathcalO (\sqrtT )$, with a coefficient that is polynomial in the dimension of the decision variable and relies on theSlater condition (i.e. the existence of interior point assumption), which is restrictive and in particular precludes treating equality constraints. In this paper, we show that such Slater condition is in fact not needed. We propose a newprimal-dual mirror descent algorithm and show that one can attain $\mathcalO (\sqrtT )$ regret and constraint violation under a much weaker Lagrange multiplier assumption, allowing general equality constraints and significantly relaxing the previous Slater conditions. Along the way, for the case where decisions are contained in a probability simplex, we reduce the coefficient to have only a logarithmic dependence on the decision variable dimension. Such a dependence has long been known in the literature on mirror descent but seems new in this new constrained online learning scenario. Simulation experiments on a data center server provision problem with real electricity price traces further demonstrate the performance of our proposed algorithm.
- Deepak Agarwal, Souvik Ghosh, Kai Wei, and Siyu You. 2014. Budget pacing for targeted online advertisements at LinkedIn. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1613--1619.Google Scholar
Digital Library
- Dimitri P Bertsekas. 1999. Nonlinear programming .Athena scientific Belmont.Google Scholar
- Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization .Cambridge university press.Google Scholar
- Nicolò Cesa-Bianchi, Philip M Long, and Manfred K Warmuth. 1996. Worst-case quadratic loss bounds for prediction using linear functions and gradient descent. IEEE Transactions on Neural Networks , Vol. 7, 3 (1996), 604--619.Google Scholar
Digital Library
- Tianyi Chen and Georgios B Giannakis. 2019. Bandit convex optimization for scalable and dynamic IoT management. IEEE Internet of Things Journal , Vol. 6, 1 (2019), 1276--1286.Google Scholar
Cross Ref
- Wei Deng, Ming-Jun Lai, Zhimin Peng, and Wotao Yin. 2017. Parallel multi-block ADMM with o (1/k) convergence. Journal of Scientific Computing , Vol. 71, 2 (2017), 712--736.Google Scholar
Digital Library
- Anshul Gandhi, Mor Harchol-Balter, and Michael A Kozuch. 2012. Are sleep states effective in data centers?. In 2012 International Green Computing Conference (IGCC). IEEE, 1--10.Google Scholar
Digital Library
- Jacques Gauvin. 1977. A necessary and sufficient regularity condition to have bounded multipliers in nonconvex programming. Mathematical Programming , Vol. 12, 1 (1977), 136--138.Google Scholar
Digital Library
- Geoffrey J Gordon. 1999. Regret bounds for prediction problems. In Proceeding of Conference on Learning Theory (COLT) .Google Scholar
Digital Library
- Elad Hazan. 2016. Introduction to online convex optimization. Foundations and Trends in Optimization , Vol. 2, 3--4 (2016), 157--325.Google Scholar
Digital Library
- Rodolphe Jenatton, Jim Huang, and Cédric Archambeau. 2016. Adaptive Algorithms for Online Convex Optimization with Long-term Constraints. In Proceedings of International Conference on Machine Learning (ICML) .Google Scholar
- Nikolaos Liakopoulos, Apostolos Destounis, Georgios Paschos, Thrasyvoulos Spyropoulos, and Panayotis Mertikopoulos. 2019. Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), , Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, Long Beach, California, USA, 3944--3952. http://proceedings.mlr.press/v97/liakopoulos19a.htmlGoogle Scholar
- Mehrdad Mahdavi, Rong Jin, and Tianbao Yang. 2012. Trading regret for efficiency: online convex optimization with long term constraints. Journal of Machine Learning Research , Vol. 13, 1 (2012), 2503--2528.Google Scholar
Digital Library
- Shie Mannor, John N Tsitsiklis, and Jia Yuan Yu. 2009. Online learning with sample path constraints. Journal of Machine Learning Research , Vol. 10 (March 2009), 569--590.Google Scholar
- Angelia Nedić and Asuman Ozdaglar. 2009. Approximate primal solutions and rate analysis for dual subgradient methods. SIAM Journal on Optimization , Vol. 19, 4 (2009), 1757--1780.Google Scholar
Digital Library
- Michael J Neely. 2014. A simple convergence time analysis of drift-plus-penalty for stochastic optimization and convex programs. arXiv preprint arXiv:1412.0791 (2014).Google Scholar
- Arkadi Nemirovski, Anatoli Juditsky, Guanghui Lan, and Alexander Shapiro. 2009. Robust stochastic approximation approach to stochastic programming. SIAM Journal on optimization , Vol. 19, 4 (2009), 1574--1609.Google Scholar
- V Hien Nguyen, J-J Strodiot, and Robert Mifflin. 1980. On conditions to have bounded multipliers in locally Lipschitz programming. Mathematical Programming , Vol. 18, 1 (1980), 100--106.Google Scholar
Cross Ref
- Alexander A Titov, Fedor S Stonyakin, Alexander V Gasnikov, and Mohammad S Alkousa. 2018. Mirror Descent and Constrained Online Optimization Problems. In International Conference on Optimization and Applications. Springer, 64--78.Google Scholar
- Paul Tseng. 2005. On accelerated proximal gradient methods for convex-concave optimization. MIT Technical Report (2005).Google Scholar
- Paul Tseng. 2010. Approximation accuracy, gradient methods, and error bound for structured convex optimization. Mathematical Programming , Vol. 125, 2 (2010), 263--295.Google Scholar
Digital Library
- Xiaohan Wei, Hao Yu, Qing Ling, and Michael Neely. 2018. Solving Non-smooth Constrained Programs with Lower Complexity than $mathcalO(1/varepsilon)$: A Primal-Dual Homotopy Smoothing Approach. In Advances in Neural Information Processing Systems. 3999--4009.Google Scholar
- Yi Xu, Mingrui Liu, Qihang Lin, and Tianbao Yang. 2017. ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization. In Advances in Neural Information Processing Systems . 1267--1277.Google Scholar
- Tianbao Yang and Qihang Lin. 2015. Rsg: Beating subgradient method without smoothness and strong convexity. arXiv preprint arXiv:1512.03107 (2015).Google Scholar
- Xinlei Yi, Xiuxian Li, Lihua Xie, and Karl H Johansson. 2019. Distributed Online Convex Optimization with Time-Varying Coupled Inequality Constraints. arXiv preprint arXiv:1903.04277 (2019).Google Scholar
- Hao Yu, Michael Neely, and Xiaohan Wei. 2017. Online convex optimization with stochastic constraints. In Advances in Neural Information Processing Systems. 1428--1438.Google Scholar
- Hao Yu and Michael J Neely. 2016. A Low Complexity Algorithm with $mathcalO(sqrtT)$ Regret and Finite Constraint Violations for Online Convex Optimization with Long Term Constraints. arXiv preprint arXiv:1604.02218 (2016).Google Scholar
- Hao Yu and Michael J Neely. 2017. A Simple Parallel Algorithm with an $O(1/t)$ Convergence Rate for General Convex Programs. SIAM Journal on Optimization , Vol. 27, 2 (2017), 759--783.Google Scholar
Cross Ref
- Jianjun Yuan and Andrew Lamperski. 2018. Online convex optimization for cumulative constraints. In Advances in Neural Information Processing Systems. 6140--6149.Google Scholar
- Alp Yurtsever, Quoc Tran Dinh, and Volkan Cevher. 2015. A universal primal-dual convex optimization framework. In Advances in Neural Information Processing Systems. 3150--3158.Google Scholar
- Martin Zinkevich. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of International Conference on Machine Learning (ICML) .Google Scholar
Index Terms
Online Primal-Dual Mirror Descent under Stochastic Constraints
Recommendations
Online Learning in Weakly Coupled Markov Decision Processes: A Convergence Time Study
We consider multiple parallel Markov decision processes (MDPs) coupled by global constraints, where the time varying objective and constraint functions can only be observed after the decision is made. Special attention is given to how well the decision ...
Online Primal-Dual Mirror Descent under Stochastic Constraints
SIGMETRICS '20: Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsWe consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint ...
Online Primal-Dual Mirror Descent under Stochastic Constraints
We consider online convex optimization with stochastic constraints where the objective functions are arbitrarily time-varying and the constraint functions are independent and identically distributed (i.i.d.) over time. Both the objective and constraint ...






Comments