Abstract
We study a variant of online optimization in which the learner receives k-rounddelayed feedback about hitting cost and there is a multi-step nonlinear switching cost, i.e., costs depend on multiple previous actions in a nonlinear manner. Our main result shows that a novel Iterative Regularized Online Balanced Descent (iROBD) algorithm has a constant, dimension-free competitive ratio that is $O(L^2k )$, where L is the Lipschitz constant of the switching cost. Additionally, we provide lower bounds that illustrate the Lipschitz condition is required and the dependencies on k and L are tight. Finally, via reductions, we show that this setting is closely related to online control problems with delay, nonlinear dynamics, and adversarial disturbances, where iROBD directly offers constant-competitive online policies.
- Naman Agarwal, Brian Bullins, Elad Hazan, Sham M Kakade, and Karan Singh. 2019. Online control with adversarial disturbances. In International Conference on Machine Learning (ICML).Google Scholar
- Naman Agarwal, Elad Hazan, and Karan Singh. 2019. Logarithmic Regret for Online Control. Advances in Neural Information Processing Systems 32 (2019), 10175--10184.Google Scholar
- Oren Anava, Elad Hazan, and Shie Mannor. 2015. Online learning for adversaries with memory: price of past mistakes. In Advances in Neural Information Processing Systems. 784--792.Google Scholar
- Antonios Antoniadis and Kevin Schewior. 2018. A Tight Lower Bound for Online Convex Optimization with Switching Costs. In Approximation and Online Algorithms, Roberto Solis-Oba and Rudolf Fleischer (Eds.). Springer International Publishing, Cham, 164--175.Google Scholar
- CJ Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang. 2020. Chasing convex bodies with linear competitive ratio. In SIAM Symposium on Discrete Algorithms (SODA).Google Scholar
Cross Ref
- Masoud Badiei, Na Li, and Adam Wierman. 2015. Online convex optimization with ramp constraints. In IEEE Conference on Decision and Control (CDC).Google Scholar
Cross Ref
- Nikhil Bansal, Anupam Gupta, Ravishankar Krishnaswamy, Kirk Pruhs, Kevin Schewior, and Cliff Stein. 2015. A 2-Competitive Algorithm For Online Convex Optimization With Switching Costs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 40), Naveen Garg, Klaus Jansen, Anup Rao, and José D. P. Rolim (Eds.). Schloss Dagstuhl-- Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 96--109.Google Scholar
- Nataly Brukhim, Xinyi Chen, Elad Hazan, and Shay Moran. 2020. Online agnostic boosting via regret minimization. arXiv preprint arXiv:2003.01150 33 (2020), 644--654.Google Scholar
- Sébastien Bubeck, Bo'az Klartag, Yin Tat Lee, Yuanzhi Li, and Mark Sellke. 2020. Chasing nested convex bodies nearly optimally. In SIAM Symposium on Discrete Algorithms (SODA).Google Scholar
Cross Ref
- Sébastien Bubeck, Yin Tat Lee, Yuanzhi Li, and Mark Sellke. 2019. Competitively chasing convex bodies. In ACM Symposium on Theory of Computing (STOC).Google Scholar
Digital Library
- Niangjun Chen, Anish Agarwal, Adam Wierman, Siddharth Barman, and Lachlan L.H. Andrew. 2015. Online Convex Optimization Using Predictions. SIGMETRICS Perform. Eval. Rev. 43, 1 (June 2015), 191--204.Google Scholar
Digital Library
- Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, and Adam Wierman. 2016. Using predictions in online optimization: Looking forward with an eye on the past. ACM SIGMETRICS Performance Evaluation Review 44, 1 (2016), 193--206.Google Scholar
Digital Library
- Niangjun Chen, Gautam Goel, and Adam Wierman. 2018. Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent. In Proceedings of Conference On Learning Theory (COLT). 1574--1594.Google Scholar
- Shiyao Chen and Lang Tong. 2012. iEMS for large scale charging of electric vehicles: Architecture and optimal online scheduling. In 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm). IEEE, 629--634.Google Scholar
Cross Ref
- Gautam Goel, Yiheng Lin, Haoyuan Sun, and Adam Wierman. 2019. Beyond online balanced descent: An optimal algorithm for smoothed online optimization. Advances in Neural Information Processing Systems 32 (2019), 1875--1885.Google Scholar
- Gautam Goel and Adam Wierman. 2019. An online algorithm for smoothed regression and lqr control. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2504--2513.Google Scholar
- Paula Gradu, Elad Hazan, and Edgar Minasyan. 2020. Adaptive regret for control of time-varying dynamics. arXiv preprint arXiv:2007.04393 (2020).Google Scholar
- Pooria Joulani, Andras Gyorgy, and Csaba Szepesvári. 2013. Online learning under delayed feedback. In International Conference on Machine Learning. PMLR, 1453--1461.Google Scholar
- Seung-Jun Kim and Geogios B Giannakis. 2016. An online convex optimization approach to real-time energy pricing for demand response. IEEE Transactions on Smart Grid 8, 6 (2016), 2784--2793.Google Scholar
Cross Ref
- Yingying Li, Xin Chen, and Na Li. 2019. Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis. In Advances in Neural Information Processing Systems. 14858--14870.Google Scholar
- Yingying Li, Guannan Qu, and Na Li. 2018. Using predictions in online optimization with switching costs: A fast algorithm and a fundamental limit. In 2018 Annual American Control Conference (ACC). IEEE, 3008--3013.Google Scholar
Cross Ref
- Yingying Li, Guannan Qu, and Na Li. 2020. Online optimization with predictions and switching costs: Fast algorithms and the fundamental limit. IEEE Trans. Automat. Control (2020).Google Scholar
- Minghong Lin, Zhenhua Liu, Adam Wierman, and Lachlan LH Andrew. 2012. Online algorithms for geographical load balancing. In Proceedings of the International Green Computing Conference (IGCC). 1--10. Proc. ACM Meas. Anal. Comput. Syst., Vol. 6, No. 1, Article 17. Publication date: March 2022. 17:22 Weici Pan et al.Google Scholar
Digital Library
- Yiheng Lin, Yang Hu, Haoyuan Sun, Guanya Shi, Guannan Qu, and Adam Wierman. 2021. Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems. arXiv preprint arXiv:2106.10497 (2021).Google Scholar
- Zhenhua Liu, Iris Liu, Steven Low, and Adam Wierman. 2014. Pricing data center demand response. ACM SIGMETRICS Performance Evaluation Review 42, 1 (2014), 111--123.Google Scholar
Digital Library
- David Luenberger. 1967. Canonical forms for linear multivariable systems. IEEE Trans. Automat. Control 12, 3 (1967), 290--293.Google Scholar
Cross Ref
- Mark Sellke. 2020. Chasing convex bodies optimally. In SIAM Symposium on Discrete Algorithms (SODA).Google Scholar
Cross Ref
- Ohad Shamir and Liran Szlak. 2017. Online learning with local permutations and delayed feedback. In International Conference on Machine Learning. PMLR, 3086--3094.Google Scholar
- Guanya Shi, Kamyar Azizzadenesheli, Soon-Jo Chung, and Yisong Yue. 2021. Meta-Adaptive Nonlinear Control: Theory and Algorithms. Thirty-fifth Conference on Neural Information Processing Systems (2021).Google Scholar
- Guanya Shi,Wolfgang Hönig, Xichen Shi, Yisong Yue, and Soon-Jo Chung. 2021. Neural-swarm2: Planning and control of heterogeneous multirotor swarms using learned interactions. IEEE Transactions on Robotics (2021).Google Scholar
- Guanya Shi, Yiheng Lin, Soon-Jo Chung, Yisong Yue, and Adam Wierman. 2020. Online Optimization with Memory and Competitive Control. In Thirty-fourth Conference on Neural Information Processing Systems.Google Scholar
- Guanya Shi, Xichen Shi, Michael O'Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, and Soon-Jo Chung. 2019. Neural lander: Stable drone landing control using learned dynamics. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 9784--9790.Google Scholar
Digital Library
- Max Simchowitz, Karan Singh, and Elad Hazan. 2020. Improper learning for non-stochastic control. In Conference on Learning Theory. PMLR, 3320--3436.Google Scholar
- Jean-Jacques E Slotine, Weiping Li, et al. 1991. Applied nonlinear control. Vol. 199. Prentice hall Englewood Cliffs, NJ.Google Scholar
- Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, and Adam Wierman. 2020. Competitive Control with Delayed Imperfect Information. arXiv preprint arXiv:2010.11637 (2020).Google Scholar
- Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, and Adam Wierman. 2020. The Power of Predictions in Online Control. Advances in Neural Information Processing Systems 33 (2020).Google Scholar
- Kemin Zhou, John Comstock Doyle, Keith Glover, et al. 1996. Robust and optimal control. Vol. 40. Prentice hall New Jersey.Google Scholar
Index Terms
Online Optimization with Feedback Delay and Nonlinear Switching Cost
Recommendations
Online Optimization with Feedback Delay and Nonlinear Switching Cost
SIGMETRICS/PERFORMANCE '22: Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer SystemsWe study a variant of online optimization in which the learner receives k-round delayed feedback about hitting cost and there is a multi-step nonlinear switching cost, i.e., costs depend on multiple previous actions in a nonlinear manner. Our main ...
Online Optimization with Feedback Delay and Nonlinear Switching Cost
SIGMETRICS '22We study a variant of online optimization in which the learner receives k-round delayed feedback about hitting cost and there is a multi-step nonlinear switching cost, i.e., costs depend on multiple previous actions in a nonlinear manner. Our main ...
Online linear optimization and adaptive routing
This paper studies an online linear optimization problem generalizing the multi-armed bandit problem. Motivated primarily by the task of designing adaptive routing algorithms for overlay networks, we present two randomized online algorithms for ...






Comments