skip to main content
research-article
Open Access

Information Aggregation for Constrained Online Control

Authors Info & Claims
Published:04 June 2021Publication History
Skip Abstract Section

Abstract

This paper considers an online control problem involving two controllers. A central controller chooses an action from a feasible set that is determined by time-varying and coupling constraints, which depend on all past actions and states. The central controller's goal is to minimize the cumulative cost; however, the controller has access to neither the feasible set nor the dynamics directly, which are determined by a remote local controller. Instead, the central controller receives only an aggregate summary of the feasibility information from the local controller, which does not know the system costs. We show that it is possible for an online algorithm using feasibility information to nearly match the dynamic regret of an online algorithm using perfect information whenever the feasible sets satisfy a causal invariance criterion and there is a sufficiently large prediction window size. To do so, we use a form of feasibility aggregation based on entropic maximization in combination with a novel online algorithm, named Penalized Predictive Control (PPC) and demonstrate that aggregated information can be efficiently learned using reinforcement learning algorithms. The effectiveness of our approach for closed-loop coordination between central and local controllers is validated via an electric vehicle charging application in power systems.

References

  1. Naman Agarwal, Brian Bullins, Elad Hazan, Sham Kakade, and Karan Singh. Online control with adversarial disturbances. In International Conference on Machine Learning, pages 111--119, 2019.Google ScholarGoogle Scholar
  2. Naman Agarwal, Elad Hazan, and Karan Singh. Logarithmic regret for online control. In Advances in Neural Information Processing Systems, pages 10175--10184, 2019.Google ScholarGoogle Scholar
  3. Oren Anava, Elad Hazan, and Shie Mannor. Online learning for adversaries with memory: price of past mistakes. In Advances in Neural Information Processing Systems, pages 784--792, 2015.Google ScholarGoogle Scholar
  4. Masoud Badiei, Na Li, and Adam Wierman. Online convex optimization with ramp constraints. In 2015 54th IEEE Conference on Decision and Control (CDC), pages 6730--6736. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  5. Andrew G Barto, Richard S Sutton, and Charles W Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE transactions on systems, man, and cybernetics, (5):834--846, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  6. Felix Berkenkamp, Matteo Turchetta, Angela Schoellig, and Andreas Krause. Safe model-based reinforcement learning with stability guarantees. In Advances in neural information processing systems, pages 908--918, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Andrey Bernstein, Emiliano Dall'Anese, and Andrea Simonetto. Online primal-dual methods with measurement feedback for time-varying convex optimization. IEEE Transactions on Signal Processing, 67(8):1978--1991, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, and Manish Purohit. Online learning with imperfect hints. arXiv preprint arXiv:2002.04726, 2020.Google ScholarGoogle Scholar
  9. Xuanyu Cao and KJ Ray Liu. Online convex optimization with time-varying constraints and bandit feedback. IEEE Transactions on Automatic Control, 64(7):2665--2680, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  10. Niangjun Chen, Anish Agarwal, Adam Wierman, Siddharth Barman, and Lachlan LH Andrew. Online convex optimization using predictions. In Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages 191--204, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, and Adam Wierman. Using predictions in online optimization: Looking forward with an eye on the past. ACM SIGMETRICS Performance Evaluation Review, 44(1):193--206, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tianyi Chen and Georgios B Giannakis. Bandit convex optimization for scalable and dynamic iot management. IEEE Internet of Things Journal, 6(1):1276--1286, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  13. Tianyi Chen, Qing Ling, and Georgios B Giannakis. An online convex optimization approach to proactive network resource allocation. IEEE Transactions on Signal Processing, 65(24):6350--6364, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay Mansour, and Kunal Talwar. Online linear quadratic control. arXiv preprint arXiv:1806.07104, 2018.Google ScholarGoogle Scholar
  15. Alon Cohen, Tomer Koren, and Yishay Mansour. Learning linear-quadratic regulators efficiently with only $sqrtT$ regret. In International Conference on Machine Learning, pages 1300--1309, 2019.Google ScholarGoogle Scholar
  16. Sarah Dean, Stephen Tu, Nikolai Matni, and Benjamin Recht. Safely learning to control the constrained linear quadratic regulator. In 2019 American Control Conference (ACC), pages 5582--5588. IEEE, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  17. Gautam Goel and Adam Wierman. An online algorithm for smoothed regression and lqr control. Proceedings of Machine Learning Research, 89:2504--2513, 2019.Google ScholarGoogle Scholar
  18. Lars Grüne and Simon Pirkelmann. Economic model predictive control for time-varying system: Performance and stability results. Optimal Control Applications and Methods, 41(1):42--64, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  19. Linqi Guo, Karl F Erliksson, and Steven H Low. Optimal online adaptive electric vehicle charging. In 2017 IEEE Power & Energy Society General Meeting, pages 1--5. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  20. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning, pages 1861--1870, 2018.Google ScholarGoogle Scholar
  21. Eric Hall and Rebecca Willett. Dynamical models and tracking regret in online convex programming. In International Conference on Machine Learning, pages 579--587, 2013.Google ScholarGoogle Scholar
  22. Eric C Hall and Rebecca M Willett. Online convex optimization in dynamic environments. IEEE Journal of Selected Topics in Signal Processing, 9(4):647--662, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  23. Elad Hazan. Introduction to online convex optimization. arXiv preprint arXiv:1909.05207, 2019.Google ScholarGoogle Scholar
  24. Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, and Karthik Sridharan. Online optimization: Competing with dynamic comparators. In Artificial Intelligence and Statistics, pages 398--406, 2015.Google ScholarGoogle Scholar
  25. Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google ScholarGoogle Scholar
  26. Alec Koppel, Felicia Y Jakubiec, and Alejandro Ribeiro. A saddle point algorithm for networked online convex optimization. IEEE Transactions on Signal Processing, 63(19):5149--5164, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alec Koppel, Brian M Sadler, and Alejandro Ribeiro. Proximity without consensus in online multiagent optimization. IEEE Transactions on Signal Processing, 65(12):3062--3077, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zachary J Lee, Daniel Chang, Cheng Jin, George S Lee, Rand Lee, Ted Lee, and Steven H Low. Large-scale adaptive electric vehicle charging. In 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pages 1--7. IEEE, 2018.Google ScholarGoogle Scholar
  29. Zachary J Lee, Tongxin Li, and Steven H Low. Acn-data: Analysis and applications of an open ev charging dataset. In Proceedings of the Tenth ACM International Conference on Future Energy Systems, pages 139--149, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yanzhe Murray Lei, Stefanus Jasin, and Amitabh Sinha. Near-optimal bisection search for nonparametric dynamic pricing with inventory constraint. Ross School of Business Paper, (1252), 2014.Google ScholarGoogle Scholar
  31. Antoine Lesage-Landry, Iman Shames, and Joshua A Taylor. Predictive online convex optimization. Automatica, 113:108771, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tongxin Li, Steven H Low, and Adam Wierman. Real-time flexibility feedback for closed-loop aggregator and system operator coordination. In Proceedings of the Eleventh ACM International Conference on Future Energy Systems, pages 279--292, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yingying Li, Guannan Qu, and Na Li. Using predictions in online optimization with switching costs: A fast algorithm and a fundamental limit. In 2018 Annual American Control Conference (ACC), pages 3008--3013. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  34. Minghong Lin, Zhenhua Liu, Adam Wierman, and Lachlan LH Andrew. Online algorithms for geographical load balancing. In 2012 international green computing conference (IGCC), pages 1--10. IEEE, 2012.Google ScholarGoogle Scholar
  35. Qiulin Lin, Hanling Yi, John Pang, Minghua Chen, Adam Wierman, Michael Honig, and Yuanzhang Xiao. Competitive online optimization under inventory constraints. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 3(1):1--28, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yiheng Lin, Gautam Goel, and Adam Wierman. Online optimization with predictions and non-convex losses. arXiv preprint arXiv:1911.03827, 2019.Google ScholarGoogle Scholar
  37. Shie Mannor, John N Tsitsiklis, and Jia Yuan Yu. Online learning with sample path constraints. Journal of Machine Learning Research, 10(Mar):569--590, 2009.Google ScholarGoogle Scholar
  38. Aryan Mokhtari, Shahin Shahrampour, Ali Jadbabaie, and Alejandro Ribeiro. Online optimization in dynamic environments: Improved regret rates for strongly convex problems. In 2016 IEEE 55th Conference on Decision and Control (CDC), pages 7195--7201. IEEE, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Miguel A Ortega-Vazquez, Francc ois Bouffard, and Vera Silva. Electric vehicle aggregator/system operator coordination for charging scheduling and services procurement. IEEE Transactions on Power Systems, 28(2):1806--1815, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  40. Romer Rosales and Stan Sclaroff. Improved tracking of multiple humans with trajectory prediction and occlusion modeling. Technical report, Boston University Computer Science Department, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ugo Rosolia, Xiaojing Zhang, and Francesco Borrelli. Data-driven predictive control for autonomous systems. Annual Review of Control, Robotics, and Autonomous Systems, 1:259--286, 2018.Google ScholarGoogle Scholar
  42. Guanya Shi, Yiheng Lin, Soon-Jo Chung, Yisong Yue, and Adam Wierman. Beyond no-regret: Competitive control via online optimization with memory. arXiv preprint arXiv:2002.05318, 2020.Google ScholarGoogle Scholar
  43. Ming Shi, Xiaojun Lin, Sonia Fahmy, and Dong-Hoon Shin. Competitive online convex optimization with switching costs and ramp constraints. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pages 1835--1843. IEEE, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, and Barnabás Póczos. Nonparametric density estimation under adversarial losses. In Advances in Neural Information Processing Systems, pages 10225--10236, 2018.Google ScholarGoogle Scholar
  45. Bo Sun, Ali Zeynali, Tongxin Li, Mohammad Hajiesmaili, Adam Wierman, and Danny HK Tsang. Competitive algorithms for the online multiple knapsack problem with application to electric vehicle charging. arXiv preprint arXiv:2010.00412, 2020.Google ScholarGoogle Scholar
  46. Wen Sun, Debadeepta Dey, and Ashish Kapoor. Safety-aware algorithms for adversarial contextual bandit. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3280--3288. JMLR. org, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xianfu Wang. Volumes of generalized unit balls. Mathematics Magazine, 78(5):390--395, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  48. Xinlei Yi, Xiuxian Li, Lihua Xie, and Karl H Johansson. Distributed online convex optimization with time-varying coupled inequality constraints. IEEE Transactions on Signal Processing, 68:731--746, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  49. Hao Yu, Michael Neely, and Xiaohan Wei. Online convex optimization with stochastic constraints. In Advances in Neural Information Processing Systems, pages 1428--1438, 2017.Google ScholarGoogle Scholar
  50. Jianjun Yuan and Andrew Lamperski. Online convex optimization for cumulative constraints. In Advances in Neural Information Processing Systems, pages 6137--6146, 2018.Google ScholarGoogle Scholar
  51. Lijun Zhang, Tianbao Yang, Zhi-Hua Zhou, et al. Dynamic regret of strongly adaptive methods. In International Conference on Machine Learning, pages 5882--5891, 2018.Google ScholarGoogle Scholar
  52. Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (icml-03), pages 928--936, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Information Aggregation for Constrained Online Control

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!