skip to main content
research-article
Public Access

Worst-case Satisfaction of STL Specifications Using Feedforward Neural Network Controllers: A Lagrange Multipliers Approach

Published:08 October 2019Publication History
Skip Abstract Section

Abstract

In this paper, a reinforcement learning approach for designing feedback neural network controllers for nonlinear systems is proposed. Given a Signal Temporal Logic (STL) specification which needs to be satisfied by the system over a set of initial conditions, the neural network parameters are tuned in order to maximize the satisfaction of the STL formula. The framework is based on a max-min formulation of the robustness of the STL formula. The maximization is solved through a Lagrange multipliers method, while the minimization corresponds to a falsification problem. We present our results on a vehicle and a quadrotor model and demonstrate that our approach reduces the training time more than 50 percent compared to the baseline approach.

References

  1. Houssam Abbas, Matthew O’Kelly, Alena Rodionova, and Rahul Mangharam. 2017. Safe at any speed: A simulation-based test harness for autonomous vehicles. (2017).Google ScholarGoogle Scholar
  2. Arvind Adimoolam, Thao Dang, Alexandre Donzé, James Kapinski, and Xiaoqing Jin. 2017. Classification and coverage-based falsification for embedded control systems. In International Conference on Computer Aided Verification. Springer, 483--503.Google ScholarGoogle ScholarCross RefCross Ref
  3. Matthias Althoff. 2015. An introduction to CORA 2015. In Proc. of the Workshop on Applied Verification for Continuous and Hybrid Systems.Google ScholarGoogle Scholar
  4. Yashwanth Annpureddy, Che Liu, Georgios Fainekos, and Sriram Sankaranarayanan. 2011. S-taliro: A tool for temporal logic falsification for hybrid systems. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 254--257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ezio Bartocci, Jyotirmoy Deshmukh, Alexandre Donze, Georgios Fainekos, Oded Maler, Dejan Nivckovic, and Sriram Sankaranarayanan. 2018. Specification-based monitoring of cyber-physical systems: A survey on theory, tools and applications. In Lectures on Runtime Verification. Springer, 135--175.Google ScholarGoogle Scholar
  6. Dimitri P. Bertsekas. 2014. Constrained Optimization and Lagrange Multiplier Methods. Academic press.Google ScholarGoogle Scholar
  7. Dimitri P. Bertsekas. 2019. Reinforcement learning and optimal control. Athena Scientific.Google ScholarGoogle Scholar
  8. Xin Chen, Erika Ábrahám, and Sriram Sankaranarayanan. 2013. Flow*: An analyzer for non-linear hybrid systems. In International Conference on Computer Aided Verification. Springer, 258--263.Google ScholarGoogle ScholarCross RefCross Ref
  9. Kyunghoon Cho and Songhwai Oh. 2018. Learning-based model predictive control under signal temporal logic specifications. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 7322--7329.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Arthur Claviere, Souradeep Dutta, and Sriram Sankaranarayanan. 2019. Trajectory tracking control for robotic vehicles using counterexample guided training of neural networks. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 29. 680--688.Google ScholarGoogle Scholar
  11. Konstantinos Dalamagkidis, Kimon P Valavanis, and Les A. Piegl. 2010. Nonlinear model predictive control with neural network optimization for autonomous autorotation of small unmanned helicopters. IEEE Transactions on Control Systems Technology 19, 4 (2010), 818--831.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Dehghani, M. Ahmadi, A. Khayatian, M. Eghtesad, and M. Farid. 2008. Neural network solution for forward kinematics problem of HEXA parallel robot. In 2008 American Control Conference. IEEE, 4214--4219.Google ScholarGoogle Scholar
  13. Marc Deisenroth and Carl E. Rasmussen. 2011. PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on Machine Learning (ICML-11). 465--472.Google ScholarGoogle Scholar
  14. Alexandre Donzé. 2010. Breach, a toolbox for verification and parameter synthesis of hybrid systems. In International Conference on Computer Aided Verification. Springer, 167--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alexandre Donzé and Oded Maler. 2010. Robust satisfaction of temporal logic over real-valued signals. In International Conference on Formal Modeling and Analysis of Timed Systems. Springer, 92--106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tommaso Dreossi, Alexandre Donzé, and Sanjit A Seshia. 2017. Compositional falsification of cyber-physical systems with machine learning components. In NASA Formal Methods Symposium. Springer, 357--372.Google ScholarGoogle ScholarCross RefCross Ref
  17. Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Kurt Keutzer, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2018. Counterexample-guided data augmentation. arXiv preprint arXiv:1805.06962 (2018).Google ScholarGoogle Scholar
  18. Tommaso Dreossi, Somesh Jha, and Sanjit A. Seshia. 2018. Semantic adversarial deep learning. In International Conference on Computer Aided Verification. Springer, 3--26.Google ScholarGoogle Scholar
  19. Souradeep Dutta, Xin Chen, and Sriram Sankaranarayanan. 2019. Reachability analysis for neural feedback systems using regressive polynomial rule inference. In International Conference on Hybrid Systems: Computation and Control (HSCC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Souradeep Dutta, Susmit Jha, Sriram Sanakaranarayanan, and Ashish Tiwari. 2017. Output range analysis for deep neural networks. arXiv preprint arXiv:1709.09130 (2017).Google ScholarGoogle Scholar
  21. Souradeep Dutta, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2018. Learning and verification of feedback control systems using feedforward neural networks. IFAC-PapersOnLine 51, 16 (2018), 151--156.Google ScholarGoogle ScholarCross RefCross Ref
  22. Georgios E. Fainekos and George J. Pappas. 2009. Robustness of temporal logic specifications for continuous-time signals. Theoretical Computer Science 410, 42 (2009), 4262--4291.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Goran Frehse, Colas Le Guernic, Alexandre Donzé, Scott Cotton, Rajarshi Ray, Olivier Lebeltel, Rodolfo Ripado, Antoine Girard, Thao Dang, and Oded Maler. 2011. SpaceEx: Scalable verification of hybrid systems. In International Conference on Computer Aided Verification. Springer, 379--395.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Qitong Gao, Davood Hajinezhad, Yan Zhang, Yiannis Kantaros, and Michael M. Zavlanos. 2019. Reduced variance deep reinforcement learning with temporal logic specifications. (2019).Google ScholarGoogle Scholar
  25. Martin T Hagan, Howard B Demuth, and Orlando De Jesús. 2002. An introduction to the use of neural networks in control systems. International Journal of Robust and Nonlinear Control: IFAC-Affiliated Journal 12, 11 (2002), 959--985.Google ScholarGoogle ScholarCross RefCross Ref
  26. Nikolaus Hansen and Stefan Kern. 2004. Evaluating the CMA evolution strategy on multimodal test functions. In International Conference on Parallel Problem Solving from Nature. Springer, 282--291.Google ScholarGoogle ScholarCross RefCross Ref
  27. Nikolaus Hansen and Andreas Ostermeier. 2001. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 9, 2 (2001), 159--195.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael Hertneck, Johannes Köhler, Sebastian Trimpe, and Frank Allgöwer. 2018. Learning an approximate model predictive controller with guarantees. IEEE Control Systems Letters 2, 3 (2018), 543--548.Google ScholarGoogle ScholarCross RefCross Ref
  29. Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2, 5 (1989), 359--366.Google ScholarGoogle ScholarCross RefCross Ref
  30. Radoslav Ivanov, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. 2019. Verisig: Verifying safety properties of hybrid systems with neural network controllers. (2019), 169--178.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kyle D. Julian and Mykel J. Kochenderfer. 2017. Neural network guidance for UAVs. In AIAA Guidance, Navigation, and Control Conference. 1743.Google ScholarGoogle Scholar
  32. Kyle D. Julian, Jessica Lopez, Jeffrey S. Brush, Michael P. Owen, and Mykel J. Kochenderfer. 2016. Policy compression for aircraft collision avoidance systems. In 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). IEEE, 1--10.Google ScholarGoogle Scholar
  33. Hassan K. Khalil and Jessy W. Grizzle. 2002. Nonlinear systems. Vol. 3. Prentice hall Upper Saddle River, NJ.Google ScholarGoogle Scholar
  34. Ron Koymans. 1990. Specifying real-time properties with metric temporal logic. Real-time Systems 2, 4 (1990), 255--299.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Sergey Levine and Pieter Abbeel. 2014. Learning neural network policies with guided policy search under unknown dynamics. In Advances in Neural Information Processing Systems. 1071--1079.Google ScholarGoogle Scholar
  36. Xiao Li, Yao Ma, and Calin Belta. 2018. A policy search method for temporal logic specified reinforcement learning tasks. In 2018 Annual American Control Conference (ACC). IEEE, 240--245.Google ScholarGoogle ScholarCross RefCross Ref
  37. Oded Maler and Dejan Nickovic. 2004. Monitoring temporal properties of continuous signals. In Formal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems. Springer, 152--166.Google ScholarGoogle Scholar
  38. Mohammadreza Mehrabian et al. 2017. Timestamp temporal logic (TTL) for testing the timing of cyber-physical systems. ACM Transactions on Embedded Computing Systems (TECS) 16, 5s (2017), 169.Google ScholarGoogle Scholar
  39. William H Montgomery and Sergey Levine. 2016. Guided policy search via approximate mirror descent. In Advances in Neural Information Processing Systems. 4008--4016.Google ScholarGoogle Scholar
  40. Meinard Müller. 2007. Dynamic time warping. Information Retrieval for Music and Motion (2007), 69--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. K. Muralitharan, Rathinasamy Sakthivel, and R. Vishnuvarthan. 2018. Neural network based optimization approach for energy demand prediction in smart grid. Neurocomputing 273 (2018), 199--208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yash Vardhan Pant, Houssam Abbas, and Rahul Mangharam. 2017. Smooth operator: Control using the smooth robustness of temporal logic. In Control Technology and Applications (CCTA), 2017 IEEE Conference on. IEEE, 1235--1240.Google ScholarGoogle ScholarCross RefCross Ref
  43. Yash Vardhan Pant, Houssam Abbas, Rhudii A. Quaye, and Rahul Mangharam. 2018. Fly-by-logic: Control of multi-drone fleets with temporal logic objectives. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems. IEEE Press, 186--197.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2012. Understanding the exploding gradient problem. CoRR, abs/1211.5063 2 (2012).Google ScholarGoogle Scholar
  45. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In International Conference on Machine Learning. 1310--1318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Vasumathi Raman, Alexandre Donzé, Mehdi Maasoumy, Richard M. Murray, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2014. Model predictive control with signal temporal logic specifications. In 53rd IEEE Conference on Decision and Control. IEEE, 81--87.Google ScholarGoogle Scholar
  48. Vasumathi Raman, Alexandre Donzé, Dorsa Sadigh, Richard M. Murray, and Sanjit A. Seshia. 2015. Reactive synthesis from signal temporal logic specifications. In Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control. ACM, 239--248.Google ScholarGoogle Scholar
  49. Vicenc Rubies Royo, David Fridovich-Keil, Sylvia Herbert, and Claire J. Tomlin. 2018. Classification-based approximate reachability with guarantees applied to safe trajectory tracking. arXiv preprint arXiv:1803.03237 (2018).Google ScholarGoogle Scholar
  50. Johann Schumann and Yan Liu. 2010. Applications of neural networks in high assurance systems. SCI, Vol. 268. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Cumhur Erkan Tuncali, Georgios Fainekos, Hisahiro Ito, and James Kapinski. 2018. Simulation-based adversarial test generation for autonomous vehicles with machine learning components. In 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1555--1562.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Cristian-Ioan Vasile, Vasumathi Raman, and Sertac Karaman. 2017. Sampling-based synthesis of maximally-satisfying controllers for temporal logic specifications. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3840--3847.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Marcell J. Vazquez-Chanlatte, Shromona Ghosh, Vasumathi Raman, Alberto Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2018. Generating dominant strategies for continuous two-player zero-sum games. IFAC-PapersOnLine 51, 16 (2018), 7--12.Google ScholarGoogle ScholarCross RefCross Ref
  54. Grady Williams, Nolan Wagener, Brian Goldfain, Paul Drews, James M. Rehg, Byron Boots, and Evangelos A. Theodorou. 2017. Information theoretic MPC for model-based reinforcement learning. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1714--1721.Google ScholarGoogle Scholar
  55. Weiming Xiang, Patrick Musau, Ayana A. Wild, Diego Manzanas Lopez, Nathaniel Hamilton, Xiaodong Yang, Joel Rosenfeld, and Taylor T. Johnson. 2018. Verification for machine learning, autonomy, and neural networks survey. arXiv preprint arXiv:1810.01989 (2018).Google ScholarGoogle Scholar
  56. Shakiba Yaghoubi and Georgios Fainekos. 2018. Falsification of temporal logic requirements using gradient based local search in space and time. IFAC-PapersOnLine 51, 16 (2018), 103--108.Google ScholarGoogle ScholarCross RefCross Ref
  57. Shakiba Yaghoubi and Georgios Fainekos. 2019. Gray-box adversarial testing for control systems with machine learning components. In Proceedings of the 22Nd ACM International Conference on Hybrid Systems: Computation and Control (HSCC’19). ACM, New York, NY, USA, 179--184. DOI:https://doi.org/10.1145/3302504.3311814Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Tianhao Zhang, Gregory Kahn, Sergey Levine, and Pieter Abbeel. 2016. Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 528--535.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Siqi Zhou, Mohamed K. Helwa, and Angela P. Schoellig. 2017. Design of deep neural networks as add-on blocks for improving impromptu trajectory tracking. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC). IEEE, 5201--5207.Google ScholarGoogle Scholar

Index Terms

  1. Worst-case Satisfaction of STL Specifications Using Feedforward Neural Network Controllers: A Lagrange Multipliers Approach

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!