skip to main content
research-article
Public Access

Safety Verification of Cyber-Physical Systems with Reinforcement Learning Control

Published:08 October 2019Publication History
Skip Abstract Section

Abstract

This paper proposes a new forward reachability analysis approach to verify safety of cyber-physical systems (CPS) with reinforcement learning controllers. The foundation of our approach lies on two efficient, exact and over-approximate reachability algorithms for neural network control systems using star sets, which is an efficient representation of polyhedra. Using these algorithms, we determine the initial conditions for which a safety-critical system with a neural network controller is safe by incrementally searching a critical initial condition where the safety of the system cannot be established. Our approach produces tight over-approximation error and it is computationally efficient, which allows the application to practical CPS with learning enable components (LECs). We implement our approach in NNV, a recent verification tool for neural networks and neural network control systems, and evaluate its advantages and applicability by verifying safety of a practical Advanced Emergency Braking System (AEBS) with a reinforcement learning (RL) controller trained using the deep deterministic policy gradient (DDPG) method. The experimental results show that our new reachability algorithms are much less conservative than existing polyhedra-based approaches. We successfully determine the entire region of the initial conditions of the AEBS with the RL controller such that the safety of the system is guaranteed, while a polyhedra-based approach cannot prove the safety properties of the system.

References

  1. Anayo K. Akametalu, Jaime F. Fisac, Jeremy H. Gillula, Shahab Kaynama, Melanie N. Zeilinger, and Claire J. Tomlin. 2014. Reachability-based safe learning with Gaussian processes. In 53rd IEEE Conference on Decision and Control. IEEE, 1424--1431.Google ScholarGoogle Scholar
  2. Matthias Althoff. 2015. An introduction to CORA 2015. In Proc. of the Workshop on Applied Verification for Continuous and Hybrid Systems.Google ScholarGoogle Scholar
  3. Matthias Althoff, Olaf Stursberg, and Martin Buss. 2008. Reachability analysis of nonlinear systems with uncertain parameters using conservative linearization. In 2008 47th IEEE Conference on Decision and Control. IEEE, 4042--4048.Google ScholarGoogle ScholarCross RefCross Ref
  4. Stanley Bak and Parasara Sridhar Duggirala. 2017. Simulation-equivalent reachability of large linear systems with inputs. In International Conference on Computer Aided Verification. Springer, 401--420.Google ScholarGoogle ScholarCross RefCross Ref
  5. Stanley Bak, Hoang-Dung Tran, and Taylor T. Johnson. 2019. Numerical verification of affine systems with up to a billion dimensions. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control. ACM, 23--32.Google ScholarGoogle Scholar
  6. Valentina E. Balas and Marius M. Balas. 2006. Driver assisting by inverse time to collision. In 2006 World Automation Congress. IEEE, 1--6.Google ScholarGoogle Scholar
  7. Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).Google ScholarGoogle Scholar
  8. Xin Chen, Erika Ábrahám, and Sriram Sankaranarayanan. 2013. Flow*: An analyzer for non-linear hybrid systems. In International Conference on Computer Aided Verification. Springer, 258--263.Google ScholarGoogle ScholarCross RefCross Ref
  9. Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. arXiv preprint arXiv:1711.03938 (2017).Google ScholarGoogle Scholar
  10. Tommaso Dreossi, Alexandre Donzé, and Sanjit A. Seshia. 2017. Compositional falsification of cyber-physical systems with machine learning components. In NASA Formal Methods Symposium. Springer, 357--372.Google ScholarGoogle Scholar
  11. Souradeep Dutta, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2018. Learning and verification of feedback control systems using feedforward neural networks. IFAC-PapersOnLine 51, 16 (2018), 151--156.Google ScholarGoogle ScholarCross RefCross Ref
  12. Javier Garcıa and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437--1480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Clement Gehring and Doina Precup. 2013. Smart exploration in reinforcement learning using absolute temporal difference errors. In Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 1037--1044.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Peter Geibel and Fritz Wysotzki. 2005. Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24 (2005), 81--108.Google ScholarGoogle ScholarCross RefCross Ref
  15. Alborz Geramifard, Joshua Redding, Nicholas Roy, and Jonathan P. How. 2011. UAV cooperative control with stochastic risk models. In Proceedings of the 2011 American Control Conference. IEEE, 3393--3398.Google ScholarGoogle Scholar
  16. Antoine Girard. 2005. Reachability of uncertain linear systems using zonotopes. In Hybrid Systems: Computation and Control. Springer, 291--305.Google ScholarGoogle Scholar
  17. Alexander Hans, Daniel Schneegaß, Anton Maximilian Schäfer, and Steffen Udluft. 2008. Safe exploration for reinforcement learning. In ESANN. 143--148.Google ScholarGoogle Scholar
  18. John Hertz, Anders Krogh, and Richard G. Palmer. 1991. Introduction to the Theory of Neural Computation. Addison-Wesley/Addison Wesley Longman.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Radoslav Ivanov, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. 2019. Verisig: Verifying safety properties of hybrid systems with neural network controllers. In Hybrid Systems: Computation and Control (HSCC).Google ScholarGoogle Scholar
  20. Kyle D. Julian, Mykel J. Kochenderfer, and Michael P. Owen. 2018. Deep neural network compression for aircraft collision avoidance systems. arXiv preprint arXiv:1810.04240 (2018).Google ScholarGoogle Scholar
  21. Torsten Koller, Felix Berkenkamp, Matteo Turchetta, and Andreas Krause. 2018. Learning-based model predictive control for safe exploration. In 2018 IEEE Conference on Decision and Control (CDC). IEEE, 6059--6066.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kristofer D. Kusano and Hampton Gabler. 2011. Method for estimating time to collision at braking in real-world, lead vehicle stopped rear-end crashes for use in pre-crash system design. SAE International Journal of Passenger Cars-Mechanical Systems 4, 2011-01-0576 (2011), 435--443.Google ScholarGoogle ScholarCross RefCross Ref
  23. David N. Lee. 1976. A theory of visual control of braking based on information about time-to-collision. Perception 5, 4 (1976), 437--459.Google ScholarGoogle ScholarCross RefCross Ref
  24. Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google ScholarGoogle Scholar
  25. Teodor Mihai Moldovan and Pieter Abbeel. 2012. Safe exploration in Markov decision processes. arXiv preprint arXiv:1205.4810 (2012).Google ScholarGoogle Scholar
  26. Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574--2582.Google ScholarGoogle ScholarCross RefCross Ref
  27. Sriram Sankaranarayanan, Souradeep Dutta, and Xin Chen. 2019. Reachability analysis for neural feedback systems using regressive polynomial rule inference. In Hybrid Systems: Computation and Control (HSCC).Google ScholarGoogle Scholar
  28. Xiaowu Sun, Haitham Khedr, and Yasser Shoukry. 2019. Formal verification of neural network controlled autonomous systems. In Hybrid Systems: Computation and Control (HSCC).Google ScholarGoogle Scholar
  29. Hoang-Dung Tran, Patrick Musau, Diego Manzanas Lopez, Xiaodong Yang, Luan Viet Nguyen, Weiming Xiang, and Taylor T. Johnson. 2019. Parallelizable reachability analysis algorithms for feed-forward neural networks. In 7th International Conference on Formal Methods in Software Engineering (FormaliSE2019), Montreal, Canada.Google ScholarGoogle Scholar
  30. Hoang-Dung Tran, Patrick Musau, Diego Manzanas Lopez, Xiaodong Yang, Luan Viet Nguyen, Weiming Xiang, and Taylor T. Johnson. 2019. Star-based reachability analsysis for deep neural networks. In 23rd International Symposisum on Formal Methods (FM’19). Springer International Publishing.Google ScholarGoogle Scholar
  31. Hoang-Dung Tran, Luan Viet Nguyen, Nathaniel Hamilton, Weiming Xiang, and Taylor T. Johnson. 2019. Reachability analysis for high-index linear differential algebraic equations (DAEs). In 17th International Conference on Formal Modeling and Analysis of Timed Systems (FORMATS’19). Springer International Publishing.Google ScholarGoogle Scholar
  32. Hoang-Dung Tran, Luan Viet Nguyen, Patrick Musau, Weiming Xiang, and Taylor T. Johnson. 2019. Decentralized real-time safety verification for distributed cyber-physical systems. In Formal Techniques for Distributed Objects, Components, and Systems (FORTE’19), Jorge A. Pérez and Nobuko Yoshida (Eds.). Springer International Publishing, Cham, 261--277.Google ScholarGoogle Scholar
  33. Cumhur Erkan Tuncali, Georgios Fainekos, Hisahiro Ito, and James Kapinski. 2018. Simulation-based adversarial test generation for autonomous vehicles with machine learning components. arXiv preprint arXiv:1804.06760 (2018).Google ScholarGoogle Scholar
  34. Weiming Xiang, Diego Manzanas Lopez, Patrick Musau, and Taylor T. Johnson. 2019. Reachable set estimation and verification for neural network models of nonlinear dynamic systems. In Safe, Autonomous and Intelligent Vehicles. Springer, 123--144.Google ScholarGoogle Scholar
  35. Weiming Xiang, Hoang-Dung Tran, and Taylor T. Johnson. 2017. Reachable set computation and safety verification for neural networks with ReLU activations. arXiv preprint arXiv:1712.08163 (2017).Google ScholarGoogle Scholar
  36. Weiming Xiang, Hoang-Dung Tran, Joel A. Rosenfeld, and Taylor T. Johnson. 2018. Reachable set estimation and safety verification for piecewise linear systems with neural network controllers. arXiv preprint arXiv:1802.06981 (2018).Google ScholarGoogle Scholar

Index Terms

  1. Safety Verification of Cyber-Physical Systems with Reinforcement Learning Control

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format .

              View HTML Format
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!