Abstract
This paper proposes a new forward reachability analysis approach to verify safety of cyber-physical systems (CPS) with reinforcement learning controllers. The foundation of our approach lies on two efficient, exact and over-approximate reachability algorithms for neural network control systems using star sets, which is an efficient representation of polyhedra. Using these algorithms, we determine the initial conditions for which a safety-critical system with a neural network controller is safe by incrementally searching a critical initial condition where the safety of the system cannot be established. Our approach produces tight over-approximation error and it is computationally efficient, which allows the application to practical CPS with learning enable components (LECs). We implement our approach in NNV, a recent verification tool for neural networks and neural network control systems, and evaluate its advantages and applicability by verifying safety of a practical Advanced Emergency Braking System (AEBS) with a reinforcement learning (RL) controller trained using the deep deterministic policy gradient (DDPG) method. The experimental results show that our new reachability algorithms are much less conservative than existing polyhedra-based approaches. We successfully determine the entire region of the initial conditions of the AEBS with the RL controller such that the safety of the system is guaranteed, while a polyhedra-based approach cannot prove the safety properties of the system.
- Anayo K. Akametalu, Jaime F. Fisac, Jeremy H. Gillula, Shahab Kaynama, Melanie N. Zeilinger, and Claire J. Tomlin. 2014. Reachability-based safe learning with Gaussian processes. In 53rd IEEE Conference on Decision and Control. IEEE, 1424--1431.Google Scholar
- Matthias Althoff. 2015. An introduction to CORA 2015. In Proc. of the Workshop on Applied Verification for Continuous and Hybrid Systems.Google Scholar
- Matthias Althoff, Olaf Stursberg, and Martin Buss. 2008. Reachability analysis of nonlinear systems with uncertain parameters using conservative linearization. In 2008 47th IEEE Conference on Decision and Control. IEEE, 4042--4048.Google Scholar
Cross Ref
- Stanley Bak and Parasara Sridhar Duggirala. 2017. Simulation-equivalent reachability of large linear systems with inputs. In International Conference on Computer Aided Verification. Springer, 401--420.Google Scholar
Cross Ref
- Stanley Bak, Hoang-Dung Tran, and Taylor T. Johnson. 2019. Numerical verification of affine systems with up to a billion dimensions. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control. ACM, 23--32.Google Scholar
- Valentina E. Balas and Marius M. Balas. 2006. Driver assisting by inverse time to collision. In 2006 World Automation Congress. IEEE, 1--6.Google Scholar
- Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).Google Scholar
- Xin Chen, Erika Ábrahám, and Sriram Sankaranarayanan. 2013. Flow*: An analyzer for non-linear hybrid systems. In International Conference on Computer Aided Verification. Springer, 258--263.Google Scholar
Cross Ref
- Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. arXiv preprint arXiv:1711.03938 (2017).Google Scholar
- Tommaso Dreossi, Alexandre Donzé, and Sanjit A. Seshia. 2017. Compositional falsification of cyber-physical systems with machine learning components. In NASA Formal Methods Symposium. Springer, 357--372.Google Scholar
- Souradeep Dutta, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2018. Learning and verification of feedback control systems using feedforward neural networks. IFAC-PapersOnLine 51, 16 (2018), 151--156.Google Scholar
Cross Ref
- Javier Garcıa and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437--1480.Google Scholar
Digital Library
- Clement Gehring and Doina Precup. 2013. Smart exploration in reinforcement learning using absolute temporal difference errors. In Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 1037--1044.Google Scholar
Digital Library
- Peter Geibel and Fritz Wysotzki. 2005. Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24 (2005), 81--108.Google Scholar
Cross Ref
- Alborz Geramifard, Joshua Redding, Nicholas Roy, and Jonathan P. How. 2011. UAV cooperative control with stochastic risk models. In Proceedings of the 2011 American Control Conference. IEEE, 3393--3398.Google Scholar
- Antoine Girard. 2005. Reachability of uncertain linear systems using zonotopes. In Hybrid Systems: Computation and Control. Springer, 291--305.Google Scholar
- Alexander Hans, Daniel Schneegaß, Anton Maximilian Schäfer, and Steffen Udluft. 2008. Safe exploration for reinforcement learning. In ESANN. 143--148.Google Scholar
- John Hertz, Anders Krogh, and Richard G. Palmer. 1991. Introduction to the Theory of Neural Computation. Addison-Wesley/Addison Wesley Longman.Google Scholar
Digital Library
- Radoslav Ivanov, James Weimer, Rajeev Alur, George J. Pappas, and Insup Lee. 2019. Verisig: Verifying safety properties of hybrid systems with neural network controllers. In Hybrid Systems: Computation and Control (HSCC).Google Scholar
- Kyle D. Julian, Mykel J. Kochenderfer, and Michael P. Owen. 2018. Deep neural network compression for aircraft collision avoidance systems. arXiv preprint arXiv:1810.04240 (2018).Google Scholar
- Torsten Koller, Felix Berkenkamp, Matteo Turchetta, and Andreas Krause. 2018. Learning-based model predictive control for safe exploration. In 2018 IEEE Conference on Decision and Control (CDC). IEEE, 6059--6066.Google Scholar
Digital Library
- Kristofer D. Kusano and Hampton Gabler. 2011. Method for estimating time to collision at braking in real-world, lead vehicle stopped rear-end crashes for use in pre-crash system design. SAE International Journal of Passenger Cars-Mechanical Systems 4, 2011-01-0576 (2011), 435--443.Google Scholar
Cross Ref
- David N. Lee. 1976. A theory of visual control of braking based on information about time-to-collision. Perception 5, 4 (1976), 437--459.Google Scholar
Cross Ref
- Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google Scholar
- Teodor Mihai Moldovan and Pieter Abbeel. 2012. Safe exploration in Markov decision processes. arXiv preprint arXiv:1205.4810 (2012).Google Scholar
- Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574--2582.Google Scholar
Cross Ref
- Sriram Sankaranarayanan, Souradeep Dutta, and Xin Chen. 2019. Reachability analysis for neural feedback systems using regressive polynomial rule inference. In Hybrid Systems: Computation and Control (HSCC).Google Scholar
- Xiaowu Sun, Haitham Khedr, and Yasser Shoukry. 2019. Formal verification of neural network controlled autonomous systems. In Hybrid Systems: Computation and Control (HSCC).Google Scholar
- Hoang-Dung Tran, Patrick Musau, Diego Manzanas Lopez, Xiaodong Yang, Luan Viet Nguyen, Weiming Xiang, and Taylor T. Johnson. 2019. Parallelizable reachability analysis algorithms for feed-forward neural networks. In 7th International Conference on Formal Methods in Software Engineering (FormaliSE2019), Montreal, Canada.Google Scholar
- Hoang-Dung Tran, Patrick Musau, Diego Manzanas Lopez, Xiaodong Yang, Luan Viet Nguyen, Weiming Xiang, and Taylor T. Johnson. 2019. Star-based reachability analsysis for deep neural networks. In 23rd International Symposisum on Formal Methods (FM’19). Springer International Publishing.Google Scholar
- Hoang-Dung Tran, Luan Viet Nguyen, Nathaniel Hamilton, Weiming Xiang, and Taylor T. Johnson. 2019. Reachability analysis for high-index linear differential algebraic equations (DAEs). In 17th International Conference on Formal Modeling and Analysis of Timed Systems (FORMATS’19). Springer International Publishing.Google Scholar
- Hoang-Dung Tran, Luan Viet Nguyen, Patrick Musau, Weiming Xiang, and Taylor T. Johnson. 2019. Decentralized real-time safety verification for distributed cyber-physical systems. In Formal Techniques for Distributed Objects, Components, and Systems (FORTE’19), Jorge A. Pérez and Nobuko Yoshida (Eds.). Springer International Publishing, Cham, 261--277.Google Scholar
- Cumhur Erkan Tuncali, Georgios Fainekos, Hisahiro Ito, and James Kapinski. 2018. Simulation-based adversarial test generation for autonomous vehicles with machine learning components. arXiv preprint arXiv:1804.06760 (2018).Google Scholar
- Weiming Xiang, Diego Manzanas Lopez, Patrick Musau, and Taylor T. Johnson. 2019. Reachable set estimation and verification for neural network models of nonlinear dynamic systems. In Safe, Autonomous and Intelligent Vehicles. Springer, 123--144.Google Scholar
- Weiming Xiang, Hoang-Dung Tran, and Taylor T. Johnson. 2017. Reachable set computation and safety verification for neural networks with ReLU activations. arXiv preprint arXiv:1712.08163 (2017).Google Scholar
- Weiming Xiang, Hoang-Dung Tran, Joel A. Rosenfeld, and Taylor T. Johnson. 2018. Reachable set estimation and safety verification for piecewise linear systems with neural network controllers. arXiv preprint arXiv:1802.06981 (2018).Google Scholar
Index Terms
Safety Verification of Cyber-Physical Systems with Reinforcement Learning Control
Recommendations
Verifying safety properties of a nonlinear control by interactive theorem proving with the Prototype Verification System
Interactive, or computer-assisted, theorem proving is the verification of statements in a formal system, where the proof is developed by a logician who chooses the appropriate inference steps, in turn executed by an automatic theorem prover. In this ...
Using symbolic execution for verifying safety-critical systems
Safety critical systems require to be highly reliable and thus special care is taken when verifying them in order to increase the confidence in their behavior. This paper addresses the problem of formal verification of safety critical systems by ...
Modeling and Verification of Safety Critical Systems: A Case Study on Pacemaker
SSIRI '10: Proceedings of the 2010 Fourth International Conference on Secure Software Integration and Reliability ImprovementThe pacemaker challenge proposed by Software Quality Research Laboratory is looking for formal methods toproduce precise and reliable systems. Safety critical systems like pacemaker need to guarantee important properties (like deadlock-free, safety, ...






Comments