Abstract
We present a novel abstraction for bounding the Clarke Jacobian of a Lipschitz continuous, but not necessarily differentiable function over a local input region. To do so, we leverage a novel abstract domain built upon dual numbers, adapted to soundly over-approximate all first derivatives needed to compute the Clarke Jacobian. We formally prove that our novel forward-mode dual interval evaluation produces a sound, interval domain-based over-approximation of the true Clarke Jacobian for a given input region.
Due to the generality of our formalism, we can compute and analyze interval Clarke Jacobians for a broader class of functions than previous works supported – specifically, arbitrary compositions of neural networks with Lipschitz, but non-differentiable perturbations. We implement our technique in a tool called DeepJ and evaluate it on multiple deep neural networks and non-differentiable input perturbations to showcase both the generality and scalability of our analysis. Concretely, we can obtain interval Clarke Jacobians to analyze Lipschitz robustness and local optimization landscapes of both fully-connected and convolutional neural networks for rotational, contrast variation, and haze perturbations, as well as their compositions.
Supplemental Material
- David Alvarez-Melis and Tommi S Jaakkola. 2018. Towards robust interpretability with self-explaining neural networks. In Neural Information Processing Systems.Google Scholar
- Mislav Balunović, Maximilian Baader, Gagandeep Singh, Timon Gehr, and Martin Vechev. 2019. Certifying geometric robustness of neural networks. Neural Information Processing Systems.Google Scholar
- Thomas Beck and Herbert Fischer. 1994. The if-problem in automatic differentiation. J. Comput. Appl. Math..Google Scholar
- Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, Xin Zhang, Jake Zhao, and Karol Zieba. 2016. End to End Learning for Self-Driving Cars. arxiv:1604.07316.Google Scholar
- Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2010. Continuity analysis of programs. In Symposium on Principles of Programming Languages (POPL).Google Scholar
Digital Library
- Swarat Chaudhuri, Sumit Gulwani, Roberto Lublinerman, and Sara Navidpour. 2011. Proving programs robust. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering.Google Scholar
Digital Library
- Swarat Chaudhuri and Armando Solar-Lezama. 2010. Smooth interpretation. In Proceedings of the 31st Conference on Programming Language Design and Implementation (PLDI).Google Scholar
Digital Library
- Swarat Chaudhuri and Armando Solar-Lezama. 2011. Smoothing a program soundly and robustly. In International Conference on Computer Aided Verification.Google Scholar
Cross Ref
- Frank Clarke. 1990. 2. Generalized Gradients. In Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, 24–109.Google Scholar
- Eva Darulova and Viktor Kuncak. 2017. Towards a Compiler for Reals. ACM Trans. Program. Lang. Syst..Google Scholar
Digital Library
- Luiz Henrique De Figueiredo and Jorge Stolfi. 2004. Affine arithmetic: concepts and applications. Numerical Algorithms.Google Scholar
- Pietro Di Gianantonio and Abbas Edalat. 2013. A language for differentiable functions. In International Conference on Foundations of Software Science and Computational Structures.Google Scholar
Digital Library
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference.Google Scholar
Digital Library
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference.Google Scholar
- Abbas Edalat and André Lieutier. 2004. Domain theory and differential calculus (functions of one variable). Mathematical Structures in Computer Science.Google Scholar
- Abbas Edalat, André Lieutier, and Dirk Pattinson. 2013. A computational model for multi-variable differential calculus. Information and Computation.Google Scholar
- Abbas Edalat and Mehrdad Maleki. 2017. Differentiation in logical form. In 32nd ACM/IEEE Symposium on Logic in Computer Science (LICS).Google Scholar
Cross Ref
- Abbas Edalat and Mehrdad Maleki. 2018. Differential calculus with imprecise input and its logical framework. In International Conference on Foundations of Software Science and Computation Structures.Google Scholar
Cross Ref
- Ruediger Ehlers. 2017. Formal verification of piece-wise linear feed-forward neural networks. In International Symposium on Automated Technology for Verification and Analysis.Google Scholar
Cross Ref
- Andre Esteva, Alexandre Robicquet, Bharath Ramsundar, Volodymyr Kuleshov, Mark DePristo, Katherine Chou, Claire Cui, Greg Corrado, and Jeff Dean. 2019. A guide to deep learning in healthcare. Nature Medicine.Google Scholar
- Cormac Flanagan, Amr Sabry, Bruce F Duba, and Matthias Felleisen. 1993. The essence of compiling with continuations. In Proceedings of the 1993 Conference on Programming Language Design and Implementation (PLDI).Google Scholar
Digital Library
- Henry Gouk, Eibe Frank, Bernhard Pfahringer, and Michael J Cree. 2021. Regularisation of neural networks by enforcing lipschitz continuity. Machine Learning, 110 (2021), https://github.com/henrygouk/keras-lipschitz-networksGoogle Scholar
- Andreas Griewank. 2013. On stable piecewise linearization and generalized algorithmic differentiation. Optimization Methods and Software.Google Scholar
- Andreas Griewank and Andrea Walther. 2008. Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM.Google Scholar
- Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In International Conference on Computer Aided Verification.Google Scholar
Cross Ref
- Matt Jordan and Alexandros G Dimakis. 2020. Exactly computing the local lipschitz constant of relu networks. Neural Information Processing Systems.Google Scholar
- Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. 2017. Reluplex: An efficient SMT solver for verifying deep neural networks. In International Conference on Computer Aided Verification.Google Scholar
Cross Ref
- Guy Katz, Derek A Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, and Aleksandar Zeljić. 2019. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification.Google Scholar
Cross Ref
- Kamil A Khan and Paul I Barton. 2012. Evaluating an element of the Clarke generalized Jacobian of a piecewise differentiable function. In Recent Advances in Algorithmic Differentiation.Google Scholar
- Kamil A Khan and Paul I Barton. 2013. Evaluating an element of the Clarke generalized Jacobian of a composite piecewise differentiable function. ACM Transactions on Mathematical Software (TOMS), 39, 4 (2013).Google Scholar
Digital Library
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Google Scholar
- Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images.Google Scholar
- Jacob Laurel and Sasa Misailovic. 2020. Continualization of Probabilistic Programs With Correction.. In 29th European Symposium on Programming (ESOP), 2020..Google Scholar
Digital Library
- Jacob Laurel, Rem Yang, Gagandeep Singh, and Sasa Misailovic. 2022. Appendix to DeepJ. https://jsl1994.github.io/papers/POPL2022_appendix.pdfGoogle Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, 86, 11 (1998).Google Scholar
Cross Ref
- Ji Lin, Chuang Gan, and Song Han. 2019. Defensive Quantization: When Efficiency Meets Robustness. In International Conference on Learning Representations.Google Scholar
- Ravi Mangal, Kartik Sarangmath, Aditya V Nori, and Alessandro Orso. 2020. Probabilistic Lipschitz Analysis of Neural Networks. In International Static Analysis Symposium.Google Scholar
- Antoine Miné. 2004. Relational abstract domains for the detection of floating-point run-time errors. In European Symposium on Programming.Google Scholar
Cross Ref
- Antoine Miné. 2017. Tutorial on Static Inference of Numeric Invariants by Abstract Interpretation. Foundations and Trends in Programming Languages, 4 (2017).Google Scholar
Cross Ref
- Matthew Mirman, Timon Gehr, and Martin Vechev. 2018. Differentiable abstract interpretation for provably robust neural networks. In International Conference on Machine Learning.Google Scholar
- Matthew Mirman, Gagandeep Singh, and Martin T. Vechev. 2019. A Provable Defense for Deep Residual Networks. CoRR, abs/1903.12519 (2019).Google Scholar
- Ramon Moore, R. Baker Kearfott, and Michael Cloud. 2009. Introduction to Interval Analysis.Google Scholar
- Christoph Müller, François Serre, Gagandeep Singh, Markus Püschel, and Martin Vechev. 2021. Scaling Polyhedral Neural Network Verification on GPUs. In Machine Learning and Systems (MLSys). 3.Google Scholar
- Colin Paterson, Haoze Wu, John Grese, Radu Calinescu, Corina S. Pasareanu, and Clark Barrett. 2021. DeepCert: Verification of Contextually Relevant Robustness for Neural Network Image Classifiers. arxiv:arXiv:2103.01629.Google Scholar
- Daniel Richardson. 1969. Some undecidable problems involving elementary functions of a real variable. The Journal of Symbolic Logic.Google Scholar
Cross Ref
- Kevin Scaman and Aladin Virmaux. 2018. Lipschitz regularity of deep neural networks: analysis and efficient estimation. In Neural Information Processing Systems.Google Scholar
- Stefan Scholtes. 2012. Piecewise Differentiable Functions. In Introduction to Piecewise Differentiable Equations.Google Scholar
- Benjamin Sherman, Jesse Michel, and Michael Carbin. 2021. λ _S: Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes. In Symposium on Principles of Programming Languages (POPL).Google Scholar
- Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, and Martin T Vechev. 2018. Fast and Effective Robustness Certification.. NeurIPS.Google Scholar
- Gagandeep Singh, Timon Gehr, Markus Püschel, and Martin Vechev. 2019. An abstract domain for certifying neural networks. Symposium on Principles of Programming Languages (POPL).Google Scholar
Digital Library
- Matthew Sotoudeh and Aditya V Thakur. 2020. Abstract Neural Networks. In International Static Analysis Symposium.Google Scholar
- Yusuke Tsuzuku, Issei Sato, and Masashi Sugiyama. 2018. Lipschitz-margin training: scalable certification of perturbation invariance for deep neural networks. In Neural Information Processing Systems.Google Scholar
- Caterina Urban and Antoine Miné. 2021. A Review of Formal Methods applied to Machine Learning. arXiv preprint arXiv:2104.02466.Google Scholar
- Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, and Luca Daniel. 2018. Towards Fast Computation of Certified Robustness for ReLU Networks. In International Conference on Machine Learning.Google Scholar
- Tsui-Wei Weng, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, and Luca Daniel. 2018. Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach. In International Conference on Learning Representations.Google Scholar
- Bohang Zhang, Tianle Cai, Zhou Lu, Di He, and Liwei Wang. 2021. Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons. In International Conference on Machine Learning.Google Scholar
- Huan Zhang, Hongge Chen, Chaowei Xiao, Sven Gowal, Robert Stanforth, Bo Li, Duane Boning, and Cho-Jui Hsieh. 2020. Towards Stable and Efficient Training of Verifiably Robust Neural Networks. In International Conference on Learning Representations.Google Scholar
- Huan Zhang, Pengchuan Zhang, and Cho-Jui Hsieh. 2019. RecurJac: An Efficient Recursive Algorithm for Bounding Jacobian Matrix of Neural Networks and Its Applications. In The 33rd AAAI Conference on Artificial Intelligence, (AAAI).Google Scholar
Index Terms
A dual number abstraction for static analysis of Clarke Jacobians
Recommendations
An accelerated Newton method for equations with semismooth Jacobians and nonlinear complementarity problems
We discuss local convergence of Newton's method to a singular solution x* of the nonlinear equations F(x) = 0, for $$F:{\mathbb{R}}^n \rightarrow {\mathbb{R}}^n$$ . It is shown that an existing proof of Griewank, concerning linear convergence to a ...
A First-Order Convergence Analysis of Trust-Region Methods with Inexact Jacobians
A class of trust-region sequential quadratic programming algorithms for the solution of minimization problems with nonlinear equality constraints is analyzed. The considered class of optimization methods does not require the exact evaluation of the ...
Nonsmooth Milyutin-Dubovitskii Theory and Clarke's Tangent Cone
In optimization theory the idea of approximating nonconvex sets by convex cones which satisfy an abstract condition---the Intersection Principle---is due to Milyutin and Dubovitskii. This approach has been successfully applied to optimization problems ...






Comments