skip to main content
10.1145/3534678.3539428acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Integrity Authentication in Tree Models

Published: 14 August 2022 Publication History

Abstract

Tree models are very widely used in practice of machine learning and data mining. In this paper, we study the problem of model integrity authentication in tree models. In general, the task of model integrity authentication is the design & implementation of mechanisms for checking/detecting whether the model deployed for the end-users has been tampered with or compromised, e.g., malicious modifications on the model. We propose an authentication framework that enables the model builders/distributors to embed a signature to the tree model and authenticate the existence of the signature by only making a small number of black-box queries to the model. To the best of our knowledge, this is the first study of signature embedding on tree models. Our proposed method simply locates a collection of leaves and modifies their prediction values, which does not require any training/testing data nor any re-training. The experiments on a large number of public classification datasets confirm that the proposed signature embedding process has a high success rate while only introducing a minimal accuracy loss.

Supplemental Material

MP4 File
Integrity Authentication in Tree Models. Weijie Zhao, Yingjie Lao, and Ping Li. Code is available at: https://github.com/pltrees/abcboost

References

[1]
Yossi Adi, Carsten Baum, Moustapha Cissé, Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In Proceedings of the 27th USENIX Security Symposium (USENIX Security), pages 1615--1631, Baltimore, MD, 2018.
[2]
Amjad Ali Alamr, Firdous Kausar, Jongsung Kim, and Changho Seo. A secure ecc-based rfid mutual authentication protocol for internet of things. The Journal of Supercomputing, 74(9):4281--4294, 2018.
[3]
Maksym Andriushchenko and Matthias Hein. Provably robust boosted decision stumps and trees against adversarial attacks. In Advances in Neural Information Processing Systems (NeurIPS), pages 12997--13008, Vancouver, Canada, 2019.
[4]
Peter Bartlett, Yoav Freund, Wee Sun Lee, and Robert E. Schapire. Boosting the margin: a new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651--1686, 1998.
[5]
Battista Biggio, Blaine Nelson, and Pavel Laskov. Poisoning attacks against support vector machines. In Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, UK, 2012.
[6]
Battista Biggio, Ignazio Pillai, Samuel Rota Bulò, Davide Ariu, Marcello Pelillo, and Fabio Roli. Is data clustering in adversarial settings secure? arXiv preprint arXiv:1811.09982, 2018.
[7]
Leo Breiman. Bagging predictors. Mach. Learn., 24(2):123--140, 1996.
[8]
Leo Brieman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1983.
[9]
Xiaoyu Cao, Jinyuan Jia, and Neil Zhenqiang Gong. Ipguard: Protecting intellectual property of deep neural networks via fingerprinting the classification boundary. In Proceedings of the ACM Asia Conference on Computer and Communications Security (ASIA CCS), pages 14--25, Virtual Event, Hong Kong, 2021.
[10]
Hongge Chen, Huan Zhang, Duane S. Boning, and Cho-Jui Hsieh. Robust decision trees against adversarial examples. In Proceedings of the 36th International Conference on Machine Learning (ICML), pages 1122--1131, Long Beach, CA, 2019.
[11]
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
[12]
Stephen A. Cook. An overview of computational complexity. Commun. ACM, 26 :400--408, 1983.
[13]
Bita Darvish Rouhani, Huili Chen, and Farinaz Koushanfar. DeepSigns: an end-to-end watermarking framework for ownership protection of deep neural networks. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 485--497, Providence, RI, 2019.
[14]
Khoa Doan, Yingjie Lao, and Ping Li. Backdoor attack with imperceptible input and latent modification. In Advances in Neural Information Processing Systems (NeurIPS), pages 18944--18957, virtual, 2021.
[15]
Khoa Doan, Yingjie Lao,Weijie Zhao, and Ping Li. LIRA: learnable, imperceptible and robust backdoor attacks. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 11946--11956, Montreal, Canada, 2021.
[16]
Chenglin Fan and Ping Li. Classification acceleration via merging decision trees. In Proceedings of the ACM-IMS Foundations of Data Science Conference (FODS), pages 13--22, Virtual Event, 2020.
[17]
Lixin Fan, Kam Woh Ng, and Chee Seng Chan. Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. In Advances in Neural Information Processing Systems (NeurIPS), pages 4714--4723, Vancouver, Canada, 2019.
[18]
Yoav Freund. Boosting a weak learning algorithm by majority. Inf. Comput., 121 :256--285, 1995.
[19]
Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of online learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119--139, 1997.
[20]
Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189--1232, 2001.
[21]
Jerome H. Friedman, Trevor J. Hastie, and Robert Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 28(2):337--407, 2000.
[22]
Jerome H. Friedman, Trevor J. Hastie, and Robert Tibshirani. Response to evidence contrary to the statistical view of boosting. Journal of Machine Learning Research, 9:175--180, 2008.
[23]
Vincent Grari, Boris Ruf, Sylvain Lamprier, and Marcin Detyniecki. Achieving fairness with decision trees: An adversarial approach. Data Science and Engineering, 5:99--110, 2020.
[24]
Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
[25]
Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7:47230--47244, 2019.
[26]
Zecheng He, Tianwei Zhang, and Ruby Lee. Sensitive-sample fingerprinting of deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4729--4737, Long Beach, CA, 2019.
[27]
Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, and Yuan Xie. Neural network model extraction attacks in edge devices by hearing architectural hints. arXiv preprint arXiv:1903.03916, 2019.
[28]
Weizhe Hua, Zhiru Zhang, and G. Edward Suh. Reverse engineering convolutional neural networks through side-channel information leaks. In Proceedings of the 55th Annual Design Automation Conference (DAC), pages 4:1--4:6, San Francisco, CA, 2018.
[29]
Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. PRADA: protecting against DNN model stealing attacks. In Proceedings of the IEEE European Symposium on Security and Privacy (EuroS&P), pages 512--527, Stockholm, Sweden, 2019.
[30]
Richard E Korf. A complete anytime algorithm for number partitioning. Artificial Intelligence, 106(2):181--203, 1998.
[31]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. Adversarial machine learning at scale. In Proceedings of the 5th International Conference on Learning Representations (ICLR), Toulon, France, 2017.
[32]
Yingjie Lao, Peng Yang, Weijie Zhao, and Ping Li. Identification for deep neural network: Simply adjusting few weights! In Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE), Virtual Event, 2022.
[33]
Yingjie Lao,Weijie Zhao, Peng Yang, and Ping Li. Deepauth: A dnn authentication framework by model-unique and fragile signature embedding. In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), Virtual, 2022.
[34]
Erwan Le Merrer, Patrick Perez, and Gilles Trédan. Adversarial frontier stitching for remote neural network watermarking. Neural Computing and Applications, 32(13):9233--9244, 2020.
[35]
Jun-Ya Lee, Wei-Cheng Lin, and Yu-Hung Huang. A lightweight authentication protocol for internet of things. In Proceedings of the 2014 International Symposium on Next-Generation Electronics (ISNE), pages 1--2, 2014.
[36]
Bo Li, Yining Wang, Aarti Singh, and Yevgeniy Vorobeychik. Data poisoning attacks on factorization-based collaborative filtering. In Advances in Neural Information Processing Systems (NIPS), pages 1885--1893, Barcelona, Spain, 2016.
[37]
Huiying Li, Emily Willson, Haitao Zheng, and Ben Y Zhao. Piracy resistant watermarks for deep neural networks. arXiv preprint arXiv:1910.01226, 2019.
[38]
Ping Li. Abc-boost: Adaptive base class boost for multi-class classification. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML), pages 625--632, Montreal, Canada, 2009.
[39]
Ping Li. Robust logitboost and adaptive base class (abc) logitboost. In Proceedings of the Twenty-Sixth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 302--311, Catalina Island, CA, 2010.
[40]
Ping Li and Weijie Zhao. Fast ABC-Boost: A unified framework for selecting the base class in multi-class classification. arXiv preprint arXiv:2205.10927, 2022.
[41]
Ping Li andWeijie Zhao. Package for Fast ABC-Boost. https://github.com/pltrees/abcboost, 2022.
[42]
Ping Li, Christopher J. C. Burges, and QiangWu. Mcrank: Learning to rank using multiple classification and gradient boosting. In Advances in Neural Information Processing Systems (NIPS), pages 897--904, Vancouver, Canada, 2007.
[43]
Zheng Li, Chengyu Hu, Yang Zhang, and Shanqing Guo. How to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of dnn. In Proceedings of the 35th Annual Computer Security Applications Conference (ACSAC), pages 126--137, San Juan, PR, 2019.
[44]
Qiang Liu, Pan Li, Wentao Zhao, Wei Cai, Shui Yu, and Victor CM Leung. A survey on security threats and defensive techniques of machine learning: a data driven view. IEEE Access, 6:12103--12117, 2018.
[45]
Yingqi Liu, Shiqing Ma, Yousra Aafer,Wen-Chuan Lee, Juan Zhai,WeihangWang, and Xiangyu Zhang. Trojaning attack on neural networks. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, 2018.
[46]
Yuntao Liu, Yang Xie, and Ankur Srivastava. Neural trojans. In Proceedings of the 2017 IEEE International Conference on Computer Design (ICCD), pages 45--48, Boston, MA, 2017.
[47]
Mehrdad Majzoobi, Masoud Rostami, Farinaz Koushanfar, Dan S Wallach, and Srinivas Devadas. Slender PUF protocol: A lightweight, robust, and secure authentication by substring matching. In Proceedings of the 2012 IEEE Symposium on Security and Privacy Workshops, pages 33--44, San Francisco, CA, 2012.
[48]
Luis Muñoz-González, Battista Biggio, Ambra Demontis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu, and Fabio Roli. Towards poisoning of deep learning algorithms with back-gradient optimization. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec@CCS), pages 27--38, Dallas, TX, 2017.
[49]
Luis Muñoz-González, Bjarne Pfitzner, Matteo Russo, Javier Carnerero-Cano, and Emil C Lupu. Poisoning attacks with generative adversarial nets. arXiv preprint arXiv:1906.07773, 2019.
[50]
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (Euro S&P), pages 372--387, Saarbrucken, Germany, 2016.
[51]
Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. Hidden trigger backdoor attacks. In Proceedings the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), pages 11957--11965, New York, NY, 2020.
[52]
Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2): 197--227, 1990.
[53]
Robert E. Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3):297--336, 1999.
[54]
Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. Poison frogs! targeted clean-label poisoning attacks on neural networks. In Advances in Neural Information Processing Systems (NeurIPS), pages 6103--6113, Montreal, Canada, 2018.
[55]
Jacob Steinhardt, Pang Wei Koh, and Percy Liang. Certified defenses for data poisoning attacks. In Advances in Neural Information Processing Systems (NIPS), pages 3517--3529, 2017.
[56]
Christian Szegedy,Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In Proceedings of the 2nd International Conference on Learning Representations (ICLR), Banff, Canada, 2014.
[57]
Alexander Turner, Dimitris Tsipras, and Aleksander Madry. Label-consistent backdoor attacks. arXiv preprint arXiv:1912.02771, 2019.
[58]
Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa, and Shin'ichi Satoh. Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR), pages 269--277, Bucharest, Romania, 2017.
[59]
Chaofei Yang, Qing Wu, Hai Li, and Yiran Chen. Generative poisoning attack method against neural networks. arXiv preprint arXiv:1703.01340, 2017.
[60]
Peng Yang, Yingjie Lao, and Ping Li. Robust watermarking for deep neural networks via bi-level optimization. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 14821--14830, Montreal, Canada, 2021.
[61]
Xiaoyong Yuan, Pan He, Qile Zhu, and Xiaolin Li. Adversarial examples: Attacks and defenses for deep learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9):2805--2824, 2019.
[62]
Chong Zhang, Huan Zhang, and Cho-Jui Hsieh. An efficient adversarial attack for tree ensembles. In Advances in Neural Information Processing Systems (NeurIPS), virtual, 2020.
[63]
Jialong Zhang, Zhongshu Gu, Jiyong Jang, Hui Wu, Marc Ph. Stoecklin, Heqing Huang, and Ian M. Molloy. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 Asia Conference on Computer and Communications Security (AsiaCCS), pages 159--172, Incheon, Korea, 2018.
[64]
Qi Zhong, Leo Yu Zhang, Jun Zhang, Longxiang Gao, and Yong Xiang. Protecting IP of deep neural networks with watermarking: A new label helps. In Proceedings of the 24th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD), Part II, pages 462--474, Singapore, 2020.

Cited By

View all
  • (2023)Machine Unlearning in Gradient Boosting Decision TreesProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599420(1374-1383)Online publication date: 6-Aug-2023
  • (2022)Marksman backdoorProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3603042(38260-38273)Online publication date: 28-Nov-2022
  • (2022)NL2GDPR: Automatically Develop GDPR Compliant Android Application Features from Natural Language2022 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS56114.2022.10273858(1-9)Online publication date: 3-Oct-2022

Index Terms

  1. Integrity Authentication in Tree Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. model authentication
    2. security
    3. tree models

    Qualifiers

    • Research-article

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 23 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Machine Unlearning in Gradient Boosting Decision TreesProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599420(1374-1383)Online publication date: 6-Aug-2023
    • (2022)Marksman backdoorProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3603042(38260-38273)Online publication date: 28-Nov-2022
    • (2022)NL2GDPR: Automatically Develop GDPR Compliant Android Application Features from Natural Language2022 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS56114.2022.10273858(1-9)Online publication date: 3-Oct-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media