skip to main content
10.1145/2939672.2939778acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Published: 13 August 2016 Publication History

Abstract

Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one.
In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

Supplementary Material

MP4 File (kdd2016_ribeiro_any_classifier_01-acm.mp4)

References

[1]
S. Amershi, M. Chickering, S. M. Drucker, B. Lee, P. Simard, and J. Suh. Modeltracker: Redesigning performance analysis tools for machine learning. In Human Factors in Computing Systems (CHI), 2015.
[2]
D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, and K.-R. Müller. How to explain individual classification decisions. Journal of Machine Learning Research, 11, 2010.
[3]
A. Bansal, A. Farhadi, and D. Parikh. Towards transparent systems: Semantic characterization of failure modes. In European Conference on Computer Vision (ECCV), 2014.
[4]
J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Association for Computational Linguistics (ACL), 2007.
[5]
J. Q. Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence. Dataset Shift in Machine Learning. MIT, 2009.
[6]
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Knowledge Discovery and Data Mining (KDD), 2015.
[7]
M. W. Craven and J. W. Shavlik. Extracting tree-structured representations of trained networks. Neural information processing systems (NIPS), pages 24--30, 1996.
[8]
M. T. Dzindolet, S. A. Peterson, R. A. Pomranky, L. G. Pierce, and H. P. Beck. The role of trust in automation reliance. Int. J. Hum.-Comput. Stud., 58 (6), 2003.
[9]
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics, 32: 407--499, 2004.
[10]
U. Feige. A threshold of ln n for approximating set cover. J. ACM, 45 (4), July 1998.
[11]
A. Groce, T. Kulesza, C. Zhang, S. Shamasunder, M. Burnett, W.-K. Wong, S. Stumpf, S. Das, A. Shinsel, F. Bice, and K. McIntosh. You are the only possible oracle: Effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng., 40 (3), 2014.
[12]
J. L. Herlocker, J. A. Konstan, and J. Riedl. Explaining collaborative filtering recommendations. In Conference on Computer Supported Cooperative Work (CSCW), 2000.
[13]
A. Karpathy and F. Li. Deep visual-semantic alignments for generating image descriptions. In Computer Vision and Pattern Recognition (CVPR), 2015.
[14]
S. Kaufman, S. Rosset, and C. Perlich. Leakage in data mining: Formulation, detection, and avoidance. In Knowledge Discovery and Data Mining (KDD), 2011.
[15]
A. Krause and D. Golovin. Submodular function maximization. In Tractability: Practical Approaches to Hard Problems. Cambridge University Press, February 2014.
[16]
T. Kulesza, M. Burnett, W.-K. Wong, and S. Stumpf. Principles of explanatory debugging to personalize interactive machine learning. In Intelligent User Interfaces (IUI), 2015.
[17]
B. Letham, C. Rudin, T. H. McCormick, and D. Madigan. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. Annals of Applied Statistics, 2015.
[18]
D. Martens and F. Provost. Explaining data-driven document classifications. MIS Q., 38 (1), 2014.
[19]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Neural Information Processing Systems (NIPS). 2013.
[20]
K. Patel, J. Fogarty, J. A. Landay, and B. Harrison. Investigating statistical machine learning as a tool for software development. In Human Factors in Computing Systems (CHI), 2008.
[21]
K. Patel, N. Bancroft, S. M. Drucker, J. Fogarty, A. J. Ko, and J. Landay. Gestalt: Integrated support for implementation and analysis in machine learning. In User Interface Software and Technology (UIST), 2010.
[22]
I. Sanchez, T. Rocktaschel, S. Riedel, and S. Singh. Towards extracting faithful and descriptive representations of latent variable models. In AAAI Spring Syposium on Knowledge Representation and Reasoning (KRR): Integrating Symbolic and Neural Approaches, 2015.
[23]
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, and J.-F. Crespo. Hidden technical debt in machine learning systems. In Neural Information Processing Systems (NIPS). 2015.
[24]
E. Strumbelj and I. Kononenko. An efficient explanation of individual classifications using game theory. Journal of Machine Learning Research, 11, 2010.
[25]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Computer Vision and Pattern Recognition (CVPR), 2015.
[26]
B. Ustun and C. Rudin. Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 2015.
[27]
F. Wang and C. Rudin. Falling rule lists. In Artificial Intelligence and Statistics (AISTATS), 2015.
[28]
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In phInternational Conference on Machine Learning (ICML), 2015.
[29]
P. Zhang, J. Wang, A. Farhadi, M. Hebert, and D. Parikh. Predicting failures of vision systems. In phComputer Vision and Pattern Recognition (CVPR), 2014.

Cited By

View all
  • (2025)An interpretable TFAFI-1DCNN-LSTM framework for UGW-based pre-stress identification of steel strandsMechanical Systems and Signal Processing10.1016/j.ymssp.2024.111774222(111774)Online publication date: Jan-2025
  • (2025)Decision Tree Clusters: Non-destructive detection of overheating defects in porcelain insulators using quantitative thermal imaging techniquesMeasurement10.1016/j.measurement.2024.115723241(115723)Online publication date: Feb-2025
  • (2025)Comprehension is a double-edged sword: Over-interpreting unspecified information in intelligible machine learning explanationsInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103376193(103376)Online publication date: Jan-2025
  • Show More Cited By

Recommendations

Reviews

Mario A. Aoun

When Bohr introduced his theory of quantum jumps as a model of the inside of an atom, he said that quantum jumps exist but no one can visualize them. Thus, at the time, the scientific community was outraged because science is all about explaining and visualizing physical phenomena. In fact, "not being able to visualize things seemed against the whole purpose of science" [1]. This paper is dealing with a phenomenon that is very similar to Bohr's story; however, instead of talking about quantum jumps or what is happening inside an atom, it is talking about interpretable machine learning (IML) or what is happening inside the machine when it is learning facts and making decisions (that is, predictions). In fact, the new topic of IML is very hot right now [2]. The authors present local interpretable model-agnostic explanations (LIME), a model of IML. First, an agnostic model means that the model could allow explanation of the behavior of the machine without referring to (that is, accessing) its internal parameters. Second, a local interpretable model means that the model acts on the neighborhood of its input values. As a result, LIME can be considered as a "white-box," which locally approximates the behavior of the machine in a neighborhood of input values. It works by calculating a linear summation of the values of the input features scaled by a weight factor. I enjoyed this paper-it is very well written and covers a significant fundamental block of IML. I recommend it to any researcher interested in theorizing the basic aspects of IML.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2016
2176 pages
ISBN:9781450342322
DOI:10.1145/2939672
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. black box classifier
  2. explaining machine learning
  3. interpretability
  4. interpretable machine learning

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '16
Sponsor:

Acceptance Rates

KDD '16 Paper Acceptance Rate 66 of 1,115 submissions, 6%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17,399
  • Downloads (Last 6 weeks)2,140
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2025)An interpretable TFAFI-1DCNN-LSTM framework for UGW-based pre-stress identification of steel strandsMechanical Systems and Signal Processing10.1016/j.ymssp.2024.111774222(111774)Online publication date: Jan-2025
  • (2025)Decision Tree Clusters: Non-destructive detection of overheating defects in porcelain insulators using quantitative thermal imaging techniquesMeasurement10.1016/j.measurement.2024.115723241(115723)Online publication date: Feb-2025
  • (2025)Comprehension is a double-edged sword: Over-interpreting unspecified information in intelligible machine learning explanationsInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103376193(103376)Online publication date: Jan-2025
  • (2025)A new approach for predicting oil mobilities and unveiling their controlling factors in a lacustrine shale system: Insights from interpretable machine learning modelFuel10.1016/j.fuel.2024.132958379(132958)Online publication date: Jan-2025
  • (2025)Answering new urban questions: Using eXplainable AI-driven analysis to identify determinants of Airbnb price in DublinExpert Systems with Applications10.1016/j.eswa.2024.125360260(125360)Online publication date: Jan-2025
  • (2025)Precision strike: Precise backdoor attack with dynamic triggerComputers & Security10.1016/j.cose.2024.104101148(104101)Online publication date: Jan-2025
  • (2025)Analyzing the impact of design factors on solar module thermomechanical durability using interpretable machine learning techniquesApplied Energy10.1016/j.apenergy.2024.124462377(124462)Online publication date: Jan-2025
  • (2024)Explainability of Image Generative AI for Novice and Expert Users: A Comparative Study of Static and Dynamic ExplanationsJournal of Digital Contents Society10.9728/dcs.2024.25.8.226125:8(2261-2272)Online publication date: 31-Aug-2024
  • (2024)Small Business Trade Area Analysis and Survival Prediction Using Public Data and Explainable AI TechniquesJournal of the Korean Institute of Industrial Engineers10.7232/JKIIE.2024.50.3.17350:3(173-188)Online publication date: 15-Jun-2024
  • (2024)Algorithmic Decision Making: Can Artificial Intelligence and the Metaverse Provide Technological Solutions to Modernise the United Kingdom’s Legal Services and Criminal Justice?Frontiers in Law10.6000/2817-2302.2024.03.053(28-39)Online publication date: 15-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media