skip to main content
10.1145/2858036.2858529acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models

Published: 07 May 2016 Publication History

Abstract

Understanding predictive models, in terms of interpreting and identifying actionable insights, is a challenging task. Often the importance of a feature in a model is only a rough estimate condensed into one number. However, our research goes beyond these naïve estimates through the design and implementation of an interactive visual analytics system, Prospector. By providing interactive partial dependence diagnostics, data scientists can understand how features affect the prediction overall. In addition, our support for localized inspection allows data scientists to understand how and why specific datapoints are predicted as they are, as well as support for tweaking feature values and seeing how the prediction responds. Our system is then evaluated using a case study involving a team of data scientists improving predictive models for detecting the onset of diabetes from electronic medical records.

Supplementary Material

ZIP File (pn2457-file4.zip)
pn2457-file4.zip
suppl.mov (pn2457-file3.mp4)
Supplemental video

References

[1]
Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. ModelTracker: Redesigning Performance Analysis Tools for Machine Learning. (April 2015), 337-346.
[2]
Saleema Amershi, James Fogarty, Ashish Kapoor, and Tan Desney. 2011a. Designing for Effective End-user Interaction with Machine Learning. In Proceedings of the 24th Annual ACM Symposium Adjunct on User Interface Software and Technology (UIST '11 Adjunct). ACM, NY, NY, USA, 47-50.
[3]
Saleema Amershi, James Fogarty, and Daniel Weld. 2012. Regroup: Interactive Machine Learning for On-demand Group Creation in Social Networks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, NY, NY, USA, 21-30.
[4]
Saleema Amershi, Bongshin Lee, Ashish Kapoor, Ratul Mahajan, and Blaine Christian. 2011b. CueT: Human-guided Fast and Accurate Network Alarm Triage. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, NY, NY, USA, 157-166.
[5]
Barry Becker, Ron Kohavi, and Dan Sommerfield. 2002. Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Chapter Visualizing the Simple Baysian Classifier, 237-249.
[6]
Enrico Bertini, Heidi Lam, and Adam Perer. 2011. Summaries: a special issue on evaluation for information visualization. Information Visualization 10, 3 (2011).
[7]
Leo Breiman. 2001. Statistical Modeling: The Two Cultures. Statist. Sci. 16, 3 (08 2001), 199-231.
[8]
Doina Caragea, Dianne Cook, and Vasant Honavar. 2001. Gaining Insights into Support Vector Machine Pattern Classifiers Using Projection-Based Tour Methods. In Proceedings of the KDD Conference. 251-256.
[9]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). ACM, NY, NY, USA, 1721-1730.
[10]
Paulo Cortez and Mark J. Embrechts. 2011. Opening black box Data Mining models using Sensitivity Analysis. In Computational Intelligence and Data Mining (CIDM), IEEE Symposium on. 341-348.
[11]
Paulo Cortez and Mark J. Embrechts. 2013. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences 225 (2013), 1-17.
[12]
John Ehrlinger. 2015. ggRandomForests: Random Forests for Regression. (2015). R package version 1.1.4.
[13]
Eibe Frank and Mark Hall. 2003. Visualizing Class Probability Estimators. In Knowledge Discovery in Databases (PKDD), Nada Lavra, Dragan Gamberger, Ljupo Todorovski, and Hendrik Blockeel (Eds.). Lecture Notes in Computer Science, Vol. 2838. Springer Berlin Heidelberg, 168-179.
[14]
Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics 29, 5 (10 2001), 1189-1232.
[15]
Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2014. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics (March 2014).
[16]
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2001. The elements of statistical learning: data mining, inference, and prediction. Springer Verlag.
[17]
Aleks Jakulin, Martin Mozina, Janez Demsar, Ivan Bratko, and Blaz Zupan. 2005. Nomograms for Visualizing Support Vector Machines. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD '05). ACM, NY, NY, USA, 108-117.
[18]
David Correa Martins Jr., Evaldo A. de Oliveira, Ulisses de Mendonça Braga Neto, Ronaldo Fumio Hashimoto, and Roberto Marcondes Cesar Jr. 2013. Signal propagation in Bayesian networks and its relationship with intrinsically multivariate predictive variables. Information Sciences 225 (March 2013), 18-34.
[19]
Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive Optimization for Steering Machine Classification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). ACM, NY, NY, USA, 1343-1352.
[20]
Been Kim, Finale Doshi-Velez, and Julie A Shah. 2015a. Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction. In Advances in Neural Information Processing Systems.
[21]
Been Kim, Kayur Patel, Afshin Rostamizadeh, and Julie Shah. 2015b. Scalable and interpretable data representation for high-dimensional complex data. In Association for the Advancement of Artificial Intelligence (AAAI).
[22]
Been Kim, Cynthia Rudin, and Julie A Shah. 2014. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification. In Advances in Neural Information Processing Systems. 1952-1960.
[23]
Josua Krause, Adam Perer, and Enrico Bertini. 2014. INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data. Visualization and Computer Graphics, IEEE Transactions on 20, 12 (Dec 2014), 1614-1623.
[24]
Max Kuhn and Kjell Johnson. 2013. Applied Predictive Modeling. Springer London, Limited.
[25]
Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of Explanatory Debugging to Personalize Interactive Machine Learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI '15). ACM, NY, NY, USA, 126-137.
[26]
Carson K. Leung and Kyle W. Joseph. 2014. Sports Data Mining: Predicting Results for the College Football Games. Procedia Computer Science 35 (September 2014), 710-719. Knowledge-Based and Intelligent Information & Engineering Systems 18th Annual Conference (KES), Proceedings of.
[27]
Moshe Lichman and Kevin Bache. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml
[28]
Brian Y. Lim. 2012. Improving Understanding and Trust with Intelligibility in Context-aware Applications. Ph.D. Dissertation. Pittsburgh, PA, USA. Advisor(s) Dey, Anind K. AAI3524680.
[29]
Shubhanshu Mishra, Jana Diesner, Jason Byrne, and Elizabeth Surbeck. 2015. Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT '15). ACM, NY, NY, USA, 323-325.
[30]
Kenney Ng, Amol Ghoting, Steven R. Steinhubl, Walter F. Stewart, Bradley Malin, and Jimeng Sun. 2013. PARAMO: A PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. Journal of Biomedical Informatics (2013).
[31]
Julian D. Olden and Donald A. Jackson. 2002. Illuminating the "black box": a randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling 154, 12 (2002), 135-150.
[32]
Kayur Patel, Naomi Bancroft, Steven Drucker, James Fogarty, and James Landay. 2010. Gestalt: Integrated Support for Implementation and Analysis in Machine Learning. In Proceedings of the 23rd annual ACM symposium on User interface software and technology (UIST). ACM, NY, NY, 37-46.
[33]
Kayur Patel, Steven M. Drucker, James Fogarty, Ashish Kapoor, and Desney S. Tan. 2011. Using Multiple Models to Understand Data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), Toby Walsh (Ed.). IJCAI/AAAI, 1723-1728.
[34]
Adam Perer and Ben Shneiderman. 2008. Integrating statistics and visualization: case studies of gaining clarity during exploratory data analysis. In Proceedings of the twenty-sixth annual SIGCHI conference on Human factors in computing systems. 265-274.
[35]
Catherine Plaisant. 2004. The challenge of information visualization evaluation. In Proceedings of the working conference on Advanced visual interfaces. ACM, 109-116.
[36]
Tony Plate, Joel Bert, John Grace, and Pierre Band. 1997. Visualizing the Function Computed by a Feedforward Neural Network. Progress in Connectionist-Based Information Systems: Proceedings of The Fourth International Conference on Neural Information Processing (ICONIP'97) 1 (1997), 306-309.
[37]
Penny Rheingans and Marie desJardins. 2000. Visualizing high-dimensional predictive model quality. In Visualization. 493-496.
[38]
Ben Shneiderman and Catherine Plaisant. 2006. Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies. In Proceedings of the 2006 BELIV Workshop. 1-7.
[39]
Chad A. Steed, Patrick J. Fitzpatrick, J. Edward Swan II, and T.J. Jankun-Kelly. 2009a. Tropical Cyclone Trend Analysis Using Enhanced Parallel Coordinates and Statistical Analytics. Cartography and Geographic Information Science 36, 3 (2009), 251-265.
[40]
Chad A. Steed, J. Edward Swan II, T.J. Jankun-Kelly, and Patrick J. Fitzpatrick. 2009b. Guided analysis of hurricane trends using statistical processes integrated with interactive parallel coordinates. In Visual Analytics Science and Technology (VAST), IEEE Symposium on. 19-26.
[41]
Greg Ver Steeg and Aram Galstyan. 2015. Maximally Informative Hierarchical Representations of High-Dimensional Data. In AISTATS'15.
[42]
Justin Talbot, Bongshin Lee, Ashish Kapoor, and Desney S. Tan. 2009. EnsembleMatrix: Interactive Visualization to Support Machine Learning with Multiple Classifiers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, NY, NY, USA, 1283-1292.
[43]
Fan-Yin Tzeng and Kwan-Liu Ma. 2005. Opening the Black Box - Data Driven Visualization of Neural Networks. In Proceedings of IEEE Visualization '05 Conference. IEEE, 383-390.
[44]
Stef van den Elzen and Jarke J. van Wijk. 2011. BaobabView: Interactive construction and analysis of decision trees. In Visual Analytics Science and Technology (VAST), IEEE Conference on. 151-160.
[45]
Greg Ver Steeg and Aram Galstyan. 2014. Discovering Structure in High-Dimensional Data Through Correlation Explanation. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence, and K.Q. Weinberger (Eds.). Curran Associates, Inc., 577-585.
[46]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In Proceedings of the 32nd International Conference on Machine Learning (ICML'15). 2048-2057.

Cited By

View all
  • (2024)Explainability of Image Generative AI for Novice and Expert Users: A Comparative Study of Static and Dynamic ExplanationsJournal of Digital Contents Society10.9728/dcs.2024.25.8.226125:8(2261-2272)Online publication date: 31-Aug-2024
  • (2024)Transforming Data Visualization With AI and MLData Visualization Tools for Business Applications10.4018/979-8-3693-6537-3.ch007(125-168)Online publication date: 13-Sep-2024
  • (2024)Explainable AI for CybersecurityAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch002(31-97)Online publication date: 18-Jan-2024
  • Show More Cited By

Index Terms

  1. Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems
    May 2016
    6108 pages
    ISBN:9781450333627
    DOI:10.1145/2858036
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. interactive machine learning
    2. partial dependence
    3. predictive modeling

    Qualifiers

    • Research-article

    Conference

    CHI'16
    Sponsor:
    CHI'16: CHI Conference on Human Factors in Computing Systems
    May 7 - 12, 2016
    California, San Jose, USA

    Acceptance Rates

    CHI '16 Paper Acceptance Rate 565 of 2,435 submissions, 23%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)322
    • Downloads (Last 6 weeks)38
    Reflects downloads up to 23 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Explainability of Image Generative AI for Novice and Expert Users: A Comparative Study of Static and Dynamic ExplanationsJournal of Digital Contents Society10.9728/dcs.2024.25.8.226125:8(2261-2272)Online publication date: 31-Aug-2024
    • (2024)Transforming Data Visualization With AI and MLData Visualization Tools for Business Applications10.4018/979-8-3693-6537-3.ch007(125-168)Online publication date: 13-Sep-2024
    • (2024)Explainable AI for CybersecurityAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch002(31-97)Online publication date: 18-Jan-2024
    • (2024)Using Negations in Analyzing German Texts in FinanceCredit and Capital Markets – Kredit und Kapital10.3790/ccm.2024.1436301(1-36)Online publication date: 5-Apr-2024
    • (2024)Responsible AI in Farming: A Multi-Criteria Framework for Sustainable Technology DesignApplied Sciences10.3390/app1401043714:1(437)Online publication date: 3-Jan-2024
    • (2024)Classification of vasovagal syncope from physiological signals on tilt table testingBioMedical Engineering OnLine10.1186/s12938-024-01229-923:1Online publication date: 30-Mar-2024
    • (2024)Relationships between Transportation Expenditures and Built Environment in the United States: Insights from Interpretable Machine-Learning ApproachJournal of Planning Education and Research10.1177/0739456X241268464Online publication date: 19-Aug-2024
    • (2024)Visualization for Recommendation Explainability: A Survey and New PerspectivesACM Transactions on Interactive Intelligent Systems10.1145/367227614:3(1-40)Online publication date: 11-Jun-2024
    • (2024)Assessing User Trust in Active Learning Systems: Insights from Query Policy and Uncertainty VisualizationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645207(772-786)Online publication date: 18-Mar-2024
    • (2024)Videogenic: Identifying Highlight Moments in Videos with Professional Photographs as a PriorProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3656186(328-346)Online publication date: 23-Jun-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media