skip to main content
research-article
Open Access

How Teams Communicate about the Quality of ML Models: A Case Study at an International Technology Company

Published:13 July 2021Publication History
Skip Abstract Section

Abstract

Machine learning (ML) has become a crucial component in software products, either as part of the user experience or used internally by software teams. Prior studies have explored how ML is affecting development team roles beyond data scientists, including user experience designers, program managers, developers and operations engineers. However, there has been little investigation of how team members in different roles on the team communicate about ML, in particular about the quality of models. We use the general term quality to look beyond technical issues of model evaluation, such as accuracy and overfitting, to any issue affecting whether a model is suitable for use, including ethical, engineering, operations, and legal considerations. What challenges do teams face in discussing the quality of ML models? What work practices mitigate those challenges? To address these questions, we conducted a mixed-methods study at a large software company, first interviewing15 employees in a variety of roles, then surveying 168 employees to broaden our understanding. We found several challenges, including a mismatch between user-focused and model-focused notions of performance, misunderstandings about the capabilities and limitations of evolving ML technology, and difficulties in understanding concerns beyond one's own role. We found several mitigation strategies, including the use of demos during discussions to keep the team customer-focused.

References

  1. [n.d.]. 2017 Kaggle ML & DS Survey. https://kaggle.com/kaggle/kaggle-survey-2017 Library Catalog: www.kaggle.com.Google ScholarGoogle Scholar
  2. [n.d.]. How to Prevent Discriminatory Outcomes in Machine Learning. https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-in-machine-learning/ Library Catalog: www.weforum.org.Google ScholarGoogle Scholar
  3. 2014. 3 Data Careers Decoded and What It Means for You. https://blog.udacity.com/2014/12/data-analyst-vs-data-scientist-vs-data-engineer.html Library Catalog: blog.udacity.com Section: Career Guidance.Google ScholarGoogle Scholar
  4. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters(Oct. 2018). https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08GGoogle ScholarGoogle Scholar
  5. 2018. Tutorial: 21 fairness definitions and their politics. https://www.youtube.com/watch?v=jIXIuYdnyyk&ab_channel=ArvindNarayananGoogle ScholarGoogle Scholar
  6. Jeroen C. J. H. Aerts, Keith C. Clarke, and Alex D. Keuper. 2003. Testing Popular Visualization Techniques for Representing Model Uncertainty.Cartography and Geographic Information Science 30, 3 (Jan. 2003), 249--261. https://doi.org/10.1559/152304003100011180Google ScholarGoogle Scholar
  7. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, and Harald Gall. [n.d.]. Software Engineering for Machine Learning: A Case Study. ([n. d.]), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. Model Tracker:Redesigning Performance Analysis Tools for Machine Learning. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 337--346. https://doi.org/10.1145/2702123.2702509 event-place: Seoul, Republic of Korea. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gennady Andrienko, Natalia Andrienko, Steven Drucker, Jean-Daniel Fekete, Danyel Fisher, Stavros Idreos, Tim Kraska,Guoliang Li, Kwan-Liu Ma, Jock Mackinlay, Antti Oulasvirta, Tobias Schreck, Heidrun Schmann, Michael Stonebraker, David Auber, Nikos Bikakis, Panos Chrysanthis, George Papastefanatos, and Mohamed Sharaf. 2020. Big Data Visualization and Analytics: Future Research Challenges and Emerging Applications. https://hal.inria.fr/hal-02568845Google ScholarGoogle Scholar
  10. Matthew Arnold, Rachel K. E. Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Darrell Reimer, Alexandra Olteanu, David Piorkowski, Jason Tsay, and Kush R. Varshney. 2018. FactSheets: Increasing Trust in AI Services through Supplier's Declarations of Conformity. arXiv:1808.07261 [cs] (Aug. 2018). http://arxiv.org/abs/1808.07261 arXiv: 1808.07261.Google ScholarGoogle Scholar
  11. author. [n.d.]. ONNX: Open Neural Network Exchange Format. https://onnx.ai/Google ScholarGoogle Scholar
  12. Ricardo Baeza-Yates. 2016. Data and algorithmic bias in the web. In Proceedings of the 8th ACM Conference on Web Science(WebSci '16). Association for Computing Machinery, New York, NY, USA, 1. https://doi.org/10.1145/2908131.2908135 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nadia Boukhelifa, Marc-Emmanuel Perrin, Samuel Huron, and James Eagan. 2017. How Data Workers Cope with Uncertainty: A Task Characterisation Study. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17. ACM Press, Denver, Colorado, USA, 3645--3656. https://doi.org/10.1145/3025453.3025738 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. George E. P. Box. 1976. Science and Statistics. J. Amer. Statist. Assoc.71, 356 (Dec. 1976), 791--799. https://doi.org/10.1080/01621459.1976.10480949Publisher:Taylor&Francis_eprint:https://www.tandfonline.com/doi/pdf/10.1080/01621459.1976.10480949.Google ScholarGoogle ScholarCross RefCross Ref
  15. E. Breck, S. Cai, E. Nielsen, M. Salib, and D. Sculley. 2017. The ML test score: A rubric for ML production readiness and technical debt reduction. In 2017 IEEE International Conference on Big Data (Big Data). 1123--1132. https://doi.org/10.1109/BigData.2017.8258038Google ScholarGoogle ScholarCross RefCross Ref
  16. Carrie J Cai and Philip J Guo. [n.d.]. Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires. ([n. d.]), 10.Google ScholarGoogle Scholar
  17. Nan-Chen Chen, Jina Suh, Johan Verwey, Gonzalo Ramos, Steven Drucker, and Patrice Simard. 2018. AnchorViz: Facilitating Classifier Error Discovery through Interactive Semantic Data Exploration. In Proceedings of the 2018 Conference on Human Information Interaction & Retrieval - IUI '18. ACM Press, Tokyo, Japan, 269--280. https://doi.org/10.1145/3172944.3172950 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Comet.ml. [n.d.]. Comet.ml - Supercharging Machine Learning. https://www.comet.ml/Google ScholarGoogle Scholar
  19. Graham Dove, Kim Halskov, Jodi Forlizzi, and John Zimmerman. 2017. UX Design Innovation: Challenges for Working with Machine Learning as a Design Material. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17. ACM Press, Denver, Colorado, USA, 278--288. https://doi.org/10.1145/3025453.3025739 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sebastian S. Feger, Sunje Dallmeier-Tiessen, Pawel W. Wozniak, and Albrecht Schmidt. 2019. The Role of HCIin Reproducible Science: Understanding, Supporting and Motivating Core Practices. In Extended Abstracts of the2019 CHI Conference on Human Factors in Computing Systems - CHI EA '19. ACM Press, Glasgow, Scotland Uk, 1--6. https://doi.org/10.1145/3290607.3312905 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard Finger and Ann M. Bisantz. 2002. Utilizing graphical formats to convey uncertainty in a decision-making task. Theoretical Issues in Ergonomics Science 3, 1 (Jan. 2002), 1--25. https://doi.org/10.1080/14639220110110324 Publisher:Taylor & Francis _eprint: https://doi.org/10.1080/14639220110110324.Google ScholarGoogle ScholarCross RefCross Ref
  22. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumeé III,and Kate Crawford. 2018. Datasheets for Datasets. arXiv:1803.09010 [cs](March 2018). http://arxiv.org/abs/1803.09010arXiv: 1803.09010.Google ScholarGoogle Scholar
  23. R. Stuart Geiger, Nelle Varoquaux, Charlotte Mazel-Cabasse, and Chris Holdgraf. 2018. The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries. Computer Supported Cooperative Work (CSCW) 27, 3 (Dec. 2018), 767--802. https://doi.org/10.1007/s10606-018--9333--1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Philip Guo. [n.d.]. Data Science Workflow: Overview and Challenges. https://cacm.acm.org/blogs/blog-cacm/169199-data-science-workflow-overview-and-challenges/fulltext Library Catalog: cacm.acm.org.Google ScholarGoogle Scholar
  25. Lasswell Harold D. 1948. The structure and function of communication in society. InThe Communication of Ideas.Harper's, New York, N.Y.Google ScholarGoogle Scholar
  26. Galen Harrison, Julia Hanson, Christine Jacinto, Julio Ramirez, and Blase Ur. 2020. An empirical study on the perceived fairness of realistic, imperfect machine learning models. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* '20). Association for Computing Machinery, Barcelona, Spain, 392--402. https://doi.org/10.1145/3351095.3372831 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M. Drucker. 2019. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 579:1--579:13. https://doi.org/10.1145/3290605.3300809 event-place: Glasgow, Scotland Uk. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. 2018. The Dataset Nutrition Label:A Framework To Drive Higher Data Quality Standards. arXiv:1805.03677 [cs](May 2018). http://arxiv.org/abs/1805.03677arXiv: 1805.03677.Google ScholarGoogle Scholar
  29. Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daume, Miro Dudik, and Hanna Wallach. 2019. Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, Glasgow, Scotland Uk, 1--16. https://doi.org/10.1145/3290605.3300830 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sungsoo Ray Hong, Jessica Hullman, and Enrico Bertini. 2020. Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs. Proceedings of the ACM on Human-Computer Interaction 4, CSCW 1 (May 2020), 1--26. https://doi.org/10.1145/3392878 arXiv: 2004.11440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Hullman, X. Qiao, M. Correll, A. Kale, and M. Kay. 2019. In Pursuit of Error: A Survey of Uncertainty Visualization Evaluation. IEEE Transactions on Visualization and Computer Graphics 25, 1 (Jan. 2019), 903--913. https://doi.org/10.1109/TVCG.2018.2864889Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Alex Kale, Matthew Kay, and Jessica Hullman. 2019. Decision-Making Under Uncertainty in Research Synthesis:Designing for the Garden of Forking Paths. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 202:1--202:14. https://doi.org/10.1145/3290605.3300432 event-place: Glasgow, Scotland Uk. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Matthew Kay, Tara Kola, Jessica R. Hullman, and Sean A. Munson. 2016. When (Ish) is My Bus?: User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 5092--5103. https://doi.org/10.1145/2858036.2858558 event-place: San Jose, California, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Matthew Kay, Shwetak N. Patel, and Julie A. Kientz. 2015. How Good is 85%?: A Survey Tool to Connect Classifier Evaluation to Acceptability of Accuracy. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 347--356. https://doi.org/10.1145/2702123.2702603 event-place:Seoul, Republic of Korea. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Claire Kayacik, Sherol Chen, Signe Noerly, Jess Holbrook, Adam Roberts, and Douglas Eck. 2019. Identifying the Intersections: User Experience + Research Scientist Collaboration in a Generative Machine Learning Interface. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems - CHI EA '19. ACM Press,Glasgow, Scotland Uk, 1--8. https://doi.org/10.1145/3290607.3299059 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The emerging role of data scientists on software development teams. In Proceedings of the 38th International Conference on Software Engineering (ICSE '16). Association for Computing Machinery, Austin, Texas, 96--107. https://doi.org/10.1145/2884781.2884783 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2018. Data Scientists in Software Teams: State of the Art and Challenges.IEEE Transactions on Software Engineering 44, 11 (Nov. 2018), 1024--1038. https://doi.org/10.1109/TSE.2017.2754374 Conference Name: IEEE Transactions on Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sean Kross and Philip J. Guo. 2019. Practitioners Teaching Data Science in Industry and Academia: Expectations,Workflows, and Challenges. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI'19). ACM, New York, NY, USA, 263:1--263:14. https://doi.org/10.1145/3290605.3300493 event-place: Glasgow, ScotlandUk. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Peter Kun, Ingrid Mulder, and Gerd Kortuem. 2018. Design Enquiry Through Data: Appropriating a Data Science Workflow for the Design Process. In Proceedings of the 32Nd International BCS Human Computer Interaction Conference(HCI '18). BCS Learning & Development Ltd., Swindon, UK, 32:1--32:12. https://doi.org/10.14236/ewic/HCI2018.32event-place: Belfast, United Kingdom. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Matthew Lease. 2011. On quality control and machine learning in crowdsourcing. In Proceedings of the 11th AAAI Conference on Human Computation (AAAIWS'11--11). AAAI Press, 97--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Lezhi Li, Yunfeng Bai, and Yang Wang. 2019. Manifold: A Model-Agnostic Visual Debugging Tool for Machine Learning at Uber. (2019). https://www.usenix.org/conference/opml19/presentation/li-lezhiGoogle ScholarGoogle Scholar
  42. John Lofland and John Lofland (Eds.). 2006.Analyzing social settings: a guide to qualitative observation and analysis(4th ed ed.). Wadsworth/Thomson Learning, Belmont, CA.Google ScholarGoogle Scholar
  43. Yaoli Mao, Dakuo Wang, Michael Muller, Kush R. Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilovic. 2019. How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question? Proceedings of the ACM on Human-Computer Interaction 3, GROUP (Dec. 2019), 1--23. https://doi.org/10.1145/3361118 arXiv: 1909.03486. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lauren Kirchner Surya Julia Angwin Mattu, Jeff Larson. 2016. Machine Bias. ProPublica(2016). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing?token=pB6i06IyoO0LwmE2vf YUQBGseZmS8U0EGoogle ScholarGoogle Scholar
  45. Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency - FAT* '19. ACM Press, Atlanta, GA, USA, 220--229. https://doi.org/10.1145/3287560.3287596 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Michael Muller, Melanie Feinberg, Timothy George, Steven J. Jackson, Bonnie E. John, Mary Beth Kery, and Samir Passi.2019. Human-Centered Study of Data Science Work Practices. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, Glasgow, Scotland Uk, 1--8. https://doi.org/10.1145/3290607.3299018 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Syed Sadat Nazrul. 2018. DevOps for Data Scientists: Taming the Unicorn. https://towardsdatascience.com/devops-for-data-scientists-taming-the-unicorn-6410843990de Library Catalog: towards datascience.com.Google ScholarGoogle Scholar
  48. Azadeh Nematzadeh, Giovanni Luca Ciampaglia, Filippo Menczer, and Alessandro Flammini. 2018. How algorithmic popularity bias hinders or promotes quality.Scientific Reports8, 1 (Dec. 2018), 15951. https://doi.org/10.1038/s41598-018--34203--2 arXiv: 1707.00574.Google ScholarGoogle Scholar
  49. Gagan Bansal Besmira Nushi and Ece Kamar. [n.d.]. Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff. ([n. d.]), 9.Google ScholarGoogle Scholar
  50. Lace Padilla, Matthew Kay, and Jessica Hullman. 2020.Uncertainty Visualization. preprint. PsyArXiv. https://doi.org/10.31234/osf.io/ebd6rGoogle ScholarGoogle Scholar
  51. Samir Passi and Solon Barocas. 2019. Problem Formulation and Fairness. In Proceedings of the Conference on Fairness,Accountability, and Transparency - FAT* '19. ACM Press, Atlanta, GA, USA, 39--48. https://doi.org/10.1145/3287560.3287567 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Samir Passi and Steven J. Jackson. 2018. Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects. Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018), 136:1--136:28. https://doi.org/10.1145/3274405 Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008. Investigating statistical machine learning asa tool for software development. In Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems - CHI '08. ACM Press, Florence, Italy, 667. https://doi.org/10.1145/1357054.1357160 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing.arXiv:2001.00973 [cs] (Jan. 2020). http://arxiv.org/abs/2001.00973 arXiv:2001.00973.Google ScholarGoogle Scholar
  55. Donghao Ren, Saleema Amershi, Bongshin Lee, Jina Suh, and Jason D. Williams. 2017. Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers. IEEE Transactions on Visualization and Computer Graphics23, 1 (Jan. 2017), 61--70. https://doi.org/10.1109/TVCG.2016.2598828 Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew SlavinRoss, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D.Sherwin, S. Karthik Mukkavilli, Konrad P. Kording, Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, FelixCreutzig, Jennifer Chayes, and Yoshua Bengio. 2019. Tackling Climate Change with Machine Learning.arXiv:1906.05433[cs, stat](June 2019). http://arxiv.org/abs/1906.05433 arXiv: 1906.05433.Google ScholarGoogle Scholar
  57. Marck Harlan Harris Vaisman, Sean Murphy. [n.d.]. Analyzing the Analyzers - O'Reilly Media. https://www.oreilly.com/data/free/analyzing-the-analyzers.csp Library Catalog: www.oreilly.com.Google ScholarGoogle Scholar
  58. April Yi Wang, Anant Mittal, Christopher Brooks, and Steve Oney. 2019. How Data Scientists Use Computational Notebooks for Real-Time Collaboration.Proceedings of the ACM on Human-Computer Interaction 3, CSCW (Nov. 2019), 1--30. https://doi.org/10.1145/3359141 Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, and Huamin Qu. 2019. ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, New York, NY, USA, 681:1--681:12. https://doi.org/10.1145/3290605.3300911 event-place: Glasgow, Scotland Uk. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Maranke Wieringa. 2020. What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT*'20). Association for Computing Machinery, Barcelona, Spain, 1--18. https://doi.org/10.1145/3351095.3372833 Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018. Investigating How Experienced UX Designers Effectively Work with Machine Learning. In Proceedings of the 2018 Designing Interactive Systems Conference(DIS '18). ACM, New York, NY, USA, 585--596. https://doi.org/10.1145/3196709.3196730 event-place: Hong Kong, China. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding Interactive Machine Learning Tool Design in How Non-Experts Actually Build Models. In Proceedings of the 2018 on Designing Interactive Systems Conference 2018- DIS '18. ACM Press, Hong Kong, China, 573--584. https://doi.org/10.1145/3196709.3196729 Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How do Data Science Workers Collaborate? Roles, Workflows,and Tools. arXiv:2001.06684 [cs, stat](Jan. 2020). http://arxiv.org/abs/2001.06684 arXiv: 2001.06684.Google ScholarGoogle Scholar

Index Terms

  1. How Teams Communicate about the Quality of ML Models: A Case Study at an International Technology Company

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!