skip to main content
10.1145/3491102.3517497acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open Access

Predicting and Explaining Mobile UI Tappability with Vision Modeling and Saliency Analysis

Published:28 April 2022Publication History

ABSTRACT

UI designers often correct false affordances and improve the discoverability of features when users have trouble determining if elements are tappable. We contribute a novel system that models the perceived tappability of mobile UI elements with a vision-based deep neural network and helps provide design insights with dataset-level and instance-level explanations of model predictions. Our system retrieves designs from similar mobile UI examples from our dataset using the latent space of our model. We also contribute a novel use of an interpretability algorithm, XRAI, to generate a heatmap of UI elements that contribute to a given tappability prediction. Through several examples, we show how our system can help automate elements of UI usability analysis and provide insights for designers to iterate their designs. In addition, we share findings from an exploratory evaluation with professional designers to learn how AI-based tools can aid UI design and evaluation for tappability issues.

References

  1. Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. Sanity Checks for Saliency Maps. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Vol. 31. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2018/file/294a8ed24b1ad22ec2e7efea049b8737-Paper.pdfGoogle ScholarGoogle Scholar
  2. Chongyang Bai, Xiaoxue Zang, Ying Xu, Srinivas Sunkara, Abhinav Rastogi, Jindong Chen, and Blaise Agüera y Arcas. 2021. UIBert: Learning Generic Multimodal Representations for UI Understanding. CoRR abs/2107.13731(2021). arXiv:2107.13731https://arxiv.org/abs/2107.13731Google ScholarGoogle Scholar
  3. Rachel Bellamy, Bonnie John, and Sandra Kogan. 2011. Deploying CogTool: integrating quantitative usability assessment into real-world software development. In 2011 33rd International Conference on Software Engineering (ICSE). 691–700. https://doi.org/10.1145/1985793.1985890Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Zoya Bylinskii, Nam Wook Kim, Peter O’Donovan, Sami Alsheikh, Spandan Madan, Hanspeter Pfister, Fredo Durand, Bryan Russell, and Aaron Hertzmann. 2017. Learning Visual Importance for Graphic Designs and Data Visualizations. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. ACM, Québec City QC Canada, 57–69. https://doi.org/10.1145/3126594.3126653Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Carrie J. Cai, Jonas Jongejan, and Jess Holbrook. 2019. The Effects of Example-Based Explanations in a Machine Learning Interface. In Proceedings of the 24th International Conference on Intelligent User Interfaces (Marina del Ray, California) (IUI ’19). Association for Computing Machinery, New York, NY, USA, 258–262. https://doi.org/10.1145/3301275.3302289Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Carrie J Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2021. Onboarding Materials as Cross-functional Boundary Objects for Developing AI Assistants. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–7. https://doi.org/10.1145/3411763.3443435Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chunyang Chen, Sidong Feng, Zhenchang Xing, Linda Liu, Shengdong Zhao, and Jinshui Wang. 2019. Gallery D.C.: Design Search and Knowledge Discovery through Auto-created GUI Component Gallery. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (Nov. 2019), 1–22. https://doi.org/10.1145/3359282Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chun-Fu (Richard) Chen, Marco Pistoia, Conglei Shi, Paolo Girolami, Joseph W. Ligman, and Yong Wang. 2017. UI X-Ray: Interactive Mobile UI Testing Based on Computer Vision. In Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, Limassol Cyprus, 245–255. https://doi.org/10.1145/3025171.3025190Google ScholarGoogle Scholar
  9. Jieshan Chen, Chunyang Chen, Zhenchang Xing, Xin Xia, Liming Zhu, John Grundy, and Jinshui Wang. 2020. Wireframe-based UI Design Search through Image Autoencoder. ACM Transactions on Software Engineering and Methodology 29, 3 (July 2020), 1–31. https://doi.org/10.1145/3391613Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. L. Cheng. 2016. Learning through the variation theory: A case study. The International Journal of Teaching and Learning in Higher Education 28 (2016), 283–292.Google ScholarGoogle Scholar
  11. Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hibschman, Daniel Afergan, Yang Li, Jeffrey Nichols, and Ranjitha Kumar. 2017. Rico: A Mobile App Dataset for Building Data-Driven Design Applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. ACM, Québec City QC Canada, 845–854. https://doi.org/10.1145/3126594.3126651Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Biplab Deka, Zifeng Huang, Chad Franzen, Jeffrey Nichols, Yang Li, and Ranjitha Kumar. 2017. ZIPT: Zero-Integration Performance Testing of Mobile App Designs. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (Québec City, QC, Canada) (UIST ’17). Association for Computing Machinery, New York, NY, USA, 727–736. https://doi.org/10.1145/3126594.3126647Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Biplab Deka, Zifeng Huang, and Ranjitha Kumar. 2016. ERICA: Interaction Mining Mobile Apps. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 767–776. https://doi.org/10.1145/2984511.2984581Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Camilo Fosco, Vincent Casser, Amish Kumar Bedi, Peter O’Donovan, Aaron Hertzmann, and Zoya Bylinskii. 2020. Predicting Visual Importance Across Graphic Design Types. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 249–260. https://doi.org/10.1145/3379337.3415825Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. 2018. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. CoRR abs/1811.12231(2018). arXiv:1811.12231http://arxiv.org/abs/1811.12231Google ScholarGoogle Scholar
  16. Amirata Ghorbani, James Wexler, James Y Zou, and Been Kim. 2019. Towards Automatic Concept-based Explanations. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2019/file/77d2afcb31f6493e350fca61764efb9a-Paper.pdfGoogle ScholarGoogle Scholar
  17. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. https://doi.org/10.1109/CVPR.2016.90Google ScholarGoogle Scholar
  18. Zecheng He, Srinivas Sunkara, Xiaoxue Zang, Ying Xu, Lijuan Liu, Nevan Wichers, Gabriel Schubiner, Ruby B. Lee, and Jindong Chen. 2020. ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces. CoRR abs/2012.12350(2020). arXiv:2012.12350https://arxiv.org/abs/2012.12350Google ScholarGoogle Scholar
  19. Andrew Head, Codanda Appachu, Marti A. Hearst, and Björn Hartmann. 2015. Tutorons: Generating context-relevant, on-demand explanations and demonstrations of online code. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 3–12. https://doi.org/10.1109/VLHCC.2015.7356972Google ScholarGoogle Scholar
  20. Forrest Huang, John F. Canny, and Jeffrey Nichols. 2019. Swire: Sketch-Based User Interface Retrieval. Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3290605.3300334Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Andrei Kapishnikov, Tolga Bolukbasi, Fernanda Viegas, and Michael Terry. 2019. XRAI: Better Attributions Through Regions. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 4947–4956. https://doi.org/10.1109/ICCV.2019.00505Google ScholarGoogle Scholar
  22. Been Kim, Martin Wattenberg, Justin Gilmer, Carrie J. Cai, James Wexler, Fernanda B. Viégas, and Rory Sayres. 2018. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 2673–2682. http://proceedings.mlr.press/v80/kim18d.htmlGoogle ScholarGoogle Scholar
  23. Chunggi Lee, Sanghoon Kim, Dongyun Han, Hongjun Yang, Young-Woo Park, Bum Chul Kwon, and Sungahn Ko. 2020. GUIComp: A GUI Design Assistant with Real-Time, Multi-Faceted Feedback. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376327Google ScholarGoogle Scholar
  24. Luis A. Leiva, Asutosh Hota, and Antti Oulasvirta. 2020. Enrico: A Dataset for Topic Modeling of Mobile UI Designs. In 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services (Oldenburg, Germany) (MobileHCI ’20). Association for Computing Machinery, New York, NY, USA, Article 9, 4 pages. https://doi.org/10.1145/3406324.3410710Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Toby Jia-Jun Li, Lindsay Popowski, Tom Mitchell, and Brad A Myers. 2021. Screen2Vec: Semantic Embedding of GUI Screens and GUI Components. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.3445049Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, and Jason Baldridge. 2020. Mapping Natural Language Instructions to Mobile UI Action Sequences. In Annual Conference of the Association for Computational Linguistics (ACL 2020). https://www.aclweb.org/anthology/2020.acl-main.729.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  27. Yang Li, Ranjitha Kumar, Walter S. Lasecki, and Otmar Hilliges. 2020. Artificial Intelligence for HCI: A Modern Approach. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3334480.3375147Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zachary C. Lipton. 2017. The Mythos of Model Interpretability. arxiv:1606.03490 [cs.LG]Google ScholarGoogle Scholar
  29. Hoa Loranger. 2015. Beyond Blue Links: Making Clickable Elements Recognizable. Nielsen Norman Group. https://www.nngroup.com/articles/clickable-elements/Google ScholarGoogle Scholar
  30. Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 4768–4777.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Nicolas Papernot and Patrick D. McDaniel. 2018. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning. CoRR abs/1803.04765(2018). arXiv:1803.04765http://arxiv.org/abs/1803.04765Google ScholarGoogle Scholar
  32. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1135–1144. https://doi.org/10.1145/2939672.2939778Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Eldon Schoop, Forrest Huang, and Bjoern Hartmann. 2021. UMLAUT: Debugging Deep Learning Programs Using Program Structure and Model Behavior. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.3445538Google ScholarGoogle Scholar
  34. Jessica Schrouff, Sebastien Baur, Shaobo Hou, Diana Mincu, Eric Loreaux, Ralph Blanes, James Wexler, Alan Karthikesalingam, and Been Kim. 2021. Best of both worlds: local and global explanations with human-understandable concepts. CoRR abs/2106.08641(2021). arXiv:2106.08641https://arxiv.org/abs/2106.08641Google ScholarGoogle Scholar
  35. Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In 2017 IEEE International Conference on Computer Vision (ICCV). 618–626. https://doi.org/10.1109/ICCV.2017.74Google ScholarGoogle Scholar
  36. Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML’17). JMLR.org, 3319–3328.Google ScholarGoogle Scholar
  37. Amanda Swearngin and Yang Li. 2019. Modeling Mobile Interface Tappability Using Crowdsourcing and Deep Learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–11. https://doi.org/10.1145/3290605.3300305Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Amanda Swearngin, Chenglong Wang, Alannah Oleson, James Fogarty, and Amy J. Ko. 2020. Scout: Rapid Exploration of Interface Layout Alternatives through High-Level Design Constraints. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376593Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Robert Stuart Weiss. 1994. Learning from strangers: The art and method of qualitative interview studies. Free Press. ix, 246 pages.Google ScholarGoogle Scholar
  40. Gerhard Widmer and Miroslav Kubat. 1996. Learning in the Presence of Concept Drift and Hidden Contexts. Mach. Learn. 23, 1 (April 1996), 69–101. https://doi.org/10.1023/A:1018046501280Google ScholarGoogle ScholarCross RefCross Ref
  41. Jason Wu, Xiaoyi Zhang, Jeffrey Nichols, and Jeffrey P. Bigham. 2021. Screen Parsing: Towards Reverse Engineering of UI Models from Screenshots. CoRR abs/2109.08763(2021). arXiv:2109.08763https://arxiv.org/abs/2109.08763Google ScholarGoogle Scholar
  42. Ziming Wu, Yulun Jiang, Yiding Liu, and Xiaojuan Ma. 2020. Predicting and Diagnosing User Engagement with Mobile UI Animation via a Data-Driven Approach. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–13. https://doi.org/10.1145/3313831.3376324Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi, and Aaron Steinfeld. 2018. Investigating How Experienced UX Designers Effectively Work with Machine Learning. In Proceedings of the 2018 Designing Interactive Systems Conference (Hong Kong, China) (DIS ’18). Association for Computing Machinery, New York, NY, USA, 585–596. https://doi.org/10.1145/3196709.3196730Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xiaoxue Zang, Ying Xu, and Jindong Chen. 2021. Multimodal Icon Annotation For Mobile Applications. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3447526.3472064Google ScholarGoogle Scholar
  45. Xiaoyi Zhang, Lilian de Greef, Amanda Swearngin, Samuel White, Kyle Murray, Lisa Yu, Qi Shan, Jeffrey Nichols, Jason Wu, Chris Fleizach, Aaron Everitt, and Jeffrey P Bigham. 2021. Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.3445186Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Predicting and Explaining Mobile UI Tappability with Vision Modeling and Saliency Analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
        April 2022
        10459 pages
        ISBN:9781450391573
        DOI:10.1145/3491102

        Copyright © 2022 Owner/Author

        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 April 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate5,789of24,782submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format