skip to main content
research-article
Open Access

Transparency in Measurement Reporting: A Systematic Literature Review of CHI PLAY

Published:06 October 2021Publication History
Skip Abstract Section

Abstract

Measuring theoretical concepts, so-called constructs, is a central challenge of Player Experience research. Building on recent work in HCI and psychology, we conducted a systematic literature review to study the transparency of measurement reporting. We accessed the ACM Digital Library to analyze all 48 full papers published at CHI PLAY 2020, of those, 24 papers used self-report measurements and were included in the full review. We assessed specifically, whether researchers reported What, How and Why they measured. We found that researchers matched their measures to the construct under study and that administrative details, such as number of points on a Likert-type scale, were frequently reported. However, definitions of the constructs to be measured and justifications for selecting a particular scale were sparse. Lack of transparency in these areas threaten the validity of singular studies, but further compromise the building of theories and accumulation of research knowledge in meta-analytic work. This work is limited to only assessing the current transparency of measurement reporting at CHI PLAY 2020, however we argue this constitutes a fair foundation to assess potential pitfalls. To address these pitfalls, we propose a prescriptive model of a measurement selection process, which aids researchers to systematically define their constructs, specify operationalizations, and justify why these measures were chosen. Future research employing this model should contribute to more transparency in measurement reporting. The research was funded through internal resources. All materials are available on https://osf.io/4xz2v/.

References

  1. Vero Vanden Abeele, Katta Spiel, Lennart Nacke, Daniel Johnson, and Kathrin Gerling. 2020. Development and validation of the player experience inventory: A scale to measure player experiences at the level of functional and psychosocial consequences. International Journal of Human-Computer Studies 135 (2020), 102--370. https: //doi.org/10.1016/j.ijhcs.2019.102370Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. CHI PLAY ACM. 2021. Contribution Types and Evaluation Criteria. https://chiplay.acm.org/2021/table-1/Google ScholarGoogle Scholar
  3. John Antonakis, Nicolas Bastardoz, Philippe Jacquart, and Boas Shamir. 2016. Charisma: An Ill-Defined and Ill- Measured Gift. Annual Review of Organizational Psychology and Organizational Behavior 3, 1 (Mar. 2016), 293--319. https://doi.org/10.1146/annurev-orgpsych-041015-062305Google ScholarGoogle ScholarCross RefCross Ref
  4. Mark Appelbaum, Harris Cooper, Rex B. Kline, Evan Mayo-Wilson, Arthur M. Nezu, and Stephen M. Rao. 2018. Journal article reporting standards for quantitative research in psychology: The APA Publications and Communications Board task force report. American Psychologist 73, 1 (2018), 3--25. https://doi.org/10.1037/amp0000191Google ScholarGoogle ScholarCross RefCross Ref
  5. American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Joint Committee on Standards for Educational, and Psychological Testing (US). 1999. Standards for educational and psychological testing. Amer Educational Research Assn, Washington, DC, USA.Google ScholarGoogle Scholar
  6. Javier A. Bargas-Avila and Kasper Hornbæk. 2011. Old Wine in New Bottles or Novel Challenges: A Critical Analysis of Empirical Studies of User Experience. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI '11). Association for Computing Machinery, New York, NY, USA, 2689--2698. https://doi.org/10.1145/1978942.1979336Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Adam E. Barry, Beth Chaney, Anna K. Piazza-Gardner, and Enmanuel A. Chavarria. 2014. Validity and Reliability Reporting Practices in the Field of Health Education and Behavior: A Review of Seven Journals. Health Education & Behavior 41, 1 (2014), 12--18. https://doi.org/10.1177/1090198113483139Google ScholarGoogle ScholarCross RefCross Ref
  8. Denny Borsboom. 2006. The attack of the psychometricians. Psychometrika 71, 3 (Sep. 2006), 425--440. https: //doi.org/10.1007/s11336-006--1447--6Google ScholarGoogle ScholarCross RefCross Ref
  9. Katreen Boustani, Anne C. Tally, Yu Ra Kim, and Christena Nippert-Eng. 2020. Gaming the Name: Player Strategies for Adapting to Name Constraints in Online Videogames. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (Virtual Event, Canada) (CHI PLAY '20). Association for Computing Machinery, New York, NY, USA, 120--131. https://doi.org/10.1145/3410404.3414259Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Florian Brühlmann and Elisa D. Mekler. 2018. Surveys in Games User Research. In Games User Research, Anders Drachen, Pejman Mirza-Babaei, and Lennart Nacke (Eds.). Oxford University Press, Oxford, Chapter 9, 141--162.Google ScholarGoogle Scholar
  11. Florian Brühlmann and Gian-Marco Schmid. 2015. How to Measure the Game Experience?: Analysis of the Factor Structure of Two Questionnaires. In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. ACM, New York, NY, USA, 1181--1186. https://doi.org/10.1145/2702613.2732831Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marcus Carter, John Downs, Bjorn Nansen, Mitchell Harrop, and Martin Gibbs. 2014. Paradigms of Games Research in HCI: A Review of 10 Years of Research at CHI. In Proceedings of the First ACM SIGCHI Annual Symposium on Computer-Human Interaction in Play (Toronto, Ontario, Canada) (CHI PLAY '14). Association for Computing Machinery, New York, NY, USA, 27--36. https://doi.org/10.1145/2658537.2658708Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. David Chan. 2009. So why ask me? Are self-report data really that bad. In Statistical and methodological myths and urban legends: Doctrine, verity and fable in the organizational and social sciences, Charles E Lance and Robert J Vandenberg (Eds.). Taylor & Francis, Abingdon, UK, Chapter 13, 309--336.Google ScholarGoogle Scholar
  14. Lewis L. Chuang and Ulrike Pfeil. 2018. Transparency and Openness Promotion Guidelines for HCI. Association for Computing Machinery, New York, NY, USA, 1--4. https://doi.org/10.1145/3170427.3185377 Proc. ACM Hum.-Comput. Interact., Vol. 5, No. CHI PLAY, Article 233. Publication date: October 2021. 233:20 Lena Fanya Aeschbach, et al.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. James M. Conway and Charles E. Lance. 2010. What Reviewers Should Expect from Authors Regarding Common Method Bias in Organizational Research. Journal of Business and Psychology 25, 3 (May 2010), 325--334. https: //doi.org/10.1007/s10869-010--9181--6Google ScholarGoogle ScholarCross RefCross Ref
  16. John Dawes. 2008. Do Data Characteristics Change According to the Number of Scale Points Used? An Experiment Using 5-Point, 7-Point and 10-Point Scales. International Journal of Market Research 50, 1 (Jan. 2008), 61--104. https: //doi.org/10.1177/147078530805000106Google ScholarGoogle ScholarCross RefCross Ref
  17. Edward L. Deci and Richard M. Ryan. 2003. Intrinsic motivation inventory. Self-determination theory 267 (2003).Google ScholarGoogle Scholar
  18. Robert F. DeVellis. 2016. Scale development: Theory and applications. Vol. 26. Sage publications, Los Angeles, CA, USA.Google ScholarGoogle Scholar
  19. Florian Echtler and Maximilian Häußler. 2018. Open Source, Open Science, and the Replication Crisis in HCI. Association for Computing Machinery, New York, NY, USA, 1--8. https://doi.org/10.1145/3170427.3188395Google ScholarGoogle Scholar
  20. Matthias Egger, George Davey Smith, and Jonathan AC Sterne. 2001. Uses and abuses of meta-analysis. Clinical Medicine 1, 6 (2001), 478. https://doi.org/10.7861/clinmedicine.1--6--478Google ScholarGoogle ScholarCross RefCross Ref
  21. Panteleimon Ekkekakis and James A. Russell. 2013. The Measurement of Affect, Mood, and Emotion: A Guide for Health-Behavioral Research. Cambridge University Press, Cambridge, UK. https://doi.org/10.1017/CBO9780511820724Google ScholarGoogle Scholar
  22. Jessica Kay Flake and Eiko I. Fried. 2020. Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them. Advances in Methods and Practices in Psychological Science 3, 4 (2020), 456--465. https: //doi.org/10.1177/2515245920952393Google ScholarGoogle Scholar
  23. Saul Greenberg and Bill Buxton. 2008. Usability Evaluation Considered Harmful (Some of the Time). In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy) (CHI '08). Association for Computing Machinery, New York, NY, USA, 111--120. https://doi.org/10.1145/1357054.1357074Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Olivia Guest and Andrea E. Martin. 2021. How Computational Modeling Can Force Theory Building in Psychological Science. Perspectives on Psychological Science 16, 4 (2021), 789--802. https://doi.org/10.1177/1745691620970585 arXiv:https://doi.org/10.1177/1745691620970585Google ScholarGoogle ScholarCross RefCross Ref
  25. Megan L. Head, Luke Holman, Rob Lanfear, Andrew T. Kahn, and Michael D. Jennions. 2015. The Extent and Consequences of P-Hacking in Science. PLOS Biology 13, 3 (Mar. 2015), 1--15. https://doi.org/10.1371/journal.pbio. 1002106Google ScholarGoogle ScholarCross RefCross Ref
  26. Kenneth D Hopkins. 1998. Educational and psychological measurement and evaluation. Pearson, London, UK.Google ScholarGoogle Scholar
  27. Kasper Hornbæk. 2006. Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64, 2 (2006), 79--102. https://doi.org/10.1016/j.ijhcs.2005.06.002Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wijnand IJsselsteijn, Yvonne De Kort, and Karolien Poels. 2013. The Game Experience Questionnaire. Eindhoven: Technische Universiteit Eindhoven.Google ScholarGoogle Scholar
  29. Daniel Johnson, M. John Gardner, and Ryan Perry. 2018. Validation of two game experience scales: The Player Experience of Need Satisfaction (PENS) and Game Experience Questionnaire (GEQ). International Journal of Human- Computer Studies 118 (2018), 38 -- 46. https://doi.org/10.1016/j.ijhcs.2018.05.003Google ScholarGoogle ScholarCross RefCross Ref
  30. E. F. Juniper. 2009. Validated questionnaires should not be modified. European Respiratory Journal 34, 5 (Oct. 2009), 1015--1017. https://doi.org/10.1183/09031936.00110209Google ScholarGoogle ScholarCross RefCross Ref
  31. Effie L.-C. Law, Florian Brühlmann, and Elisa D. Mekler. 2018. Systematic Review and Validation of the Game Experience Questionnaire (GEQ) - Implications for Citation and Reporting Practice. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play (Melbourne, VIC, Australia) (CHI PLAY '18). Association for Computing Machinery, New York, NY, USA, 257--270. https://doi.org/10.1145/3242671.3242683Google ScholarGoogle Scholar
  32. Effie L.-C. Law, Paul van Schaik, and Virpi Roto. 2014. Attitudes towards user experience (UX) measurement. International Journal of Human-Computer Studies 72, 6 (2014), 526 -- 541. https://doi.org/10.1016/j.ijhcs.2013.09.006 Interplay between User Experience Evaluation and System Development.Google ScholarGoogle ScholarCross RefCross Ref
  33. James R. Lewis. 2014. Usability: Lessons Learned ... and Yet to Be Learned. International Journal of Human-Computer Interaction 30, 9 (2014), 663--684. https://doi.org/10.1080/10447318.2014.930311Google ScholarGoogle ScholarCross RefCross Ref
  34. Elizabeth F. Loftus and Guido Zanni. 1975. Eyewitness testimony: The influence of the wording of a question. Bulletin of the Psychonomic Society 5, 1 (Jan. 1975), 86--88. https://doi.org/10.3758/bf03336715Google ScholarGoogle ScholarCross RefCross Ref
  35. Scott B. MacKenzie. 2003. The Dangers of Poor Construct Conceptualization. Journal of the Academy of Marketing Science 31, 3 (Jun. 2003), 323--326. https://doi.org/10.1177/0092070303031003011Google ScholarGoogle ScholarCross RefCross Ref
  36. Regan Mandryk. 2020. PACM Statement. https://chiplay.acm.org/2021/pacm/Google ScholarGoogle Scholar
  37. Andrew Maul. 2017. Rethinking Traditional Methods of Survey Validation. Measurement: Interdisciplinary Research and Perspectives 15, 2 (2017), 51--69. https://doi.org/10.1080/15366367.2017.1348108Google ScholarGoogle ScholarCross RefCross Ref
  38. Kent L. Norman. 2013. GEQ (Game Engagement/Experience Questionnaire): A Review of Two Papers. Interacting with Computers 25, 4 (Mar. 2013), 278--283. https://doi.org/10.1093/iwc/iwt009Google ScholarGoogle ScholarCross RefCross Ref
  39. Matthew J. Page, Joanne E. McKenzie, Patrick M. Bossuyt, Isabelle Boutron, Tammy C. Hoffmann, Cynthia D. Mulrow, Larissa Shamseer, Jennifer M. Tetzlaff, Elie A. Akl, Sue E. Brennan, Roger Chou, Julie Glanville, Jeremy M. Grimshaw, Asbjørn Hróbjartsson, Manoj M. Lalu, Tianjing Li, Elizabeth W. Loder, Evan Mayo-Wilson, Steve McDonald, Luke A. Proc. ACM Hum.-Comput. Interact., Vol. 5, No. CHI PLAY, Article 233. Publication date: October 2021. Systematic Literature Review of Transparency in Measurement Reporting 233:21 McGuinness, Lesley A. Stewart, James Thomas, Andrea C. Tricco, Vivian A. Welch, Penny Whiting, and David Moher. 2021. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Systematic Reviews 10, 1 (29 Mar 2021), 89. https://doi.org/10.1186/s13643-021-01626--4Google ScholarGoogle Scholar
  40. Elazar J Pedhazur and Liora Pedhazur Schmelkin. 2013. Measurement, design, and analysis: An integrated approach. psychology press, Hove, East Sussex, UK.Google ScholarGoogle Scholar
  41. Ingrid Pettersson, Florian Lachner, Anna-Katharina Frison, Andreas Riener, and Andreas Butz. 2018. A Bermuda Triangle? A Review of Method Application and Triangulation in User Experience Evaluation. Association for Computing Machinery, New York, NY, USA, 1--16. https://doi.org/10.1145/3173574.3174035Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Karolien Poels, Yvonne De Kort, and Wijnand IJsselsteijn. 2007. D3.3 : Game Experience Questionnaire: development of a self-report measure to assess the psychological impact of digital games. Eindhoven: Technische Universiteit Eindhoven.Google ScholarGoogle Scholar
  43. Shaghayegh Roohi, Asko Relas, Jari Takatalo, Henri Heiskanen, and Perttu Hämäläinen. 2020. Predicting Game Difficulty and Churn Without Players. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (Virtual Event, Canada) (CHI PLAY '20). Association for Computing Machinery, New York, NY, USA, 585--593. https://doi.org/10.1145/3410404.3414235Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. R Nevitt Sanford, Theodor W Adorno, Else Frenkel-Brunswik, and Daniel J Levinson. 1950. The measurement of implicit antidemocratic trends. The authoritarian personality (1950), 222--279.Google ScholarGoogle Scholar
  45. Liam P. Satchell, Dean Fido, Craig A. Harper, Heather Shaw, Brittany Davidson, David A. Ellis, Claire M. Hart, Rahul Jalil, Alice Jones Bartoli, Linda K. Kaye, Gary L. J. Lancaster, and Melissa Pavetich. 2020. Development of an Offline- Friend Addiction Questionnaire (O-FAQ): Are most people really social addicts? Behavior Research Methods 52, 5 (24 Sep 2020). https://doi.org/10.3758/s13428-020-01462--9Google ScholarGoogle Scholar
  46. Klaus R. Scherer. 2005. What are emotions? And how can they be measured? Social Science Information 44, 4 (2005), 695--729. https://doi.org/10.1177/0539018405058216Google ScholarGoogle Scholar
  47. Donald Sharpe. 2013. Why the resistance to statistical innovations? Bridging the communication gap. Psychological Methods 18, 4 (2013), 572--582. https://doi.org/10.1037/a0034177Google ScholarGoogle ScholarCross RefCross Ref
  48. Pamela Shoemaker, James W. Tankard, and Dominic L. Lasorsa. 2004. Theoretical concepts: The building blocks of theory. In How to Build Social Science Theories. SAGE Publications, Inc., Los Angeles, CA, 15--36. https://doi.org/10. 4135/9781412990110Google ScholarGoogle Scholar
  49. Brent D. Slife, Casey D. Wright, and Stephen C. Yanchar. 2016. Using Operational Definitions in Research: A Best- Practices Approach. The Journal of Mind and Behavior 37, 2 (2016), 119--139. http://www.jstor.org/stable/44631540Google ScholarGoogle Scholar
  50. Velvet Spors, Gisela Reyes Cruz, H. R. Cameron, Martin Flintham, Pat Brundell, and David Murphy. 2020. Plastic Buttons, Complex People: An Ethnomethodology-Informed Ethnography of a Video Game Museum. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (Virtual Event, Canada) (CHI PLAY '20). Association for Computing Machinery, New York, NY, USA, 594--605. https://doi.org/10.1145/3410404.3414234Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. April Tyack and Elisa D. Mekler. 2020. Self-Determination Theory in HCI Games Research: Current Uses and Open Questions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--22. https://doi.org/10.1145/3313831.3376723Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. April Tyack and Elisa D. Mekler. 2021. Off-Peak: An Examination of Ordinary Player Experience. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 115, 12 pages. https://doi.org/10.1145/3411764.3445230Google ScholarGoogle Scholar
  53. Elina Vessonen. 2020. Respectful operationalism. Theory & Psychology 31, 1 (2020), 84--105. https://doi.org/10.1177/ 0959354320945036Google ScholarGoogle ScholarCross RefCross Ref
  54. Jan B. Vornhagen, April Tyack, and Elisa D. Mekler. 2020. Statistical Significance Testing at CHI PLAY: Challenges and Opportunities for More Transparency. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play. ACM, New York, NY, USA, 4--18. https://doi.org/10.1145/3410404.3414229Google ScholarGoogle Scholar
  55. Chat Wacharamanotham, Lukas Eisenring, Steve Haroz, and Florian Echtler. 2020. Transparency of CHI Research Artifacts: Results of a Self-Reported Survey. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--14. https: //doi.org/10.1145/3313831.3376448Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Larry J. Williams and Stella E. Anderson. 1994. An Alternative Approach to Method Effects by Using Latent-Variable Models: Applications in Organizational Behavior Research. Journal of Applied Psychology 79, 3 (1994), 323 -- 331. https://doi.org/10.1037/0021--9010.79.3.323Google ScholarGoogle ScholarCross RefCross Ref
  57. Jacob O. Wobbrock and Julie A. Kientz. 2016. Research Contributions in Human-Computer Interaction. Interactions 23, 3 (Apr. 2016), 38--44. https://doi.org/10.1145/2907069Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!