Which Threshold Concepts do Computing Students Encounter while Learning Empirical Research Methods?

A strong foundation in empirical research methods is essential for computing students due to the societal impacts of digital technologies. However, learning empirical research is challenging because of a lack of a research-based approach and the absence of an established pedagogical culture for teaching empirical research methods. In this paper, we ask the research question: Which Threshold Concepts (TCs) do computing students encounter while learning empirical research methods? First, we conducted a systematic mapping review of the literature to identify the candidate TCs in learning empirical research methods. Next, we evaluated the candidate TCs in an explanatory case study of an introductory course in research methods offered to master’s students at the Department of Computer Science at the Norwegian University of Science and Technology (NTNU). We found that a particularly challenging and overarching TC may be developing and operationalizing a conceptual framework, and many other TCs can be linked to the conceptual framework. We also found that it can be difficult for computing students to grasp the nature of research and how empirical research is done. These findings may help understand student challenges while learning empirical research methods and developing solutions to address these challenges.


INTRODUCTION
Students in many fields struggle to learn empirical research methods [15,23].The reasons for this may be an absence of established pedagogical culture [31] and a lack of a research-based approach to teaching research methods [42].As a result, teachers have to rely on trial-and-error strategies in research method courses [11,12], and students may not appreciate the value of such courses in their profession [12,39].
A strong foundation in empirical research is crucial for computing students because of the broader impacts of digital technologies like the changing nature of work, hate speech, fake news, and financial and social exclusion [24].Computing students must learn to critically assess their design decisions' environmental, economic, and societal impacts [8].Working on research projects is a way of learning to think critically about a problem [7], formulate research questions, generate and analyze data, and draw conclusions to answer the research questions [33].These skills may immediately apply to a wide range of computer science occupations [6,35].However, research methods courses are often ignored in computing curricula [40].As a result, computing professionals often use gray literature like blogs and social media content in their practice instead of engaging with empirical research on the topic [14].
A detailed study of computing students' challenges in learning empirical research methods is required [13].Such an investigation needs to analyze the pedagogic actions of teachers and students to form a granular understanding of the student challenges [23,32].An important issue is the subjective nature of learning empirical research, especially qualitative research methods [42].Students "[undergo] a paradigm shift about qualitative research [by making] personal connections with the stories shared by their interviewees; and they also [learn] to draw on their own knowledge and subjectivity to develop their identity as qualitative researchers" [42].Hence, a study of learning experiences in empirical research needs to consider subjective learning challenges like cognitive, emotional, and social aspects of learning [32].
Learning research methods "requires a combination of theoretical understanding, procedural knowledge, and mastery of a range of practical skills" [18].These skills may include critical thinking [16], data analysis, and the use of theory [20].However, developing these skills is challenging for students because research methods courses either fail to engage students or impart practical skills, which are isolated from the context of applying these skills [18].Hence, a study of learning experiences in empirical research must consider the challenges of acquiring practical skills.
The theory of threshold concepts is relevant due to its emphasis on subjective learning challenges, transformative learning, and skill development [10,[27][28][29].A Threshold Concept (TC) is a piece of troublesome knowledge that, once comprehended, fundamentally transforms students' understanding and imparts a new way of thinking about the subject and themselves [21].Learning empirical research methods may require computing students to change their worldview from problem-solving to critically evaluating empirical research.This makes the theory of TC an ideal lens to study the computing students' challenges in learning empirical research methods.Therefore, we ask the following Research Question (RQ) here: RQ: Which threshold concepts do computing students face while learning empirical research methods?
Identifying TCs in learning empirical research methods has been an ongoing thread in the literature (see, e.g., [20,21].Since we aim to build on this literature, we first conducted a systematic mapping review of the literature to list candidate TCs in learning empirical research methods.Next, we conducted an explanatory case study of an introductory course in research methods offered to master's students at the Department of Computer Science at the Norwegian University of Science and Technology (NTNU).We aimed to evaluate the candidate TCs reported in the literature regarding computing students' learning experiences.The case study utilized 82 reflection reports and five semi-structured interviews.
This paper has several contributions.First, we address the need for research on teaching research methods to computing students by studying a case of a methods course.Second, the results of our case study show that the candidate TCs reported in the literature can be mapped to the learning challenges of computing students.We found that a particularly challenging and overarching TC may be developing and utilizing a conceptual framework, and many other TCs can be linked to the conceptual framework.We also found that it can be hard for computing students to grasp the nature of research and how empirical research is done.A possible reason may be that computing education significantly emphasizes developing and applying technical skills like programming, problem-solving, and design thinking.While these skills are crucial for success in the field, they may overshadow the development of critical thinking skills.As a result, computing students may need help shifting their focus from technical problem-solving to critically evaluating empirical research methods and evidence.
The rest of this paper is organized as follows: Section 2 provides a background on the theory of TC.Section 3 discusses the research method.We then present the results, including the systematic mapping review and the explanatory case study, and discuss them before concluding the paper.

BACKGROUND
We introduce the theory of Threshold Concepts (TCs) in section 2.1.Next we identify some challenges and TCs that students encounter while learning empirical research methods in section 2.2.

Theory of Threshold Concepts (TCs)
Meyer and Land introduced the notion that in different fields of study, there exists "conceptual gateways" or "portals," and a student would partake in a journey "through" them to acquire new knowledge.These gateways often help the student grasp previously inaccessible or troublesome knowledge.This journey through the metaphorical portal would lead to a transformative shift in understanding, not bound to the discipline itself.It could challenge the student's perception of "the subject matter, subject landscape or even their worldview" [27].They call a piece of knowledge that plays a part as one of these gateways a Threshold Concept (TC).A TC is said to have many of the following seven properties [10,[27][28][29]: 1) Transformative: A TC, once understood, transforms the way the student looks at the concepts in field.2) Irreversible: Once students understand a TC, they can not unlearn it.3) Integrative: The student can identify previously hidden relationships among concepts and form an integrative view after learning a TC. 4) Bounded: The understanding of a TC delineates the boundaries of a concept for the student, or they can perceive the whole field as a unified entity.5) Discursive: Understanding a TC provides a rich vocabulary to the student.6) Reconstructive: The understanding of a TC significantly changes the student's interactions in the field.7) Liminal space: The students tend to oscillate between the previous and new understanding before they cross the portal of a TC.

Challanges and TCs in learning empirical research methods
The point of departure of this paper is the understanding that students struggle to learn empirical research methods [15,23], and computing students are no exception.There may be many reasons for this: research methods courses are often ignored in computing curricula [40], and there is a general lack of a research-based approach [42] and a pedagogical culture [31] around teaching empirical research methods.While identifying the student challenges in empirical research methods, we also included papers in fields other than computing to have a broad and cross-disciplinary perspective.
Kiley and Wisker presented a study investigating how a supervisor could tell when their student has grasped a given threshold concept [21].The TCs they worked with were developing research questions, self-directed reading, and working with data at different conceptual levels.Similarly, Kiley identified TCs in teaching scientific thinking and how supervisors can help their students cross these thresholds by being unstuck in their thinking [20].The author identified TCs like the concept of argument or thesis, theory, framework (locating or bounding the research), knowledge creation (or novelty of a study), analysis, and framing of research within a paradigm.Although these TCs are exciting challenges for training a researcher, they may also be relevant for learning challenges in an introductory research methods course, which is the focus of this paper.
Mercer & Weaver explored the topic of critical thinking about information in STEM fields [26].They outlined the threshold concept of "critical thinking" and discussed how educators can help students develop critical evaluation skills.Also, Reio identified specific aspects of studies on empirical research that are important for evaluating the scientific merit of research papers [36].The author identified ten research questions to help readers systematically analyze research papers.These questions would help readers identify elements like the research problem, research design, and sampling methods.This, in turn, may help readers increase their awareness of the strengths and weaknesses of empirical studies, which then would map to the TC of critical thinking.Again, the TCs of critical thinking and evaluation of research papers are interesting for exploring the learning challenges in an introductory research methods course.
As shown in the preceding discussion, student challenges in introductory courses on teaching empirical research still need to be explored.This is the research gap our paper tries to address.Combining the research on learning empirical research in general and our case study of an introductory course in empirical research methods, we aim to identify TCs faced by computing students.

RESEARCH METHOD
We applied a two-step research strategy to answer the RQ.First, we conducted a systematic mapping review of the literature to identify the TCs in learning empirical research methods reported in the literature.Second, we conducted an explanatory case study to identify computing students' challenges while learning empirical research methods.We also mapped the TCs reported in the literature to the challenges.This section presents the research methods used in this paper.Sections 3.1 and 3.2 present the details of the two-step research strategy, whereas sections 3.3 and 3.4 describe the data generation and analysis methods used in the case study, respectively.

Systematic mapping review
We did a systematic mapping review to ground our case study into the larger research area of learning empirical research methods.We followed the 5-step process of systematic mapping studies defined by Petersen et al. [34]

Explanatory case study
We aimed to explore computing students' challenges in learning empirical research methods.Since we had decided to use the theory of threshold concepts as a lens, we designed an explanatory case study to test this theory [43] by qualitatively analyzing the data [5].Specifically, we did a systematic mapping review to identify a list of 11 candidate TCs, framed them as student challenges, and used them to explore the challenges faced by computing students.This was crucial in establishing the connection between existing literature and the case study findings.
The case in this study was the research methods course offered to master's students at the Norwegian University of Science and Technology (NTNU).The course is designed for students specializing in Computer Science (CS), Information Systems (IS), and Software Engineering (SE).This is a mandatory course for IS students, but students from other majors also take it.Around 100 students take this course every year to learn the skills and gain the knowledge needed to write their master's thesis.Some students taking the course will continue their education in the doctoral program, but most will start their professional careers after graduation.One of the objectives of this course is to help the students develop critical thinking skills.Learning about empirical research methods may help the students develop skills to ask research questions, gather and analyze data, and draw conclusions to answer the research questions.These skills may be applied to a wide range of computer science occupations [6,35] including system development in SE [1].During this 21-week course, students learn about empirical research methods in computing.The timeline of the course is shown in Figure 1.The horizontal axis shows the week numbers, the blue rectangles show the parts of the course, and the grey rectangles show the course deliverables.The blue triangles denote oral feedback, and the blue circles represent written feedback.
The course is based on the textbook by Oates et al. [33], and it defines seven learning objectives (LO1 -LO7) as shown in Table 1).Students are taught about various empirical research strategies like case studies, design-oriented approaches, surveys, and experiments, and they also develop practical skills by working on a research project.The initial phase of the course covers the theory module (see Figure 1), where the students are introduced to fundamental concepts of empirical research and the research process, including literature review, conceptual framework, research questions, strategies, and data gathering and analysis methods [33].After the theory module, students work in group projects covering all stages of research, from planning to dissemination, mimicking a realistic researcher experience.While groups are formed based on background and interest, selecting a research topic is left to the groups.Over the semester, groups produce five distinct course deliverables, including paper drafts.Each group is guided by a staff member who serves as a mentor, providing written and oral feedback for improvement.The course culminates in a research conference presentation and individual reflection reports from each student about their learning experience.The subsequent sections outline the data generation and analysis methods used to study this research methods course.Be able to properly disseminate your research.

Data generation methods
To provide a thorough understanding of the research topic, case studies use a variety of data sources to address a research questiona technique called triangulation [33].Triangulation strengthens the overall robustness of the study, increases the validity and credibility of findings, and reduces research biases.Rich and contextualized insights into the phenomenon under investigation are possible as a result of this multidimensional approach [33].We included two types of data generation methods to ensure triangulation in our case study: 1) 82 reflection reports delivered by students as part of the research methods course and 2) five interviews with the students who took the course in the previous academic year.
Reflection reports: A reflection report is submitted by each student to demonstrate that they can integrate what they learned in the course into a reflective text describing their experiences in the research project.A reflection report contains three parts: 1) what happened in the project, 2) what the course curriculum says in this case, and 3) a retrospective analysis of what could have been improved.We used the reflection reports in our case study because they offered a primary account of the student's learning in their own words.Also, these reports provide an otherwise inaccessible window into their thoughts on various concepts introduced in the course curriculum and how they are applied in a research project.
Semi-structured interviews: We recruited students from the previous academic year using the mailing lists of the course.We developed an interview guide using insights from the systematic mapping review(see sections 3.1 and 4.1), focusing on challenges in learning empirical research and candidate TCs (see Table 2).The interviews were conducted by a research assistant who was not part of the course staff to ensure open and comfortable responses.Each interview lasted between one and one and a half hours.We recorded and transcribed the interviews and used them in our data analysis.The interviews provided supplementary raw data to complement the structured reflection reports.Because of the semi-structured nature of the interviews, the interview questions could be tailored to further dive into the student's understanding and reasoning.

Data analysis methods
The data in this study were analyzed using a qualitative analysis approach.We used two steps in data analysis: 1) open coding to identify student challenges and categorize them into codes and 2) peer debriefing to map the codes to the candidate TCs resulting from the systematic mapping review of the literature (see Table 2).We used the NVivo program to import the reflection reports and the interview transcripts), as the pre-processing step of data analysis.
Open coding: Open coding was employed to systematically analyze and categorize the diverse range of individual experiences and perspectives gathered from the data sources [41].This process involved carefully reading and classifying the text to create initial codes, forming the fundamental units of analysis.Through open coding, researchers could uncover emerging concepts and categories without imposing preconceived notions or established theoretical frameworks, providing a comprehensive data exploration [9].After this step, the resulting codes were mapped to the candidate TCs obtained from the systematic mapping review in peer debriefing.
Peer debriefing to map codes to the candidate TCs: Peer debriefing is a technique of collaborative data analysis where other researchers review the coding and themes, can offer insightful comments, and contribute to the overall validation process, has been used to ensure the dependability and credibility of the identified concepts and categories.Peer debriefing also serves as a form of quality control, guaranteeing that the research is thorough and trustworthy and providing chances for reflection and development [38].Researchers can address potential biases, spot overlooked details, and improve the reliability of their qualitative research by participating in this collaborative process [4].In this study, one of the researchers conducted the step of open coding, and an additional two researchers performed the peer debriefing step [37].They read the coded portions of data and the descriptive codes and tried to map them to the candidate TCs identified in the systematic mapping review (see Table 3).

RESULTS
This section presents the findings of our two-step research strategy to answer the RQ.Firstly, section 4.1 presents the results of our systematic mapping review.We identified 11 candidate TCs from the literature.We mapped these TCs to the LOs of the research methods course to connect the student learning challenges reported in the literature to the course's curriculum.Secondly, sections 4.2-4.10present the results of our case study.

Identifying the candidate TCs from literature and mapping them to the LOs
The systematic mapping review provided a direction to our case study by identifying the candidate TCs associated with learning empirical research methods.Table 2 presents a summary of the TCs.The first column gives a unique ID for each candidate TC, the second column shows the candidate TCs reported in the papers included in the systematic mapping review, and the third column presents the central concept to represent each TC.For example, Kiley studied the experience of learning to be a researcher and identified various candidate TCs, including the concepts of knowledge creation, developing an argument, and the use of theory and paradigms in research [20].Similarly, Alpi & Hoggan studied information literacy competencies of novice researchers and identified candidate TCs like producing an abstract conceptual understanding from facts [2].Further, Motjolopane studied a course on research methods and identified a candidate TC for developing an understanding of research methodology [30].In our systematic mapping review, we synthesized all these candidate TCs into TC1: Practice of research.Table 2 also shows how the Learning Objectives (LOs) were mapped with the 11 candidate TCs, which resulted from the systematic mapping review.The mapping between literature-derived TCs and the course curriculum is pivotal for the case study because it aligns the literature with the course's scope and ensures robust results.Moreover, the mapping facilitated the creation of codes, offering a clear direction for analyzing data.The last column of Table 2 shows the relevant LO of the course (see Table 1).For example, the TC1: Practice of research is mapped to three LOs: LO1, LO4, and LO5.TC1 is mapped to LO1 because practical research issues like developing an argument and using theory in research [20] are relevant for designing and planning an empirical research project.TC1 is also mapped to LO4 because the concept of analysis [20] is suitable for conducting data analysis in a research project.Lastly, TC1 is also mapped to LO5 because critical thinking is relevant for rigorously evaluating empirical research.As shown in Table 2, almost every candidate TC from the systematic mapping review could be mapped to some LO in the course except TC8 and TC10.This is not surprising because the course is not aimed at TC8: Interdisciplinary collaboration) and TC10: Advancing professional practice).Since each concept (except TC8 and TC10) was shown to be directly fitting to some LO in the research methods course that we studied, it is reasonable to believe that these candidate TCs are likely to be encountered by the students as challenges in the course.
As described in section 3.4, First, we used open coding to identify a corpus of codes from the data sources.Next, we mapped the codes with the candidate TCs in peer debriefing.We present the results of this mapping here.Table 3 shows an overview of the codes to TCs mapping.As shown in Table 3, no codes were mapped to TC8 and TC10 because these TCs were not mapped to any of the course LOs.In the following sections, we present the details of these mapping by bringing out representative quotes from the data sources.

TC1: Practice of research
A lack of prior research experience presents a significant challenge for students in an introductory course on research methods [11].There can be several reasons for this.For example, the students may have limited knowledge and practical skills due to their unfamiliarity with the research process.They may struggle to make informed decisions regarding selecting a realistic research topic for their research project.We mapped these challenges to the candidate TC1: Practice of research.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Lack of prior experience in research: The student groups chose the research topics for their research projects in the research methods course we studied.This may have resulted in potentially incorrect decisions in estimating the amount of work and establishing the feasibility of the research project as the following quote TC1 "Knowledge creation, argument, theory and paradigms" [20], "Produce an abstract that deal with concepts instead of facts" [2], "Developing understanding of research methodology" [30] Practice of research LO3, LO4, LO5 TC2 "Develop a research question from a topic" [2], "Topic discovery and research problem formulation" [30], "Clear research questions" [36] Research questions LO1, LO2 TC3 "Develop an appropriate design" [2], "Developing an understanding of the research method inclusive of data analysis [30]" Research design LO1, LO2 TC4 "Ensuring students develop critical thinking" [30], "Developing critical thinking skills" [36], "Critical Evaluation of Information" [13] Critical thinking LO4, LO5 TC5 "Ethical considerations when using data from human subjects" [19], "Ethical considerations are central to [research]" [22], "Plans for following ethical guidelines" [36] Ethical considerations in research LO6 TC6 "Engagement in science communication [25]", "Writing up qualitative data [17]"

Research communication LO7
TC7 "Knowledge as contextual and constructed [2]" Iterative nature of research LO1, LO2, LO5 TC8 "Diversity of learners" [19], "Bridge the gap [between disciplines]" [13] Interdisciplinary collaboration -TC9 "Concept of analysis" [20], "Put forward an argument supported by evidence" [2], "Developing an understanding of research method, inclusive of data analysis" [30] Handling evidence (data generation and data analysis) LO3, LO4 TC10 "Enter dialogue with experts" [2] Advancing professional practice -TC11 "Concept of locating or bounding research in a framework" [20], "Focus on conceptual frameworks, methodology and methods" [2] Conceptual framework LO1, LO2 suggests: "As none in our group had any considerable experience with conduction any proper research, I found the start of the course to be rather hard.The lack of experience and proper understanding of how much work goes into research lead us to initially choosing a research topic that would not be feasible to complete."R/Gr1/St1 Note that the above quote was selected from a reflection report submitted by student 1 of group 1.The key "R/Gr1/St1" shows this connection, and we have specified similar keys for all the quotes presented in this paper to establish the chain of evidence in our findings [33].We have made some words in the quote italics to add emphasis.
Hard to select a research topic: The lack of prior experience in research may also have made it difficult for the students to select a topic that was specific enough, as the following quote suggests: "It was a bit challenging to understand what type of topic we were supposed to write a paper about.We were going a bit back and forth about how specific it was going to be, and what kinds of things it could be about.But I think we eventually settled on a good topic.It took quite a bit to come to a conclusion."I4 Note that the above quote was selected from the transcript of interview 1, and the key "I1" shows this connection.
Hard to select a research topic that is interesting and novel: Since the students were working in groups, the decision to select an interesting and novel research topic may have been particularly difficult as the following quote from an interview transcript suggests:"Question: What do you think was the most challenging aspect of developing the first draft of your research proposal?Answer: Well that was figuring out what you wanted to write about I think because finding a sort of space to explore is not something you at that point done before" I2 The issue of topic selection in a group is also supported by the following quote from a reflection report."I found the process of selecting a specific topic and finding a practical problem that all five group members agreed on to be quite challenging and timeconsuming.Although the process was time-consuming and, at times, frustrating, I feel like the experience of working in teams was very valuable."R/Gr5/St1 Change of research topic: The difficulties in topic selection may have led to a change of research topic and further challenges concerning the research project, as the following quote suggests: "The main problems in this project seemed to stem from not being able to find a research topic with appropriate research questions.Our group changed the theme and research questions after the first research proposal draft, which slowed down the progress."R/Gr3/St4 Lack of understanding of amount of work in the course: The lack of prior experience in research may have resulted in underestimating the amount of work in the research project in the course as suggested by the following quote:"At the time of choosing the topic however, we did not have a good understanding of how fleshed out this type of research was.In hindsight, we should've spent more time reading similar studies and developing a good understanding of how our research improves upon or contributes to the knowledge pool."" R/Gr3/St2 The lack of prior experience and incorrect design decisions may have resulted in a lack of time for the research activities as the following quote suggests:"With the limited time and the amount of work in this project, we only went with one data generation method instead.We wanted to triangulate our findings, and we wanted to generalize our answers.But we could not do so because of time constraints."R/Gr9/St1 Hard to conduct research when in group: Working in groups may involve coordination and collaboration challenges as the following quote suggests:"We divided the workload by assigning a section to every person; person A wrote the results, person B wrote the discussion, etc.I think it was a mistake to change from a collaborative, iterative approach to an individual incremental one, as it negatively affected our productivity.The ones responsible for the later sections experienced much downtime in the early phases due to sections depending on previous sections (e.g., discussion depending on the results).Another downside of this approach was that the lack of collaboration and communication caused each individual's vision of the final product to gradually diverge, resulting in some inconsistencies in the report."R/Gr5/St1

TC2: Research questions
Developing research questions from a research topic is a candidate TC [2].Students may struggle to balance ambitious research questions that contribute to the field and research questions that can be realistically addressed within an introductory course's resources and time constraints.Similarly, formulating a research question may be challenging because the research questions inform the design decisions throughout the research project [30,36].We mapped these challenges to the candidate TC2: Research questions.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Hard to problematize literature: Defining research questions involves reading the recent literature on the research topic, finding hidden assumptions, and challenging them -a process called problematization [3].However, students often do not define the research question based on problematization as the following quote suggests: "One of the major obstacles we faced was trying to pick a good research question to focus on.In hindsight, it seems that we, to some extent started by choosing what kind of research answer we wanted, rather than starting with a practical problem."R/Gr5/St4 The students often rush to data generation instead of ensuring that the research questions are correct, as the following quote suggests: "In hindsight, it would have been more effective to take more time to narrow down the research question.We were eager to start the data generation , which might have impacted how we defined the research question."R/Gr9/St2 Hard to be unbiased when formulating research: The correct formulation of the research questions is also challenging for students.The students tend to formulate research questions that lead to specific data generation methods as the following quote suggests:"In retrospect, it is clear that the desire to use specific methods influenced the choice of research question unduly, in a sense putting the cart before the horse.To put it in textbook terms, we essentially worked through the process backwards."R/Gr2/St4

TC3: Research design
Selecting a research design including a research strategy and the relevant data generation methods is difficult for students [33].The reasons can be a lack of knowledge and practical understanding of the research strategies [2,30].Additionally, students may encounter practical constraints, such as limited time, resources, and access to participants or data sources.We mapped these challenges to the candidate TC3: Research design.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Choice of research strategy: The research question dictates the choice of a research strategy because selecting the right research strategy guarantees an answer to the research question [33].However, students' selection of research strategy is often guided by practical considerations like the amount of time available and access to participants instead of the research questions as the following quote suggests: "I've learned the importance of choosing the right strategy and method that provides the right type of data for the chosen research question.Our research questions were more exploratory in nature and sought to go in-depth on our topic of interest.Therefore, we should have generated more qualitative data instead of quantitative data that we could have analyzed and interpreted."R/Gr3/St1 Hard to reason why specific research methods were chosen: Since students usually do not have practical experience with many research strategies, they find it hard to reason for their choice of a research strategy as the following quote suggests: "If we were to repeat this process, I think we should have had more arguments for making our choices, and to a greater extent elaborate and reason for why we did exactly as we did."R/Gr2/St3

TC4: Critical thinking
Critical thinking is an essential analytical skill because it helps in many steps of a research project.For example, critical thinking helps in problematization [36].It also helps in choosing the proper research methods and applying them [30].Critical thinking is a handy skill in data analysis as it helps evaluate bias in analysis [13,20].We mapped these issues to the candidate TC4: Critical thinking.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Hard to search for appropriate literature: Finding appropriate literature and critically evaluating it is the key to problematization [3,36].However, professional education in computing is often focused on helping students apply the techniques of programming and problem-solving instead of challenging them.This may result in a lack of a critical view of empirical research for computing students.Hence, computing students may face challenges in focusing on the research topic, finding literature, and problematizing it as the following quote suggests: "As our research topic was very broad, we started with very broad keywords such as "web accessibility" and "web accessibility guidelines."This yielded very large amounts of literature, and it was hard to choose which articles were more relevant than others" R/Gr9/St2 Hard to maintain internal validity A lack of critical view of the information may lead to various types of bias in a research project, which may result in a low internal validity in the findings as the following quote suggests: "There is undoubtedly some degree of experimenter bias in our study, which might be because of our qualitative analysis, a method that is prone to research bias.Furthermore, how we asked questions in our semi-structured interview might also have affected the results.We have thought of AI-based tools being a factor in our careers ourselves.Subconsciously, we might have affected the results towards 'low concern' to comfort ourselves because we will embark upon a software development career."R/Gr4/St1

TC5: Ethical considerations in research
Ethical considerations are particularly relevant for empirical research as they involve gathering and analyzing data from human participants [19].This requires carefully following ethical guidelines regarding informed consent, data privacy, and security [22].We mapped these issues to the candidate TC5: Ethical considerations in research.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Ethical considerations: Due to the lack of prior experience with empirical research and practical constraints, students tend to avoid ethical considerations instead of addressing them as the following quote suggests: "the ethics of this research is both complicated and not something we have time to establish an ethical research strategy.If we were to use this approach, we would have to do a case study to make some observations.This could be problematic in many ways, as we would likely need to collect some medical information along with the personal information to have an idea of what kind of pain the subjects are in.Additionally, we would need access to observation and other data-collecting methods while also giving the test subjects access to VR equipment that they can use in periods of pain.This would require a NSD approval that we were not necessarily likely to achieve while also gaining access to information and subjects that are protected by a very strict standard that we would not have been able to uphold."R/Gr1/St4 Note that at the time of this writing, the Norwegian Centre for Research Data (NSD) was the national institution regulating compliance with the ethical guidelines for research in Norway.

TC6: Research communication
Research communication is often challenging because it requires bridging the gap between technical and non-technical audiences [25].Also, research communication requires precise descriptions and arguments, which may be difficult when writing qualitative studies because of the rich and subjective descriptions [17].We mapped these challenges to the candidate TC6: Research communication.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Research communication to unfamiliar people: Dissemination of results to a broader audience is challenging because it requires removing technical jargon and complexity and presenting a clear message.Students find dissemination difficult as the following quote suggests "Presenting at the conference was challenging because you have the feeling that you are going to present a lot of information to somebody that hasn't really been reading research papers on the topic like you have done.That felt a little bit like, I wouldn't say, terrifying but a little bit scary."I1 Hard to be precise: Research communication requires a precise description of the argument, which is often challenging for students as the following quote suggests: "It was interesting to learn exactly how to phrase things and actually what to have in the paper.Because it's a bit more structured than things we would have written previously.In previous courses, it's more like "write whatever you can think of related to the idea, and then structure it a bit.Here it was more: we need specific things written in a certain way, and that needed multiple iterations to get right."I4 4.8 TC7: Iterative nature of research Research tends to be an iterative because knowledge is often contextual and constructed [2].Learning the scientific method involves developing "an understanding that research is iterative, questions are based on gaps in available information, and that as questions are addressed, new questions may arise [2]."However, students find it difficult to iterate in various stages of the research project.We mapped this challenge to the candidate TC7: Iterative nature of research.We provide the reasoning for this mapping using the codes identified from the data sources and a representative quote in the following paragraph.
Revisions to research proposal: It was difficult for students to revise the research proposal in response to the feedback from the course staff.Such a revision was complicated if it involved revising the research questions because of the long-term implications of the revision, as the following quote suggests: "After redoing our research questions from the feedback after attending the supervision meeting, we concluded how to conduct our interviews.The decision to conduct interviews and even the question designs were made even earlier.Thus, we had to change our interview questions to match the new research questions.This shook up the process a bit, which resulted in us doing the interviews very late."R/Gr8/St4 4.9 TC9: Handling evidence (data generation and data analysis) Data generation and data analysis are challenging because of many issues like access to data [30], lack of technical skills in data generation, statistical analysis and visualization, [20] time and resource constraints and dealing with uncertainties and bias [2].We mapped these challenges to the candidate TC9: Handling evidence (data generation and data analysis).We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Experiment execution (hard to recruit participants): Students face many challenges in finding and recruiting participants fulfilling the criteria of their research projects as suggested by the following quote: "The recruitment of participants was also something we learned how hard can be, as we spent a notable amount of time trying to find enough participants, and without any way to incentivize participation it is hard to see how we could have made this easier."R/Gr1/St1 Hard to prepare for data generation: Gathering accurate and useful data is also a challenge, as the following quote suggests: "Even though our questionnaire was effective in gathering enough respondents and answers, it became clear that we could have spent more time reviewing the questions we included.All questions were closed questions, which could have put answers into the respondents' minds.We could have received more accurate data by using open questions where the respondents could list the answers" R/Gr9/St2 Hard to conduct data analysis: Due to a lack of experience, the students find it difficult to select and use different data analysis methods.As a result, students often confuse and misrepresent various data analysis methods as the following quote suggests: "We chose a grounded theory approach, where we looked at the distribution of answers without connecting it to a theory.This can be seen as a limitation of the study, as it resulted in a mere summary of results as opposed to a quantitative analysis using statistical functions."R/Gr3/St4 Hard to ensure external validity: Due to uncertainties and bias in data gathering and analysis, the students find themselves in a difficult position where they can not generalize from their findings.This may result in a lack of external validity in research studies as the following quote suggests: "In our study, there is sampling bias, where the participants substantially differ from the overall population.As mentioned in the paper, our focus group consists of three people who are all in their first year of master's degree and have known each other for a long time.To generalize our results, we would have required a more diverse participant group, consisting of students and people who work" R/Gr4/St1 The lack of ability to generalize from findings is also supported in the following quote: "Based on alow response rate, it can be assumed that those who responded to the survey were more interested in the topic, and perhaps had stronger attitudes as compared to average Norwegian medical student.We don't have enough data to generalize any conclusion to other groups of people or situations."R/Gr3/St1 4.10 TC11: Conceptual framework Developing and using a conceptual framework is challenging for students because a conceptual framework represents abstract relationships among variables, concepts, and ideas [2,20].Also, the conceptual framework is helpful in data analysis and generalizing from the findings of a study, and this may be challenging for students.We mapped these challenges to the candidate TC11: Conceptual framework.We provide the reasoning for this mapping using the codes identified from the data sources and some representative quotes in the following paragraphs.
Hard to develop appropriate conceptual framework: Due to the lack of experience, the students find it hard to develop a conceptual framework as the following quote suggests: "I've learned that we should have developed a more exhaustive conceptual framework, because we did not have as much theory as we could interpret our results data with."R/Gr3/St1 Hard to conduct data analysis: Using the conceptual framework in data analysis is crucial, as the following quote suggests: "I'd like to mention one aspect we could improve.This is the analysis of what our results really meant, this part of our conceptual framework was weak, and we should have taken more inspiration from similar studies in this part."R/Gr3/St2

DISCUSSION
This section synthesizes our findings on student challenges while learning empirical research methods.To address our RQ, we aim to connect the general challenges (see Table 3) to the computing context.The following sections discuss three salient TCs and present the implications of these TCs for research methods education in computing.

TC11: Conceptual framework as an overarching TC
A conceptual framework plays at least three roles in structuring the research project [20].First, it helps define the project's scope by defining and linking the main variables and concepts studied.An important issue is to maintain a balance between simplicity and comprehensiveness.The conceptual framework must be sufficiently simple to provide a clear understanding of the research context yet comprehensive enough to capture the complexity and interrelationships of the variables or concepts under investigation.Second, the conceptual framework lays out a road map for the project by helping the researcher make the design decisions for the research project.Third, the conceptual framework is often reused in the data analysis as a lens or structure to develop theory.Due to these broader roles, developing and operationalizing a conceptual framework can be seen as an overarching TC connecting many candidate TCs presented in Table 3.It can be directly related to formulating research questions (TC2), research design (TC3), and iterative nature of research (TC7).Defining and operationalizing a conceptual framework may be particularly challenging for computing students because an empirical research project in computing involves at least two domains: a problem domain, e.g., students cheating in assignments using large language models and a solution domain, e.g., automatically detecting the text generated by large language models.Students often get carried away in the problem domain by investigating, e.g., the causes and effects of cheating, without giving much thought to the solution domain.Alternatively, the students might focus too much on the solution domain by, e.g., developing a tool for detecting generated text without thinking about the usage setting of the tool.This challenge can be addressed by having multidisciplinary groups in research projects in the methods courses.This interdisciplinary collaboration in research can help students broaden their perspectives and help them to integrate knowledge and expertise from various disciplines.

TC1: Practice of research in computing
The duality of problem-solution domains in empirical research in computing has broader implications for research practice (TC1).For instance, focusing too much on the solution domain and ignoring the problem domain may explain the adverse societal impacts of computing tools like hate speech, fake news, and financial and social exclusion [24].
The duality of problem-solution domains may help explain the practical difficulties of computing students in selecting a research topic, doing a research project in a research methods course (see section 4.2), and disseminating the results to a broader audience (TC6, see section 4.7).Similarly, too much focus on the solution domain may explain why computing students ignore ethical considerations in empirical research instead of addressing them (TC5, see section 4.6).

TC4: Critical thinking and professional education in computing
Professional education in computing often places a significant emphasis on developing and applying technical skills, such as programming, problem-solving, and algorithmic thinking.While these skills are crucial for success in the field, they may overshadow the development of critical thinking skills.As a result, computing students may face challenges in shifting their focus from technical problem-solving to critically evaluating empirical research methods and evidence.This lack of critical thinking may explain the practical difficulties of computing students in problematizing literature and dealing with uncertainties and bias in research (see section 4.5).Also, this lack of critical view may explain the behavior of computing professionals using grey literature like blog posts and social media content in their practice instead of engaging with empirical research [14].This challenge can be addressed by offering courses in empirical research methods to computing students, like the course we studied in this paper, where the students select a research topic and do an empirical research project.

Reflections and limitations
Some of the authors of this paper were part of the course staff for the introductory research method course studied here.Our role as course staff and our perception of evaluating students' course deliverables has always been a concern in the research process.Accordingly, we have consciously tried to avoid the interference of personal ideas.Although a research assistant who was not part of the course staff conducted the interviews, some personal input may have been included in the results due to our role in the data analysis.We used the peer debriefing technique in the data analysis to reduce the impact of this problem.
One of the limitations of this research is the use of students' reflection reports as a data source.Reflection reports were used because of the challenges of recruiting current course students for interviews, e.g., students may think that discussing the course openly may affect their grades.We interviewed five students who took the course last year for a long-term view.However, we could not tie a reflection report of a student to the interview with the same student, although doing so may have strengthened our findings.

CONCLUSIONS
This paper uses the theory of Threshold Concepts (TCs) as a lens to study the challenges faced by computing students when learning empirical research methods.The study began with a systematic mapping review of the literature to identify candidate TCs in learning empirical research methods across disciplines.We then contextualized the candidate TCs to computing in an explanatory case study of a research methods course, answering the call for a research-oriented approach to teaching research methods [42].The findings reveal that candidate TCs from the literature align closely with computing students' learning challenges.One particularly formidable TC is developing and operationalizing a conceptual framework related to many other TCs.
The study uncovers a potential duality of problem and solution domains in an empirical research project in computing, which may have broader implications for teaching empirical research methods.A strong prioritization of technical skills in computing education may overshadow the development of critical thinking skills, which may be essential for conducting and evaluating empirical research.Our approach offers insights for both research and practice in teaching research methods to computing students.

Table 2 :
Threshold Concepts (TCs) reported in literature mapped to Learning Objectives (LOs) in the course

Table 3 :
Mapping codes to the candidate TCs