U.S. Latines in Computing: A Review of the Literature

Though there is a known issue with enrollment disparities across race/ethnicity in computing, U.S. Latin American (Latine) students have remained chronically underrepresented for decades. In order to actively support the recruitment, retention, performance, and experiences of U.S. Latines, we must make sure that we are carrying out studies with this population across outreach practices, interventions, student outcomes, and testimonies. To offer a more comprehensive understanding of the existing body of work on U.S. Latines in computing, we reviewed computing education papers that either fully center around or dedicate specific analyses to U.S. Latines. We analyzed 53 papers and found the following results: most research focuses on Student Experiences and Institutional Perspectives with very few studies on Research Opportunities; the educational setting tended to be K-12 with a severe lack of community college studies; there is a low number of popular pedagogical practices studied with U.S. Latines; research has mainly focused on students, but seldom focused families despite strong Latine connections to family; there was a strong spike in studies in 2021; and the U.S. location of the studies tended to line up with the U.S. Hispanic populations, but many states are under-performing. We discuss the implications of our findings and suggest future research directions to better understand, recruit, and support Latines in computing.


INTRODUCTION
Computer science remains a consistently lucrative career choice [7], and its enrollment growth continues steadily [35,40,45], with projections for 2029 indicating a faster growth rate compared to other fields [29].However, a significant well-known under-representation issue persists in computer science, particularly for women and Black, Latine, Native American, and Pacific Islander (BLNPI) students and professionals [38,39,44].Despite U.S. Hispanics being the nation's second-largest racial or ethnic group and experiencing rapid growth between 2010 and 2020 [2], recent longitudinal studies show that they face a persistent and troublingly low participation rate in computer science [23], limiting their access to the abundant career opportunities, competitive pay rates, and social advantages offered by the field.Among Latine/Hispanic students enrolled in computer science, a study conducted by Lewis et al. has highlighted various barriers they encounter in the field [21].These included, but are not limited to, low sense of belonging, negative sentiments around identity observations, and the importance of community in the field [21].Addressing these challenges is crucial to creating a more inclusive and supportive environment that we continuously have been aiming to cultivate in computing.
If we are to make progress in the field of diversity for Latines in computing, a review of literature would help provide information on published studies -without such a review, it would be difficult to both be aware of what has shown to be effective and make new progress for Latines.By documenting what has been done, this could help researchers replicate existing studies and pave paths to new projects.Additionally, this could help practitioners pinpoint effective strategies to implement in their classrooms.This paper contributes to the field of computing education by providing a summary of the current state of the studies done focusing on U.S. Latines in computer science to help guide practitioners and researchers in the field searching for ways to better serve this community.

TERMINOLOGY
For the purposes of this study, the term "Latine" refers to people of Latin American origins.We are not using Hispanic as it limits the population to Spanish-speaking Latin Americans .We are not using the term Chicanx/o/a as this limits the definition to those of Mexican descent.We will not be using Latin@1 /a/o as these are not gender-inclusive terms.The use of Latine over Latinx is a purposeful stance, even though both are used as gender inclusive markers.Latinx is perceived as a U.S.-English-centric term [34] -a form of "linguistic imperialism" against Spanish speakers [12,37].A study by Noe-Bustamante et al. shows that only 3% of the Latin population in the U.S. use the term Latinx [26], while Latine is growing faster in popularity than Latinx [36].This is possibly due to the fact that the − morpheme already exists in Spanish making it easier to implement to the Spanish language [10,22,36] and that the term originates from a Latin American country -Argentina.
We will also not be using the term "URM" which has been widely used in computing to refer to under-represented minorities.This is due to its many critiques on how it can be a dehumanizing, oppressive, and racist label to use [1,25,41,42].Instead, we will be using the more inclusive term "BLNPI" which refers to Black, Latine, Native American, and Pacific Islanders.

PREVIOUS WORK
A literature review by Holanda et al. detailed the initiatives to increase diversity in introductory programming courses (CS1) [15].They analyzed 67 papers covering pedagogical changes, assessment of student sentiment, establishment of learning communities, and mentoring events aiming to increase the inclusion of "groups underrepresented in [CS] majors (such as women, African-Americans, [Latine], and Native Americans)" [15].Their findings show that teaching changes, curriculum changes, and student support were the main interventions to increase diversity in CS1 with most studies found in California, New Jersey, New York, and Texas.We follow a similar motivation and approach to Holanda et al., however our work differs in that we will study what research has been done for one specific underrepresented group -Latines.
Ortiz-Lopez et al. similarly created a systematic mapping of studies, however they focused on the retention of BLNPI students which explicitly include the Latine population [30].They found a total of 76 papers covering educational initiatives for retention of BLNPI students showing that most studies took place in California and Texas -in line with Holanda et al.. Our work differs in that we will report studies where Latines are the main focus or are specifically analyzed in the paper.

METHOD
In order to better understand the current studies on U.S. Latines in computing, we will look for 1) the categories of work being done to identify what is the main area of focus for Latines in computing, 2) the breakdown of the educational settings studied to see what environments are most represented, 3) the breakdown of pedagogies to see what pedagogical study is most popular to carry out for Latines, 4) the breakdown of the population focuses (i.e., students, faculty, etc.) to verify what populations has been primarily studied, 5) the yearly frequencies to evaluate the yearly progress that computing education research makes towards actively including Latines, and 6) the mapping of studies across U.S. states to verify if studies are reflecting and reaching Latine communities.

Research Questions
To understand the previous literature of U.S. Latines in computing, we asked the following research questions: (1) What are the main categories of studies conducted?(2) What educational settings are primarily studied?(3) What pedagogies have been evaluated specifically on Latines? (4) What is the population focus (e.g., students, professionals, family, etc.) for these studies?(5) What are the yearly frequencies of these studies?(6) What is the breakdown of U.S. states for these studies?

Paper Selection
The goal of this project is to explore the current literature on U.S. Latines in computing and identify common trends and findings.To do this, we began by searching for any relevant literature related to Latine student in computer science.We searched the ACM Digital Library (ACM DL), IEEE Xplore, and the Computer Science Education (CSE) journal from Taylor & Francis (T&F).The query we used for the search was the following: "Latine Latinx Latin@ Latina Latino Chicanx Chicana Chicano Hispanic Spanish Portuguese".Each term was separated with an "OR" relationship and confined to search within the paper abstracts .This was done because we believe that if a study focuses on Latines (as a primary or sole focus), then we believe that the study would most likely have at least one of these terms in the abstract.The reasoning for each of the terms were as follows: Latine/x/@/a/o -Gender neutral (e/x/@) and gendered (a/o) identifiers for people of Latin descent; Chicanx/a/o -Gender neutral (x) and gendered (a/o) identifiers for people of Mexican descent in the U.S.; Hispanic / Spanish / Portuguese -Linguistic identifiers for people in Latin America.
We used the ACM DL on the ACM Full-Text Collection with the filters on content-type set to "Research Article" and searched through both "Proceeding Series" and "Journal/Magazine Names".For "Proceeding Series", we set the venues to Technical Symposium on Computer Science Education (SIGCSE), Innovation and Technology in Computer Science Education (ITiCSE), International Computing Education Research (ICER), Koli Calling, Special Interest Group for Information Technology Education (SIGITE) and Global Computing Education Conference (CompED).For "Journal/-Magazine Names", we set the venues to Transactions on Computing Education (TOCE), Journal of Computing Sciences in College (JCSC), and SIGCAS Computers and Society (SIGCAS).
We used the IEEE Xplore advanced search page and applied the filters for "Conferences" and "Journals".Additionally, under "Publication Topics" we filtered for "computer science education".As we could not filter by venue, we received a variety of venues totaling up to 34.
We used the standard search tool from Taylor & Francis.Under the "Journal" options, we filtered for "Computer Science Education".
Our search was confined to these venues as we wanted our initial list of papers to represent the mainstream computing education research community.The most recent time this search was run and confirmed was on the 17th of August of 2023, resulting in 82 papers for ACM, 143 papers for IEEE, and 5 papers for CSE.The breakdown of the ACM paper venues are as follows: SIGCSE (35), ITiCSE (12), ICER (4), Koli Calling (1), SIGITE (1), CompED (0), TOCE (14), JCSC (10), SIGCAS (5).The breakdown of the four IEEE venues that had papers pass the inclusion criteria are as follows: Frontiers in Education (FIE) (33), Research in Equity and Sustained Participation in Engineering, Computing, and Technology (RESPECT) (25), International Conference on Computational Science and Computational Intelligence (CSCI) (3), and Integrated STEM Education Conference (ISEC) (2).All 5 papers for T&F were published in CSE.
In the initial pruning round, the first two authors independently assessed all 230 papers, deciding to include, exclude, or discuss further with other authors.They examined titles and abstracts, excluding papers unrelated to U.S. Latines in computing.Insufficient In the second pruning round, the first two authors resolved inclusion conflicts.The third round involved all three authors collectively addressing remaining conflicts, resulting in 53 analyzed papers.The venue breakdown of these papers is found on Table 1.The 53 papers included in this study numbered between a1-a53 (as well as the full list of papers before pruning) can be found in the following link: https://bit.ly/Latines-Computing-Literature-Review.The rest of this paper uses this numbering scheme to refer to the 53 papers due to space limitations.

RESULTS
A summary of results for all the papers analyzed can be found here: https://bit.ly/Latines-Computing-Literature-Review.

Categories
The first two authors met for thematic analysis to identify the categories of the studies for U.S. Latines.They individually tagged ten studies each with topics related to the paper.After the initial tagging, the authors met to identify related topics across tags to create categories.Once these categories were identified, the authors individually tagged the remaining studies.Another meeting occurred to verify the categorization of new tags found.Any conflict in determining a tag's category throughout the process was settled by reviewing the context in the paper for further deliberation.If conflicts persisted, they were settled with the third author's input.
The following six categories were identified: 1) Student Experiences, 2) Institutional Perspectives, 3) External Factors, 4) Pedagogy, 5) Student Outcomes, and 6) Research Opportunities.Table 2 provides a detailed information for each category.Papers may study across more than one category, therefore the total number of papers on the table exceeds 53.Papers may study across more than one category or educational setting, therefore the total number of papers on our graph may exceed 53 and the total number per category may exceed their values Table 2.The majority of papers for Pedagogy and Student Outcomes are in K-12, whereas Student Experience seems to have a more even distribution between K-12 and 4-year University.The sole Informal educational setting work studied Pedagogy, External Factors, and Student Experiences.There were few papers for Community Colleges (CC) and Professionals.Research Opportunities had a fairly even distribution across K-12, University, and CC.

Pedagogical Practices
The breakdown of pedagogies used for U.S. Latines in computing is found on Table 3.The pedagogical practices implemented were Culturally Relevant Computing (CRC), Socially Responsible Computing (SRC), Bilingual Instruction, Computational Thinking Curriculum (CT), Peer-Led Team Learning (PLTL), Media Computation (MC), Sheltered Instruction (SI), Pair Programming (PP), and Consequential CS Learning (CCSL).The majority of studies used Culturally Relevant Computing.Popular pedagogical practices, such as Pair Programming and Media Computation, are not well represented.

Population Focus
The breakdown of the populations focused on the studies can be found on Table 4.The focus is determined by the population being studied through an intervention, survey, or involvement (e.g., interventions carried out with students are student focused; surveys carried out with faculty members are faculty focused; interventions directly involving family members are family focused).We see a majority of papers focusing on students.Faculty, family, and professional focused studies have smaller shares.

Categories Across Years
We can find the paper category frequencies across the years in Figure 2 where each line on the graph corresponds to a category.One paper published in 1995 which fell under the Institutional Perspective category is excluded from this graph for visualization purposes.There is a fluctuating pattern across all categories from 2008 -2018.However, we can see a large spike for studies in 2021 with most categories dropping back down to their previous ranges.

Geographic Distribution
We can see the map of Hispanic 2 population proportions by state (percent of Hispanics in a state) in the U.S. in 2020 [6] alongside the geographic distribution of paper publications with Figures 3a and 3b respectively.If a paper describes the population location, we use the U.S. state in which the study took place.If a study takes place in more than one institution (e.g.a45), each institution's state is counted.If there is no clear indication of location, we list the first author's institution's state as the state of publication because we assume the study is conducted at or near the author's institution. 2The U.S. Census uses the term "Hispanic" for Latine people.If a paper is a national study (e.g., a19), we list the first author's institution's state as the state of publication because this institution is leading the study for Latines across the nation.We can see that the maps align well.The higher proportions lie in California, Florida, Illinois, and Texas for both maps.However, there seems to be higher proportions of Hispanics in various states in the South and West that do not have similar amounts of Latine student papers being published.and recruitment efforts.This could be because Student Experience studies have been gaining in popularity across the field [24] and Institutional Perspective studies are mainly quantitative statistics on administrative data leading to less of a necessity of carrying out interventions.However, only 9.4% included Research Opportunities as a way to engage U.S. Latines with computing.All university studies for research opportunities belonged to Hispanic-serving Institutions.Interestingly, a large portion of these studies were in K-12 (specifically high school) and Community Colleges.This is important to note as research experience is a vital part of the application process for graduate school, which can lead to professorships and ultimately have Latines serve as role models for others to join the field.As of now, Hispanics are at 2.2% of CS Ph.D. enrollments in the U.S. [45].Additionally, undergraduate research programs have shown to be generally effective recruitment and retention tools [18,31].We note that research opportunity papers have increased over the past years, but opportunities for BLNPI students remains low [16].We recommend future studies to provide research opportunities for Latines and publish results on both the efficacy of these programs and the experiences of the students.

Educational Setting Predominantly K-12
From the breakdown of educational setting across categories (see Figure 1), we see studies predominantly focused on K-12 rather than the 4-year University setting -particularly the "Pedagogy" category.We believe that this might be due to populations being more racially/ethnically concentrated in the K-12 setting for BLNPI students [27] -a rarity in the university setting.
There is a more equal proportion of Institutional Perspective university studies compared to K-12.This is likely due to a fieldwide emphasis on recruitment and retention of BLNPI students.Though it is imperative to focus on these types of studies for Latines, it is also important to study the Student Outcomes and Student Experience as these would play a role in fostering a successful and welcoming community in computing.Therefore, we encourage future work in the university space to study the Student Outcomes and Student Experiences of Latines in computing.
We noticed a low number of studies in Community Colleges (CC) with a total of 5 studies being performed and few of those studies focusing on Student Outcomes.This result is alarming coupled with the fact that upon completing high school, 46% of Latine students enroll in the CCs [17], 35% of Latines earning a bachelor's degree begin their education in CCs [5], and the majority of the emerging HSIs are two-year CCs [8].However, we do recognize that research is not generally part of the responsibilities for CC faculty [32,43], creating a lack of a research culture in the CC setting [14].We also understand that 67% of CC students are part-time students [28], allotting them less time to be a part of research studies.We urge universities to create research-practice-partnerships with CCs which our findings show as promising avenues for collaboration [11].

Low Number of Popular Pedagogies
In terms of the pedagogies that were carried out for U.S. Latine populations (see Table 3), the main interventions are Culturally Relevant Computing (CRC), Computational Thinking (CT), Socially Responsible Computing (SRC), and Bilingual Instruction.This is to be expected as these pedagogical techniques have been increasingly popular to attract BLNPI students in CS and STEM education -particularly CRC [19,20].The majority of the studies from these pedagogies are from the K-12 educational setting -possibly due to more concentrated populations of Latines as mentioned in Section 6.2.However, we do not see many studies from popular computing pedagogies such as Pair Programming (PP), Media Computation (MC), Peer Instruction, and POGIL.We believe there is a possibility that Latines have taken part in research of these pedagogies, but these studies may not be reporting the specific demographic analyses on their abstract.Additionally, they might not have reported on the demographics of their studies in general, in line with the findings of Heckman et al. with over half of the studies analyzed falling into the category of "Weakly Supports Replication" [13].We encourage future studies in pedagogical spaces to provide specific analyses for BLNPI students in order to identify the efficacy of pedagogies to hopefully help recruit and retain these populations in computing.

High Student Focus, Low Family Focus
As expected, the majority of the studies are student focused (see Table 4).There is a low amount of papers focusing on faculty, possibly due to the underrepresentation issue with only 2.8% of computing faculty being Latine [45].Additionally, it is discouraging to see that there are not as many family focused studies due to the strong Latine cultural value tied to ones family known as familismo.It has been shown that Latines have high levels of familismo [33], with higher rates than those of European and Asian descents [4] -the predominant populations in computing.This is important to note as Latines have reported high perceived support through familismo [3], making it a potential avenue for research on its effect on their experience and retention in computing.This low family focused finding may be because it might be more difficult to involve families in a student's educational journey in the CC and university contexts due to the student now being adults and not requiring any involvement with family for research studies.Unsurprisingly, all of the family focused studies are in K-12, possibly due to families being more accessible from being in contact with the research teams for consent permissions.We encourage future studies across educational settings to carry out family focused studies to study the concept of familismo and its role for Latines in computing.

Fluctuating Across Years; Spike in 2021
Overall, we see categories averaging around one paper a year with the exception of Student Experience at two papers a year (see Figure 2).We see general inactivity of studies for U.S. Latines in computing until the 2010s.During this decade, there is a focus mainly on Institutional Perspective and, particularly, Student Experience studies.This could be due to similar findings from Section 6.1.During the 2020s, there was a spike in 2021 possibly due to the COVID-19 pandemic and the renewed focus on institutionalized racism in the U.S. -perhaps prompting an increased interest in BLNPI populations.COVID-19 seemed to have been a ripe topic for gauging Student Experience during distance learning, and also possibly helped with the considerations of non-academic External Factors that affect students in their studies.We urge future work to include specific analyses on Latines in their existing studies and/or establish studies with Latines as the main population in their work across all categories.

Similar Geographic Alignment
From the geographic distribution of papers (see Figure 3), we see a similar alignment between both maps.This is expected as it is a higher likelihood that states with higher Hispanic proportions have more opportunities to carry out studies with Latines.However, we find states with high proportions of Hispanics but little to no amount of studies centering around Latines.States such as New York (NY), Oregon (OR), and Washington (WA) have lower amounts of studies, however have a visibly higher proportions of Hispanics in the state.We would also expect more studies from states at or near the Mexican border, however we see a lack of studies from Arizona (AZ), Nevada (NV), and New Mexico (NM).
However, there is a possibility that though there are higher proportions of Hispanics in many of these states, they may not have as an extensive a network of research universities.The Carnegie Classification of Institutions of Higher Education [9] shows that the lower publishing states and territories compared to their Hispanic population proportion, such as AZ, NV, NM, and Puerto Rico may be due to the lower amounts of R1/R2 institutions with the total amount being 4, 2, 2, and 4 respectively.However, states like NJ and NY have many R1/R2 institutions (9 and 19 respectively), a relatively high Hispanic population proportions (22% and 20% respectively), but low amounts of studies for Latines in computing (2 and 2 respectively).We urge these states' future work to include specific analyses on Latines in their existing studies and/or establish more studies with Latines as the main population.We also encourage states with high Hispanic population proportions (e.g., AZ, NV, NM, etc.) to include specific analyses on Latines in their existing studies as they may have a higher representation of Latines in their classrooms.

LIMITATIONS AND THREATS TO VALIDITY
A limitation in our study is the exclusion of short papers and panel sessions.By doing so, we may be missing valuable works from shorter studies that may provide insight on Latines in computing.Another limitation to this study is that U.S. states with a high proportion of Latines performing studies on BLNPI students may have their study be majority Latine.However, since these studies did not report on their demographics, we could not verify the racial/ethnic makeup of their study.A limitation of our study is the exclusion of non-computing venues (e.g., Journal of Latinos in Education, Research in Higher Education) in our search.By doing so, we may have missed valuable works from venues outside of computing education.A threat to validity of our study is having our search criteria solely scan the abstracts of papers.There is a possibility that a study performed specific analyses on Latines, but did not explicitly report those findings in the abstract -thereby not being found in the initial search.Lastly, a limitation is the exclusion of studies that were conducted in Latin America.We focused our results on U.S. Latines, but future work should review effective strategies used in Latin America and compare these efforts to what has been done for U.S. Latines.

CONCLUSION
In this paper, we reviewed 53 conference and journal works that studied U.S. Latines in computing.We found that studies are primarily distributed across Student Experiences and Institutional Perspectives with very few studies involving Research Opportunities; most research is in K-12 and is severely lacking work in the community college setting; the main pedagogical practice is Culturally Relevant Computing with popular pedagogies (e.g., Pair Programming, Peer Instruction, etc.) being poorly researched; predictably, most studies are focused on students, but we hope to see more work involving family members; there are typically 1-2 papers for each category per year regarding U.S. Latines in computing, but a spike was found in 2021 -likely due to COVID; and several states with high Hispanic population proportions are lacking in the amount of studies published on U.S. Latines in computing.We believe that our study enables researchers and practitioners to understand the current state of research related to U.S. Latines in computing and to help improve the status quo by focusing on areas of potential improvement.

Figure 1
Figure 1 is a stacked bar graph of the breakdown of U.S. Latine study papers across categories separated by educational setting.

Figure 1 :
Figure 1: Study categories separated by educational setting

Figure 2 :
Figure 2: Number of papers per category across years

Figure 3 :
Figure 3: Comparison between Hispanics by State in the U.S. and Distributions of Papers by State in the U.S.(a) Proportion of Hispanics by State, from 2020 U.S. Census[6]

Table 1 :
Breakdown of venues

Table 2 :
Breakdown of Categories

Table 3 :
Breakdown of Pedagogies

Table 4 :
Breakdown of Population Focus