Testing, Socializing, Exploring: Characterizing Middle Schoolers’ Approaches to and Conceptions of ChatGPT

As generative AI rapidly enters everyday life, educational interventions for teaching about AI need to cater to how young people, in particular middle schoolers who are at a critical age for reasoning skills and identity formation, conceptualize and interact with AI. We conducted nine focus groups with 24 middle school students to elicit their interests, conceptions of, and approaches to a popular generative AI tool, ChatGPT. We highlight a) personally and culturally-relevant topics to this population, b) three distinct approaches in students’ open-ended interactions with ChatGPT: AI testing-oriented, AI socializing-oriented, and content exploring-oriented, and 3) an improved understanding of youths’ conceptions and misconceptions of generative AI. While misconceptions highlight gaps in understanding what generative AI is and how it works, most learners show interest in learning about what AI is and what it can do. We discuss the implications of these conceptions for designing AI literacy interventions in museums.


INTRODUCTION
Generative artifcial intelligence (AI) technologies are increasingly entering people's lives at work, home, and in school, highlighting the need for greater AI literacy -"competencies users need in order to efectively interact with and critically evaluate AI " [58].Developing AI literacy is especially crucial among youth, as middle schoolers become increasingly exposed to generative AI technologies [69] at an age when they are beginning to form their identities [91] and make decisions about their future [86].Prior work has centered around AI education in classrooms, focusing on computing standards and elective AI courses highlighting the technical aspects of AI [2,29,36,90,96].On the other hand, culturally-relevant approaches (i.e., incorporating learners' cultural knowledge and practices in educational interventions) [52] have been shown to lower the barrier of entry to and foster interest development in computing subjects [11,62].Furthermore, interventions in informal learning spaces such as museums have been proven to reach broader audiences of learners [60,75].Thus, understanding how young people approach and conceptualize generative AI technologies is key to developing efective, culturally-relevant AI literacy interventions [53,58,77].
While existing studies ofer valuable insights into children's perceptions of AI, there remains an unaddressed need for focused research on their conceptions of generative AI [25,65].Specifcally, limited research has examined how open-ended interactions with conversational agents (CAs) might reveal middle school-aged children's thinking about generative AI [77].Additionally, the relatively novel and popular CA, ChatGPT, ofers fexible and comprehensive interactions across a variety of use-cases, requiring new research [79].
Our study aims to identify underrepresented middle school learners' personally and culturally-relevant topics and improve our understanding of their approaches to and conceptions of generative AI through their free exploration of a popular generative AI tool (ChatGPT).This unstructured, self-guided, and short-term interaction is analogous to the experience in museum settings, where visitors often engage with exhibits for a short time, in a similarly exploratory manner [42].Our research aims to inform the design of public AI literacy learning interventions in free-choice learning environments.More specifcally, we intend to inform the design of interactive museum exhibits with the goal of fostering and broadening AI literacy.Our analysis is guided by the following three research questions: RQ1.What personally and culturally-relevant topics do middle schoolers show interest in discussing with Chat-GPT?Identifying personally and culturally-relevant areas of interest can inform the design of educational interventions leveraging topics youth already engage with.RQ2.How can we characterize middle schoolers' approaches to interacting with ChatGPT?This dimension allows us to inform the design of AI literacy interventions that support diverse approaches to generative AI.RQ3.What conceptions and misconceptions of AI capabilities emerge through middle schoolers' conversations with ChatGPT?Characterizing learners' thinkingboth accurate conceptions and misconceptions -will highlight areas to target in AI literacy interventions.
In this paper, we present our fndings from a qualitative analysis of a focus group study at a science museum that was comprised of 24 middle school-aged participants who freely interacted with ChatGPT.Our contributions are as follows: (1) We surface personally and culturally-relevant topics to middle schoolers when interacting with ChatGPT.(2) We identify three distinct approaches that participants exhibited in portions of their interactions with ChatGPT: (1) AI testing-oriented, characterized by participants' testing of ChatGPT's knowledge and capabilities, (2) AI socializingoriented, characterized by their treatment of ChatGPT as a peer, and (3) content exploring-oriented, characterized by their curiosity for the content that ChatGPT produces (e.g., essays, poems, jokes).(3) By analyzing participants' group dialogue through the lens of Long & Magerko's AI literacy framework [58], we surface themes and gaps in participants' thinking around what AI is, what it can do, and how it works.(4) We highlight the importance of examining contextual conversations to characterize participants' interaction patterns with generative, conversational AI tools.We provide example codebooks to qualitatively analyze participants' dialogue (see Table 3), prompts (see Table 4) as well as the CA's replies (see Table 5).(5) We discuss four considerations informed by our fndings for the design of interactive museum exhibits to foster and broaden public AI literacy.

RELATED WORK 2.1 Studying User Interactions with AI
Prior studies have examined user interactions with CAs in various contexts.For example, Sube et al. [83] studied factory workers' interactions with a specialized AI system and delineated 13 patterns of interaction within three core themes: cognitive, emotional, and social.Wang et al. [89] focused on graduate students' perceptions of a virtual teaching assistant across fve features such as verbosity and readability.Importantly, this prior research has substantiated the feasibility of inferring user perceptions such as emotions [80], personality traits [63], conversational breakdowns [55], ascribed humanness [71], and politeness [13] from linguistic cues in interactions with CAs.However, existing literature often targets specifc settings (e.g.customer service [48], education [89], or gaming [33]).Though research in CAs has advanced their capabilities dramatically in recent years [10,33,70], the launch of generative AI made fexible and comprehensive interactions across a variety of use-cases possible, requiring new research into user experiences in using CAs [79].Specifcally, ChatGPT "can engage in fuent, multi-turn conversations out-of-the-box, substantially lowering the barriers to creating passable conversational user experiences" [94].To understand users' perception of LLMs, recent studies, such as Korkmaz et al. [50], have approached ChatGPT analysis through sentiment analysis of social media, fnding generally positive user attitudes.Mogavi et al. [64] analyzed social media data to understand user perspectives of AI in various education sectors, and identifed productivity, efciency, and ethics as key discussion points.Skjuve et al. [79] surveyed users about their experiences with Chat-GPT and suggest that ChatGPT was mainly understood from three perspectives: its (1) technical functionality, (2) uses and purpose, and (3) interaction capabilities.Other research looked into users' prompting behaviors and barriers with large language models for creative writing [33] and collaborative programming [28,95].Users' interaction with LLMs in creative writing highlighted the need for a deeper look into users' mental models of such tools as participants exhibited varied notions of the AI system [33].
Our work distinguishes itself in two major ways: First, our study design allows for participants to guide their free exploration of ChatGPT, unbounded by specifc tasks, capturing more authentic experiences and ofering a broader understanding of user conceptions and approaches.Second, our analysis takes into account the role of participants' group dialogue in addition to other contextual evidence.This holistic approach echoes Rapp et al. 's [71] view that perceptions of chatbots vary based on multiple factors including context, participant objectives and cues that the chatbot exhibits.

Children's Understanding of AI
The feld of child-computer interaction has investigated how children conceptualize AI.One strand of research has noted the agedependent diferences in how children understand and interact with AI technologies [19,65].For example, Nguyen [65] reported variances between diferent teenager groups in their perceptions of CAs such as more or less competent, trustworthy, sociable, and knowledgeable.Druga et al. [15] contrasted the perceptions of children (4-10 years) and parents in their interaction with smart devices through a maze-solving activity.Their fndings indicated that children focus more on sensory and social-emotional aspects, whereas parents more often reference cognitive abilities.Druga et al. [19] argued that playful and interactive ways of probing children's understanding of AI could be key to advancing our knowledge of their "cognitive and conceptual development."They also call on future research to investigate "tasks without a clear goal, such as social interaction tasks" and to examine "the views of teenagers and young adults." Recent studies have also touched on the impact of social and contextual factors on children's views and interactions with intelligent technologies [31].For example, Rubegni et al. [76] examined the role of social interactions and settings in shaping middle school-aged children's hopes and fears about social robots.Druga et al. [17] extended this line of inquiry by exploring the imagination and expectations of children across four countries (USA, Germany, Denmark, and Sweden), noting the infuence of socio-economic and cultural backgrounds on their understanding of AI.Interestingly, children from lower socio-economic statuses displayed stronger collaborative abilities, while those from higher socio-economic backgrounds exhibited a deeper understanding of AI concepts, highlighting a need for AI literacy interventions catered towards children from lower socio-economic backgrounds.
Another avenue of research has concentrated on design characteristics that could infuence a young user's interaction with AI [18,19,88,93].Druga et al. [18] and Woodward et al. [93] proposed design recommendations focusing on elements like voice and prosody, and error detection and correction techniques, respectively.In summary, while existing studies ofer valuable insights into younger children's perceptions and conceptions of AI, there remains an unaddressed need for focused research on teenagers' conceptions of these technologies [25,65] and of CAs, more specifically.Our work aims to contribute to this area of research by informing the design of AI literacy interventions specifcally tailored to middle school-aged group's conceptions of and approaches to AI.

AI Literacy for Middle School-Aged Children
While teaching AI at the K-12 level is not yet widespread, researchers argue it is important for developing future societal readiness [12].Most research on novice AI learning has focused on classrooms [2,36,88,90,96] with initiatives like AI4K12 [87], organizations like the Center for Integrative Research in the Computing and Learning Sciences [30], and commercial eforts such as AI4All, ReadyAI and Concord Consortium [1, 72,73].
Although initiatives are emerging, the formal integration of AI literacy into K-12 curriculum remains limited, creating a signifcant public education gap [12].Long & Magerko [58] defne AI literacy as "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate efectively with AI; and use AI as a tool online, at home, and in the workplace."With nascent nationwide computing standards and elective AI courses, most students lack exposure, especially those from lower-resourced schools [5].Much existing classroom AI content focuses on technical aspects rather than ethical/social implications [2,29,32,38,45,92,96].Prior work shows that real-world, social, and cultural relevance motivate historically underrepresented groups, such as girls, in CS and AI [12,23,43].We defne culturalrelevance as "knowledge and practices in family and community life" [52] including beliefs, ritual practices, art forms (e.g., music, movies, video games), and academic subjects (e.g., science, history, literature), as well as informal cultural practices such as language, gossip, stories, and rituals of daily life [68,85].Culturally-relevant CHI '24, May 11-16, 2024, Honolulu, HI, USA approaches in education have been shown to provide a low barrier of entry to computing subjects [11,62] and empower "students intellectually, socially, emotionally, and politically" [51].However, few AI curricula address learners' lived experiences or cultural backgrounds, which research shows empowers diverse engagement [44,62,78,82].Thus, while progress in formal learning is being made, informal approaches are also crucial to build inclusive public AI literacy.[20] have explored the perceptions of AI bias among children and proposed an AI literacy framework focusing on algorithmic justice.These frameworks support the growing trend towards utilizing informal learning environments, such as museums, after-school programs and at-home resources, to expand the reach of AI education [1, 15,16,22,24,53,57].
While prior work has identifed general design recommendations and direction to support informal AI learning activities, little work specifcally targets designing museum exhibits.Informal learning spaces like museums have historically been integral to public science engagement, such as supporting science knowledge, interest development, and improved interdisciplinary connections [3,4,37,84].There have only been a few museum exhibits (e.g., Robot Revolution at Chicago's Museum of Science and Industry [67], AI: More than Human at The Barbican [7], Exploring AI: Making the Invisible Visible at Boston's Museum of Science [66]) focused on AI so far [53,57], likely due to the novelty and opacity of most AI systems.For example, Lee et al. [53] designed an AI-related exhibition to cultivate critical thinking competencies and found that the exhibit supported youth in relating AI to their lives.Long et al. [57] have also explored the role of museum exhibits in fostering AI literacy among family audiences and suggest design considerations specifc to this context.The conceptualization of museum spaces as free-choice, constructivist learning environments, as highlighted in Long et al. 's work [57], prioritizes learner agency and embodied social interaction in designing learning interventions in museums.To allow for this active process of meaning-making, we focus on a self-guided exploration in our focus group study.Based on this self-guided, free exploration interaction, we also use a holistic approach to understanding participants' interaction with ChatGPT as the foundation for the design of informal learning interventions in museums.

METHODS
Our study diferentiates itself by employing a self-guided, openended design that allows for a more holistic analysis of how middle schoolers interact with and understand ChatGPT.In addition to considering the participants' group conversation to understand their interaction approaches, we also incorporate the participants' prompts and ChatGPT's replies.

Recruitment and Participants
We focus on middle school-aged children (i.e., 9-14 years old) as the target demographic for our current study and design decisions as prior work has shown that children younger than six struggle with understanding AI reasoning processes as they are still developing a theory-of-mind [91] and because introducing AI at the middle school level exposes children to the topic at an age when they are beginning to make decisions about their future [86].Since a key focus of our larger project is broadening public AI literacy, we also emphasize underrepresented learner populations in AI and CS.More specifcally, we focus on engaging middle school-aged learners without a CS background, middle school-aged girls, and students of Title 1 schools (i.e., schools that receive funding from the federal government to support low-income students).As we aim to inform designs for museum visitors with varying prior knowledge of AI, we did not explicitly require or inquire about participants' prior interactions with AI.
Following Institutional Review Board (IRB) approval, participants were recruited through our museum partner's mailing lists: a member mailing list with approximately 20,000 contacts, and a mailing list of people who have previously participated in user studies with approximately 3,000 contacts listed.Participants, with their parents, completed a screening questionnaire identifying their age, gender, race and ethnicity, computer science background, and current school they attend, as well as their legal guardian's name and contact information to ensure that their age ranged from 9 to 14 and that they ft at least one of the following criteria: a) identify as female, b) attend a Title 1 school, and/or, c) have not previously taken a CS course.Eligible participants were then contacted to complete consent and assent forms, and choose a focus group timeslot.In total, we recruited 24 participants (P1-P24) from 17 diferent families (see Table 1).Some participants had familial ties (e.g., siblings or cousins) or were friends prior to the study.The participants identifed mostly as female (n=17, n=7 male, n=0 non-binary / third gender), and their ages ranged from 10 to 14 years, with a majority (n=8) being 12 years old.About half of the participants (n=11) attend a school either listed as Title 1 or as eligible for Title 1 funding and the majority of participants (n=18) had not previously taken any CS course.Participants were compensated with free admission to the museum for the day for them and their immediate family (i.e., parents, legal guardians, and siblings), free parking, and $20 in cash.

Study Design
This focus group study took place in April 2023, at a museum of science in a major Northwestern city in the U.S., as the frst step in a larger efort to design interactive museum exhibit for AI literacy.Overall, we conducted nine focus group sessions (FG1-FG9) (see Table 1).Each focus group lasted approximately 30 minutes and consisted of two separate, consecutive activities.In the frst activity, participants were asked to name several examples of their interests (not related to AI).Though this activity may have primed students to be thinking specifcally of these topics during the second activity, it is not included in the analysis presented here as it does not directly relate to children's interactions with AI.We only describe it as it may have primed participants' prompt topics during the second activity.The second activity required participants, in small groups of sizes ranging from two to four, to interact with ChatGPT1 (version 3.5) for 10-15 minutes (equalling 99 minutes and 12 seconds of recording overall).We deliberately structured the second activity to have a duration of no more than 15 minutes, mirroring the typically brief duration of interactions that visitors have with museum exhibits [42].Participants were told that ChatGPT is "a chatbot that can hold conversations just like humans, it can talk to you about anything you want, and you can ask it any questions you want."They were encouraged to discuss their ideas for conversation and prompts out loud as a group, type them in the text box, and send the message by hitting the enter key.Following the interaction, we asked each group their thoughts about ChatGPT.

Data Collection
All focus groups were audio and video recorded, and their chat log with ChatGPT saved from each computer with participants' consent.Authors 1 and 2 generated manual transcriptions for each focus group's second activity and included prompts sent to ChatGPT in the transcriptions at the appropriate moments.
The fnal dataset consisted of 1) the prompts participants sent to ChatGPT during the study, 2) the responses generated by Chat-GPT, and 3) transcripts of participant discussions from audio and video recordings to capture certain movements pertinent to the interaction (e.g., pointing to the computer, direction of speech).

Data Analysis
We conducted a multifaceted qualitative analysis of the study data, as shown in Table 2, to elucidate students' topics of interest, interaction patterns, and AI conceptions and misconceptions when engaging with ChatGPT.The frst two authors met regularly to discuss emerging observations and refne the codes to capture nuances.Coding was iterative, with regular discussion among all authors to reach consensus and consolidate codes.Resolution strategies for disagreements included: revisiting defnitions of codes, expanding the discussion to other authors, and examining additional data points.
To address our frst research question, we categorized each prompt by its overall topic to understand the topics participants chose to explore with ChatGPT.To address our second research question, we used an inductive thematic analysis approach [9] to examine three key data dimensions simultaneously: 1) the transcripts of discussions between participants where they explicitly stated their goals such as "I would like to see how much it knows and question it" (P1) or reactions such as "It feels good to talk to someone who's actually as smart as me" (P9), 2) the types of prompts participants sent to ChatGPT, and 3) the types of responses ChatGPT provided.We examined all three aspects of the data simultaneously to provide a more holistic and richer account of each FG's interaction.More specifcally, participants' goal-oriented thoughts and reactions surface their motivations in interacting with ChatGPT and their intent behind each of their prompts.Participants' prompts demonstrate how they implemented their motivations, and Chat-GPT's answers provide additional context for their reactions and subsequent prompt choices.
During the analysis of participants' dialogues and interactions, we observed patterns of AI-related conceptions and misconceptions.To further investigate these patterns, we conducted a round of targeted coding of the dialogues to extract participants' AI-related

FINDINGS
In the following section, we frst detail participants' topics of interest when freely interacting with ChatGPT.Second, we describe how participants approach their interaction with ChatGPT and identify three distinct interaction approaches: AI testing-oriented, AI socializing-oriented, and content exploring-oriented.Finally, we surface conceptions and misconceptions shared by participants across all three approaches when discussing ChatGPT among their respective FGs.While participants' prompts highlight their interests, some of them also voiced additional preferences for their future interactions with CAs such as nine participants discussing their struggles with spelling words in text-based interactions ("I wish I knew how to spell more elements.I really know a lot of elements but don't know how to spell a lot of them" (P9)), and four suggesting "mak[ing] answers short because it's a lot to read I was just skipping" (P12).While this analysis reveals the breadth of subjects participants explored with ChatGPT, understanding the rationale behind their choice of prompt requires examining the sequence of their interaction holistically.

How Can We Characterize Middle
School-Aged Children's Approaches to ChatGPT?
From the 24 participants, only three participants from FGs 3, 6, and 8 ofered minimal prior familiarity with ChatGPT, stating that "Oh, ChatGPT, you can use this for essays" (P21) and "it's blocked over school computer though, so annoying" (P15).We examine three key dimensions of each FG's interaction sequence including a) each focus group's dialogue, b) their prompts to ChatGPT along with the prompt's type, and c) ChatGPT's corresponding answer in order to examine participants' motivations in interacting with ChatGPT, how their motivations were implemented through their prompts, and the context for their subsequent reactions.To clarify our approach, we provide a walkthrough of FG4's interaction sequence (see Figure 1).FG4 consisted of three participants where P9 asked the most questions, while P8 and P10 mostly spectated.As shown in Figure 1, FG4 prominently exhibited dialogue that represents socializing with AI with P9 actively comparing his own knowledge to ChatGPT's, stating that "it feels good Operationalized Opinion Prompts where participants seek subjective assessments on topics that "What is the best Harry Potter Question can be framed in a way that allows for measurable or quantifable critemovie?" ria, even though opinions themselves may not be objectively defned.

Speculative
Speculative prompts consist of questions that explore hypothetical "What happens when you die[?]" scenarios or future possibilities, imaginative speculation and conjecture beyond factual or knowable information.
to talk to someone who's actually as smart as me" and "it's making me look like an idiot" when ChatGPT answered with more specifcity than anticipated or introduced information previously unknown to P9.In some instances, participants in FG4 also discussed testing AI, stating that "I already know it, I'm just gonna see if it knows" (P9) and exploring AI-generated content, saying "will space keep expanding?Ask it, I really want to know" (P9).FG4 asked six questions in total.Four of the six questions were categorized as fact questions such as "will space keep expanding?",and "how long ago was the big bang?"One was a creative request: "make a poem about Chicago", and one was an operationalized opinion question: "how many years till the next major ice age?" ChatGPT replied with correct answers to all fact questions, produced a poem for the creative request, and replied with a caveat to the operationalized opinion question (see Figure 1).Overall, FG4's favored interaction approach appears to consist in socializing with AI, which motivated P9's factual questions about science, a "shared interest" between P9 and ChatGPT.However, not all fact-based questions were given in such a social manner.Some groups were explicitly trying to trick the AI with fact questions they thought it would not have an answer to, such as FG1's "what day was the Normandy invasion of 1944 planned to be on?"which is a diferent day than when it actually happened because of a heavy storm of the coast of France.Therefore our inductive categorization of groups took into account not only the prompts but also their dialogue and stated intentions.While FGs may have exhibited multiple kinds of behaviors, we categorize them here based on the most prominent approach they took during their interaction.Overall, FG1, FG2, FG7, and FG9 mostly focused on testing AI's knowledge and capabilities -we refer to these groups as AI testing-oriented.FG4, FG5, and FG8, prominently approached ChatGPT as a social entity -we refer to these groups as AI socializing-oriented. FG3 and FG6 mostly exhibited interest in the content ChatGPT was producing rather than ChatGPT itself -we refer to these groups as content exploringoriented.We present each approach with examples from FGs where that approach is most prominent.4.2.1 AI Testing-Oriented Approaches.FG1, 2, 7, and 9 showed the most instances of testing AI.These groups repeatedly expressed their intent to test ChatGPT's knowledge and capabilities.For example, P1 in FG1 explicitly stated "I would like to see how much it knows and question it."Similarly, when brainstorming their frst question, P23 in FG9 said to P24 "let's think about how knowledgeable it can be, let's give it a random question," and P24 replied "we can give it a history question that is hard to understand." While FG2 and FG7 did not state their intentions as explicitly, they would react to ChatGPT's answers by qualifying its correctness.For example, P4 in FG2 repeatedly said "got it right" or "yess!!It got it correct" whenever ChatGPT answered, and P18 in FG7 pointed out that ChatGPT forgot one of the book titles when asked about books in a series and her sister, P19, qualifed it as "stupid." While all four AI testing-oriented FGs aimed to test ChatGPT, their approaches to testing it varied, as shown by the types of prompts they elected.Fact questions were a popular option for testing ChatGPT, but other types of prompts such as speculative, anthropomorphized, and creative, among others were also used.For example, FG2 chose to focus solely on fact questions to test ChatGPT's knowledge of elements in the periodic table, musical instruments, and the number of humans on earth.On the other Responses involve the AI elucidating its limitations or abilities as a language model, asserting its competence to respond or acknowledging its inability based on its designed functions.
"As an artifcial intelligence language model, I do not have a gender or a physical body, so I am neither a boy nor a girl."

AI produced something
Responses consist of generated content, such as creative outputs, or solutions, demonstrating the AI's generative nature.
"Oh Cheese Flavored Oreos, how strange you seem, A savory twist on a classic cream-flled dream.No longer sweet, but rather cheesy and bold, . . ."

Incorrect answer
Responses ofer information that contradicts established facts or provides an answer that is inaccurate or unsupported by evidence.
"To fnd the product of these numbers, you simply need to multiply them together.Using a calculator, the result is:1 * 2 * 3 * 8 * 555 = 33,120 . . ." hand, FG7 and 9 opted for "unknowable" speculative questions to test ChatGPT's limits, such as "what percent of people want to be an animal?"FG1 opted for trick questions in the form of fact prompts.First, they asked "what day was the normandy invasion of 1944 planned to be on[?]" which is a diferent day than when it actually happened because of a heavy storm of the coast of France, and followed by asking "if a fre is happening in a building, what door do the cops go to frst?" expecting ChatGPT to answer that frefghters are the frst to enter the building, not police ofcers.While ChatGPT's answer explained the diferent roles that frst responders hold, FG1's participants estimated that they had tricked the AI.Compared to AI socializing-oriented groups (see section 4.2.2),AI testing-oriented groups' prompts about ChatGPT as an anthropomorphized entity were often directed towards ChatGPT's capabilities rather than its preferences.For example, FG1 asked "how much do you know about history?" and FG7 asked "can you do my math homework?"Additionally, AI testing-oriented FGs' operationalized opinion prompts required expert opinions or speculations on topics such as ocean pollution ("how many gallons of oil are in the ocean?" (FG9)) and legal matters.FG9's creative prompt was also crafted in a specifc and complex manner: "write a poem about dancing monkeys and black tutu skirts eating avocados." 4.2.2AI Socializing-Oriented Approaches.FGs exhibiting most instances of socializing with AI (FG4, FG5, FG8) were treating Chat-GPT as a peer such as discussing "shared interest" with ChatGPT, comparing themselves or others to ChatGPT, or attributing a persona to ChatGPT.As demonstrated in the interaction sequence walk-through of FG4 above, FG4 bonded with ChatGPT through their "shared interest" in science and through comparing their knowledge and creativity to ChatGPT's.FG5, composed of four previously-acquainted female participants, attributed a persona to ChatGPT, stating that "You should be able to talk to this.If they made this into something where [...] it was just like Siri but Chad" (P12), asking questions such as "can you tell me a joke", "how are you feeling", "who is the hottest man on earth" and sending a number of prompts consisting of diferent emojis.FG8 prompted ChatGPT for input on diferent situations or debates or for advice; asking it to settle a debate between two participants on which sport is best.P20 asked "how can I get faster for track" before stating, "that's some good advice" and taking a picture of ChatGPT's answer on her phone.FGs tried to socialize with ChatGPT through diferent types of prompts such as anthropomorphized entity prompts, fact questions, or speculative questions.As opposed to anthropomorphization questions aimed to test AI, here, these types of questions were mostly directed towards AI's embodied and non-embodied feelings and preferences instead of its capabilities, such as "are you a boy or a girl" (FG5), "do you like the name Chad" (FG5), and "what's your favorite color" (FG8).Additionally, the speculative questions asked by AI socializing-oriented groups are more unanswerable, such as "what happens when you die" (FG5), "what[ is] heaven like" (FG5), and "when will pigs fy" (FG8), as opposed to other groups who asked speculative questions about expert topics such as ocean pollution or about album releases and game updates that can be potentially answered in the near future.

Content
Exploring-Oriented Approaches.FG3 and FG6 expressed curiosity more often for the content that ChatGPT produces rather than ChatGPT itself.These groups stated that they "would defnitely use it, just ask random questions that I wanna know the answer to, just randomly" (P16, FG6) or would "try this for school" (P7, FG3).
FG3 and FG6 explored AI-generated content through diferent prompt types such as speculative, creative, or operationalized opinion questions.FG3's speculative prompts all referenced upcoming musical album releases and game updates, such as "when is pplayboi carti releasing a album, " and "what is the 1.21 minecraft update going to be." FG6's speculative question referenced a fctional battle that they "always wonder[ed] to know" (P17) about: "who would win, all the marvel heroes or the jedis from star wars." While ChatGPT was unable to answer the majority of their speculative prompts, displaying a disclaimer about its abilities as an LLM and its knowledge cutof date of September 2021, participants in FG3 and 6 did not refect on ChatGPT's answers and persevered in asking those types of questions.Additionally, FG3 and 6's creative requests were generally broad compared to AI testing-oriented groups.For example, "write a story about the world" (FG6) or "write an essay about tigers" (FG3).This observation coupled with the lack of refection on ChatGPT's abilities suggests that FG3 and 6 may have been more curious about the content ChatGPT was producing rather than it as an entity or its capabilities.

What Conceptions and Misconceptions Do
Middle School-Aged Children Have About ChatGPT?
Separate the analysis on interaction approaches, we also examined the entire corpus of transcripts for conceptions and misconceptions shared by the participants during their interaction with ChatGPT.We extracted a total of 57 AI-related conceptions and misconceptions from the data (see Figure 2).Of these, 15 referred to "What is AI?", 33 to "What can AI do?", and 9 to "How does AI work?"Our analysis of the extracted conceptions and misconceptions surfaced themes such as assigning human attributes to AI, comparing AI to other tools or entities, and sharing assumptions about AI's intelligence, capabilities, and modality of interaction (see Figure 3).We note that the categories are soft boundaries and the extracted conceptions and misconceptions might overlap between guiding  Actually no, that kind of makes sense."Both FGs that mainly focused on testing or socializing with AI described ChatGPT with human-like attributes.AI testing-oriented groups used attributes such as forgetfulness "they forgot a book, 2 books" (P19, FG7) [M-7], intentional lying/deception "liar liar liar liar" (P19, FG7) [M-6] or being intelligent "He's smart" (P1, FG1) [M-4].AI socializing-oriented groups on the contrary, focused on "feelings" and "experiences" stating "It'll say I don't have any feelings" (P12, FG5) [C-3] or "He's gonna be like I don't know, I don't have that experience" (P12, FG5) [C-4], referring to ChatGPT's disclaimer about lacking human experiences.Most Participants used "AI ", "it" or "they" when they referred to ChatGPT.Some participants assigned a male gender to ChatGPT by the use of the pronoun "he" (n= 4).However, none of participants used "she", and one female participant specifcally stated "not a girl, don't pick a   , P8 explained to her brother why AI's answer difered from what he expected: "I know, but the computer might not understand that" (FG4); however, P16 (FG6) remarked, "it doesn't matter, it's super smart it will know what you're trying to say," conveying a misconception about the system's ability to infer meaning.
FG5 saw ChatGPT as superior to Google for feedback [C-8] and specifcity [C-9] stating, "but this gives you specifc answers" (P13, FG5).Groups who mostly showed interest in exploring AIgenerated content and socializing with AI conveyed that ChatGPT's answers are always correct, such as: "it's making me look like an idiot" (P9, FG4).FG9 misconceived ChatGPT as providing factual answers -rather than making statistical predictions about language -because it writes "informative essays", having a perception of intelligence (P23, FG9) [M -11].Regardless of their interaction approach, many FGs correctly recognized a number of ChatGPT's capabilities (e.g., ChatGPT can interpret emojis) and shared no misconceptions about any of its capabilities.For example, FG3 made a statement about ChatGPT being able to hold a conversation and receive feedback stating, "we are gonna give it a thumbs up" (P5) [C-10].FGs with most instances of socializing with AI noticed that ChatGPT can give "good advice" (P20, FG8) [C-12], cannot answer speculative questions [C-13], can interpret emojis (P12, FG5) [C-14] and suspected that it potentially is tracking their answers (P13 FG5) [C-15].AI's capabilities in writing poems and stories were mentioned across all interaction approaches stating, "It is good at making poems" (P8, FG4) [C -11].Lastly, only FGs who mostly focused on testing AI and exploring AI-generated content talked about potential applications of AI, including ChatGPT focusing on one topic [C-17] and using ChatGPT as an Intelligent Tutoring System (ITS): "if we need help on homework, instead of directly giving the answer it could help you and walk you through the steps" (P23, FG9) [M-12].While ChatGPT can provide guidance for completing assignments, there is no guarantee the provided steps are pedagogically sound, accurate, or consistent [6,54,81] as opposed to ITS which are specifcally designed for educational purposes.
FGs that mostly socialized with AI had a considerable amount of conceptions (n=11) compared to their misconceptions (n=5) and compared to groups that elected to mainly focus on testing AI or exploring AI-generated content.Additionally, they noticed a larger amount of ChatGPT's capabilities compared to AI testing-oriented and content exploring-oriented groups.Importantly, all three interaction approaches highlighted ChatGPT's creative abilities.

How Does AI Work?
There were 8 statements about how AI works and all statements were classifed as misconceptions, with no correct conceptions observed.Themes of misconceptions included assumptions about information access, AI's operational mechanics, and modalities of interaction.Across all interaction approaches, FG 2, 5, 6, and 9 exhibited misconceptions about ChatGPT's information access with four participants mentioning it in their conversation, two of whom assumed that ChatGPT "has to search through the web" (P17, FG6) [M-15] or is "connected to Google" (P12, FG5) [M-14].Two other participants in FG2 and 9 noticed that ChatGPT does not have access to real-time information when ChatGPT responded with its knowledge cutof date.
Regarding AI's operational mechanics, P23 (FG9) anthropomorphized AI's functioning stating that "it types so fast" [M-20].All other misconceptions about AI's operational mechanics were made by participants when socializing with AI.P12 did not comprehend why ChatGPT repeatedly provided the disclaimer text "As an AI language model . . .," stating the disclaimer was "overused" (FG5) [M-18].P13 also in FG5 was unsure if building AI systems involved coding/engineering or "building a physical thing that it can talk" [M-19].Additionally, P8 (FG4) assumed AI uniformly uses voice commands for interaction [M-16] and later asked if she could communicate with ChatGPT through drawings [M-17].This range of misconceptions around how AI works indicates gaps in understanding how AI actually operates across all three interaction approaches.

DISCUSSION
To address our broader aim, our discussion section is guided by the question: What do our fndings mean for the design of informal learning interventions that foster AI literacy?Our suggested considerations are not only important for fostering AI literacy but also designing AI systems to support children's needs and interests, and more broadly imagining alternative ways in which generative AI agents could be designed to support children's AI literacy.More specifcally, the following considerations could be adapted for museum settings in the form of interactive exhibits to be explored collaboratively by groups of museum visitors.This could encourage visitors to engage in dialogue and refectionpractices that have been shown as efective learning mechanisms in these settings [14,27,74].We discuss our considerations related to Highlighting Personally and Culturally Relevant Topics, Expanding the Scope of AI Exploration, Leveraging Anthropomorphism as an Approach to Understanding Generative AI, Leveraging Creativity as an Approach to Understanding Generative AI.Table 6 details the mapping between our design considerations and themes of our fndings.

Highlighting Personally and
Culturally-Relevant Topics Our fndings point to a diverse range of topics that hold personal and/or cultural relevance for children such as school and academics, hobbies and interests, pop culture, and technology and gaming.In light of this, educational interventions might beneft from incorporating a wide range of topics or open-ended activities, enabling students to delve into their own interests through AI.While interests are varied, common threads like science or creativity emerge as potential focal points for group engagement.Incorporating design elements familiar to youth culture, such as Minecraft's or Roblox's design features (e.g., pixelated visuals), could serve as a shared platform for groups of children to discuss AI topics around.This observation aligns well with the insights from Ellis et al. [23], which suggest that embedding technical AI content in socially relevant contexts can engage a broader spectrum of learners.Therefore, a nuanced approach that blends individual interests with broadly appealing elements could enhance the learning experience.

Expanding the Scope of AI Exploration
Based on our open-ended study design, participants explored Chat-GPT in their own ways; this led to a number of missed opportunities in their explorations.More specifcally, our analysis indicates that AI testing-oriented groups primarily focused on evaluating Chat-GPT's knowledge and capabilities through fact-based questions they already knew the answers to.We suggest that this approach may restrict a full exploration of ChatGPT's abilities (e.g., its creative potential) as it only surfaced ChatGPT's ability to generate what one participant referred to as "informative essays" (P23, FG9, AI testing-oriented), giving the illusion of competency.Further, this focus on fact-based questions could actively perpetuate the misconception that ChatGPT is a fact repository that delivers accurate Even so, specifc interests and prompts did occasionally surface new areas of exploration by highlighting important limitations of the system.For example, P4 in FG2 was surprised and curious about ChatGPT not having access to real-time information, after asking a seemingly simple, factual question (i.e., "how many people are on earth?").This aligns with insights from Ellis et al. [23], suggesting the challenge for educators is to help students "address their misconceptions and develop an increasingly sophisticated understanding" of AI.We observe that participants' types of questions (e.g., factbased) and topics of interest are important to highlight because participants made sense of the agent through questions/prompts about those topics.We suggest a constructivist approach where these types of questions and topics of interest can be leveraged to scafold participants towards an increasingly sophisticated understanding resulting in them accurately assessing the system's strengths and weaknesses.While ChatGPT's disclaimers about its capabilities (e.g.knowledge cutof date) captured participants' attention, more detailed explanations seem necessary to render this information meaningful for them.For example P133 in FG5 mentioned "it says it's an AI model [. . .] but I don't know what it is."We also propose making AI's explanations of itself more accessible for the target age group, considering that verbose text that does not match the learners' reading level might be counterproductive, as expressed by four of our participants.For example, this suggestion can also broadly beneft the design of educational AI systems aimed for middle-school classrooms.The quality of explanations are crucial as highlighted in Wang et al. 's work "recognizing user perception of CAs and providing appropriate feedback to help users revise their perceptions is thus critical in building smooth human-CA interactions" that also allow users to revise their mental model [89].
In informal learning settings, such as museums, it is widely acknowledged that learning arises from the conversations visitors have around the exhibit and oftentimes, revisions to individuals' mental models happen through this productive talk [14,27,74].We noticed that AI socializing-oriented groups shared a higher number of statements related to AI than the other two group types -this may be due to the social context they created or their interest in the topics discussed with ChatGPT.This dialogue can provide them with more opportunities to adjust their mental models.While we do not argue that the AI socializing-oriented approach is superior to the other two, we highlight the importance of approaches that promote conversations between all visitors and encourage future work to examine why this approach leads to more conversation.
Overall, given the difculty in crafting efective prompts, as underscored by Liu et al. [56] and Zamfrescu-Pereira [94], we suggest the design and use of pre-designed prompt guides or templates that cover important aspects of the AI agent.Alternatively, for learners who seem more interested in AI's answers rather than AI itself, the suggested prompts could incorporate their topics of interest to encourage AI exploration.For AI testing-oriented participants, designing short challenges to assess AI's capabilities and knowledge can help them uncover more dimensions of AI (e.g., ethics).

Leveraging Anthropomorphism as an
Approach to Understanding Generative AI Anthropomorphism emerged as a notable trend in our study, especially among participants in the AI testing-oriented and socializingoriented groups.AI testing-oriented groups ascribe diminished humanness to ChatGPT (e.g., stupidity, lying, smart) as opposed to AI socializing-oriented ones who ascribe high humanness (e.g., feelings and experiences) [71].This anthropomorphic approach aligns with previous research about personifcation as an inherent strategy for young learners to grasp the concept of programmability [19].Further, the use of personifcation has been shown as valuable in introducing or explaining scientifc issues to young learners (e.g., steam is escaping through a valve) [21,34]; however, caution is advised to avoid inaccurate mappings and to prevent false inferences [46].Additionally, diferences in the attribution of humanness, among similar topics such as the presence of emotion in AI, often evoke strong opinions in students making them "a potential hook for engagement" [23].
Similarly, incorporating embodied interactions in museum exhibits can not only render concepts more understandable to those with limited prior knowledge but also create a more engaging visitor experience [39][40][41]74].A potential design direction might involve allowing children to "step into the agent's shoes" [17] providing them an opportunity for perspective-taking that could deepen their understanding of what it is, how it works, and what it can do.For example, this perspective-taking could support a deeply needed understanding of how AI works, as highlighted by the prevalence of misconceptions around ChatGPT's operational mechanics in our fndings.Additionally, exhibits could consider assigning personalities to AI agents -be it a "spy", "writer", or "scientist" -to emphasize particular features or capabilities.For example, the 'spy' agent could highlight the trust dimension whereas the 'writer' agent could emphasize the creative capabilities of AI.Previous work has shown this approach has the potential to change user's perceptions of AI's capabilities [49].

Leveraging Creativity as an Approach to
Understanding Generative AI Prior research highlights that open-ended creative exhibits promote prolonged engagement and facilitate visitor-led learning experiences that can lead to more personally relevant meaning-making [8,26,35].Our fndings show that all three approach groups, AI testingoriented, AI socializing-oriented and content exploring-oriented groups, noticed and were interested in chatGPT's creative abilities, a major capability of recent generative AI platforms [33].For example, P8 and P10 who were initially disinterested in interacting with ChatGPT, regarding it as too STEM-focused, used poem generation as an entry point to re-engage with it.Additionally, P8 asked if they could draw with ChatGPT, suggesting that ofering multiple modalities of interaction (e.g., voice, text, image) can foster more authentic personal expression.Highlighting diferent modalities of interaction would also allow participants to recognize the existence of diferent types of AI [60].Furthermore, participants often expressed misconceptions about how AI functions, likely stemming from the lack of transparency in ChatGPT's interface.Allowing participants to program or customize an AI agent as a way of co-creating with it can provide a personally meaningful learning experience [17,61,62].For example, in music co-creation, providing sliders that allowed users to control diferent parameters of the AI agent "not only increased users' trust, control, comprehension, and sense of collaboration with the AI, but also contributed to a greater sense of self-efcacy and ownership of the composition relative to the AI." [61].Additionally, learner-centered explanations of AI could be leveraged to achieve learning objectives as users interact with the generative AI systems [47].

LIMITATIONS & FUTURE WORK
Our study presents observations based on short-term exposure to ChatGPT, as we aim to inform the design of museum exhibits where visitors often engage with exhibits for a short time.Our observations may not fully capture the nuances of long-term engagement or the potential shifts in learners' conceptions and behavior over time.Additionally, we studied middle school-aged children's openended interactions with ChatGPT, which was in line with our focus on free-choice learning experience.This approach resulted in a variance in the number of prompts sent by diferent FGs.
Our recruitment was also focused on historically underrepresented groups in CS and AI who had limited experience with such AI tools.The identifed personally and culturally-relevant topics and interaction approaches are not exhaustive, other potential topics and approaches are also possible and may depend on the context, population background, and prior experiences with AI.Future work can build on our study by exploring diferent populations, including teenagers from other geographic areas, with varying prior knowledge about and experiences with AI tools and highlighting the diferences across variables.Our future work also aims to develop museum exhibits guided by our design considerations (in Section 5) and subsequently assess their efectiveness using relevant evaluation frameworks [e.g., 59,74].

CONCLUSION
Our study ofers key insights into how middle schoolers engage with generative AI, specifcally through their interactions with ChatGPT.We identify three distinct user approaches -AI testingoriented, AI socializing-oriented, and content exploring-oriented -each revealing unique interests and gaps in understanding.We discuss the need for educational initiatives that address specifc misconceptions and cater to diverse interaction approaches.Ultimately, our research informs the design of AI literacy interventions, such as interactive museum exhibits, aimed at an age group that is critical for cognitive and identity development.

Figure 1 :
Figure1: Overview of FG4's interaction sequence with ChatGPT.Each column denoted with a roman numeral displays the corresponding code applied for each data type (i.e., participant dialogue type, participant prompt type, and ChatGPT's answer type).

Figure 2 :
Figure 2: Total number of statements referring to conceptions (left) and misconceptions (right) about AI or ChatGPT shared by each interaction approach group broken down by Long & Magerko's AI literacy framework themes.

Figure 3 :
Figure 3: Overview of the AI-related conceptions and misconceptions shared by the participants during their in-group discussions.Conceptions are denoted by a [C] and misconceptions are denoted by an [M] with their respective numbering.

4. 3 . 2
What Can AI Do? Across all interaction approaches, participants made the most statements about what AI can do compared to other aspects of AI, and all groups had conceptions and misconceptions.Common themes included comparison to humans and Google, AI's intelligence, and capabilities and ideas for application of AI.When comparing AI to humans, FGs who prominently socialized with AI made comparisons about AI's ability to infer hidden context [C-7, M-8], writing skills [C-6] and knowledge [M-9].For example

Table 2 :
Summary of research questions and corresponding data analysis "It'll say I don't have any feelings" (identifed as conception)"It'll say I don't have any feelings" (was categorized as What is AI?) "It'll say I don't have any feelings" (corresponding theme: Assigning human attributes to AI)

Table 3 :
Codebook for qualitative analysis of participants' dialogues focusing on goal-oriented thoughts and reactions when interacting with ChatGPT [58]oringStatements that exhibit "I would defnitely AIintent to explore the conuse it, just ask rangenerated tent generated by Chatdom questions that Content GPT without focusing on I wanna know the exploring ChatGPT as an answer to, just ran-AI entity itself.domly"(P16)conceptionsand misconceptions.For instance, the statement "It has to search through the web and the web is not always the most reliable thing" (P17) suggests a misconception that AI searches the web to answer any question asked of it or "It'll say I don't have any feelings"refects an understanding that AI does not have human-like feelings or emotions.In total, we extracted 57 excerpts exhibiting AI conceptions or misconceptions.We used Long & Magerko's AI literacy framework[58]to deductively categorize the extracted conceptions and misconceptions into fve overarching questions: "what is AI?", "what can AI do?", "how does AI work?", "how should AI be used?" and "how do people perceive AI?"Then, we inductively surfaced recurring themes of conceptions and misconceptions under each overarching question, such as attributing human qualities to ChatGPT or comparing ChatGPT to other tools.
4.1 What Personally and Culturally-RelevantTopics are Middle School-Aged Children Interested in Talking to ChatGPT About?In total, participants explored 10 broad topic categories in their prompts to ChatGPT consisting of school and academics, hobbies and interests, pop culture, technology and gaming, AI, relationship to self or others, current world events and issues, food, life and death, and other.Additionally, 8 prompts were nonsensical words or emojis, such as "som[e]thing" and were coded as gibberish.School and academic subjects were the most common, with 17 prompts on topics like history ("how much do you know about history"), math ("what is 1"), and science ("will space keep expanding").Participants' hobbies and interests were also prevalent, with 16 prompts on poetry ("make a poem about Chicago"), jokes ("Can you tell me a joke"), sports, and more.Pop culture was another popular topic, encompassing 15 prompts about music ("when[']s yeat rele[a]sing a[n] album"), celebrities ("who is drake"), and movies.Other notable topics included technology/gaming (n = 6) like Minecraft or Roblox; AI characteristics (n = 6) regarding ChatGPT's gender, age, or preferences; relationship to self or others (n = 5) focused on body image and intelligence; current world events/issues (n = 3) such as politics, population, or pollution; and food (n = 3) on topics like cheese and milk.Less frequent topics were existential questions (n = 2) about death and the afterlife and other conversation statements without clear topics (n = 4) such as greetings.

Table 4 :
Codebook for qualitative analysis of prompts sent to ChatGPTAnthropomorphized entity Prompts that engage with ChatGPT as a human-like entity, attributing "Are you a boy or girl?"CreativeCreative prompts involve participants providing open-ended requests "make a poem about cheese favored to ChatGPT, asking it to generate creative content, ideas, or responses.oreos"FactQuestion Fact questions are inquiries that seek factual and objective information "is francium the biggest element?" from ChatGPT.These questions have verifable answers rooted in existing knowledge, data, or established facts.

Table 5 :
Codebook for qualitative analysis of responses received from ChatGPT

Table 6 :
Mapping design considerations to the themes identifed in the Findings