Back to School - Sustaining Recurring Child-Robot Educational Interactions After a Long Break

Maintaining the child-robot relationship after a significant break, such as a holiday, is an important step for developing sustainable social robots for education. We ran a four-session user study (n = 113 children) that included a nine-month break between the third and fourth session. During the study, participants practiced math with the help of a social robot math tutor. We found that social personalization is an effective strategy to better sustain the child-robot relationship than the absence of social personalization. To become reacquainted after the long break, the robot summarizes a few pieces of information it had stored about the child. This gives children a feeling of being remembered, which is a key contributor to the effectiveness of social personalization. Enabling the robot to refer to information previously shared by the child is another key contributor to social personalization. Conditional for its effectiveness, however, is that children notice these memory references. Finally, although we found that children's interest in the tutoring content is related to relationship formation, personalizing the topics did not lead to more interest in the content. It seems likely that not all of the memory information that was used to personalize the content was up-to-date or socially relevant.


INTRODUCTION
As part of an efort to reduce math defciencies in primary education, we are developing a social robot math tutor.This work is part of the SOROCOVA project 1 and started after the COVID-19 pandemic led to a widening of the divide in math competences between children [47,48].Through a large scale user study ( = 130 8-11 y.o.), that included three sessions within a week, we have shown that the math robot tutor is efective in improving children's math performance [37].
However, to truly reduce math defciencies, children need to keep practicing throughout their school careers [24,44].Keeping children engaged during recurring interactions with a tutor robot spanning a long period of time is a key challenge in the feld of human-robot interaction (HRI) [33,38].In previous work it has been shown that fostering the child-robot relationship provides a more sustainable motivation for engagement than the novelty efect [23,38,49].Furthermore, a more relatable robot can potentially increase children's intrinsic motivation to work on math with the robot [10,15].
However, an aspect of long-term HRI that has not yet been studied much is how to sustain a relationship after a signifcant break.
In schools, holidays provide a natural gap between interaction sessions, but also breaks due to inconsistent use of robots by teaching staf or robot malfunctions are common [52].In this paper, we investigate such a break in the interaction and how to best sustain the child-robot relationship in an educational setting.
After the three initial sessions (phase 1), we returned nine months later to the same six schools to host a fourth robot math tutoring session (phase 2) with the same participants ( = 113, 9-12 y.o.).The participants had moved up from group 62 to group 7.
To better foster the child-robot relationship, we implemented a memory-based personalization strategy [38] that makes the conversation more personal by referring to information that the child had shared previously [34], matching the conversational topics of the math dialogs to children's interests [37], and co-creating a secret handshake [40].To address the break in the interaction, we developed a novel 'getting reacquainted' module for the memory-based personalization strategy to allow children to 'catch up' with the robot after a break.
In this paper, we focus on the social design components (discussed in Sections 2 and 3) of our robot math tutor and investigate to what extent they sustain the child-robot relationship after the nine month break (see .In particular, we were interested in the contribution of memory references, personalizing the math dialogs, and the 'reacquainted moment' to keep children interested in the robot after the break.

RELATED WORK 2.1 Challenges of Long-Term Social Robot Tutors
Designing social robots that have to operate for a longer time in primary education is challenging for two reasons.Firstly, during their frst twelve years, children experience crucial cognitive and social-emotional developments that rapidly change their educational needs, personal interests, and even their preferred interactions with robots [34,51].Secondly, children tend to learn more from tutors who are socially meaningful, often out of distrust of information coming from unfamiliar agents [8,19,32].Over time, the robot thus needs to adapt its content and behaviors to refect these personal changes and remain interesting, engaging, and relevant to the child [33,55].

Social Personalization by Educational Robots
Personalization is an important concept in educational robotics [3,55].Often the focus lies on adapting the educational content [35], feedback [16], or learning strategies [36] to improve children's performance and competences [3,22].However, to address the social aspects of learning we also need social personalization; for example, by adapting the motivational prompts given by the tutor infuenced by the afective state of the child [14].
In this paper we focus, in particular, on social personalization that contributes to relationship formation.Inspiration can be drawn from how student-teacher relationships develop.Children feeling heard and seen by the teacher is not only crucial for relationship building [45], but also creates an inclusive learning atmosphere [41], benefting both the student and the teacher [12].
One key improvement is for the child to feel remembered by the robot and to experience a development in the relationship [25,38].By asking the child to disclose information about themselves and to subsequently enable the robot to refer back to that information and use it to make the interaction more personal is a powerful strategy for giving children the feeling that they are heard and remembered by the robot, thereby fostering the relationship over time [21,25,30,34,39].

DESIGN RATIONALE FOR SOCIAL PERSONALIZATION
In this section, we present the specifcation of a memory-based personalization strategy that a robot math tutor can use to remain socially engaging and relevant, even after a break.In particular, we focus on three individual components to obtain this social relevancy: using memory references, personalizing the math dialogs, and the 'reacquainted moment'.We are interested in studying the efects of these components on sustaining the child-robot relationship after a break.To steer our investigation, we formulated hypotheses about the design as a whole and each individual component.We will test these hypotheses with a user study comparing the social personalization strategy with a non-personalized equivalent alternative.

Social Robot Math Tutor
The key design principle of the social robot math tutor is to interweave social interaction with practicing math.This principle is theoretically grounded in social constructivism, which argues that learning is inherently a socially interactive process [1,28,46,54].One practical reason is that previous research has shown that social behaviors can actually distract from the educational activity if they are not directly linked [27,29].
The main mode of interaction of the social robot tutor is a conversation.During every session the robot chitchats with the children and asks them questions about their hobbies and interests.This is followed by a series of math dialogs, in which the robot shares brief anecdotes about its fctional past jobs.A multiplication problem is embedded in each math dialog.For example, "I used to work as a dishwasher.During one shift, I had to clean 7 stacks of 14 plates.How many plates did I have to clean that shift?What is 7 times 14?".
The robot displays the problem on a tablet next to the robot.Children are given a pencil and paper to do the calculations (see Figure 1).While they are calculating the robot waits patiently.There are buttons located on the robot's feet.Children can press one button to signal to the robot that they are ready to verbally answer or press the other button to signal that they need help.If speech recognition fails after two attempts, children can provide the answer via the tablet.After a verbal answer, the robot repeats the answer and gives children the opportunity to correct the answers if necessary.
If children signal for help or answer incorrectly, the robot breaks down the sum following the principles of progressive schematization [13,20,50].For example, "14 can be split into 10 and 4.You can solve 7 x 14 by adding the results of 7 x 10 and 7 x 4".For each new type of problem the child encounters, the robot proactively demonstrates how to breakdown each type of problem.
We developed a rule-based artifcial cognitive agent 3 to allow the robot to autonomously manage a multi-session child-robot conversation.

Memory-based Personalization
Memory-based personalization is the primary strategy for the robot to make the tutoring appear more personal.It persistently stores children's answers to the questions the robot asks during the chitchat.In this way, the robot collects what children's hobbies, interests, and preferences are.The robot uses this information in three ways.It includes references to this information throughout the conversation in the current or following sessions.We term these memory references.Secondly, the robot uses the collected interests and preferences to personalize the math dialogs.And thirdly, the robot can summarize the collected information to communicate to the child that it remembers them after a break.This is a core feature of the 'getting reacquainted' component.These individual components are discussed in more detail below.
After the three sessions in the frst phase, we found no main efect of personalization on the child-robot relationship [37].In previous work, diferences in relationship scores due to personalization started emerging after four or fve ffteen-minute sessions.After the third session relationship scores started decreasing overall.However, the decline was less strong when the robot personalized the interaction [38].Since the length of the exposures and the content of personalization in our study are similar, we expect similar efects even though there is a nine month break between session three and four in our study.Thus, we hypothesise that personalization will result in a stronger child-robot relationship after the fourth session compared to no personalization (H1 a ) and the child-robot relationship will decline overall between sessions three and four (H1 b ). 3 Code is available here: https://bitbucket.org/socialroboticshub/sorocova-back-toschool/src/main/

Memory References
The key principle underpinning memory references are communicating to the child that the robot listens to them, recognizes them, and remembers them.The robot does this in two ways.The frst is by creating a secret handshake in the frst session.Children can choose a song fragment and physically move the robot's arms around to create the handshake (see Figure 1).The robot displays this secret handshake during every greeting and goodbye.However, a default wave is used when the robot does not personalize.
Secondly, the robot refers back to information shared by the child in a previous conversation.The robot flls the slots of the templated dialogs to make a reference.A memory reference is often used to motivate the inclusion of a new conversational topic, for example "You like [horses] right?I have a cool story about [horses]'.Or to make connections between the current and past conversations; for example, "My robot friend's head was [orange], just like your favorite color".A non-memory equivalent utterance would be, for example, "My robot friend's head was blue, just like its favorite color".
The memory references are designed to be explicit and reinforce the child-robot relationship.During the frst phase we observed that children interpreted statements that were not informed by the memory as memory references.We expect that after more exposure (i.e. more sessions) the diference between a genuine memory reference and non-memory equivalent statements will be more clear.We hypothesise that children will perceive there are more memory references in the personalization condition (H2 a ).Children will perceive less memory references after more exposure when it does not personalize (H2 b ) and perceive an equivalent amount of memory references after more exposure when it does personalize (H2 c ).Finally, we expect a positive relationship between the perception of memory references and relationship formation between the child and the robot (H2 d ).

Personalized Math Dialogs
We created a collection of templated math dialogs.Each math dialog centers around one topic.These topics are related to one or more of the topics discussed during chitchat.Some math dialogs follow-up on others.The topic and dependencies are specifed in the metadata of each math dialog.
The math dialogs are personalized in two ways.Firstly, the artifcial agent uses the information collected about the child and the interaction history to reason about which math dialog is the best match to include next in the conversation.When the robot does not personalize, the agent randomly selects a math dialog.Secondly, the robot uses slot flling to tailor a math dialog to a child's known preference for that topic.For example, "In the restaurant that I worked at, I used to prepare a lot of [pizza]".When the robot does not personalize, a predefned generic slot fll is used.
The aim of personalizing the math dialogs is to make them more interesting for children and reinforce the child-robot relationship.We expect that after more exposure, as the novelty of the math dialogs decreases, personalization becomes more important.We hypothesise that children will fnd the personalized math dialogs more interesting (H3 a ).When the robot does not personalize, children's interest in the math dialogs decreases after more exposure (H3 b ) and when the robot does personalize, children's interest in the math dialogs remains equivalent after more exposure (H3 c ).Finally, we expect a positive relationship between dialog interest and relationship formation (H3 d ).

Getting Reacquainted
To address the nine-month break between the two phases we designed a getting reacquainted moment at the start of phase 2. As always, the robot greets the child by their name and displays their secret handshake.Then the robot provides a short recap of the conversations the child and the robot had during phase 1.The primary goal of the recap is to reinforce the notion that the robot remembers the child and feels fondly about their previous interactions.It recalls fve interests or preferences about the child.Finally, the robot provides a brief refresher of how the tutoring and communication works.
When the robot has no access to the memory and does not personalize, it instead greets the child with a default wave and the blanket statement "Hi, nice to see you again".Instead of recapping the previous interaction, the robot reintroduces itself, talks about why it likes to do math, but the robot does provide the tutorial recap.
The aim of the 'reacquainted moment' is not only for children to refamiliarize with the robot, but to also give them the feeling that they are remembered by the robot.This in turn should further foster their relationship.We hypothesise that personalization will lead to children feeling more remembered by the robot (H4 a ) and that there is positive relationship between feeling remembered and relationship formation (H4 b ).

METHOD
We ran a multi-session user study in two phases.First, participants practiced math with the help of a tutor robot for three sessions within a week (phase 1).Then the robot left the school for nine months and came back for a fourth session 4 with the same participants (phase 2).

Participants
In phase 1, 130 children (group 6; 8-10 y.o.; 63 boys and 67 girls) completed the experiment.In phase 2, 113 (group 7; 9-12 y.o.; 55 boys and 58 girls) children from the frst phase (the original sample) completed the experiment.Five participants that participated in phase 2, did not complete the experiment for technical reasons.17 participants that participated in phase 1, did not in phase 2 because they were not at school during the experiment (e.g. they were sick or had changed schools).
We followed the guideline of Hogg & Tanis (2020) to recruit at least 30 participants per condition (and at least 100 respondents in total) [18].With 4 (original) conditions, our minimum desired sample was therefore 120 participants.But we aimed for 130-140 participants to account for possible dropouts during the recurring study.
During phase 1, participants of the same age, gender, and math level were randomly split over the experimental conditions.Those conditions were the same in phase 2. Participants and their legal guardians signed informed consent forms before participating in phase 1 that also gave consent for participation in phase 2. This study was approved by the ethical committee of the institution of the last author (ref.number: 2022-054032).

Experimental Design
The full experiment had a mixed factorial design, with personalization and scafolding as the between-subject factors and sessions as a within-subjects factor.To address the hypotheses in this paper, we only focus on personalization (without vs. with) as the betweensubjects factor and phases 1 and 2 as within-subjects factor.The dependent variables were measured once after phase 1 and once again after phase 2.

Quantitative Measures and Instruments
The four dependent variables in phase 2 are relationship formation, perceived memory references, dialog interest, and feeling remembered.They were measured using a self-report questionnaire.The relationship formation scale was based on the scale developed by [53] and [8] and contained six items (Cronbachs phase 1: .73,phase 2: .85).The items cover aspects of comfort, similarity, friendship, and willingness to continue interacting.To accommodate children, a 4-point Likert scale was used: No, defnitely not; No, not so; Yes, a little; Yes, defnitely so [9].The other variables were measured using a single item manipulation check on the same 4-point scale.The full questionnaire is available as supplementary material.

Qualitative Measures and Analysis
We conducted semi-structured interviews after session 3 and before and after session 4. We analysed the answers to four diferent questions.To investigate the impact of the robot's memory we asked participants before session 4 to speculate about what they thought the robot might remember about them (1).As part of the manipulation check children rated whether they felt the robot remembered them.During the post interview for session we asked participants how they noticed the robot remembered them (or not) (2) and what they though about the robot (not) doing that (3).To investigate diferences in the experience of the relationship before and after the break we asked participants, after session 3 and 4, to explain why the robot did (not) suit them (4).
Due to time constraints we analysed about half ( = 3 interviews per participant x 55 participants = 165) of the interviews (random sample while balancing the conditions ( = 29 and = 26), gender, and math level).The interviews were recorded on audio and automatically transcribed with Whisper (medium model) [43].We performed a thematic analysis [5] where we followed a structured data-driven coding approach [11].First, the sample was divided over fve coders who collected the participants' responses to the fve questions and coded each response.Secondly, the codes were reviewed by another coder.Finally, during a collective discussion the codes were refned and grouped under themes.

Set-up and Procedure
The set-up and procedure for phases 1 and 2 were almost identical.The study took place in an otherwise unoccupied room in the school during normal school days.A 57 cm tall V6 Nao (humanoid) robot was used (see Figure 1).It was placed on the ground.On one side, a 9.9 inch Lenova Tab4 10 tablet was placed in a tablet stand.On the other side a lap table was placed with paper and pencil on it.A rug was placed in front of the robot to allow the participants to sit and a handycam camera was positioned behind the robot to record the participants' behaviors during the interaction.The robot operated autonomously and was started from a laptop by a research assistant.The research assistant remained in the room but was positioned far behind the participant to avoid unnecessary contact.The research assistant would only intervene in the case of a system crash.After a reboot, the participant could continue the interaction where they left of.
Participants came into the room one by one.There were three sessions on separate days within one week in phase 1.After nine months session 4 took place as part of phase 2. At the start of phases 1 and 2, participants received general instructions about the study and the robot, and were reminded that they could stop at any moment without giving any reason and without consequences.Session 1 in phase 1 started with a tutorial on how to talk to the robot and how the math exercises worked.Session 4 in phase 2 included a 'reacquaintance moment' (described in Section 3.5) and a recap tutorial.
All four sessions consisted of a math activity (specifed in Section 3.1).After phases 1 and 2, participants could say goodbye to the robot and were taken to a separate room where they flled in the digital questionnaire and were interviewed by another research assistant who was unaware of the experimental condition to prevent a bias in the questioning.

RESULTS
The method of Brunner et al. (2002) [4], and the nparLD R-package [42], for non-parametric analysis of longitudinal data in factorial experiments (with the Wald-Type Statistic [WTS]) was used to perform a phase x personalization analysis ( = 55 and = 58) on relationship formation, perceived memory reference, and dialog interest.Wilcoxon signed-rank and Mann-Whitney U post-hoc tests were to test for main efects of phase and personalization respectively.For hypotheses 2 and 3 Wilcoxon equivalence tests were run using the TOSTER R-package [31].Reported efect sizes of similar constructs in comparable studies (e.g.[39]) are typically high (> .8).To be on the safe side we selected the smallest efect size of interest (SESOI) to be .5.
Furthermore, a Mann-Whitney U test was used to check for a main personalization efect on feeling remembered.All data points are median [quartiles].Spearman's rank-order correlation tests were run to assess the relationship between the individual items of the relationship formation questionnaire and perceived memory references, dialog interest, and feeling remembered (only in phase 2).The correlations for phase 1 are reported in Table 1 and for phase 2 in Table 2. Aside from the relationship formation scale, correlations are also reported for the individual items of this scale as well.

Feeling Remembered
Participants statistically felt more remembered by the robot (see Figure 2; bottom right) in the personalization condition (4 [4,4]) than in the no-personalization condition (3 [2 3]), = 394, < .00001,d = .91.Before session 4 we asked participants what they expected the robot to remember about them.Slightly more than half of the sample expected the robot to have a memory (Personalization: 69% vs. No-Personalization: 42%).We identifed that participants expected the robot to have a social memory and remember their name, interests, or the secret handshake (P: 48% vs. NP: 31%), their math level (P: 14% vs. NP: 4% ), or remember things "because it is a computer" or "it probably has access to the cloud" (P: 17% vs. NP: 12%).Finally, in the no-personalization condition participants were less sure what to expect, 46% could not answer the question versus 20% in the personalization condition.
After session 4 we asked how they could tell the robot remembered them or not.In the personalization condition 93% of the sample explicitly referred to the robot recalling their interests (81%), name (48%), or the secret handshake (44%) as the reason they felt remembered.The remaining 7% felt remembered, but could not specify why.In the no-personalization condition 19% (incorrectly) thought the robot recalled their interests, name, or secret handshake (all 40%).Another 19% mentioned the lack of memory references as a reason they did not feel remembered.38% felt remembered, but could not specify why.Finally, 12% could not answer the question.
Participants (13%) who did not feel remembered (i.e.score 1 or 2; all in no-personalization condition) were mostly ambivalent (71%) or sometimes disappointed (29%) about it.A positive point remarked by one participant is that it ofered the opportunity to share more current information with the robot after the break.For those that felt remembered by the robot (87%) it was hard to articulate what they thought about it.The majority (79%) gave nondescript positive remarks (e.g."it was cool the robot did that").A specifc remark that was given more than once (7%) was "[I appreciated] that the robot did not have to ask the same questions again, because he can remember my answers from last time." We also asked participants why the robot suited them.After 3 sessions 15% of the sample argued that the robot was a good ft because it had similar interests and preferences.The feeling of similarity increased to 40% after session 4, often citing the memory references used during the 'getting reacquainted' moment explicitly.The feeling of similarity replaced enjoyment of the interaction as a key motivation for a good ft.Enjoyment was mentioned by 31% of the sample after session 3 and by 20% after session 4.

DISCUSSION 6.1 Efects of Social Personalization on the Child-Robot Tutor Relationship After a Break
Participants felt a signifcantly stronger relationship after the break (i.e. after session 4) when it personalized the interaction than when it did not personalize.We can therefore accept hypothesis 1 .The diference in relationship scores between personalization and no personalization was not statistically signifcant before the break (i.e. after session 3).Overall, the relationship scores where high across the board, but they did decline slightly, and statistically signifcantly, between session 3 and 4. We can therefore accept hypothesis 1 .
It prompts the question of what causes the decline in the relationship scores.Perhaps, the break cooled the relationship somewhat.Interestingly, however, is that Ligthart et al. (2022) reported a similar decline in relationship scores between a third and ffth interaction, but without a break.Those interactions were similar in length (i.e. between 15 and 20 minutes per session) and format (personalized conversation).Those scholars argue that it does not mean that children experience a weaker relationship over time.Instead, they suggest that children have not experienced a relationship with the robot and, as a consequence, rate the relationship scale items based on what they hope the relationship will be.After more exposure to the robot and experiencing more of the relationship, they rate the items based on that experience [39].In most cases it does not live up to the high expectations [6], reducing the gratifcation, thus resulting in lower scores [26].Our results add to the evidence supporting this explanation.The internal consistency of the relationship formation scale improved between phases 1 and 2. This suggests that participants were more sure about how to rate the items after phase 2 than after phase 1.
Ligthart et al. (2022) furthermore reported that, similar to our fndings, only after the ffth session a main efect of personalization was found on relationship formation [39].Besides the lack of a break, another important diference between studies is that ours centered around an educational task, while theirs was purely a social conversation.
The fndings contribute in the following fve ways.Firstly, we replicate the results of [39] and add to the evidence that memorybased personalization is an efective strategy for relationship building.Secondly, we strengthen the evidence for the hypothesis that children frst rate their relationship with the robot as what could be, and after more exposure switch to a more genuine evaluation.This stresses the importance of running more longitudinal studies, because especially efects that rely on self-report might be infated by unrealistic expectations.
Thirdly, we demonstrate that relationship building also works when the personalization is integrated in the educational task.This is important, because if the educational task and social behaviors are separated, the social behaviors can actually distract from the education task [27,29].Fourthly, the fndings suggest that the amount of time passed between interactions is not as important as the amount of exposure to the robot for relationship formation and assessment.The break did not seem to afect the relationship scores, when comparing the scores with a similar study that did not have a break [39].Finally, the break also did not seem to reduce the efectiveness of the personalization strategy.Thus, memory-based personalization is a robust strategy for relationship building even if there are breaks in the interaction.

Contribution of Memory References
When the robot personalizes, children perceived the robot to include more memorable information that they had previously shared with the robot.This was signifcantly the case in both phases.We can therefore accept hypothesis 2 .Interestingly, after phase 1 participants in the no-personalization condition were reasonably sure that the robot made personal memory references, while in fact the dialogs were all scripted and did not use stored information.This might be caused by a form-function attribution bias [17], where the social behaviors, and perhaps the humanoid appearance, of the Nao robot made children assume it made memory references [2].In the no-personalization condition participants perceived signifcantly less memory references after phase 2 than phase 1.In the personalization condition this perception remained equivalently high between both phases.We can therefore accept hypothesis 2 and 2 respectively.Looking at the scores after phase 2, participants were fairly sure whether or not the robot used memory references, indicating that the form function bias reduces with increasing exposure to the robot.
As the diferences between conditions and between phases 1 and 2 grow, the importance of including memory references becomes clearer.After phase 2, there is a signifcant but weak positive relationship between perceiving the memory references and relationship formation ( = .25).We can therefore accept hypothesis 2 .By looking at the correlations between the perception of memory references and the individual items of the relationship formation scale, we can gain more detailed insights about which parts of relationship formation are probably supported by the memory references.Our study shows that children who perceive more memory references feel the robot is a better ft ( = .34) and the robot feels more like a friend ( = .23).This is confrmed by the interviews.Memory references however do not signifcantly relate to children's willingness to interact more with the robot.
Finally, the use of memory references in earlier sessions does increase children's expectations towards the robot doing so in future sessions even after a big break.It also becomes more important for relationship formation, especially to feel more similar to the robot.After session 4 more participants indicated that memory references were a key reason for why they felt similar to the robot, compared to session 3.

Contribution of Personalized Math Dialogs
The diferences in interest scores of the math dialogs with or without personalization were not statistically signifcant.We therefore have to reject hypothesis 3 .Where after phase 1 children did fnd personalized math dialogs more interesting, this diference dissipated in phase 2. In both conditions the scores after phases 1 and 2 were equivalent.We can therefore reject hypothesis 3 and accept hypothesis 3 .
The math dialogs were generally rated as interesting irregardless of personalization and phase.It confrms we succeeded in creating appealing math dialogs that are generally aligned with children's interests.It is possible that personalizing them only marginally improves them.However, we would like to suggest two possible factors that could have reduced the efectiveness of the personalization in phase 2.
The robot used information from both previous sessions as well as the current session to personalize the math dialogs.In phase 2, about 60% of the math dialogs were tailored with information from phase 1, nine months prior.It could be that the information had become outdated, because children's interests had changed or certain topics had become less relevant to them.Another plausible explanation is that children were never interested in one or more of the topics anyway.They provided an answer when the robot asked what their favorite wild animal was for example, while they actually do not have a strong preference.Regardless of their lack of preference, they appreciated the robot using it to personalize a math dialog during phase 1.After the break either this appreciation faded or they simply forgot what answer they gave, resulting in them not recognizing a personalization attempt by the robot.
Furthermore we found that the more interesting the math dialogs were to the child, the stronger the relationship formation ( = .40).We can therefore accept hypothesis 3 .Looking at the correlations between story interest and the individual relationship formation items, we found that children felt more comfortable around the robot, the robot was a better ft, felt more like a friend, and children were more willing to continue interacting with the robot.The correlations became stronger between phases 1 and 2, signaling the importance of investing efort in keeping the math dialogs interesting as children keep interacting with the robot.
To improve our design we need to better account for changing or inaccurate interests and preferences.Including verifcation of the relevance and accuracy of collected information about the child could be a key improvement.This verifcation should be integrated in the conversation in a clever way, i.e. to not be obtuse or repetitive, when new information is received and after a signifcant break.The weak positive correlation in phase 2 between perceiving memory references and dialog interest signals another opportunity for (minimally) improving the personalization strategy.If the robot more clearly communicates to the child that it weaves information previously shared by the child into the math dialogs, it might make them more interesting.

Contribution of Getting Reacquainted
Children felt signifcantly more remembered by the robot in the personalization condition.We can therefore accept 4 .During the interview participants indicated that the robot's summary of their preferences was a key reason they felt remembered by the robot. 1 out of 5 participants in the no-personalization condition explicitly mentioned the lack of memory references reduced their feeling of being remembered.Memory references are an important mechanism for feeling remembered ( = .56).Feeling remembered was strongly appreciated by the participants.
Having a recap, irregardless of whether children actually (fully) recognize themselves in it, might also contribute to this efect as well.Furthermore, in the no-personalization condition, the robot ofered no recap, but did say "nice to see you again".Some children indicated that this statement alone was enough for them to feel remembered by the robot.Investigating how much the robot must share and how accurate it must be for children to feel remembered will be an interesting question for future work.
There was a weak positive relationship between feeling remembered and relationship formation ( = .19).We can therefore tentatively accept 4 .The robot felt to be a better ft and more like a friend when children felt it remembered them more.When children felt more remembered it did not mean they felt more comfortable or were more willing to continue interacting as well.

Limitations
Although the study included multiple sessions and took place in a school setting, it was still relatively short (in an educational context) and controlled.For example, there was a researcher in the room during the interactions.Even though the robot operated autonomously and had some ability to adapt to each child, the majority of the interaction was scripted.More work is to be done to verify how much holds up unsupervised in a real class room (e.g.like [7]) with more sessions over a longer time.
The investigation of the contribution of the individual design components is based on single item manipulation checks.Although this does not necessarily diminish the value of children's responses, the conclusions do depend on the interpretation of this one item.Although we are confdent that the children understood these items well, this needs to be kept in mind when considering the fndings.
Finally, the signifcant correlations we reported were mostly weak to moderate.This reemphasizes the complexity of factors infuencing child-robot interaction.We must not oversimplify.Yes, we believe our designs are efective, but that does not mean they are the best or indeed the only way to achieve the intended efects.

Future Work
A key fnding of the discussed work regarding personalizing interactions, is the importance of signalling that the robot is using information previously shared by the child.How to achieve this efectively, without becoming repetitive, gimmicky, or outright obnoxious over time will be an important question for future research.
In this paper we took a deep dive into the social design elements of the robot math tutor.A similar investigation focusing on the math design elements themselves would also be valuable.
Finally, we are currently analysing task and robot engagement on the basis of the collected video data.The results will be reported in future work.

CONCLUSION
Memory-based personalization that makes tutoring by a robot more personal is an efective strategy to make the tutoring more sustainable.We have shown that our personalization strategy robustly fosters the child-robot relationship past a nine-month break.
Enabling the robot to refer to information that the child previously shared with it is a key component of the personalization strategy.Conditional for the efectiveness of these memory references is that children notice the robot making them.Feeling remembered by the robot also contributes to relationship formation, albeit to a lesser extent.Including a moment to get reacquainted after a longer break between sessions supports the feeling of being remembered.However, personalizing the math dialogs seems to be less successful.Although children's interest in the math dialog is positively related to relationship formation, the way in which they were personalized did not increase children's interest.It seems likely that the information used to tailor the math dialogs was not sufciently relevant or meaningful.Validating the social relevancy of memory information is an important next step, especially when there are breaks in the interaction.
By making robot tutoring more personal, a social robot tutor can increase its long-term impact and fnally fulfl its purpose to support children in their learning around the world.

Figure 1 :
Figure 1: Experimental set-up with the robot in the middle, a small lap table with a paper and pencil on the right, and a tablet on the left.The child and the robot are doing their secret handshake.

Figure 2 :
Figure 2: Relationship formation (top left), perceived memory references (top right), dialog interest (bottom left) scores per condition per phase, and feeling remembered scores per condition (bottom right)

Table 1 :
Spearman's rank-order correlation between the individual relationship formation questionnaire items and perceived memory references and dialog interest scores for phase 1.

Table 2 :
Spearman's rank-order correlation between the individual relationship formation questionnaire items and perceived memory references, dialog interest, and feeling remembered scores for phase 2.