Revealing Networks: Understanding Effective Teacher Practices in AI-Supported Classrooms using Transmodal Ordered Network Analysis

Learning analytics research increasingly studies classroom learning with AI-based systems through rich contextual data from outside these systems, especially student-teacher interactions. One key challenge in leveraging such data is generating meaningful insights into effective teacher practices. Quantitative ethnography bears the potential to close this gap by combining multimodal data streams into networks of co-occurring behavior that drive insight into favorable learning conditions. The present study uses transmodal ordered network analysis to understand effective teacher practices in relationship to traditional metrics of in-system learning in a mathematics classroom working with AI tutors. Incorporating teacher practices captured by position tracking and human observation codes into modeling significantly improved the inference of how efficiently students improved in the AI tutor beyond a model with tutor log data features only. Comparing teacher practices by student learning rates, we find that students with low learning rates exhibited more hint use after monitoring. However, after an extended visit, students with low learning rates showed learning behavior similar to their high learning rate peers, achieving repeated correct attempts in the tutor. Observation notes suggest conceptual and procedural support differences can help explain visit effectiveness. Taken together, offering early conceptual support to students with low learning rates could make classroom practice with AI tutors more effective. This study advances the scientific understanding of effective teacher practice in classrooms learning with AI tutors and methodologies to make such practices visible.


INTRODUCTION
Learning analytics has a long-standing tradition of generating insights about learning from log data of student interactions with AI-based systems, for example, AI tutors [3,28,47].Yet, in recent years, there has been a growing recognition that learning with AI-based systems can only be partially understood through interactions with these systems themselves.Tutor log data alone is limited because, in real-world classrooms, students face contextual challenges, such as motivational, content-level, or interface-interaction issues that constrain effective practice [4].Similarly, students receive support outside of AI tutors, such as teacher visits [23].In the present study, we define in-tutor data as interactions students have with interface elements of an AI tutor captured in log data (e.g., hint requests, correct attempts) and out-of-tutor data as captured events based on the context around AI tutor use, for example, teacher-student interactions.
Leveraging out-of-tutor data to study in-tutor learning and understand effective classroom practices around AI tutors requires appropriate methods to jointly study in-tutor and out-of-tutor data.Combining both data types is challenging as it requires processing, annotating, and analyzing these data in a way that leads to meaningful and interpretable findings for research and practitioners [11].Nonetheless, some interesting work has been done that combines different data sources.For example, recent work highlights the importance of teacher visits to specific students for inferring student disengagement and learning in AI-supported classrooms [23].Similarly, the distribution of teacher attention has been shown to relate to student learning gains [22].Yan et al. [45] report correlations between teachers' classroom behavior based on position sensors and collaborative learning measures, such as group cohesion.These three studies, however, do not provide a comprehensive picture of how student learning varies with classroom context and teaching practices.Generating such a picture is crucial for teacher-facing applications, for example, reflection tools [24,29].
The present study leverages network methods to fill this gap in understanding the interplay of in-tutor learning and classroom practices.Epistemic Network Analysis (ENA) [35] and Ordered Network Analysis (ONA) [38] in quantitative ethnography (QE) [35] receive increasing attention in learning analytics and promise interpretable insight into learning processes [34].These methods encode temporal relationships between virtual and physical events and establish ties of which behaviors frequently co-occur (e.g., student disengagement and teacher monitoring).Comparing these temporal relationships across learners can drive the interpretation of favorable learning conditions in AI tutors.The present study compares teacher practices (e.g., student visits) across groups of students with low and high learning rates representing how efficiently students improve in AI tutors by practicing problem-solving steps related to a given skill.QE methods are promising in understanding the role of out-of-tutor events for in-tutor learning yet are largely unexplored in this context.This is crucial because collecting, processing, and analyzing out-of-tutor data is costly (e.g., money and teacher time when buying and deploying sensors).This research presents a case study in understanding classroom practices in a typical application area of AI-based learning systems: classrooms with individualized problemsolving practice.We combine Transmodal Analysis [36] and ONA [38] to study learning differences in the linear equation-solving tutor Lynnette [28] where our past work found teacher practices related to student learning [23].
The present study's contribution is three-fold.First, the study provides evidence that the inference of student learning rates in AI tutors significantly improves when considering out-of-tutor events and spatial teacher information.Second, we distill relevant teacher practice features associated with learning, specifically teacher screen alignment and the teacher talking to students.Further, after teacher visits to students with low learning rates, their in-tutor behavior approximated those of students with high learning rates, contextualizing prior findings suggesting that such visits help students learn [23].Third, differences in conceptual teacher help might explain these differential associations between teacher practice and learning.All three contributions can guide learning analytics for effective teacher support.

BACKGROUND
2.1 Learning in AI-Supported Classrooms AI-supported classrooms are instructional settings in which students learn with the help of AI-based systems while the teacher orchestrates and facilitates learning [23,46].Prior work on AI tutors, which offer step-level guidance and feedback during problem-solving practice combined with individualized mastery learning [1], has delineated how teacher practice changes when used in classrooms.For example, Kessler et al. [25] report how teachers in these settings focus on moving around the classroom, giving one-to-one conceptual and socio-emotional support.Early qualitative work on AI tutors found that teachers can provide more individualized support when students learn with AI tutors [33].
Other works have studied how teacher-facing learning analytics can guide teacher practices during individualized learning.Analytics of student disengagement and struggle can support teacher decision-making.For example, Holstein et al. [20] report how teachers having access to student behavioral states during learning via mixed reality subsequently focused more on students with low initial knowledge, which resulted in strong improvements in learning gains, especially for students with low initial knowledge, who therefore had more to learn.Similarly, Yang et al. [46] co-design a tool that allows teachers to dynamically pair students based on classroom dashboards informed by data from AI tutors.
The role of teacher practice for effective learning with AI tutors is understudied [39].Recent review papers on multimodal learning analytics highlight a lack of studies where student learning is analyzed through the lens of teacher practices [9].One study documented how student disengagement relates to the teacher's choice of what students to help [23].However, recent work also suggests that the effects of teacher-facing analytics on teacher classroom behavior may also differ by teacher traits and characteristics [40], making it even more crucial to establish standard methodologies and taxonomies to understand effective teacher practices in AI-supported learning settings.The present study addresses this gap by using quantitative ethnography to distill insights into teacher practices related to how well students learn.

MMLA for Understanding Teacher Practices and Learning
Multimodal learning analytics (MMLA) embraces the idea that classroom learning processes can only be understood by drawing from rich data of learners and their environment [16].Past work has fused data from modalities including audio, video, eye-tracking, clickstream, and speech [9].Applying MMLA comes naturally in learning settings where technologies complement or augment traditional forms of instruction and generate log data and other learner data, such as blended learning [31], hybrid learning [30], and individualized learning via AI-based systems in classrooms.
Learner modeling in MMLA comes with unique challenges.An overview article by Cukurova et al. [11] highlights research-level challenges (e.g., processing and analyzing data) and end-user-level challenges (e.g., generating meaningful and interpretable insights for teachers).Yet, most past MMLA research has focused on classical approaches to inferential and predictive outcome modeling [9].For example, past MMLA work predicted academic performance in higher education [8], behavior change in students with special education needs [7], and student engagement in online learning [6].The limitations of outcome-based modeling in MMLA are two-fold.First, when the number of potential predictors is large, criteria for feature selection and associations to present in student-or teacher-facing analytics become non-trivial.
Distilling insights requires domain-level knowledge, theory, and qualitative interpretation [43].Second, outcome-based modeling does not capture and is limited in speaking to what events precede or succeed each other.Yet, speaking to temporality is vital in communicating insights about learning.Consider, for example, an association between student engagement and teacher practice.Only temporality can distinguish between student-elicited changes in teaching or teacher-elicited changes in student engagement.Yet, distinguishing between both cases is key for effective teacher-facing analytics such as reflection tools [29].Quantitative ethnography, which we survey next, can fulfill both requirements.

Quantitative Ethnography Methods in Learning Analytics
Quantitative Ethnography (QE) is a methodological lens that combines ethnographic (i.e., qualitative, case-based) and statistical (i.e., quantitative, data-driven) methods to study human behavior [21].The key affordances of QE methods are that they can capture (a) complex dependencies between feature-rich data sets and (b) yield interpretable insights for practitioners that scaffold interpretation processes that would otherwise rely on statistical expertise.While traditionally relying on codes from qualitative discourse, QE studies have utilized log data collected from digital tools to understand fine-grain learning processes, for example, to situate log traces with video or text replays to interpret the context of human-system interactions, enabling rich descriptions of learning at a large scale [34].
Due to the unique affordances of unifying descriptions and quantitative representation, QE has a rising interest in learning analytics.For example, Epistemic Network Analysis models behaviors captured in learning environments connected via temporal co-occurrence, which supports interpretations of learner strategy differences and their relationship to outcomes [12,15,48].For example, Fougt et al. [15] distinguish student proficiency levels in higher education writing assessment to support grading via keywords.Fernandez-Nieto et al. [14] use epistemic network analysis to model and visualize student spatial behavior during nursing education simulations, showing that instructors valued and overall consistently interpreted insights generated from such visualizations about team performance and behavior.Similar evaluation studies are scarce.While prior work investigated collaborative learning using epistemic network analysis [5], to the best of our knowledge, no study has triangulated student learning data with teacher practice data in that fashion.
There is a gap in constructing interpretable qualitative networks of behavior to understand effective teaching practice, which could be incorporated into teacher-facing dashboards [29].The present study offers a methodology and analysis to bridge established learning constructs from tutor log data (e.g., hint use and attempts in AI tutors) with teacher spatial data and teacher practice gleaned from observation codes.Specifically, we leverage the emergent methodology of Transmodal Analysis (TMA) [36] and Ordered Network Analysis (ONA) [38], further described in Section 3.4.1.

The Present Study
Our research questions revolve around the viability of and insights generated from applying QE networks to understand the effective teacher practices across students with low and high learning rates in an AI-based tutoring system for linear equation solving.We study behavioral connection-making across teacher and student behavior, which our networks capture via temporal co-occurrence.Our three research questions are as follows: RQ1: How much does learning rate inference improve when considering out-of-tutor teacher practices?RQ2: How does behavioral connection-making of teacher and student behavior differ by student learning rate?RQ3: How do these differences in connection-making relate to whether students have been visited by the teacher?

Data Sets
We combined three data sets, resulting in a consecutive stream of timestamped events ( = 23,486), representing student interaction data with an AI-based tutoring system ( = 19,796), classroom observation notes ( = 565), and teacher spatial positions during classroom practice ( = 3,125).We ensured the synchronization of the internal clocks of position trackers, AI tutors, and the observation coding software via rigorous testing for data merging.3.1.2Tutor Log Data.All students learned with Lynnette, an AI-based tutoring system for equation-solving.Lynnette provides step-wise error feedback and hints [28].Students received the same 12 problem sets, totaling 48 problems.
Their difficulty levels ranged from elementary equations and progressed gradually to more intricate ones.All student transactions in the tutoring system (i.e., problem-solving step attempts and their correctness, hint use) were recorded via timestamped log data following standard practices [26].We also employed detectors to better understand student learning in Lynnette.Detectors are means to infer behavioral states (e.g., disengagement, affect, doing well) from tutor log data, including but not limited to using decision rules and machine learning.The detectors used in the present study use decision rules to generate timestamped student states at each tutor transaction, that is, the presence of idle behavior (i.e., inactivity for 2 minutes), tutor misuse (i.e., exploiting feedback and hints to progress in the problem), and struggle (i.e., inability to master skills in the system despite repeated attempts).All detectors are further described in [20].

Observation Notes.
Following standard practice for classroom observations working with AI tutors [19], one observer at the back of the classroom during instruction collected timestamped codes representing different classroom events.The events were recorded using the "Look Who's Talking" software and included teacher actions related to specific students (e.g., "talking to student #1") and students' behaviors (e.g., "raising hand").All recorded codes are described in Section 3.2.Furthermore, the observer noted any special occurrences during the classroom session, such as noteworthy dialog between the teacher and a specific student.In the present study, we use these notes for a qualitative analysis that contextualizes different student behaviors between groups of students.

Spatial
Teacher Data.Spatial teacher information in the classroom in the form of timestamp X-Y coordinates at the rate of seconds was collected using Pozyx's UWB (ultrawide-band)-based position sensors.The positioning system estimates a person's real-time position based on the signal transmitted by UWB tags in a lanyard worn around their neck and six anchors in the classroom's periphery.More information on the system can be found in [23].

Feature Engineering
Engineering features for this study required establishing timestamped codes (i.e., events) across tutor log data, observation data, and teacher position data.By codes, we mean timestamped binary indicator variables representing the presence or absence of certain behaviors at a given timestamp with all timestamps established via student tutor transactions (see Section 3.1.2),human observation notes (see Section 3.1.3),and teacher position logs (see Section 3.1.4).
Using tutor log data, we created four codes related to standard measurements of student success and assistance (i.e., tutor support) during learning.They included students' hint requests in the tutoring system, correct attempts, and incorrect attempts at problem-solving steps.We additionally differentiated between correct attempts and correct first attempts at problem-solving steps, which are routinely used to assess student knowledge while learning with AI tutors [27].Behavioral states (see Section 3.1.2),that is, tutor misuse, struggling, and idling, represented three additional codes.
Codes representing teacher practices in the classroom were partially taken from raw observation logs (see Section 3.1.3)and partially engineered from teacher spatial position data.Observation codes included the teacher talking to a specific student and students' hand raises.Based on teacher position logs, we engineered a code representing teacher monitoring of a specific student's screen, which we call screen alignment.Teacher screen alignment means that the teacher's inferred orientation is aligned with the direction in which a given student's screen is facing, which can contribute to understanding teaching practices [13].Screen alignment may represent the teacher's ability to attend to (from afar) and help (from close by) specific students working with the AI tutors, as they are able to see the student's screen.The computation of screen alignment is based on (a) inferring teacher orientation from the teacher's movement LAK '24, March 18-22, 2024, Kyoto, Japan Borchers et al. by taking the difference between the two most recent teacher position coordinates, (b) inferring the cosine-similarity of that trajectory to a student's screen, and (c) setting a minimal cosine similarity cutoff for whether the teacher's current orientation is facing a given student's screen.We set the threshold such that the 90 closest (out of 360, i.e., 1/4) degrees to the direction the teacher faces constitute alignment.This decision was based on informal classroom observations during data collection as well as considerations of the human field of vision.All behavioral codes' descriptions, examples, and base rates are in Table 1.
Table 1.Behavioral codes based on different types of learning events, including their base rates in the study sample ( = 23,486), which represents the frequency with which each code is present in a consecutive stream of events in the multimodal data set.The teacher faces the student's screen based on trajectory.0.280

Event
To answer RQ2, we group students by whether they have a comparatively high learning rate (i.e., a high rate of improvement in the tutoring system) or a comparatively low learning rate based on a median split.While both learning rates are relative to one another, we refer to both groups as low and high learning rate groups for simplicity.Overall learning rate differences across students can be estimated via iAFM modeling.iAFM modeling is a variation of AFM modeling, a binomial regression model estimating whether students get a first attempt at a problem-solving step right without tutor help [27].The model assumes that the probability of getting a problem step right depends on the skills or knowledge components associated with that step.The model estimates, for each knowledge component, an intercept for the initial difficulty and a learning rate.Learning rates refer to how much students improve at getting a step right as a function of how many prior opportunities they had to apply the skill needed in the tutoring system (and receive feedback on their attempt, from which they can learn).AFM models usually also include a student-level intercept that represents the student's initial proficiency across all knowledge components.As our specification of the knowledge components targeted in the instruction, we used Lynnette's standard knowledge component model [28].iAFM modeling extends AFM modeling by leveraging linear mixed models to estimate individualized learning rate parameters per student.These learning rates represent differences across knowledge components in how fast students improve per opportunity in the AI tutor.Grouping students based on this parameter (as done in this study) can then distinguish between students who learn faster and those who learn slower while working with the AI tutors.
To answer RQ3, we infer teacher visits to specific students based on teacher position data and compare behavioral connection-making before and after them.We used an algorithm that infers a teacher visit to a specific student if the teacher stops (i.e., stays within a certain radius for a minimum amount of time) in the proximity of a specific student.
An evaluation of that algorithm is in [37].We group students by whether they have been visited at least once.We group students by whether they have been visited at least once.This decision was based on a low number of visits students experienced ( = 3,  = 4.25 over three study days) and to avoid making assumptions about how long effects from an initial visit would spill over to subsequent learning.

Qualitative Text Replay Analysis
To better understand teacher events with different network ties to student actions across students with low and high learning rates, we employ log data replays inspired by prior work on labeling tutor log data [32].Specifically, we sample a context window of three teacher actions (omitting observation notes unrelated to the teacher; see Section behavioral codes [38].ONA constructs a sliding window for each observation and calculates connection counts between any two codes within it.ONA accumulates the connection strength by summing window-based connections per unit code.ONA then produces normalized and centered connection strength, plotted as line weights.Then, based on the standardized connection strength, ONA performs a dimensional reduction to generate a pair of ONA scores for each unit code and uses ONA scores to plot units in a two-dimensional space.We performed means rotation to generate the first dimension (MR dimension) that maximizes the group differences between students with low and high learning rates.
We conducted statistical tests on the ONA scores to compare the differences in connection patterns between groups.
In addition to ONA scores, each unit can be represented by a network with nodes and edges.Within a node, the radius of a colored circle and the saturation of its color reflects the frequency of self-transitions for the code.The outer radius of a node reflects the frequency of a code responding to other codes.A big node in the space indicates that the code is a common response to other codes.A pair of triangles indicates the bi-directional connections between two codes, while a dark arrow marked on the edge indicates the overall directionality of connections.ONA then determines the node position using co-registration.ONA finds optimized node positions to minimize the distance between ONA scores and network centroids for all units.Thus, co-registration enables the interpretation of connection patterns based on the location of their ONA scores: the adjacency between unit ONA scores and codes provides an interpretation of connection-making for the unit.The unit on the left side of a dimension tends to make more connections among codes on the same side.ONA can also average line weights and node weights to generate group mean networks.To compare the differences between two groups (e.g., students that have and have not been visited by the teacher, as featured in RQ3), ONA can subtract the mean line weights of one group from another to generate a subtracted network.
In the subtracted network, the saturation of colors and thickness of edges indicate stronger ordered connections or self-transitions of one group.Visualized differences in line weights between two groups can also be statistically tested.

Transmodal
Analysis.The present study uses an augmented Transmodal ONA model (T/ONA) to understand learning processes from multimodal classroom data.Transmodal Analysis (TMA) is a conceptual and methodological framework to model human activities and processes (e.g., learning, communication) by representing temporal-sensitive connections between events across multiple modalities [36].According to Shaffer et al., TMA can specify the unique temporal impact of each modality and augment existing state-dependent models, such as Epistemic Network Analysis, Ordered Network Analysis, and Process Mining.Instead of deriving separate models for each modality or source, TMA integrates and models the relationship across events in a holistic model.To specify the impact in a learning and interacting process, TMA allows researchers to adjust temporal influence functions (TIF) per modality.For example, according to our qualitative analysis, the impact of the teacher talking lasts longer than a student's attempt at a problem step in the tutoring system.Thus, to represent such temporal impacts, TMA specifies mathematical functions of time to describe the effects of teacher's talk and in-tutor submission.Under the specification of different TIFs, the estimation of connection strength better represents transmodal relationship as an augmented state-dependent model [36].Temporal influence functions: Based on qualitative classroom observations, the window of events varies based on the type of learning events.Due to the consecutive actions prompted in the tutoring system, we specified a relatively short impact window for events in tutor logs and detector predictions, compared to out-of-tutor interactions and teacher's location changes.Thus, we specified the windows for tutor logs, detector predictions, non-spatial out-of-tutor interactions (raising hand and talking), and spatial effect (screen alignment) as 5 s, 10 s, 15 s, and 20 s, respectively.
3.4.4QE Model Evaluation.QE models can be evaluated based on interpretive alignment [41,42].Interpretive alignment refers to a claim warranted by both qualitative and quantitative interpretations.Interpretive alignment shows consistency between connections illustrated by network models and qualitative stories from data.For example, the present study combines inquiry into connections between out-of-tutor interactions and in-tutor learning with qualitative text replay analyses of observation notes.Good interpretive alignment is achieved when statistical differences in connection-making for unit networks contextualize and are consistent with qualitative observations about classroom learning.For the quantitative tests in RQ2, we used the Wilcoxon Rank Sum Test to compare the T/ONA scores of the low and high learning rate groups.Similarly, we used the Wilcoxon rank sum test to statistically compare line weights and the connection strength of specific codes across groups.For RQ3, we compared each student's pre-and post-visit phases as a repeated measured test, applying Wilcoxon Ranked Sign Tests on both the T/ONA scores and the line weights.
To infer low or high learning rates based on connection-making patterns, we performed logistic regressions for the two T/ONA models.In both cases, we modeled learning rate (low vs. high) via T/ONA scores on the first and second dimension.To test whether one model significantly described the study sample better than another (RQ1), we bootstrapped T/ONA scores by randomly selecting units, performing logistic regressions, and constructing a distribution of  scores for each model ( = 1,000 samples).Then, we performed a -test to compare whether one model's mean of  scores significantly differs from another.All analysis code is publicly available, with data available upon request. 1

RESULTS
4.1 RQ1: Learning Rate Inference Fidelity when Considering Out-of-Tutor Teacher Practices Comparing two models with and without out-of-tutor teacher practice codes, we conducted logistic regressions to infer whether a given student's learning rate is low or high using corresponding T/ONA scores on the two-dimensional space.Multimodal T/ONA models with both in-tutor and out-of-tutor interactions ( = 150.60)described the data better than the unimodal model with in-tutor behaviors only ( = 158.30).This  decrease was significant based on bootstrapped units for each T/ONA model and their resulting  distributions ( = 12.76,  95% = [7.06,9.63],  < .001).The differences in line weights contribute to the statistical differences in T/ONA scores of units.According to the Wilcoxon Rank Sum Test on T/ONA scores on the first dimension, there is a significant difference between students with low and high learning rates (  = 0.15,  ℎℎ = −0.26, = 4,813,  = 0.72,  < .001).Thus, the learning rate groups differed regarding network connections for in-tutor and out-of-tutor data.

RQ3: Connection Patterns before and after Teacher's First Visit
RQ3 pertains to connection-making differences between in-tutor behavior and teacher practices before and after a given student had at least one visit.Similar to RQ2, we break out this analysis by students with low and high learning rates.
We constructed group mean plots and subtracted plots for pre-and post-visit phases by learning rate group (Figure 2).
Prior to the teacher's visit, the teacher usually monitored students via SCREEN ALIGNMENT, followed by students' requesting hints.However, after the teacher's visit, there is a significantly stronger connection from FIRST CORRECT ATTEMPT to CORRECT ATTEMPT ( _ = 0.14,   _ = 0.50,  = 16,  = 0.92,  < .001).That is, after a teacher's visit, students with low learning rates tended to achieve more consecutive correct attempts.
Figure 2 (right) displays connection-making for students with high learning rates.After teacher visits, these students had a significantly stronger connection from FIRST CORRECT ATTEMPT to CORRECT ATTEMPT ( _ = 0.50,   _ = 0.54,  = 575,  = 0.33,  = .030),which indicates the consecutive correct attempts after teacher's visit.Furthermore, after the teacher's visit, SCREEN ALIGNMENT is also a common response to both FIRST CORRECT ATTEMPT ( _ = 0.13,   _ = 0.17,  = 525,  = 0.34,  = .026)and CORRECT ATTEMPT ( _ = 0.14,   _ = 0.21,  = 493,  = 0.38,  = .013).The teacher tended to follow up on the students with high learning rates by monitoring their screens after correct responses in the AI-based tutoring system.
Testing T/ONA scores on the MR dimension for two groups, there is a significant difference between pre-visit and post-visit for the low learning rate group ( _ = -0.33,  _ = -0.08, = 303,  = 0.49,  = .022);however, there is no significant difference between pre-visit and post-visit for the high learning rate group ( _ = 0.18,   _ = 0.25,  = 1071,  = 0.25,  = .096).The teacher's first visit significantly impacted students with low but not with high learning rates.For students with lower learning rates, post-visit connection-making included a higher frequency of consecutive correct attempts in the tutor than pre-visit.

Qualitative Text Replay Analysis
T/ONA analysis revealed that teacher support in the form of SCREEN ALIGNMENT and TEACHER TALKING tended to be followed by hint requests by low learning rate students and correct attempts by high learning rate students (see Section 4.2).As described in Section 3.3, we qualitatively interpret observation notes related to teacher actions around student actions of interest.We start by summarizing themes of SCREEN ALIGNMENT and TEACHER TALKING actions preceding hint requests and correct attempts across students with low and high learning rates.
Differences in teacher support during SCREEN ALIGNMENT and TEACHER TALKING could help explain why low learning rate students used more hints while high learning rate students got more correct attempts in the tutor after these codes.Students with high learning rates often received abstract hints on approaching problem-solving steps in linear equations.One observation note read "good job; you subtract x from both sides [...] should you multiply or divide both sides" and "we are going to do this on both sides".The teacher would also prompt students to anticipate the next problem-solving step, stating, "what are you going to do next?".Such anticipatory self-explanations have been found to support learning [2].Conversely, low learning rate students received comparatively procedural and concrete recommendations on what to input into the tutor.For example, human observation notes about the teacher-to-student dialog read "you can just write 2x = 4" and "you need to put 2x=4.Hmm, so that's 4/10".As an approximation of prompting for explanations, observation notes to low learning rate only included 27 questions compared to 80 for high learning rate students.However, students with high learning rates also had more recorded teacher-related observation notes overall (335 compared to 66).A Poisson regression model for count data indicated that students with low learning rates were prompted more frequently per observation note than high learning rate students, with an incidence rate ratio of  = 1.71,  95% = [1.11,2.53],  < .001.We note that in the low learning rate group, students also faced difficulties in working with the AI tutors; with observation notes reading "there are eight problems, you need to press enter" or "I would just write x =.No. No. How did you get to do [...] use slash?Oh!", which also included prompts.
A second key finding was that students with low learning rates achieved more consecutive correct attempts after visits, while high learning rate students already did well before visits.Therefore, we examined replay windows of observation notes related to visits for both student groups.Two qualitative observations help explain why teacher visits were effective for students with low learning rates.First, the teacher would remind the students with low learning rates not to abuse hints: "don't use hints so much" and "I can see what you are doing -hint abuse".These observations indicate that students with low learning rates often exploited the tutoring system's feedback to advance on the problem when the teacher visited them.In other words, disengaged behavior might have prompted teacher visits, which aligns with recent work on classroom analytics showing associations between visits and disengagement [23].Second, the teacher would also go into longer interactions with low learning rate students, now using self-explanation prompts.For example, one note read "Do you know what to do here?What does this say?[...] Maybe use the diagram".Similarly, one observation note stated "Would you subtract 3 on both sides?[...] Would that be a good thing to do?".In comparison, for high learning rate students, observation notes indicated that visits constituted brief check-ins with students (e.g., "How are you doing?") or brief support to help the student advance on the task, for example, "Use slash to show division", "Keep the two", and "You always have to click enter".The number of observation notes related to visits was too small to establish reliable statistical comparisons in the frequency of these different teacher behaviors.
Taken together, SCREEN ALIGNMENT and TEACHER TALKING tended to include procedural instead of conceptual support for students with low learning rates.These students subsequently had a high frequency of tutor hint use.
However, around teacher visits, which tended to be prompted by disengagement, low learning rate students received more elaborate and conceptual support.Subsequently, they experienced higher rates of correct attempts in the AI tutor.

DISCUSSION AND IMPLICATIONS
The present study investigated effective teacher practices in AI-supported mathematics classrooms using ordered network analysis.It distilled relevant features of teacher practice that had differential associations with in-tutor actions across students with low and high learning rates in an AI-based tutoring system for linear equation solving.We discuss insights generated from the presented analyses that inform teacher-facing dashboards and reflection tools [29].
Our first research question is: Does including data about teacher practices improve the inference of student learning rates within the AI tutor, which traditionally considers (only) student-tutor interaction data?We found a significant improvement in model fit (adjusting for model complexity) based on  values, suggesting that teacher practices are associated with the efficacy of tutored problem-solving as measured in learning rates [27].Specifically, as we discuss, our results indicate that teachers monitoring students' screens (SCREEN ALIGNMENT) and talking to students reliably distinguished between students with low and high learning rates.However, after extended visits, the in-tutor behavior of students with low learning rates aligned more with those with high learning rates.This finding extends recent work relating visits to learning gains [23] by capturing learning process differences after visits.More broadly, while past studies in MMLA have established how the distribution of teacher attention relates to student learning outcomes [22,45], our comprehensive analysis contributes evidence and methodologies regarding (a) what teacher behaviors relate to in-the-moment learning differences and (b) how slower learners receiving support related to these teacher behaviors exhibited more desirable learning behavior afterward.Intervening on these teacher practices through teacher-facing classroom analytics could help improve the effectiveness of AI-supported practice in mathematics, as similar intervention studies making student disengagement and struggle states visible to the teacher have demonstrated significant learning gains [20].However, to test this hypothesis, future work would need to investigate whether teachers have sufficient resources to change their practices (i.e., time, attention) and whether our found associations are causal.
Our second research question asked how the temporal co-occurrence of teacher practices of teacher and student behavior differ by student learning rate.T/ONA analysis distilled SCREEN ALIGNMENT and TEACHER TALKING as having differential associations within both student groups.Low learning rate students had more hint requests after SCREEN ALIGNMENT and TEACHER TALKING while high learning rate students had more correct attempts after SCREEN ALIGNMENT and TEACHER TALKING.Two potential explanations can help make sense of this difference.
First, both behavioral codes might represent different kinds of teacher help across student groups, resulting in different learning behaviors.Our qualitative text replay analysis supported this interpretation by highlighting how students with lower learning rates tended to receive more procedural hints (i.e., what to put into the tutor) from the teacher around SCREEN ALIGNMENT and TEACHER TALKING.In contrast, high learning rate students received more conceptual help.
Conceptual help sometimes consisted of asking the student to reflect and self-explain, which prior work found beneficial to student learning [2].When students require instructional support from the teacher but get procedural help as to what to put into the tutor, they do not learn to generalize that instruction to new problem instances, prompting them to continue to use hints.Indeed, excessively requesting hints that eventually reveal the answers to problem-solving steps is expected to, and has been empirically shown to, relate to low-to-flat learning rates [17].Conversely, students receiving self-explanation prompts from the teacher might have learned from them, leading to subsequent correct attempts in the system.A second alternative interpretation can be attributed to disengagement, as students are generally expected to learn from AI tutor instruction if they try to.Indeed, the hint request code connected to states of idleness and tutor misuse in the presented T/ONA analysis.If students are disengaged and the teacher does not provide conceptual support to re-enter a state of learning, then students can not learn from tutor support and continue to engage in behavior like gaming the system, that is, exploiting tutor feedback and hints to advance in the problem without learning [3].
Our third research question asked how associations between teacher practice and in-tutor student learning depend on whether students have been visited.Connection-making of low learning rate students became more similar to that of high learning rate students after visits, with more connections to correct attempts after SCREEN ALIGNMENT and TEACHER TALKING.Why were visits effective for students with low learning rates?The presented qualitative text replay analysis suggested that (a) visits included extended conceptual teacher support for low learning rate students and (b) were prompted by student disengagement.The interpretation that the teacher may have visited students because they were disengaged is in line with recent work on teacher practices in AI tutor classrooms [23].An alternative explanation is that learning after visits might have been improved because disengagement was smaller post-visit, not because of conceptual help that directly aided learning.[14] by asking teachers to think aloud while inspecting ONA networks, gauging consistency in network interpretations across teachers.

Limitations and Future Work
We see three limitations to the present study that guide future work.First, our current analytical lens is limited in investigating why the teacher gave different types of support to different students.Our analytical results concerning effective teacher practices coincided with qualitative data pointing to the teacher offering conceptual over procedural support to students.What is missing is an analysis of how teacher assumptions (e.g., when different types of student support are warranted and effective; [18]) relate to specific actions in the present study's behavioral teacher codes.For example, the teacher featured in this study was familiar with the concept of hint abuse in AI tutors, which could have influenced the type of support the teacher delivered to different students.Future work may employ interview methods or audio recordings of teacher interactions during classroom learning to understand teacher decision-making better and how teacher-facing analytics change these assumptions.Such work could be guided by past efforts to study teacher knowledge changes about students through teacher dashboards [44].Future work is also encouraged to investigate a larger sample of teachers with potentially different assumptions, as our present study was restricted to a single teacher.
Second, our current analysis focuses on a limited set of antecedents that may have initiated teacher-student interactions.
Next to student disengagement, as supported by our data, more factors could have prompted interactions with specific students.Future work could investigate who initiates teacher-student interactions and why.While T/ONA can encode the temporal order of events, it cannot speak to their specific causality.Future work could investigate to what extent teacher-student interactions are a function of students' deliberate actions (via verbal requests or hand raises) versus actions that the teacher initiates (e.g., triggered by student disengagement).Differentiating between the two could have important ramifications for designing teacher support tools that help different students most (e.g., shy compared to extroverted students).Third, the present study grouped students by a global measure of student learning: overall learning rates.However, it could be that student learning rates fluctuate during classroom sessions and are associated with teacher practice throughout the classroom session.Future work could employ instructional factors analysis [10] to explore how in-the-moment differences in learning relate to teacher practice.Such analysis could guide live analytics on how teachers could allocate their limited resources during AI-supported classroom learning most effectively.

SUMMARY AND CONCLUSIONS
The present study advances the scientific understanding of effective teacher practice in classrooms learning with AI tutors and methodologies to make such practices visible.Considering teacher practices beyond in-tutor interactions significantly improved the inference of learning rates, a measure of favorable learning conditions during tutored problem-solving.Compared to prior work relating teacher behaviors to learner outcomes, we demonstrated how ordered network analysis can distill teacher behaviors related to learning rates and distinct learner interaction profiles, which may inform analytics-based teacher reflection and support tools.Students with low learning rates did not convert teacher screen monitoring and talking episodes into desirable learning behaviors and continued to use hints.However, after teacher visits, these students' in-tutor behavior approximated those of students with high learning rates.Our qualitative analysis suggested that differences in conceptual teacher help might explain these differential associations between teacher practice and learning rates.Prompting teachers to offer early conceptual support to students with low learning rates via teacher-facing analytics might make classroom practice with AI tutors more effective.

3. 1 . 1
Classroom Context.All data stem from a classroom study in the summer of 2022 over three days at a public school in the United States.The study involved eighty-five 7th-grade students from five different classes taught by the same math teacher, who had 16 years of experience at the participating school and previously taught with AI tutors.In 2022, the school reported that 45.9% of its students were classified as "Below Basic" based on Algebra 1 end-of-course test scores.The data collection occurred during their regular math class, lasting approximately 20 minutes daily.

3. 4 . 3
Model Specification.We specify five parameters to generate T/ONA models: (1) Unit: For RQ1 and RQ2, we set the smallest unit of analysis as each student with different learning rates given a specific date and class participation period.For RQ3, we additionally split the smallest unit of analysis into pre-visit and post-visit phases.(2) Horizon of observation rules: Given the structure of an AI-supported classroom, all students can access the teacher's talk and location changing as a public observation and activities on their screen as a private observation; however, students cannot access content on their peers' screens.In TMA, we used a filtering function to operationalize these rules and form personalized contexts for each individual.(3) Means rotation (MR) parameter: To have the first dimension maximizing the difference between two student groups, we used the grouping variable indicating low and high learning rates to be an MR parameter.(4) Codes: To compare model performance and fitting based on different combinations of modalities, we included different numbers of codes for the two T/ONA models with and without out-of-tutor interactions.(5)

4. 2
RQ2: Connection Patterns for Students with Low and High Learning Rates For a T/ONA model incorporating out-of-tutor data, student connection patterns are distinguished based on node positions on the left and right side of the means rotation dimension (x-axis; Figure 1).Students with low learning rates had more consecutive HINT REQUEST (  = 0.20,  ℎℎ = 0,  = 4,356,  = 0.55,  < .001).By contrast, students with high learning rates exhibited a strong connection from FIRST CORRECT ATTEMPT to CORRECT ATTEMPT (  = 0.29,  ℎℎ = 0.43,  = 1,384,  = 0.51,  < .001),indicating consecutive correct problem-solving step attempts.

Fig. 1 .
Fig.1.Connection-making between in-tutor and out-of-tutor behavioral codes for students with low learning rates (red) and high learning rates (blue).

Fig. 2 .
Fig. 2. Connection-making visit and after visit for students with low learning rates (left) and high learning rates (right).
3.1.3)beforeorafter students' actions of interest (e.g., hint requests) to better understand these student actions and their teacher practice context.We sample from 429 human observer notes during data collection related to teacher actions (i.e., notes on what content the teacher discussed with students or notable teacher behaviors).We repeat this procedure separately for data from low and high learning rate students, comparing teacher practice context by group by summarizing interesting observation notes and qualitative themes across them.3.4QuantitativeEthnography Methods3.4.1 Ordered Network Analysis.This study uses Ordered Network Analysis (ONA) to study teacher practices in classrooms working with AI tutors.ONA is a visual and mathematical representation of ordered relationships among To compare both explanations, future work could provide analytics-based interventions that nudge teachers to offer more early or conceptual support (or both) to disengaged students.Prior work on tools that guide teacher attention toward disengaged students suggests that such interventions can greatly improve learning [20].Evaluations of ONA network visualizations with teachers pose fruitful future work to gauge how they could be effectively employed in classroom settings or teacher professional development.As our present ONA visualizations are static, future work could generate dynamic visualizations that allow exploring network ties and connection strength.Future work could use the retrospective reflection technique