Measuring Affective and Motivational States as Conditions for Cognitive and Metacognitive Processing in Self-Regulated Learning

Even though the engagement in self-regulated learning (SRL) has been shown to boost academic performance, SRL skills of many learners remain underdeveloped. They often struggle to productively navigate multiple cognitive, affective, metacognitive and motivational (CAMM) processes in SRL. To provide learners with the required SRL support, it is essential to understand how learners enact CAMM processes as they study. More research is needed to advance the measurement of affective and motivational processes within SRL, and investigate how these processes influence learners’ cognition and metacognition. With this in mind, we conducted a lab study involving 22 university students who worked on a 45-minute reading and writing task in digital learning environment. We used a wearable electroencephalogram device to record learner academic emotional and motivational states, and digital trace data to record learner cognitive and metacognitive processes. We harnessed time series prediction and explainable artificial intelligence methods to examine how learner’s emotional and motivational states influence their choice of cognitive and metacognitive processes. Our results indicate that emotional and motivational states can predict learners’ use of low cognitive, high cognitive and metacognitive processes with considerable classification accuracy (F1 > 0.73), and that higher values of interest, engagement and excitement promote cognitive processing.

examine how learner's emotional and motivational states influence their choice of cognitive and metacognitive processes.Our results indicate that emotional and motivational states can predict learners' use of low cognitive, high cognitive and metacognitive processes with considerable classification accuracy (F1 > 0.73), and that higher values of interest, engagement and excitement promote cognitive processing.

INTRODUCTION
Self-regulated learning (SRL) is theorised as a dynamic, goal-driven group of learning processes in which learners actively set their learning goals and consciously enact different learning strategies to pursue their goals [61][62][63].Self-regulated learners oversee their progress in relation to learning goals and task requirements, and, if discrepancies are identified, they adjust their learning strategies or modify their goals to fulfill task requirements [46,63].Engagement in SRL has been shown to boost learner academic performance in different subjects [14] and promote the development of lifelong learning skills [10,32,69], a critical set of skills for future professionals to thrive in an ever changing modern job market.For this reason, the development of learner SRL skills has been an important part of researchers' and educators' agendas over the past few decades.
Regardless of the theorised significance and empirically documented benefits of SRL, SRL skills of many learners remain underdeveloped.Learners often struggle to productively navigate multiple cognitive, affective 1 , metacognitive and motivational (CAMM; [2,5,40]) processes that underlie SRL.For example, a student studying a history book chapter may struggle to make use of appropriate learning strategy to aid reading comprehension (cognitive process) while overseeing the effectiveness of that strategy relative to task requirements (metacognitive process), regulating their excitement prompted by the content in the chapter (affective process) and motivating themselves by increasing their own interest in task (motivational process).In this way, a learner may miss the opportunity to fully benefit from SRL.To provide learners with the required SRL support and help them boost their SRL engagement and learning performance, it is hence essential to understand how students enact SRL processes throughout a studying session.Many of the SRL processes, however, have not been easy to observe and measure, which, in turn, have hindered understanding and supporting of SRL in the past.
Recent advances in technologies for collecting multi-modal tracedata that students generate as they study in digital learning environments opened up new opportunities to improve the measurement of SRL [3,40].For this reason, SRL researchers have been increasingly leveraging student digital trace data, (e.g., navigational logs, mouse movements, and keyboard strokes), eye-gaze and electrodermal activity (EDA) data to study CAMM processes in SRL [4,15,16,18,20,37,46,56].Using these data channels, researchers have obtained an account of several CAMM processes (e.g., enactment of studying strategies [39,53], metacognitive monitoring and control [46,47], co-occurrence of emotions [33], and student attention allocation during studying [55]) salient to student learning experiences and achievement.These processes could not be reliably and unobtrusively detected in the past using more traditional data collection methods, such as self-report surveys and think aloud protocols [62].Despite this considerable progress, SRL researchers to date have mainly focused on measuring cognitive and metacognitive processes, and have rarely examined learners' affective and motivational processes within SRL (but see [29,56,57]), even though these processes have been theorised to affect learners' cognitive and metacognitive engagement as they study [61,63], e.g., a learner feeling excited about the topic they study may have engaged in prolonged metacognitive monitoring to ensure they comprehended key information properly.A possible reason may be that the detection of affective and motivational states has typically been considered difficult to researchers using data channels available in the past [5].Accordingly, there is a growing need among SRL researchers for introducing additional data channels to enhance the measurement of affective and motivational states, and afford a more comprehensive analysis and understanding of CAMM processes in SRL.
With recent development of technologies for psychophysiological measurements of brain activity, including electroencephalogram (EEG), a device that can unobtrusively record an individual's brainwaves in different contexts, researchers and practitioners in different domains have become increasingly interested in analysing brain activity data to understand human psychological processes enacted under different conditions, e.g., epileptic seizures and sleep disorders (for review see [65]).Despite its promises to advance understanding of human psychological processes, including affective and motivational states, to our knowledge, educational researchers have yet to leverage the brain activity data to deepen understanding of CAMM processes within SRL [40].To address this research gap and advance scholarly understanding of CAMM processes, in the present study, we explored the viability of using a wearable EEG device to collect data about academic emotions (i.e., excitement, relaxation and stress) and motivational constructs (i.e., interest and engagement) of learners who worked on a 45-minute reading and writing task using digital learning environment in the research lab.We examined this data to understand the extent to which they may predict and explain learner's cognitive and metacognitive processes as they emerge during the same session.
Our results indicate that affective and motivational states measured via EEG channel can predict learners' engagement in lowlevel, i.e., first time reading and re-reading, and high-level cognitive processes, i.e., elaboration and organisation of learning content, and also in metacognitive processes, i.e., task orientation, planning, monitoring and evaluation, with a considerable prediction performance (F1-measure > .73).In further analysis, we found that higher values of interest, engagement and excitement encourage, whereas higher values of stress inhibit learners' cognitive processes during the session.

BACKGROUND 2.1 Measuring CAM processes in SRL with learning analytics methods
SRL has been theorised as a temporal process unfolding over four loosely ordered phases: task understanding, goal setting and planning, strategy enactment and adaptation [61,63].Self-regulated learners develop a perception about task requirements and achievement standards (e.g., by studying task instructions and scoring rubric), set goals and develop plans specifying how to go about the task, and enact cognitive strategies to address the task.Throughout the learning session, self-regulated learners oversee (i.e,.metacognitively monitor) the utility of strategies they enacted and modify (i.e., metacognitively control) their strategies as they deem appropriate.Cognitive and metacognitive processes are hence considered central to SRL.Importantly, underlying conditions external (e.g., task instructions, length and time constraints) and internal to the learner (i.e., knowledge, motivation and affect) influence the way how an individual learner will engage cognitive and metacognitive processes during the task [62].For example, a learner may perceive the utility of the same learning strategy depending on their emotional and motivational state at that point in a learning session.For this reason, affective and motivational processes are also considered central to SRL.Over the past years, researchers have utilised different learning analytics methods to detect and study the relationships among CAMM processes.To this end, researchers have often collected data from multiple data channels, e.g., self-reports, voice, log, eyetracking and, more rarely, physiological data including EDA and EEG (for an overview see [40]).In most of the studies published so far, researchers have examined the relationship between cognitive and metacognitive processes within SRL.This line of work has contributed a significant new knowledge to SRL research, e.g., by understanding differences in the use of learning strategies among students based on their log and eye-tracking data [19,50,55] and by informing innovative and scalable SRL interventions based on student learning processes inferred from log data [36,37].
In a rather small group of studies, researchers investigated learners' affective and motivational states that condition SRL.For instance, Tormanen et al. [57] utilised video and EDA data to reveal the relationship between learner affective conditions and their emotion regulation behaviour, and demonstrated that learners who were in need of restoring their emotional grounds for collaboration with peers were more likely to regulate their emotions.Taub and Azevedo [54] analysed video and log data to understand learners' metacognitive monitoring and strategic learning behaviors as the learners tested hypotheses in a game-based learning environment.The authors found that learners who used strategic testing behaviours demonstrated high levels of efficiency and emotions during the task.Gašević et al. [24], Hong et al. [29] and Zhou and Winne [68] analysed self-reports and learners' log data, and documented the relationships between learners' motivation and their approaches to learning [24], metacognitive behaviors [29] and goal orientation [68].
In line with findings in [40], we posit more research is needed to advance the measurement of affective and motivational processes, and investigate how these processes affect learners' cognition and metacognition as they unfold during the learning session.In the present study, we made a step towards this goal by harnessing the potential of student log data and EEG technology to dynamically and unobtrusively record cognitive, metacognitive, affective and motivational processes.

Electroencephalogram (EEG) technology in educational research
Electroencephalogram is a psychological measurement examining the relationship between physiological (i.e., activity of neural cells in brain cortex) and mental processes [65].The EEG headset contains small electrodes that are placed on the designated positions on the participants' scalp to record the voltage caused by brain activity.The positions of the electrodes are determined based on prior psychophysiological research that linked different psychological processes to different parts of the brain cortex and its levels of activity, e.g., low activation of the right frontal cortical hemisphere may indicate an approach response, i.e., a positive emotion [48].
The EEG method has been widely utilised in different domains to appraise human-machine interaction, enhance user authentication and security, and, most often, identify individual's mental states such as confusion, fatigue and emotions [60].
Due to its capabilities to identify psychological processes critical for learning, educational researchers have been increasingly using EEG to study learning in a classroom and lab environments.To date, researchers have appeared to mainly focus on using EEG to measure learners' attentional and meditational constructs [65], while in a limited group of studies researchers have utilised EEG to measure emotional and motivational processes.For example, Salvador Inventado et al. [31] examined the change in student frustration and excitement after receiving feedback as they were studying topics in object-oriented programming.More recently, Xu et al. [66] investigated confusion as learners engaged in logical reasoning and showed that learners who were more confused tended to experience a higher cognitive load during the task.Ghergulescu and Muntean [25,26] utilised EEG to measure learner engagement, one of the main indicators of motivation [7].The authors unveiled the changes in learner engagement during the task in a game-based learning environment [26] and demonstrated the methodological benefits of using sensor-based over questionnaire-based methods to measure changes in learner motivation over time [25].In the present study, we take the application of EEG in educational research a step forward and use this method to detect learners' emotional and motivational processes and further document how these processes relate to SRL.More formally, our study was guided by the following research questions: RQ1: To what extent can learner affective and motivational processes automatically detected as they work on the reading and writing task predict their engagement in cognitive and metacongnitive processes during the task?RQ2: To what extent can learner affective and motivational processes automatically detected as they work on the reading and writing task explain their choice of specific cognitive and metatognitive processes during the task?

METHOD 3.1 Context and participants
We conducted a laboratory study at a large, research-intensive university in Australia 2 .We recruited 22 participants, including 9 undergraduate and 13 graduate students who reported various majors (e.g., data science, pharmaceutical science and finance).One of the participants opted out during the study and we excluded this participant's data from the analysis.The study involved three main tasks: pre-task assessment, essay writing from multiple texts, and post-task assessment.In the pre-task assessment, the participants were asked to complete a pre-test that included 15 multiple-choice questions assessing the participants' prior knowledge of the topics included in the reading materials for the task.After completing the pre-task, the participants received a brief tutorial on how to use the learning environment created to support reading and writing tasks (detailed in Section 3.2).
During the main task, the participants engaged in a 45-minute multi-text writing session.They were asked to selectively read nine brief texts (mean length=178.55 words, SD=125.24)written in English.The readings included two main topics: (1) "Artificial Intelligence" (AI, four texts, mean length=182.25 words, SD=166.55)introducing the concept of AI and discussing some of its current limits; and (2) "The School of the Future" (five texts, mean length=175.60 words, SD=102.67),describing how the use of modern AI-based technologies such as virtual and augmented reality has enhanced education, discussing the concept of AI literacy and summarising the perspectives from educational stakeholders towards using AIbased technologies in education.Based on the texts they read, the participants were tasked to compose a short, 200-300-word essay to (1) explain the concept of AI, (2) describe the application of AI in their daily life, and (3) describe the changes they envision that AI-based technologies will introduce to education in the future.
After the multi-text writing session, the participants completed a post-task assessment consisting of 15 multiple choice questions that assessed their knowledge of the reading topics.An AUD$60 gift card was offered to each student to compensate for their time participating in the study.

Learning environment
For the purpose of the study, we developed a technology-enhanced learning environment that was integrated into the Moodle learning management system (Figure 1).The environment included (1) navigation area with task instructions, scoring rubric and reading pages, (2) reading area, and (3) essay writing area.The environment also included several theory-aligned tools that support students as they engage in reading comprehension and writing production.For instance, students were afforded the opportunity to highlight the spans of texts in the reading materials, and then assign different cognitive and metacognitive tags [64], e.g., useful, concept, important, and confusing, or attach open-text notes to those highlighted spans to further aid text comprehension.The students could review their annotations by expanding the List of Annotations.The Searching Tool allowed students to retrieve specific annotations with keywords, whereas the Planner Tool allowed students to plan their writing session, i.e., to specify how they want to engage with the task (e.g., dedicate 5 minutes to skim over the reading list, then dedicate 15 minutes to thoroughly read and annotate selected texts, and then dedicate another 15 minutes to develop an essay draft).The timer tool helped students oversee the time remained for the task.

Data collection
To answer our research questions, we collected digital trace data and EEG data learners generated during the session.In addition, we collected learner eye-gaze and physiological data and used these two channels as baselines in examining the prediction performance of EEG data, because these data channels had been widely utilised in broader research detecting individual affective and motivational states [34,42,45].

Digital trace data.
The participants' interactions with the learning environment were automatically recorded by the learning platform in the form of timestamped navigational logs, keyboard strokes, and mouse activities (i.e., clicks and moves).

EEG.
We utilised a 14-sensor, industry standard headset Emotiv EPOC X3 .See our digital appendix 4 for the layout of the electrodes to record participants' EEG data.The raw EEG data were sampled at 256 Hz and captured by Emotiv Pro5 software.The algorithms in the software were used to analyse and process the raw EEG data.The algorithm computed several performance metrics in real time from the EEG data as described in the headset documentation 6 .Specifically, the metrics included "stress" (sense of ease with a task), "engagement" (alertness and conscious direction of attention to task-related materials), "interest" (degrees of attraction to the current activity), "excitement" (the feeling of positive physiological arousal), "relaxation" (the ability to recover from intense concentration) and "attention" (fixed attention to a specific task).The output metrics values were scaled to the range of 0 to 1 indicating the corresponding levels.

Eye-tracking.
We recorded the participants' gaze data using a screen-based eye-tracker -Tobii Pro Nano7 .The eye-tracker sampled gaze data at 60 Hz, with 9-point calibration.It was mounted at the bottom of a screen whose resolution was 1920 x 1080.A chin rest 8 was introduced to ensure the reliable collection of gaze data.Tobii Pro Lab9 was used to record, inspect and analyse the participants' gaze data.The default algorithms in Tobii Pro Lab [41] were adopted to process participants' fixations (i.e., periods where eyes remain focused at a particular location), saccades (i.e., rapid eye movements between two consecutive fixations), and pupil diameters.

CAMM process measurement
3.4.1 Cognitive and metacognitive processes.We utilised the learners' digital trace data to to infer cognitive and metacognitive processes learners enacted during the multi-text writing task.We utilised the theoretical framework of SRL processes proposed by Bannert [8] to guide our work at this stage.In particular, the framework categorised SRL processes into three major categories: metacognitive, cognitive (low and high) and motivational.Metacognitive processes include Orientation, Planning, Monitoring, Evaluation, whereas cognitive processes include processes resembling low cognition (First-reading and Re-reading) and high cognition (Elaboration and Organisation).We note that the definition and coding of motivation-related processes in the framework rely on the use of learner verbal expressions as commonly collected with think aloud protocols, which could not be observed from digital trace data and hence it was excluded from the analysis, following the approach reported in the previous study [20].Details about the theoretical framework describing cognitive and metacognitive processes in SRL is provided in Table 1.
Further, we utilised a trace parser that was based on the theoretical framework above.The trace parser has been empirically validated and used in several prior studies on SRL (e.g., [18,20,37,53] to automatically transform raw trace data into discernible SRL processes according to pre-determined rules 11 .There were two major components to the trace parser: the action library and the process library.The participants' actions captured in the trace data were labelled according to the action library.For illustration, if a learner created, deleted or edited the highlights while reading textual materials, such an action was labelled as HIGHLIGHT_EDITING.In total, the action library comprised 17 distinct learning actions, detailed in Table 2.
Subsequently, the sequences of annotated learning actions were mapped to the SRL processes defined in Table 1 according to the process library.An example First-reading process is READING->HIGHLIGHT_EDITING/NOTE_EDITING->READING, in which learners read the textual materials, highlight specific text spans and/or take notes, then continue reading.A total of 27 sequences of learning actions were categorised into the SRL processes, detailed in Table 3.In this way, we identified learners' cognitive and metacognitive processes as they unfolded during the learning session.

3.4.2
Affective and motivational processes.The EEG headset was accompanied with a full suite of software tools (Emotiv Pro -see Section 3.5.4)which streamlined the extraction of affective and motivational features from the raw EEG data.We thus extracted the following affective features: (1) stress, (2) excitement and (3) relaxation.We opted to measure stress and excitement as stressrelated and positive psychological processes have been considered important academic emotions in SRL [43].Since relaxation represents the individual's ability to recover from intense focus and productively change leaning behaviour, which is a characteristic of skilled self-regulated learners, we included this feature in the analysis, as well, to examine its effects on learner cognition and metacognition.Further, we extracted the following motivational features: (1) excitement and (2) interest.These processes have been commonly considered important indicators of learner motivation in prior research (e.g., [7,27]).

Data preparation
3.5.1 Time-series approach to detecting cognitive and metacognitive processes.To answer our research questions, we considered the prediction of learner cognitive and metacognitive processes as a time-series prediction problem.Hence, we utilised learners' data separately coming from EEG, eye-tracking and physiological channels within a specific time span to predict learners' cognitive and metacognitive processes in the next time span.Inspired by previous research reporting on a broader EEG prediction task that examined the EEG predictive performance in different time spans [12], we adopted a time span of 1 second (1 Hz) in our study.In other words, data from a current second would be utilised to predict the processes in the next second, guided by our research questions.
Following this approach, we obtained 33,944 samples in which learners engaged in SRL processes in the next second and 13,535 samples in which we did not detect evidence of engagement with SRL processes.More specifically, the samples with identified future self-regulated processes comprised 16,610 low cognitive processes

Planning
Plan the learning process by arranging activities and determining strategies, e.g., planning the time to complete the writing task.

Monitoring
Oversee the learning progress according to task requirements and/or plans, e.g., checking if the time remaining is sufficient for completing the task.

Evaluation
Evaluate the learning process, e.g., assessing whether the notetaking benefit the understanding of reading materials.

Low cognition
First-reading Read learning content.

Re-reading
Re-read learning content.

High cognition Elaboration
Elaborate by connecting content-related comments and concepts to reason about and associate components of the content.

Organisation
Organise learning content by creating an overview, writing down information point by point, summarising, adding newly generated information, and editing information by rephrasing it or integrating it with prior knowledge.We note that, if in a following second more than one SRL processes was observed, we considered the first process chronologically observed as the process to be detected.It is worth mentioning that our study treated each time interval from the sessions as a unit of analysis as we aimed to promote the generalisability of such predictive modelling to inform educational design.
3.5.2Gaze data pre-processing.We obtained learners' timestamped gaze information, including their pupil diameters and their pupilary states (i.e., whether they were fixating or saccading).Because pupil sizes varied among learners, we calculated the percentage changes in pupil diameters based on their baseline measurements.
Following [58], the baseline measurements were determined by averaging the learners' pupil diameters from their fixations in the initial calibration carried out at the start of the study.Additionally, we one-hot encoded learners' pupilary states (i.e., "10" suggested that the learner was fixating and "01" suggested that the learner was saccading).Since the gaze data were sampled at a different frequency (60 Hz) compared to the transformed EEG data (4 Hz), we down-sampled the extracted gaze features to 4 Hz.For the percentage changes of pupil diameters, we averaged each of the 15 values.For the pupilary states, we used "11" to suggest that learners engaged in both fixation and saccades, and "00" to indicate that learners' pupilary states were unclassified (i.e., the software either could not detect both eyes or could not determine whether individuals were engaging in fixations or saccades).

3.5.3
Wristband data pre-processing.We utilised the PyEDA toolkit [30] to pre-process the EDA data collected by the wristband (e.g., applying a low-pass Butterworth filter).The BVP readings were processed following a standard practice [67] (i.e., applying a 4ℎ order Butterworth bandpass filter with cut-off frequencies of 1Hz and 8 Hz).Then, the processed BVP readings were down-sampled to 4 Hz.The heart rate values were padded with repeated values to achieve the same 4 Hz frequency.
3.5.4EEG data pre-processing.The pre-processing steps were conducted with the built-in algorithm of the Emotiv Analyzer web tool 12 which was designed to clean the EEG signals captured by the headset 13 and thus ensure the data was not contaminated by occasional noise (e.g., head movements).In addition to the preprocessing, removing the artifacts from the EEG data was critical to ensure that the captured data was mainly brain activities instead of other activities (e.g., eye blinks and heart beats) [13].We utilised EEGLAB MATLAB toolbox [13] for this purpose.Specifically, we adopted the Infomax Independent Component Analysis (ICA) to identify the artefacts in the EEG data [38].The components categorised as containing ocular (e.g., eye blinks), muscular or cardiac artefacts were removed from the data.Informed by prior research [23], we obtained fine-grained power spectrum metrics from the EEG data by calculating the power in 5 frequency bands (delta -1-4 Hz, theta -4-8 Hz, alpha -8-12 Hz, beta -12-30 Hz and gamma -30-50 Hz).This was achieved by running Fast Fourier Transform (FFT) [11] on each of the 256 EEG readings (1 second) with a Hanning window of 64.The FFT yielded 70 features (14 channels x 5 bands).
Additionally, for each second, 4 band power values were calculated for the 256 raw readings, producing a 4 Hz data.
3.5.5EEG performance metrics.We utilised the performance metrics computed by the Emotiv Pro software without further processing as the algorithms were developed through rigorous experiments in prior research to produce reliable measures [17].The metrics values were converted to the same frequency of 4 Hz by padding with repeated values.

Model implementation and feature extraction
To answer our research questions, we developed several supervised learning models trained on different sets of data and compared their classification performance.Since our prediction task was formulated as a time series classification problem, we have utilised the state-of-the-art time-series classification method ConvTran [21], as it can capture dependencies among processes over time using the transformer-based architecture [59].Specifically, it incorporates two position encoding techniques designed for time series data to encode the positions and order of the measurements in time series.Additionally, ConvTran incorporates disjoint temporal and spatial convolution in its encoder [22].This unique feature enhances its ability to capture spatial correlations among the input EEG channels, which is particularly relevant to our study.The default parameters of ConvTran were utilised in our study.
To answer our RQ1, we trained four models using: 1) wristband data; 2) gaze data; 3) EEG data and 4) EEG performance metrics extracted from the Emotiv headset, to predict the SRL processes (i.e., high cognitive, low cognitive or metacognitive processes, Table 1) in the next time span.We thus formulated the task as a binary classification problem predicting each group of cognitive and metacognitive processes separately.For example, if metacognition was predicted to occur in the next time window, label 1 was assigned to the corresponding feature, whereas the remaining features (i.e., low and high cognition) were assigned label 0. To address the imbalance in the dataset (see Section 3.5.1),following prior research [36], we randomly under-sampled data from the majority class with the RandomUnderSampler of the Python package Imbalanced-Learn [35].We performed five random under-sampling procedures for each binary classification and reported the models' average performance (with standard deviation) to ensure that our results were reliable.We randomly retained 20% of the data for testing the implemented models.The remaining 80% of data were further randomly split in an 80:20 ratio, with 80% used for training the models and 20% used for validating the models.For each model, we run 100 epochs with a batch size of 128, an initial learning rate of 0.003 and a dropout rate of 0.01.The model with the best validation performance over 100 epochs was retained as the final model and was evaluated based on its classification accuracy, precision, recall and F1 scores.
To answer our RQ2, we examined the best performing model and interpreted its features using DeepLift [52], an explainable AI technique used to reliably extract important features in broader EEG-based and time-series classifications [1,49,51].DeepLift estimates the importance of each feature in the sample by comparing each feature to a reference input to examine how the change across selected features contributed to the changes in the predicted outcomes.To determine a reliable reference input that accurately reveals feature importance, we selected a sample that has been correctly predicted by the models in all five runs from RQ1.

RQ1 -Prediction of cognitive and metacognitive processes?
We observed that the baseline model using eye-gaze data identified high and low cognitive processes with considerable classification performance (F1 scores of 0.67 and 0.63, respectively), whereas the performance of this model in identifying metacognitive processes was noticeably lower (F1 score of 0.43, Table 4).Another baseline model using EDA, BVP and HR data collected by the wristbands identified high cognitive, low cognitive and metacognitive processes with considerable classification performance (F1 scores of 0.63, 0.60, and 0.62, respectively, Table 5).
The model based on the EEG data, presented in Table 6, identified learners' high and low cognitive processes with an outstanding classification performance (F1 scores of 0.92 and 0.91, respectively), whereas the classification performance of this model in identifying metacognitive processes was considerably lower (F1 score of 0.55, Table 6).
The model trained using the EEG performance metrics computed by Emotiv Pro identified all the processes with considerable classification performance with F1 scores of 0.83, 0.83 and 0.73 for low cognitive, high cognitive and metacognitive processes, respectively.
Even though the EEG-based models generally outperformed the baseline models, our results suggest that simply relying on the EEG data could not predict the learners' metacognitive behaviours in the next time span.As the model based on EEG performance metrics identified all the SRL processes with considerable accuracy,

RQ2 -Explanation of cognitive and metacognitive processes?
We show the important emotional and motivational features in Figures 2. Our results indicate that stress was negatively, and interest and engagement were positively contributing to learner use of high cognitive processes.In other words, learners' interest and engagement would more likely trigger their high cognition, whereas their stress would likely inhibit high cognition in the next time span.We also observed that relaxation was relatively unimportant in facilitating learners' high cognitive processes.Similarly, we observed that learners' levels of interest and stress were related to their low cognitive processing.However, level of learners' excitement appeared to be more related to low cognition, compared to engagement which was more related to high cognition.We also found that learners' relaxation may prevent them from engaging in low cognitive processes.Unlike the cognitive processes, we noticed that learners' interest in one time span was not relevant to their metacognitive processing in the next time span.We also found that learners' relaxation and excitement were more likely to prompt their metacognition, with engagement likely hindering metacogniton.

DISCUSSION AND IMPLICATIONS
Even though affective and motivational processes have been theorised to influence learner cognitive and metacognitive engagement in SRL [61], limited research has been done to date to unobtrusively measure those processes and analyse the relationships among them as they unfold during the learning session [40].To contribute new knowledge to SRL research, we utilised multiple data channels and measured learners' CAMM processes as learners worked on a multisource writing task in a digital learning environment.Specifically, we observed learners' cognitive and metacognitive processes from their digital trace data, and learners' affective and motivational processes from their brain activity data recorded over the same learning period using the EEG headset.We also demonstrated that the state-of-the-art framework ConvTran can be successfully applied to predict SRL processes between different data channels and in a temporal manner, making this algorithm a potential addition to educational technologies that dynamically measure and support student SRL processes as they emerge over time.Our results demonstrate that learner affective and motivational states in one time span may predict learner cognitive and metacognitive behaviours in the following time span with a considerable accuracy (RQ1).This finding empirically confirms theoretical assumptions about SRL as temporal and contextual process in which learner affect and motivation set conditions for later enactment of cognitive and metacognitive strategies [61][62][63], e.g., reading, elaborating and monitoring.Conditions where learner interest and engagement are high, and stress low in one time span, appeared to promote learner cognitive behaviours in the next time span, in particular their engagement in high cognition (RQ2).In light of prior research (e.g., [44]), this finding confirms the theorised positive relationship between learners' situational motivation and their use of deep information processing strategies such as elaboration and organisation in the context of our study.The finding also confirms the inverse relationship between stress levels and learner's deep approaches to studying [9].Further, learner excitement about the topic, the ability to refocus their own mental processes, and their low levels of stress appear to create productive conditions that prompt metacognitive strategy (RQ2), e.g., task orientation, planning and monitoring, resonating with prior research that entertained the positive relationship between individual's abilities to control their mental processes and to engage in metacognition while maintaining positive affect during the task (e.g., [6,28].However, the observed negative effects of increased engagement on metacognitive strategy use (RQ2) are somewhat counterintuitive and should be further investigated in future studies.

LIMITATIONS AND FUTURE WORK
We identified several limitations to our study.First, since the participants worked on the same task in the lab, and their prior knowledge measured at the outset of the lab session was not statistically significantly related to volumes of their cognitive and metacognitive processes during the task, we deemed the external SRL conditions to be approximately the same across all the participants.For this reason, the external conditions were not examined in this study.We, however, acknowledge that this difference may become more prominent across larger population of students working on task with different requirements in classroom settings.We hence posit the constructs representing external conditions should be studied more thoroughly in future research.Second, even though our choice of the 1-second time span for temporal analysis of SRL processes was motivated by prior research [12], we consider the choice of the most appropriate time span to be context-dependent and plan to further evaluate our work using different time spans in the future.Third, some motivational constructs, e.g., self-efficacy, utility value and task value, have not been measured in our study.As these constructs are commonly related to learner perceptions, they have been considered challenging to dynamically measure during the task.

Figure 1 :
Figure 1: Learning Environment and Multi-Text Writing Task

Table 1 :
Theoretical framework for trace-based measurement of SRL processes requirements, prior knowledge, feelings about the task and learning activities needed to accomplish the task, e.g., reading task instructions or scoring rubric.

Table 2 :
The action library for labelling learning actions.

Table 4 :
Classification performance with gaze data collected from Tobii Nano Pro.

Table 5 :
Classification performance with wristband data collected from Empatica E4.

Table 6 :
Classification performance with EEG data collected from Emotiv Epoc X.

Table 7 :
Classification performance with performance metrics extracted from Emotiv Epoc X.