Designing Instructions using Self-Determination Theory to Improve Motivation and Engagement for Learning Craft

Recent HCI research has shown significant interest in investigating digital working instructions for guiding novices to perform manual tasks. While performance enhancement has been a primary focus, it is increasingly recognized that technology’s impact extends beyond objective metrics. Trainee motivation and engagement plays a pivotal role in enhancing learning outcomes and effectiveness. This paper investigates the utilization of principles from Self Determination Theory–clear attainable goals, meaningful rationale, and perspective taking–in designing multimedia instructions to enhance novice users’ indicators of psychological well-being. We present findings from an experiment involving real-world woodworking, where novice users, in a between-subjects study, followed interactive, in-situ projection-based guidance. Results demonstrate that adhering to SDT postulates can positively influence perceived competence, intrinsic motivation and task execution quality. These findings offer valuable insights for designing digital instructions to guide and train novices, emphasizing the importance of psychological well-being alongside task performance.


INTRODUCTION
Several industrialized countries are faced with an aging workforce, [25,57] resulting in an increasing number of experts set to retire in the coming years.This is especially problematic for crafting skills that are traditionally learned through apprenticeship.Examples of such crafting skills include woodworking, welding, brazing, and butchery.Novel strategies are therefore needed to train unskilled workers efectively without burdening expert workers [7].
Here, digital instructions are a viable alternative to in-person instruction, as they do not require continuous presence of an expert, can sustain self-controlled learning and practice, and are thus more scalable.
Prior research in designing instructions has mainly focused on operators' productivity and efciency when embedding diferent types of media, such as images, videos [20,59], and 3D models [26,31].Whether they can also be designed to improve motivation and engagement of operators has been unexplored [7].This is crucial because research clearly shows that learning environments rooted in an understanding of the drivers of human motivation and psychological well-being can improve learners' intrinsic motivation, persistence, and reduce drop-out rates [5,53,55].Ignoring these factors, on the other hand, can lead to disengagement, stress, and even burnouts in professional settings [14,18].
Self-determination theory (SDT) [49] is a theory of human motivation that ofers a framework to investigate facilitating factors that can positively infuence peoples' engagement, persistence, and motivation in everyday activities.While SDT divides human motivation into two categories, intrinsic and extrinsic motivation, research shows that intrinsic motivation often results in higher persistence, performance, and satisfaction [49].Research in SDT has investigated several facilitating factors theorized to increase intrinsic motivation.For example, studies in educational contexts show that when teachers provide clear, specifc goals and immediate feedback on students' performance, students are more likely to take ownership of their learning experience and feel more competent [34].Furthermore, research in sports shows that justifying instructions and providing a rationale when giving feedback leads to increased intrinsic motivation [42].Additionally, avoiding controlling language [6] and acknowledging participants' negative feelings also positively infuences competence, self-esteem, negative afect and subjective well-being [6,43].
SDT has been applied to e-learning environments in the context of formal education for transferring abstract knowledge, such as math, languages, or chemistry.These systems, however, often focus on the design of new features in e-learning environments to enhance the experience and increase the intrinsic motivation and engagement of students.Examples include features that ofer students control over the pace of instructional content [3,22,56] and features to support multiple learning paths [3].In contrast, crafting skills are learned by ofering step-by-step instructions and by observing and replicating experts [19].Transferring crafting skills thus requires a careful design of instructions and a frst-person view of the expert.Furthermore, in the context of crafts, it is especially important to intrinsically motivate trainees to adopt steps in their own practices, and to persevere as activities often progress slowly.The details of every step are however essential to achieve the desired result.Hence, it is important to investigate how SDT could be efectively applied to the design of digital instructions for learning crafts.For example, formulating instructions in a way that explains the rationale for a step or acknowledges trainees' feelings of doubt could be comforting and increase motivation and engagement.
In this paper, we study applying SDT to the design of digital instructions for crafting tasks to improve trainees' intrinsic motivation and engagement.Therefore, we conducted a user study in which participants were guided by digital instructions to perform crafting activities.Our experimental conditions compared task performance and subjective indicators of engagement and motivation when applying diferent SDT facilitating factors to the design of digital instructions.This paper makes the following contributions: (a) We show how SDT heuristics can be applied to the design of digital instruction for crafting tasks.(b) We empirically evaluate the impact of these SDT facilitating factors on task performance as well as cognitive and afective states, such as competence, intrinsic motivation, meaning, and stress.(c) We ofer guidelines on how researchers and practitioners can leverage our results in the design of digital instructions for crafting tasks.

RELATED WORK
This work draws from and builds upon prior work on the use of SDT in educational settings and e-learning applications, as well as theories and best practices for designing digital instructions.

SDT-Supported Strategies in In-person Educational Settings
According to SDT, intrinsic motivation refers to motivation that is sustained by the "satisfactions inherent in the activity itself" [50](e.g.enjoyment), and can be contrasted with extrinsic motivation in which the reason to persist in an activity depends on external rewards or reinforcements (e.g.performance contingent rewards or threat of guilt, shame or punishment) [50].In addition to the two basic forms of motivation, SDT proposes three psychological needs, including competence (to feel efective), autonomy (self-endorsement and ownership of actions), and relatedness (to feel connected and involved with others) [49].Intrinsically motivating contexts are associated with the fulfllment of these needs with positive efects on subjective (e.g., afect and vitality, meaning, reduced stress) and objective outcomes (e.g., work performance).Conversely, extrinsically motivating contexts are usually associated with psychological need thwarting.In educational settings, SDT provides an extensive collection of teacher/coach behaviors that efectively support psychological needs in in-person settings.These can broadly be categorized into three groups: improving participants' understanding of what and why an activity is performed, and optimizing how these two aspects are communicated.
Facilitating factors that relate to users' understanding of "what" should be done, include providing choice among learning activities, ofering additional information on the goal of the activity, and providing feedback.Choice supports autonomy, but too much choice can overwhelm, hence structure in terms of clear goals is necessary [30].Research shows that clear, specifc, and transparent goals enhance self-regulation as individuals can assess their progress compared to benchmarks and apply corrective eforts, thereby increasing perceived competence and hence intrinsic motivation [35,51].However, goals are efective only when they are formulated in terms of mastery orientation (i.e.those that target competence) instead of performance orientation (doing well relative to others) [49].In addition to goals, providing positive and corrective feedback is crucial to maintaining interest and persistence in activities [30,43].
Ofering a rationale or "why" instructions and feedback are given is essential for providing structure and creating an environment that fulflls individuals' psychological needs ( [49], pg.247).For example, providing a rationale helps participants endorse the goal when it is not immediately obvious, thus leading to a willingness to perform the activity rather than feeling externally coerced [58].Additionally, a concrete and well-formulated rationale adds meaning and relevance to activities [58] and enhances participants' persistence [13].When instructing or when ofering feedback in activities, a rationale increases engagement and intrinsic motivation as this reasoning allows individuals to stay focused on the task and more confdently monitor their behavior without feeling overwhelmed or hurt [42].In addition, it reduces signs of psychological ill-being, such as pressure or anxiety [12].However, similar to the goal, in order for a rationale to be efective it needs to be intrinsically oriented, that is, related to the task and not the person [58].
Finally, the communication style, or "how" information is communicated, is important.Reducing controlling language (e.g., "you should" or "you must") while communicating goals or ofering feedback emphasizes options (i.e., autonomy) instead of exercising control and reduces the pressure to adopt a behavior [49].Research shows that taking participants' perspective and acknowledging obstacles they might experience enhances self-determination [6].While a rationale justifes the instruction, the acknowledgement explicitly validates negative feelings (e.g., frustration) that can coexist when encountering obstacles, it increases perseverance and stimulates intrinsic motivation [12,48,49,58].Research also shows that in tasks that are not immediately enjoyable, rationales are most efective when combined with acknowledgements [52].This is because according to SDT, when tasks or steps are not immediately enjoyable, participants may feel compelled to follow instructions, accompanied by negative feelings (e.g.frustration, stress or tension).Providing a rationale clarifes the personal usefulness of the action and maintains focus on the task, while the acknowledgement validates negative feelings (e.g.frustration) that can co-exist with the activity, leading to persistence [12,49,58].

Application of SDT in the E-learning Domain
Several systems and studies show how to present and structure digital educational content in a way that improves students' selfdetermination.These systems often introduce features in e-learning environments to replicate best practices in the classroom.For instance, features include ofering choice to students on which modules to follow [3,8,16,22,56], allowing control over the pacing and sequencing of instructional content [3,22,56], supporting multiple learning paths [3], explaining rationale [41,56], and using non-directive language [41,56].Other features include ofering virtual feedback [3,8], displaying roadmaps [56] and visualizing progress [8,16], and virtual breakout rooms [3,8].Overall, these studies show that introducing novel features to support SDT facilitating factors in e-learning environments can positively impact intrinsic motivation.In contrast, research shows that supporting SDT facilitating factors as gamifcation features in e-learning environments, such as avatars, points, badges, leaderboards, achievements, and praise has less efect on intrinsic motivation as such features could act as extrinsic motivators [4,11,17,36,37].
Prior research has largely focussed on supporting SDT in elearning systems in the context of formal education and the transfer of abstract knowledge and skills, such as learning languages [3,41], chemistry [16], math [37], writing essays [22], decision making [8], programming [17] or psychological theories [11,36].This research instead investigates how SDT can be introduced to step-by-step instructions for learning crafting tasks.Intrinsically motivating trainees is especially important in crafting tasks as it requires persevering through important detailed instructions that have implications in later steps.

Design of Multimedia Instructions
A large body of empirical research focuses on the design and presentation of multimedia instructions which combines multiple modalities (e.g.text, audio, pictures, videos and animations).For instance, event segmentation theory [44,59] shows that instructions refecting the visual perception of events by humans improve retention.Similarly, the cognitive theory of multimedia [20,38,39] ofers principles to reduce the cognitive load of instructions by considering the limited capacity of human working memory.Studies also show that providing explicit explanations in multimedia educational content may be better for learning than having students self-generate the rationale based on their understanding of the learning material [10], but, explanations are benefcial only for beginners and can be redundant for experts [33].Digital instructions for supporting operators in industrial settings have been explored mainly for their efects on productivity and efciency when embedding diferent types of media, such as images, videos [20], and 3D models [26,31], ignoring aspects of motivation and engagement.
There are some indications that multimedia could support goal clarity.Krijgsman et al. [34] complemented instruction videos on a tablet with additional information provided by a teacher who leveraged SDT principles in their communication style.While students could view the videos in a self-paced manner, the teacher ofered additional feedback and information on the goal (goal clarifcation).The authors found no diferences in need satisfaction compared to the control group.They, therefore, hypothesize that the novelty and fun of using tablets may have enabled students to 'self-generate' goals and feedback from the videos.Following this reasoning, multimedia may possess inherent capabilities to represent goals and provide feedback.This is also supported by research on the efectiveness of multimedia instructions, which shows that videos and animations are generally superior to photos in communicating a continuous process [9,46,54].On the other hand, images ofer "uninterrupted, immediate access" [45] to information and are processed more actively as participants are forced to make inferences rather than passively watch videos [2,9,29].Therefore, researchers suggest adding "traces" to include persistent information when showing instruction videos to improve retention [1,45].Finally, text is better at transmitting descriptive information as compared to pictures which are better at depicting information, but the information should not be redundant [39].Hence, a combination of videos, pictures and text could highlight/draw attention to important details.

STUDY
To apply SDT to the design of step-by-step instructions, we focus on the facilitating factors goal clarity, acknowledgment, and rationale.We picked these 3 known facilitating SDT factors as a starting point for our research as they demonstrated their efectiveness for intrinsically motivating learners in other learning contexts [41,56].Other known factors, such as supporting interaction with experts [3,8], and automated progress tracking [16], are mainly implemented by supporting additional features in learning environments.We consider this beyond the scope of our work as we focus on the design of the instructions itself.
To support SDT the three aforementioned facilitating factors of goal clarity, acknowledgment, and rationale in step-by-step instructions for learning craft, we carefully considered the characteristics of diferent digital mediums: • Goal Clarity: Videos are commonly used to design craft instructions.While they efectively depict an entire process or workfow (Section 2.3), they are fast-paced and often less efective at drawing attention to specifc details.Annotated still images and text, on the other hand, can ofer more detailed explanations and draw attention to important details.We use these characteristics of text and still images to clarify the goal of sub-tasks in our work.• Acknowledgement: As demonstrated in past research [41] textual information can be efective at acknowledging participants' feelings of, for example, frustration.In our research, textual fragments are therefore added to acknowledge participants' feelings in diferent steps.
• Rationale: Textual information is also highly efective in communicating the relevance of a step.Clarifying the causeand-efect chain as well as potential personal benefts one can gain ensures that trainees are engaged and persist [58].
To study the efects of introducing the three facilitating factors to the design of digital instructions, we conducted a between-subjects experiment in which participants received step-by-step digital instructions to complete a crafting activity: creating fnger joints on two sides of a wooden plank using a plunge router and carpentry jigs (Figure 1).This workpiece will be the tabletop for a small pedestal shown in Figure 1h.
We chose a woodworking activity as it engages the users in a physical activity with a tangible result and confronts apprentices with a workmanship of risk [47], meaning some errors are nonrecoverable.This uncertainty makes the task challenging, as the quality of the end result depends on how well participants comprehend and execute the instructions, as well as some decisions that participants make themselves.Even though we provide participants with physical and cognitive scafolding, through the use of jigs and digital instructions, quality and performance variations are inevitable.For example, the workpiece can be incorrectly aligned with the carpentry jig.

Experimental Conditions
Our between-subject experiment consists of three conditions: a control condition (L0) and two conditions with increasing levels of facilitating factors (L1 and L2).All three conditions were identical, and formulated using non-controlling language, apart from the adaptions to the multimedia instructions to implement our experimental manipulations.Table 1 summarizes them and how the instructions were changed for each of the three conditions via text and multimedia.The multimedia instructions were created by flming a woodworking expert performing the activity using a GoPro camera from multiple perspectives, segmenting and editing the individual steps.Together with the expert, we also designed the instructions for each of the conditions using the principles covered in Section 3, as follows: • Level 0 (L0): This condition was given to the control group.
Participants were told that they will take part in a guided woodworking activity.The instructions consist of a combination of basic text fragments and video instructions, sufcient to complete the task.• Level 1 (L1): In addition to the information in L0, this condition included our implementation of goal clarity.At the start of the study, participants were told in more detail that they would create the countertop of a wooden pedestal.In addition to the instruction videos used in L0, pairs of photos detailing the before and after stages were included for each step, and a goal clarity fragment was added to the text (Figure 3).• Level 2 (L2): In addition to the information in L1, this condition included rationale and acknowledgement.At the start of the study, participants were told they would create the top of a wooden countertop to learn the basics of woodworking, a sustainable hobby they could pursue in the future.The same videos and before and after photos as in L1 were used for the instructions, but the textual instructions for each step also included rationale fragments that explained the cause-and-efect.For example, the value of a step within the workfow or why it is useful outside the scope of this study.In steps that we perceived as tedious or difcult, we also added an acknowledgement text fragment to recognize participants' possible negative feelings (e.g., "this step is somewhat tricky", or "we know this is a lot of information, but we ask you to persevere because it will be referred to in the upcoming steps").Rationale and acknowledgement fragments are combined in one condition, as research shows that together they are more efective at intrinsically motivating participants [52].
As our focus is on introducing SDT to instruction design to support factors that positively infuence engagement and intrinsic motivation, we take special care to minimize the risk of extrinsically motivating participants.Participants can pause/replay videos or go back to prior instructions as often as they like.Where appropriate, we ofer choice to participants by showing multiple viable solutions.The communication with participants uses non-directive language.Participants do not receive any performance objectives, such as time limits or error counts.Finally, participation in the experiment is voluntary, and no reward is ofered.

Task Design
We guided participants through various steps to create the tabletop of a small wooden pedestal, shown in Figure 1h, using a plunge router.We instructed participants to use a carpentry fnger joint jig, which facilitates using a plunge router as it defnes the tool path to guide movement.This allows for using a plunge router with repeated accuracy without requiring signifcant woodworking skills.The assembled pedestal has two side pieces and a back piece.We pre-cut the wooden plank(s) that participants would work on to the required dimensions to ft the jig precisely and pre-fabricated the two sides of the table and the back to keep the entire study time within a 1-hour time frame.
The task design for every condition is identical and consists of 5 tasks, each consisting of multiple steps, as shown in Figure 2. In the frst task (6 steps), the fnger joint jig is assembled from lasercut parts and clamped onto the workbench.Here, it is important that the jig is assembled without any gaps to ensure the workpiece properly aligns with the jig.The second task (5 steps) introduces participants to the use of the plunge router, and participants are instructed to correctly position the clamps on the jig to prevent interference with the movement of the router.We also deliberately gave all participants two options in this step for confguring the router to mill the full depth of the fnger joints at once or in three stages.Increasing the depth over three milling iterations results in a cleaner cut, reduces the force required, and prolongs the life of the milling bit.An incorrect depth setting will lead to a piece that does not ft and may require rework.In the third task (8 steps), the workpiece is clamped in the jig, and the router is used to mill the fngers on one edge of the plank.In the fourth task (7 steps), the workpiece is fipped both horizontally and vertically and clamped back into the same jig, and the fngers are milled on the other edge.In the ffth and fnal task, the workpiece is assembled together with  The video recordings were edited and annotated to highlight and clarify complex steps.For conditions L1 and L2, we created photos to visualize the before and after stages for each step.We additionally introduced annotations and cues in some photos to draw participants' attention to important details (Figure 3), for example, locking the depth of the plunge router (Figure 3a) or ensuring that there is no gap between the workpiece and the jig (Figure 3b).For difcult steps, we also created a split view, slowed down the video and added visual annotations to frames, for example, to show how the clamp is operated.In total, we made 21 instruction videos having lengths varying between 8 seconds and 37 seconds, depending on the complexity of the step.The same instruction videos were used in all three conditions.
As covered in Section 3.1, the textual instructions, complementing the videos, were adapted to the conditions.Table 2 ofers example textual instructions for the same two steps in all three conditions.

Hypotheses
Research shows that providing clear goals can improve the feeling of competence [35].Competence can be further enhanced when the rationale for steps is clear [42].Therefore, H1: as the number of facilitating factors in instruction increases, the perceived self-competence increases.
Next, while clear goals and rationale tend to increase positive experiences, acknowledgements are comforting when experiencing negative feelings [6].Hence, H2: as the number of facilitating factors increases, the perceived sense of pressure decreases.
Providing rationales makes it more likely that instructions are considered meaningful, relevant, and resonate with personal values [58].Hence, H3: as the number of facilitating factors increases, the perceived sense of meaning increases.
Research shows that increasing feelings of self-competence are directly related to a sense of enjoyment, a core predictor of intrinsic motivation [49].Therefore,  The introduction of facilitating fragments in conditions L1 to L2 can impact the mental load of instructions.Hence, H5: as the number of facilitating factors increases, the perceived mental load increases.
As the introduction of facilitating fragments in conditions L1 and L2 can impact self-confdence (H1), it can also impact the number of times instructions are consulted.Therefore, H6: as the number of facilitating factors increases, the number of times instructions are consulted decreases.
Elements that afect the level of self-competence (H1) and intrinsic motivation (H4), can also impact the quality of workmanship.Therefore, H7: as the number of facilitating factors increases, the quality of workmanship (i.e., the end result) increases.

Study Apparatus
Our experimental setup consists of an interactive wizard-of-oz, projection-based augmented reality application implemented in PyQt6.Figure 4 shows the layout of the application, which consists of three main components: instructional text, accompanying media, and interactive controls.Figure 4 also shows the diferences in application layouts between the three experimental conditions.The bottom part of the interface consists of controls for playing, pausing or repeating the video and moving between steps.All interactions with the system are logged.
The application was projected onto the work table using a shortthrow projector with sufcient brightness for an illuminated room (4000 Lumens, 1920x1080p resolution) (Figure 5).The projected display was 90 cm wide and 55 cm high, ensuring sufcient clarity and visibility.We chose spatial projection of instructions because it blends with the work area and does not consume a dedicated space on the workbench.At the same time, the projected instructions are close to the participants' work area and in their feld of view, which facilitates viewing instructions.We could have also used a head-mounted device (HMD); however, HMDs have been known to cause physical discomfort when worn for long durations [32], which may have introduced confounding factors.The current task and step number and a progress bar is continuously shown to indicate the progress throughout the activity.The controls were triggered by participants through touch but were operated via a wizard, observing participants through a video feed from an overhead webcam.This video feed was also used to record each session (Figure 5).We specifcally decided to not track tools or the workpiece or to project visual cues on top of the workpiece, such as where to position clamps.Research shows that such overlaid instructions can result in a loss of agency and passivity [21] which is crucial for our study design.Our projected instructions did not consider the current state of tools or the workpiece.As such, participants still had to think and decide how to execute every step, such as where to position the clamps on the jig.Doing so requires cognitive engagement throughout the activity, rather than passively following instructions.

Study Procedure
Before the start of the study, our task design, study apparatus and the conditions were pilot-tested to ensure all instructions were sufciently clear to perform the task.Based on feedback from these sessions, the contrast in the video and pictures was increased to ensure readability across all conditions.
Thirty participants from a university campus took part in our between-subject study design.Participants were recruited via a combination of public mailing lists and convenience sampling.This resulted in 10 participants for all three conditions (24 males, 6 females, 11 participants between 18-24 years of age, 12 participants between 25-34 years of age, 6 between 35-44 years of age, and 1 between 45-54 years old).Participants were randomly assigned to one of the three conditions, but we tried to ensure gender balance between the conditions (8 males, 2 females).No participant had prior experience in using a handheld router.24 participants have a background in computer science/information technology, and 6 participants have non-technical backgrounds.Four participants reported having some exposure to woodworking but not enough to be considered as actual know-how.One participant, for example, mentioned seeing his father perform woodworking.Two others reported participating in some basic DIY woodworking before but not using a handheld router.The fourth participant did use a CNC router a few times.These participants were distributed across the conditions (two in L1, one each in L0 and L2).No compensation or reward was provided for taking part in the experiment.
Participants were briefed that the aim of the experiment was to understand how users interacted with digital instructions while carrying out crafting tasks.Before beginning the experiment, they were given a general safety briefng on wearing eye and ear protection during the experiment and some general precautions to take when using the plunge router.Participants were told that the observer would only intervene in the event of a safety issue and would not answer any questions they may have about the activity.Additionally, the digital instructions for all groups also included safety precautions to be taken when carrying out the activity.To avoid additional pressure on participants to perform, we refrained from asking them to execute the activity as quickly or with as few mistakes.
Participants were told that the projected application works like a touchscreen as their hands were tracked via the webcam.Following informed consent (as approved by the ethics committee of the university) and after the participants had the safety equipment on, we requested them to start whenever they felt ready.Participants could repeat or revisit the instructions as often as they liked.The last page of the application instructed them to signal when they were fnished with all the tasks.The participants then flled in a digital questionnaire, and the study ended with a semi-structured interview.The interview started with a question about participants' general experience ("Tell me about your experience performing the activity"), how the instructions helped them ("What did you think about the instructions?"),and how they experienced decisions that had to be made in some of the tasks ("Tell me why you went with option X/Y").Finally, the observer provided participants feedback on their performance.The entire session lasted, on average, 80 minutes.

Measurements
The online questionnaire consisted of fve instruments.Perceived self-competence and tension were measured using the standard subscales from the intrinsic motivation inventory (6 and 5 items, respectively, Likert scale, 1-7) [40].The perceived sense of meaning was measured using a standardized scale from Huta et al. [28] (12 items, Likert scale, 1-7).Mental load was measured using the NASA TLX scale [24].Participants were asked to fll out these questionnaires while considering their experience from the task they performed in the study.Next, we assessed participants' intrinsic motivation by asking them to assess their willingness to repeat this activity in the near future, using the situational motivational scale (4 items, Likert scale, 1-7) [23].Our projected interface also logged the time spent in each step.All scales are standardized and designed to be applied in a broad range of activities, hence no modifcation was necessary.The quality of the result was measured by two expert examiners.The frst expert combines multiple years of practical experience in managing a large maker space lab with a strong interest in woodworking.He teaches product design students and makers a variety of courses, ranging from using handheld tools to CNC machines.He was also involved in designing the activity (subsection 3).The second expert is a certifed furniture maker and a woodworking enthusiast but does not pursue woodworking as a profession.The experts together devised a scale from 1 to 7 to evaluate the quality.Five of these points assessed basic quality.A 1 out of 5 was assigned for a not-functional cut requiring substantial rework, 2 if signifcant areas are left uncut, 3 if cuts are clearly imperfect but still functional, 4 if there are slight imperfections, and 5 if the cut is fnished exactly as intended.An additional point was awarded if the cuts were milled to the intended depth, and another point was awarded when the cuts were properly aligned on the x,y and z axes.After the experiment, the experts evaluated the workpieces independently.The fnal assessment of the two examiners was averaged to obtain a fnal score.

Data Analysis
Table 3 shows the descriptive statistics for the data extracted from the questionnaires relevant for H1-5.All instruments met the reliability criterion (Cronbach's > 0.7).All variables met the normality assumption (p-value of Shapiro-Wilk-test > 0.05), and one variable did not meet the homoscedasticity assumption (Levene-test > 0.05).
The scores representing the quality of the end result given by both experts had a high inter-rater agreement (Cohen's Weighted kappa = 0.923).Table 4 shows the descriptive statistics for our metrics on task completion time, video and switching events, and quality of the end result, relevant for H6 and H7.Video events count playing/pausing and repeating the video, and switching events refers to switching between the steps.Both these violate the normality and homoscedasticity assumptions.While task completion time was our main focus as it was not directly relevant to our hypotheses, L0 took the longest (M=3065.6s,SD=512.049),followed by L2 (2845.3s,SD=482.081) and L1 (M=2470.5s,SD=673.139).These diferences, however, were not statistically signifcant (ANOVA F(2, 27) = 2.866, p = 0.074).Figure 6 shows the time taken in each condition to complete each of the fve tasks.5 in the appendix.

Perceived Pressure (H2)
H2 is based on the averaged pressure/tensions scale from the intrinsic motivation inventory.It states that the perceived level of  pressure or tension, when performing the activity, reduces as the number of facilitating factors increase.A one-way ANOVA (F(2,27) = 1.751, p = 0.193, 2 =0.115), and linear contrast (t(-1.370),p = 0.182) are not signifcant.As Figure 7b shows, L0 reported the highest levels of stress (M=4.08,SD=1.337).L1 (M=3.1,SD=0.909) and L2 (M=3.36,SD=1.236) were similar in their ratings and lower than L0.

Perceived Meaning (H3)
H3 stated that participants who received additional rationale and acknowledgement in the instructions would perceive the activity as more meaningful, measured by the averaged meaning scale from [28].In Figure 8a it is apparent that the means are rather similar, and the Kruskal-Wallis test also confrms the same (H(2) = 0.113, p = 0.945, 2 =0.069).Regardless of the level of instructions, participants found the activity almost equally meaningful (L0 (M=3.682,SD=1.491),L1 (M=3.927,SD=1.410),L2 (M=3.8,SD=0.385)).L2 has a much lower standard deviation compared to the other two conditions, as indicated by the signifcance of the Levene's Test.

Intrinsic Motivation (H4)
H4 postulated that participants' intrinsic motivation to practice the activity in the future will increase with the levels of instruction.
The scores for intrinsic motivation from the situational motivation scale were averaged and analyzed.Results of a one-way ANOVA are just short of signifcance (F(2,27) = 3.202, p = 0.057, 2 =0.192).

Quality of End Result (H7)
H7 states that the quality of workmanship will increase as the instruction level increases.Figure 10c shows the results of experts' assessment of the quality of workmanship.The basic task completion score between L0 (M=3.0,SD=1.453) and L1 (M=3.3,SD=1.252) is similar, with L2 achieving a higher score (M=4.2,SD=0.753).When it comes to the total score (including the additional two points on workmanship), L0 received the lowest mean score (M=3.650,SD=1.617, with one outlier), followed by L1 (M=4.6,SD=1.647) and L2 had a slightly higher quality score (M=4.9,SD=0.626).A one-way ANOVA here reveals that the mean scores between conditions are not statistically signifcant (F(2, 27) = 2.240, p = 0.126, 2 =0.142); however, a linear contrast reaches signifcance (t(27) = 2.027, p = 0.05), confrming a linear relationship between instruction level and workmanship quality in H7.

Qualitative Feedback
We also considered and further analyzed participants' responses from the semi-structured interview.Below, we discuss their subjective experience with the presented instructions, as well as how they felt about decisions they had to make as part of some tasks.4.9.1 Comments related to the instructions.In condition L0, although the instructions were basic, participants found them clear.As P1 (L0) mentioned: "they were clear, I guess they were short and yeah, efective".P4 (L0) found them "good and clear", similar to P7: "it was very clear to me how to perform the activity...in terms of the  instructions themselves and such, I found it quite easy to follow".P13 and P23 (L0) similarly stated "the instructions were clear and I knew what I had to do" and "the instructions were quite clear, so that was a relief for me.".P16 (L0) felt positive: "I think I learned a lot [...] the AR tool did help in explaining the things that I had to do".However, some participants felt uncertain about the steps they were instructed to perform.As P21 (L0) mentioned: "I want to do things right.So I was, like, watching steps twice or thrice to make sure I was doing it correctly".One participant, P27 (L0) was frustrated after making a mistake: "for me, it would have been easier if it had been pointed out to check carefully [...] and so in the end I started to, to view 2 or 3 times each video just to make sure I didn't lose any detail".Participant P21, P10 and P7 suggested that the instructions could be improved by immediately communicating the goal in each step, unknowingly referring to goal clarity that was introduced in the other conditions.
In condition L1, participants' comments were mostly positive about the level of detail of the instructions, and participants reported fewer difculties with following the steps compared to L0.As P5 (L1) stated, "I sometimes got a little bit unsure about am I doing what is expected, and can I do something wrong?[...] But overall, I felt that I was supported enough to be able to achieve it.".P20 (L1) commented that he also found the instructions helpful: "I think they were quite clear, so, if I was able to do it, I guess somebody else will also be able to do it".P22 (L1) especially liked that the before and after photos allowed him to fgure out what to do himself: "for me, there was plenty of information, uh, only using the pictures.I also like to fgure out a little bit myself.".
Similar to L1, participants also felt positive about the instructions perceived in condition L2.As P3 (L2) mentioned, "It was very interesting.The frst time I used the router, um, it was very helpful with the videos.Also, the text alone gave a frst idea.".P12 (L2) noted, "It was well explained, and the system worked pretty good.First time working with such a router for me [...] I would do it again".P26 (L2) mentioned "it was actually quite fun.And in the end it didn't seem that difcult".Similarly, P6 (L2) stated: "it was quite enjoyable and relaxing.I had quite a lot of fun, actually.".Overall, we observed participants in condition L2 were more vocal about how they enjoyed the activity compared to conditions L0 or L1.4.9.2Comments related to decisions in steps.In Task 2, in all three conditions, we deliberately provided participants the option of milling the fnger joint at the full depth at once or in three steps with incremental depths.As detailed in Table 2, we only added a rationale for this decision in condition L2.In both L0 and L1, 3 participants used the maximum depth and 7 used incremental depths, in L2, 2 participants used the maximum depth and 8 used incremental depths.During the interview, we asked participants across all conditions why they picked one option or the other.
In conditions L0 and L1, participants gave diverse reasons.P14 (L1) transferred knowledge from a diferent domain: "I worked on CNCs before...And if you cut too much at a time, it can result in a more difcult cut or in a less good fnish or something.So I just thought, let's do it in steps".Other participants trusted that this more advanced procedure would be proposed for a good reason.For instance, P24 (L0) argued: "I didn't know what the intention was.I thought, okay, it's optional, but let's just follow it, because there might be a reason why people might use it".Some participants, such as P27 (L0) experienced anxiety because of a lack of information: "I didn't know why I had to do it in three steps or why I could do it in maximum.And that makes you then as somebody completely ignorant with woodwork to doubt and to hesitate.Shall I continue?Should I stop?And you know, you're not secure anymore or confdent in what you're doing".These variations in rationale are in line with existing research, showing that when individuals are uncertain and the environment does not explicitly provide information to support motivation, they fall back on their individual motivational orientations toward the environment [49].Lastly, a few participants in L0 and L1 used the maximum depth, and most did so because it seemed to them the easier option.As P16 (L0) stated "it seemed easier to me to just do the maximum, so I went for the easy option".Similarly, P22 (L1) thought "it was easy to just use maximum immediately".One participant, P25 (L1) could not distinguish between the router depth stops on the router and went with the default option "I couldn't see the screw that was pointing up, so I just went with what I got already." In L2, participants expressed more confdence when being asked why they picked the multi-depth option.As P12 (L2) mentioned: "I wanted to do it step by step.Yeah.Took it a little bit slower and seems cleaner".Likewise, P23 (L2) mentioned: "Yeah, because the explanation told me that it would give a better result".P26 milled in incremental levels of depth on one edge of the countertop but tried milling at full depth on the other edge afterwards: "I wanted to start with three, but then I was like, this takes a bit too long, so now I'm just going to try what it gives with one depth.So just to see the diference".Howevwer, P4 (L2) went for the maximum option "because the video said it was optional, and I thought it was easier", and P9 (L2) wanted to save time and "not put too much time to fgure out the instructions".

The Efectiveness of Applying SDT's Recommendations to Digital Instructions
The results of our study show that applying SDT's suggestions to design instructions can increase users' perceived competence (H1) and intrinsic motivation to engage in the activity (H4).As reported in Section 4.9, some participants who performed the study without these facilitating factors (L0) expressed feelings of doubt.
Quantitative results also corroborate these fndings, showing that participants in L0 navigated more between steps to consider previous instructions again (H6).Our observation that introducing goal clarity, rationale, and acknowledgement in digital instructions improves perceived competence also aligns with prior research investigating these principles in in-person learning [30].Hence, our study demonstrates the applicability of SDT to designing multimedia instructions for the craft domain, and replicates the results obtained previously in formal education [41,56].Although we could not accept H2, the level of pressure (or anxiety) felt by participants in condition L0 (M=4.1)still stands out compared to both L1 and L2 (M=3.1 and 3.36, respectively).More in-depth studies are needed in the future to get conclusive answers.We also could not conclude that facilitating factors increased participants' sense of perceived meaning for the activity (H3).However, in condition L2, in which we communicated the meaning of the task, the standard deviation was signifcantly lower compared to L0 and L1.We postulate that clearly communicating the meaning of a task convinces users of the value of the task.In conditions L0 and L1, on the other hand, we did not explicitly do this, and participants seem to have flled in those details themselves.Huta [27] suggests that perceived meaning in an activity depends on whether the activity resonates with one's values and aspirations.We believe this is true for our crafting activity, as many participants could fnd meaning in the task without us explicitly mentioning the value.
Although we expected that the additional content in the instructions in L1 and L2 would increase the mental load, we did not fnd any signifcant diferences between the conditions with respect to this metric (H5).This shows that introducing additional information to convey goal clarity, rationale and acknowledgements to digital instructions does not necessarily have negative efects on perceived mental load, and they should, therefore, not be left out to lower the mental load.In contrast to these results, we did fnd signifcant diferences between conditions in how participants used the instructions (H6).Without the introduction of these principles in instructions (L0), participants consulted the instructions of all steps on average twice, whereas only once for conditions L1 and L2.
Our study shows that introducing SDT's recommendations to instructions has a linear relationship with workmanship quality (H7).Both conditions L1 and L2 saw an increase in the workmanship quality compared to L0.This could be attributed to the additional details included in the instructions, some of which highlight details that are important for achieving a high-quality result.We also observed the workmanship quality in condition L2 was more consistent compared to L1 (Figure 10c).We believe this is because participants in condition L2 spend more time on task 2 (Figure 6).This task is instrumental for the quality as it focuses on the correct setup of the router and workpiece.Condition L2 explicitly acknowledged the challenges in this task and instructed participants to persevere and devote enough time to absorb the information.

Guidelines on Applying SDT to Designing Instructions
First, using video recordings of a workfow as the main resource for instructions results in uncertainty, as the many comments of participants show in Section 4.9.Introducing goal clarity by, for example, complementing the video with before and after photos for every step (L1) increases perceived competence (H1), intrinsic motivation (H4) and task quality (H7).At the same time, it reduces the number of times participants are required to consult the instructions (H6).Second, our results show that introducing rationale and acknowledgement in digital instructions leads to further improvements.The qualitative results (Section 4.9) show that if users understand the rationale for a decision, they are more confdent about the option they picked.When facing a difcult task, an acknowledgement of the difculty and additional rationale highlighting the importance of the step helps participants persevere.While introducing these principles in condition L2 increased the completion time (L2), some participants described the experience as relaxing, while others mentioned deliberately taking their time with the steps.The results, however, show that this additional information and time results in improved quality (H7), higher competence (H1) and intrinsic motivation (H4).
Third, one would expect that introducing goal clarity, rationale and acknowledgements in instructions increases mental load as they increase the amount of textual and visual information, thus processing these instructions also requires more time.We could, however, not accept H5 as the mental load did not signifcantly difer across conditions.One reason for this could be that diferent kinds of mental efort were required for the diferent conditions [15,39].In condition L0, extraneous mental load is dominant, as the mental load is more the result of participants having to self-generate concrete goals because of the minimalistic instructions.In condition L2, however, intrinsic and germane mental loads are more present as instructions are highly detailed, and the challenge is the result of the inherent load of the instructions and the execution of the task itself.So even though the fnal mental load across conditions is similar, it is always the goal in educational design to minimize extraneous mental load.Hence, one could postulate that introducing SDTderived techniques to digital instructions is benefcial for mental load as it can reduce extraneous mental load when applied properly.
Finally, while prior research shows that when a task is uninteresting or the goal is unclear, providing rationale can make it more meaningful [12,58].However, in our study, introducing a rationale did not signifcantly afect the perceived sense of meaning (H3).As argued in Section 5.1, when the task itself has a clear end result that users can grasp and reason about, a rationale may not always add additional meaning for participants.

Limitations and Future Work
While our work ofers concrete results and practical guidelines on how to efectively improve the engagement and motivation of operators as well as task quality by applying SDT's recommendations to multimedia instructions, we identifed four limitations that ofer opportunities for future research: First, thirty subjects participated in our between-subject experiment.Given the observed efect sizes in our study, a larger sample size may have helped us to increase the power of the tests and possibly confrm additional hypotheses that are just short of signifcance because of a single outlier, such as H2 and H7 (Figures 7b and 10c).Since this is the frst study of its kind, the efect sizes obtained could serve as reference values to conduct pre-study power analyses to design better studies in the future.Using the average observed efect size of f = 0.4 in our study, and 70% power, future studies should consider at least 54 participants.Yet, in crafting domains, large sample sizes may not be feasible as raw materials may not be reusable, resulting in increased material costs and wastage.
Second, some indicators of psychological well-being, such as perceived sense of meaning, are hard to assess or measure within the context of a study.Our study also did not show any signifcant diference across conditions in how participants personally valued the task they performed.Either there are no diferences between the conditions, or the context of the study did not allow us to correctly measure the value of these tasks within the participants' lives.Future studies could possibly be embedded in real-life situations to mitigate such difculties.
Thirdly, we followed guidelines in existing research for embedding multimedia content in instructions [38].As such, we did not exhaustively study the impact of introducing goal clarity, rationale and acknowledgement for all types of media .Future research could investigate this further, such as applying these suggestions to instructions ofered via particular modalities or other interactive technologies, such as novel virtual or mixed-reality systems.
Finally, our research focuses on applying SDT to the design of instructions for craft.Our results show that the introduced facilitating factors are efective in increasing the fulfllment of the psychological need for competence.Future research can further investigate facilitating factors to support the two other basic psychological needs of autonomy and relatedness for learning crafts.This can include designing new features, such as providing multiple activities for learning and practicing specifc crafting skills, or features to facilitate communication with other trainees and experts in the learning environment.

CONCLUSION
Learning new activities is not only limited to correctly following instructions.Research shows that when people feel competent and self-determined, they are willing to put in more efort and are intrinsically motivated to practice activities.From the HCI perspective, past work on designing interactive multimedia instruction for manual tasks has mostly focused on comparing how various types of visualizations or multimedia afect task time and errors.While SDT has been used in the design of e-learning applications, it has not been applied to the design of multimedia instructions for crafting tasks to explore whether the instructions could be designed in a way to improve participants' perceptions of self-competence, meaningfulness of the task, intrinsic motivation, and reduce stress.
Filling this research gap, in this paper, we investigated how SDT could be harnessed to modify the content of digital multimedia instructions in a crafting task.Subsequently, we successively applied the SDT principles of goal clarity, rationale, and acknowledgement and evaluated them in a between-subjects experiment.Results indicate that, indeed, applying these principles increased participants' perceived competence, intrinsic motivation and task quality, and decreased the need to refer to instructional content, without any signifcant changes in task time or mental load.The results of this study have implications for designing multimedia instructions to train operators in various domains of crafting and manual work.Future investigations can consider how other kinds of content visualizations (such as 3D models) or system-level technological capabilities (e.g.closed-loop systems) could be developed in line with recommendations from SDT.

Figure 1 :
Figure 1: Overview of the activity and fnal assembly resulting in a small wooden table.

Task 1 (Figure 2 :
Figure 2: Step-by-step sequence of the activity.There are fve tasks and 27 steps in total.

Figure 3 :
Figure 3: Start and end pictures used to clarify goals in each step.The annotations and cues provide additional support to visually inspect the quality of completion.The left pair (a) from task 2, shows the change in states when the router depth is set, the right pair (b) belongs to task 3 where the workpiece is clamped.

Figure 4 :Figure 5 :
Figure 4: Prototype of the user interface showing how instructional content was shown

Figure 6 :
Figure 6: Execution time for each of the fve tasks in each condition.Values can be consulted in Table5in the appendix.

Figure 7 :
Figure 7: Boxplots with the score distribution per instruction level for (a) perceived competence and (b) perceived pressure.

Figure 8 :
Figure 8: Boxplots with the score distribution per instruction level for (a) perceived meaning and (b) intrinsic motivation.

Figure 9 :
Figure 9: Mental load with diferent levels of instruction.

Figure 10 :
Figure 10: Boxplots with the score distribution per instruction level for (a) video events, (b) switching events, and (c) workmanship quality.

Table 1 :
Successive enhancement of instructions from L0 to L2 L1 + acknowledgement and rationale fragment Same as L1 the pre-fabricated two side pieces.Every session thus consists of 27 steps in total.

Table 2 :
Text excerpts from two diferent instructions.This allows to start cutting with a shallow depth and moving on to intermediate and maximum depths in a step-wise manner.L2 (acknowledgement and rationale fragment) This may sound counterintuitive, but using the maximum depth right away will not give you smooth and clean cuts.Instead, if you divide the cutting process into stages, where you start with a shallow cut, and deepen it in steps, you will prevent tearouts and prolong the life of the bit.

Table 3 :
Descriptive Statistics and Test Results for Normality and Homoscedasticity

Table 4 :
Descriptive Statistics and Test Results for Normality and Homoscedasticity