Understanding the Effect of Reflective Iteration on Individuals’ Physical Activity Planning

Many people do not get enough physical activity. Establishing routines to incorporate physical activity into people’s daily lives is known to be effective, but many people struggle to establish and maintain routines when facing disruptions. In this paper, we build on prior self-experimentation work to assist people in establishing or improving physical activity routines using a framework we call “reflective iteration.” This framework encourages individuals to articulate, reflect upon, and iterate on high-level “strategies” that inform their day-to-day physical activity plans. We designed and deployed a mobile application, Planneregy, that implements this framework. Sixteen U.S. college students used the Planneregy app for 42 days to reflectively iterate on their weekly physical exercise routines. Based on an analysis of usage data and interviews, we found that the reflective iteration approach has the potential to help people find and maintain effective physical activity routines, even in the face of life changes and temporary disruptions.


INTRODUCTION
Maintaining regular physical activity helps improve people's quality of life [8].It is recommended that adults engage in at least 150 minutes of physical activity per week [78].Moreover, physical activity must be performed on a regular basis to achieve a desirable outcome [88].In the United States, however, a low proportion of people meet the recommended physical activity level across all age, gender, and race groups [52].
Though prior studies found that incorporating physical activity into an individual's routine helped them maintain it [97], people often lack knowledge of their routines [31].As planning has been widely studied as a way to improve individuals' physical activity levels [3,66,90,107], the "when, where, and how" to form physical activity plans is well understood [40].Prior studies have also determined several aspects of individual knowledge that facilitate physical activities, including exercise history [94], health knowledge [88], and preferred activity level [79].Nevertheless, in situations where people lack that knowledge [99], their physical exercise plans may be less feasible.Additionally, it is already difficult for individuals to explore long-term patterns in their health data [15,41,43,62].When facing a change in their lives [81][82][83], individuals' previous behavior change plans may no longer be valid or practical [95,107].Deviation from established routines is also very common [30,31,76], which interrupts activities planned within them.Whether individuals are able to maintain their physical exercise plans in the long term remains a question [15,62].
Self-experimentation has been discussed as a way to assist individuals in filling those knowledge gaps [20, 26-29, 44, 49, 51, 57, 85], whereas recent research argues for better scaffolding ordinary people's practice in this regard.Self-experimentation is defined as an iterative process for seeking individual knowledge by generating hypotheses, conducting experiments to test hypotheses, and reflecting on the results [51].For individuals who are new to certain contexts or activities and have no past experience to reflect on, self-experimentation serves as a way to generate self-meaningful knowledge [57].Research found that Quantified-Selfers' initial selfexperimentation practices [20] may not follow a standardized selfexperimentation framework (e.g., starting tracking without any goals in mind, tracking multiple aspects at the beginning, comparing data without explicit control conditions) [51].Additionally, individuals found it difficult to set up self-experimentation [26,59] in complicated scenarios (e.g., sleep quality [29], chronic health conditions [9]).Thus, prior works call for better support for ordinary people to initiate self-experiments [19,27,29,57].
Reflecting on previous experiences, as demonstrated in prior works [4,39,58,59,101,105], helps individuals obtain self-knowledge (such as behavioral patterns and trends [23,60,105]) to form better behavior change plans.Similarly, when engaging in self-experiments, individuals iteratively reflect on their records until the data collected is sufficient to confirm or deny their hypotheses [49,51,59].Nonetheless, recent research argues that the behavioral change of individuals should be a continuous process after a single experiment [26].Individuals also express their interest in restarting selfexperiments [27] to further explore other aspects of their behaviors even after the current experiment has ended [29].To support this practice, self-reflection, defined as a process of reviewing data collected to draw connections between ideas and behaviors [36,101], supports individuals in iterating through the gained self-knowledge and applying it to the next iteration [1,35,58,59,101,102].In this study, we are interested in exploring how individuals can leverage their prior experiences to initiate the next iteration of their behavior change plans.
Building on previous works, we propose a framework we call "reflective iteration, " which we implemented in a mobile app, Planneregy.The reflective iteration framework refers to the practice where individuals iteratively specify behavior change plans and reflect on their practice to identify preferable ways of carrying out desirable behaviors.In this study, participants used the Planneregy app to plan physical activities every seven days and iteratively reflect on their records to figure out physical exercise routines that worked for them.With this tool, we address the following questions: (1) How does the reflective iteration framework impact the way individuals gain self-knowledge about physical exercise routines?(2) How does the reflective iteration framework impact the way individuals iterate on their physical activity plans?(3) How should the reflective iteration framework be designed to support individuals' reflection and iteration on physical activity planning?
To answer these questions, we deployed Planneregy with 16 participants and carried out a 42-day user study.Findings from interviews show that 1) bundling physical activity plans into "strategies" can assist individuals in unpacking and testing multiple aspects of their physical activity plans, 2) iterating physical activity strategies allows participants to update their self-knowledge when facing life changes and temporary disruptions, and 3) providing high-level evaluation and reflection methods helps facilitate and guide iteration of participants' physical activity plans.Based on these findings, we offer further insights into designing future personal informatics systems.
Our paper contributes to the literature in three ways: (1) We provide the Planneregy app as a behavior change tool for supporting people in gaining self-knowledge that allows them to establish and improve their physical exercise routines; (2) We demonstrate the utility of applying the "reflective iteration" framework in helping users iterate physical activity strategies with flexibility and supportive information (e.g., satisfaction score, keyword descriptions of a strategy); (3) We offer several design implications for designing personal informatics systems, including supporting flexibility in creating behavior change plans by unpacking complexities in life scenarios, facilitating iteration to coordinate with life changes and temporal disruptions, and leveraging reflection practice for better iteration.

RELATED WORK
Self-experimentation has the potential to help people obtain individual knowledge that informs their planning for physical exercise.
Recent studies have explored designing self-experimentation for personal use by leveraging the power of ubiquitous computing systems.In this section, we first review the relevant literature regarding designing self-experimentation experiences.We look into the iterative nature of performing self-experimentation and point out the close connection between iteration and reflection in a continuous self-experimentation practice.We further discuss the knowledge gap and the issues within the existing self-experimentation designs, which provide insights and inform the design of the reflective iteration framework.

Evolving Self-Experimentation Methods
Self-experimentation refers to a personal informatics method for individuals to perform experiments on themselves through trial and error.It originates from single-case designs (SCDs) or n-of-1 trials [51,63] that has long been used in the medical field [7].Compared to randomized controlled trials (RCTs) [22], self-experimentation was found useful to help individuals gain self-meaningful knowledge [19,20], as it allowed individuals to test variables that only matter to themselves [50].To apply the self-experimentation methodology to assist individuals' practice, Karkar et al. specified a selfexperimentation framework that consists of several steps [51]: specifying variables to test, conducting self-experiments, and reviewing the results to determine the intervention for behavioral change.
With the development of ubiquitous computing technology, researchers investigated the use of mobile devices [11, 27-29, 50, 89, 100, 103] to facilitate individuals' self-experimentation practices, suggesting scaffolding it by balancing the user burden and scientific rigor [27,28,50].Taylor et al. developed QuantifyMe [89,100] for novice individuals to conduct self-experiments and evaluated it through a 6-week study.TummyTrials [50] aims to assist people with irritable bowel syndrome in conducting self-experimentation with scientific rigor.SleepCoacher [28] collects individuals' sleep data and sends personalized recommendations for users to test; the system iterates on its recommendations based on users' feedback.Aimed at helping people experiment with ways to improve sleep quality, SleepBandits [29] allows users to generate their own hypotheses to test.In the Self-E study [27], researchers developed a general self-experimentation tool for individuals to investigate multiple aspects of their daily lives.
Though the self-experimentation framework requires individuals to form hypotheses in the first step [51], novice self-experimenters face challenges in initiating experiments.By examining Quantified-Selfers' practices, Choe et al. [20] found that many people started tracking with no specific goal in mind.Individuals who lack prior self-experimentation experience find it challenging to identify variables to track and articulate them into hypotheses [26,59].Moreover, life is complicated [100], where multiple factors may be involved in understanding different life scenarios (e.g., sleep quality [28,29], chronic health conditions [9]), posing challenges for individuals to perform self-experimentation. Instead of performing self-experimentation in clinical settings, performing it in the real world may have to account for flexibility to make it accessible to the general public [51].Further, Karkar et al. pointed out the importance of guiding individuals to set up proper experimentation [51].The TummyTrials study [50] suggests that the self-experimentation system should be flexible to account for individuals' needs in creating experiments with their own concerns.Therefore, systems should be designed to better support individuals in setting up selfexperimentation in various life contexts [20,27,29,57].

Reflective Iteration
While Karkar et al. 's self-experimentation framework suggests people's self-experimentation ends when data reveals a statistically confident result [51], recent studies argue for supporting continuous self-experimentation, where individuals keep experimenting with different aspects of life scenarios.Participants in the SleepBandits [29] study expressed an interest in continuing with another self-experiment after finishing the current one.Apart from helping individuals find knowledge that leads to positive behavior changes, it is argued that self-experimentation systems should also support post-experiment behavior change to help people maintain the intended behaviors [26].To assist people's practice in a post-outcome stage, the system may enable users to either retest the result that they are not confident with or continue with the next thing to test [50].Daskalova et al. suggested having iterations in setting up experiments, where individuals can restart their experimentation to test the same condition multiple times [27].Lee et al. found that individuals could come up with behavior change plans that are more realistic and personalized through iterations [57].A study also found that reflectively iterating on experimental variables, devices, and hypotheses supports individuals in choosing the right tracking method and variables to test [26].As self-experimentation helps individuals gain new personalized knowledge [20,44], it is worth investigating whether and how individuals can apply this knowledge.Thus, we want to further understand how to design systems to support individuals who face constant changes in life [81,83], in iterating their behavior change plans.
Self-reflection, as a common component in self-experimentation designs [11,27,29,50,51], can be further leveraged to assist individuals' iterations on self-experiments.Self-reflection describes a process where individuals reflect on the feedback information [68,70] collected from performing activities [60] to gain personal insights [19].To provide guidance for individuals' iterations of their behavior change plans, we looked into existing studies about self-reflection, which has been commonly investigated in personal informatics as a way to help people seek personalized goals [58] and solutions [57].Reflecting on previous experience helps individuals generate insights to inform their decision-making [105,106] regarding future actions [59].Prior work enforced weekly reflections to help individuals reflect on their data [5,6,9].Ayobi et al. [9] examined individuals' iterative thought process of unpacking the complicated chronic pain issue into multiple aspects and reflecting on it.Bentvelzen et al. [16] pointed out that reflection itself is an iterative process where individuals experience constant changes in their lives and suggested different levels of abstraction to help users interact with their data.When interpreting the experimental results, people may look for answers beyond a single "yes" or "no" result [50] and seek a high-level summary of their self-experimentation results [29,103].Given the close connection between iteration and reflection, we still lack an understanding of how individuals can apply the knowledge gained through reflection for future iterations.

METHOD
Previous research suggests a move from the quantified-self toward the qualitative-self [18,25].When applying the self-experimentation framework to promote individuals' behavior change, as opposed to clinical practice, researchers may consider sacrificing scientific rigor so as to reduce user burden [49][50][51] and compensate for uncertainties in people's daily lives [77,81,105].It is more motivational and engaging for users to try creative ways of forming behavior change plans rather than following traditional n-of-1 trials [57].
Inspired by self-experimentation, but recognizing that its rigor may not be a good fit for the complex process of establishing physical activity routines, we propose an approach we call "reflective iteration" to support the process of finding routines that work.Reflective iteration resembles Karkar et al. 's self-experimentation framework [51] and the single-case design approach it is based on in that it supports the systematic articulation of hypotheses (i.e., behavioral strategies believed to result in "good" routines), time-bounded behavioral commitments, collection of outcome data, and periodic review of the relationships between behaviors, outcomes, and contextual factors to evaluate hypotheses and draw conclusions.Reflective iteration differs from self-experimentation in several ways.It does not include randomization of stimuli within time boxes; rather, the participant applies the chosen strategy for the entire time span.Unlike SCDs but similarly to Karkar et al. [51], reflective iteration does not seek to draw statistical inferences from outcome data.Not only would this be infeasible without randomization, but the nature of both plans and outcomes is sufficiently fuzzy that qualitative reflection would be more productive than statistical inference.Reflective iteration builds upon self-experimentation by explicitly building in iteration via regular, scaffolded reflection and revision sessions wherein participants review results from the previous time-box and articulate revised strategies to attempt in the next one.
The reflective iteration framework consists of three major components: planning, tracking, and reflection (Fig. 1).This framework asks users to 1) specify multiple behavior change plans and bundle them as one strategy by summarizing it using their own words; 2) track and report those behavior change plans; and 3) reflect on their records to decide whether to maintain or change (i.e., start a new strategy or change back to a previously used one) the strategy and start over.
To evaluate this reflective iteration framework in the domain of physical activity, we designed, developed, and deployed a physical activity planning app, Planneregy, that allowed users to plan out physical exercise routines and iterate on them through weekly reflections.Using this app as a probe, we conducted a 42-day study with 16 participants to understand their experiences in gaining selfknowledge about their physical activity routines and improving them.

Planneregy: An App that Helps Users Iterate Physical Activity Routines Through Weekly Reflection
Planneregy consists of three major components: planning, tracking, and reflection.Since people usually have weekday-specific activity patterns, physical activity plans that work for one day may not work for others [10,31,34,76,105].Instead of presenting participants with their data on a daily basis, we chose to let them reflect on their data weekly [12] (Fig. 1.).In this way, users can have a comprehensive view of their execution of plans on different days of a week.We instructed participants to bundle the plans they specified for 7 days into one single "strategy" that they would reflectively iterate on.The Planneregy app was built using the React Native framework with a Google Firebase backend.Due to time and equipment constraints, the team only developed the iOS version of the app, which was distributed to participants seamlessly using Apple's TestFlight service.We open-source the code here and encourage future researchers to leverage this app and its features to further investigate the potential of the reflective iteration framework as well as build upon it.
3.1.1Planning.On the first page of the planning screen (Fig. 1. 1-4), participants planned activities by specifying activity type, date, and time [3,32,105] (Fig. 1. 2b).They planned all the activities that they wanted to do in the next seven days.We provided a calendar view (Fig. 1. 2a/5a/13b), which lists both their Planneregy plans and their Google calendar events, to help them better make sense of their daily schedule [3,24,32,55,105].We provided participants with different types of contextual factors (Fig. 1. 2c), such as temporal factors (i.e., schedule, time-of-day, and day-of-week) and activity information (i.e., activity type, duration, satisfaction level, and completion) to help them understand their data [27,71] and create feasible plans [46,54,61,77,80].We were aware that the weather and temperature could be factors that influenced users' planning for physical activity and the way they interpret their records.Thus, we provided a weather forecast for the next 15 days and included the weather information for all their previous activity records.Once the activity was planned, the app reminded them 1 hour prior to the start time.
Prior work suggests scaffolding individuals' sense-making on their own experiences by using personal meaningful substances (e.g., keywords, symbols, and figures) [9,96] and grouping records around common characteristics to facilitate sense-making [20,105].In light of this, participants were asked to come up with a few keywords (Fig. 1. 3a) to summarize their plans [92].We provided a few examples of keywords (e.g., light exercise, morning, outdoor).
Those keywords helped summarize common characteristics of plans, but we kept it open and encouraged participants to try creative ways to specify their own keywords.Similar to hypothesis (i.e., under those conditions, how well I can enact my plans) in previous self-experimentation frameworks [20,49,51], those keywords will be evaluated by users later in the reflection phase.
Lastly, participants bundled all their plans and keywords into one "strategy" and gave it a name (Fig. 1. 4a).Then they were directed to the tracking screen.The app would remind them to reflect on and evaluate this strategy after 7 days.Users were only able to pursue one strategy every 7 days.For a detailed case study regarding how participants bundled physical activity plans into strategy, please refer to Section 4.1.1.

Tracking.
After specifying their plans and strategy, participants were directed to the tracking page (Fig. 1. [5][6][7][8], where they logged their completion information for the following 7 days.This tracking process was inspired by the experimentation stage in the self-experimentation framework [51], which helped collect information for weekly reflection [5,6]. We chose the manual tracking and self-reporting approach as a way to improve participants' awareness of their activity [20].Participants reported their completion of a plan using a multichoice question (Fig. 1. 9-12): 1) completed; 2) didn't complete; and 3) completed differently (i.e., participants performed a different activity or performed the activity but at a different time) (Fig. 1. 9a).In all cases, participants were encouraged to report other activities outside of their plans.For the days without any plans, participants were asked to report whether they had done any other activities.
Participants could visit all their previous strategies in the collection popup (Fig. 1. 7a).By providing comparisons [53] among different strategies, we aimed to facilitate participants' understanding of their data under different conditions [20], so they could switch to different strategies throughout their practices.

Reflection & Iteration.
On day 7 of each week, the app prompted participants to reflect (Fig. 1. 13-16) on their strategy [5,6], which resembles the review result stage of Karkar et al. 's self-experimentation framework [51].Participants first evaluated the keywords (Fig. 1. 13a) they assigned to determine if each of those aspects helped them better complete their plans.They rated each keyword as "helpful" (colored in green) and "not helpful" (colored in yellow).They then rated their satisfaction with the current strategy (Fig. 1. 14b) on a scale of 1-7 [5,6,50,105].We recognized that users all lived within their individual contexts [74,75] and perceived life experience subjectively [45,83].Thus, we did not set up a standard for rating satisfaction.We hoped that users could evaluate their experience using their own criteria.
Their evaluation was saved to the database and presented to them on both the collection page (Fig. 1. 11a) and the strategy selection page (Fig. 1. 16a).In the next step, the app asked them about what they would like to do for the next week.Participants were offered three options: continue with the same strategy, change to a previous strategy, or start a new strategy from scratch (Fig. 1. 15a).If participants wanted to go with a previous strategy, the strategy selection page would pop up to let them select.If they chose either to continue the strategy or go back to a previous one, the app automatically planned the same activities on the same day The Planneregy app allows participants to reflectively iterate on their physical activity plans on a weekly basis.1-4: On day 1, participants will plan physical activities for the following 7 days.They will use keywords (e.g., morning exercise, moderate exercise, etc.) to summarize the common characteristics of those plans.They will bundle those plans into one "strategy" and name it.5-12: Starting from day 1 to day 7, participants will report how they completed those planned activities on a daily basis.We provided four ways to visualize their completion information: 1) Completed the activity exactly as planned, marked as green; 2) Completed a different activity at the same time, marked as yellow; 3) Completed the same or a different activity at a different time, marked as yellow (the original planned activity is marked as light yellow); 4) Failed to complete any activities, marked as gray.On day 7, participants will reflect on the current strategy by first evaluating keywords and rating the strategy.Participants will then be asked to iterate on their strategies by: 1) continuing with the same one; 2) changing to a previously used strategy; and 3) creating a brand new strategy.To be noticed, screenshots in this figure were captured from the first author's personal use and didn't represent any of the participants' data.
of the week as they specified previously.If they chose to start a new strategy, the app let them create plans, pick keywords, and name the strategy from a sketch.After making the decision, they were directed to the first planning page (Fig. 1. 1-4), where participants could still modify the strategy by changing, deleting, or adding plans.If they picked a previous strategy for another week, we added a "return" icon (Fig. 1. 1a) to its name to indicate it's a repeated strategy.From there, they started a new week of planning, tracking, and reflection.

User Study
We conducted a 42-day user study, during which participants were asked to use the Planneregy app to plan and track physical activities, as well as iterate on their planning strategy.Each participant attended three interviews: at the beginning of the study, at the end of the third week (a brief check-in), and at the end of the study.The three interviews were designed to better understand participants' reflective iteration experiences with less recall bias.This study was approved by the first author's institution under an exempt IRB review.

Participants & Recruitment.
We recruited participants through the department's mailing list and flyers at the first author's institution.People who were interested in participating first filled out a screening survey where they indicated their satisfaction with their current physical activity level [5,77,105] on a scale of 1-7.Participants were invited to the study using the following criteria: • Adults aged 18-55.
• People who currently live in the U.S.
• People who don't have any disabilities, injuries, or health concerns that may prevent them from performing regular, moderate physical activities.• People who have access to a smartphone with a system higher than iOS 10.0.• People who have a Google account.
• People who are unsatisfied with their current physical activity level (rate from 1-4 of a satisfaction level out of 7) [77,105].17 participants were recruited.One participant (P14) withdrew from the study after the first interview.We removed P14's data but did not reassign new codes to the rest of the participants.Of the remaining 16 participants who completed the study, 3 were male and 13 were female (Fig. 3.).All participants were either undergraduate or graduate students at the first author's institution (7 undergraduate students, 7 master students, and 2 Ph.D. students).This student population has relatively good digital literacy [42], which potential helped them quickly adapt to complicated personal informatics systems, such as the Planneregy.School settings have also been commonly used in prior studies [12] to investigate individuals' physical activity and routine.As college students, participants constantly faced new situations in their lives (e.g., having major exams, spring-break, etc.), providing a good opportunity to explore how the reflective iteration framework could help them navigate through those temporary disruptions and life changes.

Compensation.
Participants who completed all 6 weeks' study activities, including submitting 6 questionnaires and attending 3 interviews, received a total of $120.Participants were compensated for $10, $20, and $30 by attending the initial, middle check-in, and exit interviews.Additionally, we asked them to submit a brief weekly questionnaire, which included open-ended questions such as "briefly talk about your experience with the Planneregy app this week?".Participants received the questionnaire via email at the end of each week and were supposed to complete it the same day they received it.By completing each questionnaire, they earned $10.In this way, participants were compensated only based on their submission of the weekly questionnaire and attending interviews, regardless of their usage of the Planneregy app.We informed participants that no compensation is tied to the use of the system [20].

Instructions.
We ran a few pilot tests before the study with lab members (none of them was involved in the later user study).We realized that the system and the framework had a learning curve.To help navigate participants through the usage of the app and study procedures, we included an onboarding tutorial when participants opened the app for the first time.The onboarding screen presented participants with the following instructions: • "How to use this app: a) Create physical exercise plans once a week.b) Report your plans on a daily basis.c) Reflect on your records at the end of each week and try to improve them." • "Success Criteria: Find a planning strategy that fits well with both your physical capabilities and daily routines." We set this general success criteria with respect to the diversities in participants' physical activity capabilities and routines.We  2: We adopted a Data-Driven Retrospective Interview [93] approach to generate insights in depth.In doing so, we tailored interviews for each participant to help them recall the nuances in their data.By leveraging the researcher version of the Planneregy app, researchers were able to go through participants' data before the interview.Based on the review of their data, researchers further added interview questions to address data points that they found interesting (e.g., repeated the same strategy for multiple weeks but rated it low).During the interview, participants were asked to share their screens, so both the researcher and participants were able to refer to specific data points during the walkthrough of the data.Researchers would have the researcher version of the Planneregy app open on the side, so they could keep digging into participants' data even during the interview.In this way, researchers were able to better understand participants' narratives and ask follow-up questions properly.hoped this broad criteria could encourage participants to explore creative ways of performing physical activities that worked best for them and reflect on the results using their own criteria.To motivate participants to exercise more [5,24,27,69,86], we included a progress bar (Fig. 1. 5e) on the tracking page to indicate how far they were from reaching the recommended 150 minutes of weekly exercise [78].Other than those, no further instruction was given to participants regarding how they were expected to plan and perform regular physical activities.
At the end of the first interview, the researcher shared the screen of the app and walk participants through the usage of the app (i.e., creating strategies, completing daily reports, reflecting and evaluating outcomes, and improving).During the walk-through, the researcher provided a scenario of how one person with limited understanding of his or her physical activity level and routine could use the app to identify a preferable weekly exercise schedule.Researchers would have a brief check-in meeting with participants at the end of the third week of their study to address any confusion regarding using the app.Additionally, participants could click on the question icon (Fig. 1. 1b, 5f) on each page of the app to view detailed instructions for using the functions of the app.
The team was aware of the risk that participants might try aggressive physical activity strategies that's harmful to their bodies.Thus, participants were informed in the consent form that they should only participate in moderate-intensity aerobic activities.A list of those activities was provided [78].

3.2.4
Interviews.We conducted three interviews with each participant to learn from their experience performing regular physical exercise before and during the study.The second interview was conducted at the end of the third week as a brief check-in to see if participants had any trouble using the app and was not recorded.All the interviews were conducted remotely [53].The first and last interviews were semi-structured interviews and were recorded.
After using the Planneregy app for 42 days, participants attended the last interviews (Fig. 2.).For the last interview, we were informed by a Data-Driven Retrospective Interviewing (DDRI) approach [93], using which we tailored the questions for each participant based on their data reported in the Planneregy app.To prepare for each participant's interview questions, we created a researcher version of Planneregy, which allowed researchers to view participants' data (i.e., physical activity plans, keywords, strategies) at any time.The researcher version of the Planneregy app shared the same interface as the one that participants used, except that all participants' calendar events were removed.Prior to each interview, researchers would use this tool to go through participants' data.If they noticed any interesting data points (e.g., didn't complete any morning exercise but still marked morning as a helpful aspect of the strategy), they would add questions to the interview protocol to ask participants to explain.During the interview, we let participants share the app's screen and walk through their experience by referring to the data.When walking through their experience or being asked about particular data, participants could refer to their records in the Planneregy app.Meanwhile, researchers had the researcher version of Planneregy opened from their end, so they could browse participants' data to confirm their narratives and prompt them with new questions.
3.2.5 Data Analysis.The interviews with participants were video and audio recorded and later transcribed into scripts.Researchers performed thematic analysis following in vivo coding [87].To address our research questions, themes were primarily grouped around participants' planning, routine, reflection, and iteration experiences.By leveraging the researcher version of the Planneregy app (Fig. 2.), researchers referred back to participants' data (Fig. 3.) to better understand and inspect their narratives.When browsing through participants' records, any interesting data points further inspired examinations of the interview scripts, which helped reveal nuances in their experiences.We particularly looked at commonly mentioned themes and performed secondary coding and grouping (e.g., facilitator of iteration, supportive information in reflection).The authors of this paper then engaged in a collective discussion about thematic data points, evaluating whether they led to valid findings.
Additionally, researchers manually went through the scripts and summarized participants' characteristics in their practice using the app (e.g., iterating strategy, evaluating strategy, keyword preference, etc.).From this analysis, we identified different types of participants (e.g., Fig. 3: initiating new behavior change plans vs. initiating behavior change plans by leveraging current routine) regarding how they initiated and iterated on behavior change plans.

FINDINGS
During the study, participants created a total of 48 unique strategies, 203 keywords (including duplicated ones), and 434 physical activity plans (Fig. 3.).Under those strategies, the average exercise time per week was 155 minutes (min=4, max=330).Including the activities that participants completed differently from what they planned, they completed 72% of their reported plans (min=4%, max=100%).Nearly 50% of the keywords were rated as helpful.On average, each participant created 3 unique strategies (min=2, max=5) and 27 plans (min=5, max=62).Most participants changed their strategy after the first week (n=11).P6 and P7 maintained the same strategy for the longest of 5 weeks.In general, each week's strategy (including repeated ones) contained 5 plans (min=0, max=12) and 2 keywords (min=0, max=9).The average completion rate regarding each week's strategy was 61% (min=0, max=100%).Over half of the participants (n=9) had an increase in their average weekly physical activity level at the end of the study.

Supporting Participants in Creating Behavior Change Plans by Bundling Physical Activity Plans into Strategies
While traditional n-of-1 trials usually start with forming hypotheses [51], our system allowed participants to first specify behavior change plans without having a clear assumption regarding the effect of their physical activity plans.In this study, bundling physical activities into strategies seemed to scaffold participants' practice to create behavior change plans by unpacking the complexity of physical exercise routines.
4.1.1Bundling Physical Activity Plans into Strategies.The Planneregy started with asking participants to specify a series of physical activity plans they want to perform in the next 7 days.Here we offered P7's case as an example to showcase how the Planneregy app helped participants bundle physical activities into strategies (Fig. 4.).P7 chose to start a brand new strategy at the end of this study (Fig. 3.).When planning out those activities, P7 started with the idea of doing regular walking from 6 pm to 6:30 pm, and then extended this idea to several days of the week that she thought would work but modified them a little bit: "I'm planning on doing walking still every day at 6 to 6:30.
[then] I'll add this for Friday, Saturday, Monday, Tuesday, Wednesday.And then I want to do it late during the days that I feel a little free.So that might be, that might be Saturday [since] Saturday I'm more free.I might do like 30 more minutes.So I'll do Saturday one more.And then I know I'm free on Wednesday, so I'll also click Wednesday and do the same thing." After specifying 8 physical activity plans, P7 picked 2 keywords from the sample keywords list along with another 4 self-defined keywords: "I just added two keywords.Again, 'outdoor' and 'light exercise'.And 'warm weather' and 'mindfulness' and 'little challenge'." In the last step, P7 named the strategy.By following this procedure, P7 specified a strategy named "Afternoon walk part 3" which had 8 physical activity plans summarized by the keywords "outdoor" and "light exercise", "warm weather", "mindfulness" and "little challenge".).In many cases, their tried to test and reflect on multiple variables all at once.For example, P5 tried to do activities in different places in his house to find a good spot: "I was doing it, Participants completed 70% of their planned activities and rated 50% of the keywords as helpful.9 of 16 participants had an increase in their weekly average physical activity level compared to what they reported before the study.
like my workouts, in different areas of the house.I just found a space that worked." In other cases (n=12/16), participants' practices were combinations of different aspects.P6 started with testing both activity types and locations, which are outdoor walking and indoor elliptical workouts: "I think I was walking.[…] So walking would be an outdoor activity, while an elliptical workout would be indoors.So I was experimenting with which one I would prefer." When creating strategies, P2 started with different activities while taking her mental wellness into account: "I thought it was important for me to also do some activity in this week, which is why there's like cardio today, strengths today, and rowing on Monday.[…] Because I wanted to relax, there's like gardening and meditation on like Tuesday and Wednesday." 4.1.3Keywords Supported Unpacking the Complexity of Physical Exercise Routine.We proposed keywords as a way to help participants summarize the characteristics of their plans when bundling them into a single strategy.When initiating physical activity routines that involve multiple aspects, participants commonly (n=11/16) used keywords (e.g., morning exercise, light exercise) to summarize the routine for which they were unsure of its effect.P10 believed keywords helped her better describe her strategies: "[The most helpful function] probably is the keywords.I liked those [keywords].It was a nice way to categorize what I was doing." Keywords were also used by participants (n=7/16) to deliver additional information regarding their activities.P6 used "Podcast" as one of the keywords since she would like to listen to podcasts while doing workouts: "I also labeled podcasts because I thought if I was listening to podcasts while on the elliptical I would be less boring and I would make it a habit to do the workout." In doing so, some participants (n=7/16) recognized having keywords not only as a Figure 4: Screenshots from P7's planning and tracking screen, showing how participants bundled physical activities into strategies.1) P7 first specified 8 physical activity plans on different days in the next 7 days.2) She then picked 5 keywords to summarize the common characteristics of those plans, which are "Light exercise," "Outdoor," "Warm Weather," "Mindfulness," and "Little Challenge."3) In the last planning page, she bundled those plans and keywords as one strategy and gave it a name.By clicking the "Start" button, she would be directed to the tracking page where the key information regarding the strategy was listed and started tracking and reporting her plans.
way to summarize their activities but also as a way to reinforce their understanding of their strategies.P7 felt keywords helped her summarize what she had planned so far and made her more committed: "I said all the keywords were helpful because I knew that this was something that I was looking for.I needed things like light work, casual, something mindful.[…] I didn't like to set a huge goal for myself, but something light and casual, and that's why I thought the keywords helped me to commit to the plan." P8 thought keywords were helpful in reminding her of the way of doing exercise from a high level: "They helped you remember, like, what the plan actually was because if you just have an afternoon exercise plan, you may not really remember what it is, but if you have like light exercise and outdoor, you can just remember it."

Iteration Allowed Participants to Update
Their Knowledge to Better Coordinate with Life Changes and Temporary Disruptions While Karkar's self-experimentation framework [51] aims to assist people in testing out what is working with their situation, there is no such clear answer in the context of identifying preferable physical activity routines due to the constantly changing nature of life.Thus, participants needed to keep updating their knowledge and change their routines accordingly.We found that the iteration feature could potentially support participants in choosing feasible routines in different situations.

Iterating the Same Strategy
Helped Generate New Knowledge and Develop Physical Activity Routines.Most participants (n=14/16) had applied the same strategy more than once (Fig. 3.).By iterating on the same strategy from a variety of aspects (e.g., activity type, time, activity intensity, etc.), participants (n=5/16) were able to gain a better understanding of their physical activity routine.P6 used her second strategy five times and found it helped her better understand the patterns of daily routines: "I repeated the week three routine for pretty much the rest of the testing, and that has worked out.Then, from week three to week six, I had a better understanding of how I could fit in daily or every other day cardio activities." After figuring out a strategy, many participants (n=8/16) simply stuck to it, as mentioned by P7: "I repeated the plan pretty much the last few weeks.[…] because after the first few trials and errors, I learned that this is the one that works for me.[I just wanted to] stick to the one that works for me and then make sure that that kind of got into my habit." P12 also tended not to change the strategy when there were no anticipated changes in her life: "I think the next two weeks will look pretty similar for me.So, like, that's why I was thinking of just doing the same routine for the next two weeks." With the same strategy, participants (n=9/16) also tried to improve it gradually from different perspectives (e.g., increasing intensity, adding new aspects, etc.).When repeating the same strategy, P2 tried to increase the intensity of the activities she was doing: "I wanted to eventually maybe replicate the moderately relaxed one, or maybe go to something that's even more aggressive or, you know, physical activity heavy for the next week." As P7 got used to the current strategy, she wanted to add some new aspects, including new activities and routines, to make it more fun: "I feel like I got used to the afternoon walk strategy.I feel like I can also now add something new, like some new routines, to make it more fun and exciting.But I wouldn't necessarily be off of my afternoon walk routine because I really enjoy it.I'm gonna add a few more stuff in it.[…] So if I were to name the plan, it'll be like afternoon walks, plus weight." Under the same strategy, P17 tweaked her plans to make them work better for her schedule: "I've moved it [the strategy] a little, from week to week.But for the most part, it's been the same for the past couple of weeks, and I have kind of just been putting the walking more in the middle of the day."

Flexibility in Changing Strategies Helped Participants Better
Coordinate Plans to Meet Life Changes and Temporary Disruptions.Our participants (n=10/16, see Fig. 3.), college students, faced either life changes (i.e., a transformation of the state of life [81]) or temporary disruptions (i.e., temporary events that disrupt individuals' normal routines [30,31,105]) during the study.To coordinate with those changes, all participants created more than one strategy (min=2, max=5).Changing the strategy means going with one that is different from the previous strategy regarding how the plans were structured (e.g., morning vs. afternoon, light exercise vs. intensive exercise).In a few cases (n=4/16), participants switched back to the strategy that they used before.
Facing changes in their lives (e.g., having exams, being sick, traveling) was the most commonly mentioned (n=8/16) reason for them to change their strategies.When they encountered life changes or temporary disruptions, what worked before would no longer be practical, requiring participants to update their strategy, as mentioned by P10: "Originally, I think I tried jogging in the evening, and then I tried stretching in the morning, and that worked decently well.
[…] But then I got to this weekend and was like, This is not working; I need to try something different." P7 had some health issues in the middle of the study while trying to maintain regular physical activities: "the first two weeks everything was morning routine because it worked out for me.But then I changed to afternoon walks, mostly because I had some health issues.[…] And in a way, I wanted to kind of force myself to be regular about being active.[…] So after making a compromise by changing the plan from morning to afternoon, even though I was physically sick sometimes, […] I still made sure that I would follow a routine." For participants (n=3/16) who changed their lifestyle slightly, some aspects of their physical activity routine, which had been tested as not working, started to work, leading to the change of strategies.P12 used to do workouts in the afternoon, but recently she started waking up early: "I think earlier I would go in the afternoon just because I wouldn't wake up this early.[…] Since spring break, I started waking up pretty early, so I have a lot more time in the morning.So I planned everything to be pretty early in the morning." After the temporary disruptions, participants' lives returned to their normal state.The routine knowledge tied to the previous context became applicable again.Thus, some participants (n=4/16) changed back to the strategies that worked before.P16 only changed his strategy during the spring break, and it took a while for P16 to go back to the routine after the break: "[For the spring break week] I didn't put gym and morning [as keywords] because it was just not true.There was no gym I could go to during the vacation.

High-Level Reflection Helped Steer Participants' Iteration
Different from traditional n-of-1 trials, where people wrap up experimentation once they get a confident result and change their behavior accordingly [51], participants' behavior change happened simultaneously when iterating physical activity routines.We found that the high-level reflection further helped steer participants' iteration directions and guide their creation of future strategies.

High-Level Reflection Helped Reinforce the Gained Knowledge.
To evaluate one strategy, the system first guided participants to evaluate different aspects of the strategy separately and then rate the strategy as a whole.When asked to evaluate previous strategies, most of the participants (n=14/16) examined strategies from different aspects, as summarized by keywords.When P10 was walking through her previous strategies, she was able to attribute the failures in following her plans to specific aspects (i.e., intensity and timing): "So the first week I thought maybe it's because it's in the evening, so it's not working.
[…] And then, by the second week, I realized that doing anything super intensive was not gonna work.So that's why moderate exercise is not working." Some participants (n=5/16) compared variables from the same aspects to find what was working.For example, P5 found that his strategies worked better when they were planned for the afternoon: "I think I planned to do it both in the afternoon and in the morning, but it ended up being more in the afternoon." Participants adopted intuitive and flexible ways of evaluating keywords.Instead of using any statistical or scientific methods, P10 evaluated keywords simply based on how she felt: "I guess [my way of evaluating keywords] was kind of arbitrary.It was just kind of based on how I felt about them and if I thought making a change in them would make a difference.Yeah, it wasn't super scientific or anything." Apart from reaching a binary "working" vs. "not working" conclusion, participants adopted different criteria in evaluating those keywords.P17 evaluated keywords not only based on how successful they were regarding specific aspects of her strategy but also based on how they improved the quality of her exercises: "My impression of the keywords was to mark them as helpful if they helped me have a good success rate in finishing the activities I set out to do.But like the keyword 'more hours', it's not like adding more hours helped me to be more consistent, but I do think it improved the quality of the experience." P7 came up with her own keyword, "mindfulness", to check if she had been reflective when doing exercise: "So mindfulness means I wanted to have that walk to be more reflective, like thinking about what I've been through and just kind of having that reflection of the day.But I guess for this week, I didn't really feel like it; I just wanted to be more casual." Participants found that evaluating keywords helped them reinforce the knowledge they learned through practice.P6 got a better idea of the helpful aspects of her strategy by evaluating keywords: "I found [keywords] to be pretty helpful.[…] For example, I would tag a strategy as flexible or morning light exercise.So seeing those keywords pop up after each week's review has reinforced the idea that oh maybe I do like exercising in the morning and like being flexible about different types of workouts." P2 thought that evaluating strategies with keywords helped make her thoughts concrete: "So before I used the app, all these thoughts were like in my head.After evaluating the keywords, participants were asked to rate their satisfaction with the previous strategy.For many participants (n=11/16), this served as a decision point on how they would iterate strategies.After walking through the current strategy, P13 felt unsatisfied with it and would like to create a new one: "Obviously I'm very unsatisfied with this week, so I'm just gonna rate it as one.[…] I try to predict what my week will look like, the week after.[…] So I'll choose a different strategy." To rate their overall satisfaction with the strategy, most of the participants (n=12/16) would refer to their completion rate, while a few participants adopted different evaluation criteria, including energy level (n=2/16), routine integration (n=2/16), and flexibility (n=1/16).
After the reflection, participants reported knowledge gains regarding the aspects that did or did not work for them.During the interviews, many participants (n=10/16) walked us through how they applied that knowledge to the iteration of strategies.P10 tweaked her strategy by evaluating keywords so she could continue with the working aspects and change the non-working parts: "When I was making them (strategies), […] I would look at the suggestions [of keywords].Because I would basically just review the week and then immediately go into making my next week one.So I would have fresh in my mind what didn't work the past week out of the keywords that I selected but didn't work, and the keywords I selected that did.So I would go into the next week and try and like maintain the ones that did work and try and change the ones that didn't.After P3 found out that he kept missing jogging plans, he changed them to walking activities: "[Why I added more walking to the new strategy is] just because I was missing out on the jogs so frequently, so I thought maybe I can substitute them with walking because that's something that's easier for me to do." We further found that encouraging participants to view and reflect on their strategies through different keywords and relevant aspects can be helpful to steer participants' (n=7/16) iterations on their behavior change plans in a positive way.As there were aspects that could be summarized by a few keywords (e.g., morning, afternoon, evening), when one aspect proved not helpful, participants naturally started to test the rest of them.When P10 found doing exercise in the morning was not working, she started doing exercise in the evening: "In the past two weeks, I've been mainly scheduling stretching in the mornings.But because that wasn't working, I thought I would try putting things in the evenings as well, just to try something different." There were keywords that indicated alternative conditions (e.g., gym and home workout), so participants could try testing the alternative one when the initial condition did not work.P6 first tried exercising at the gym.Once she found it wasn't working, she switched to home workouts: "And I think it was weeks one, two, and three; I was experimenting a little bit.I may have started off thinking that I wanted to use the treadmill or electrical machine at the gym to do my cardio, but I soon realized that that's not flexible enough for me.Sometimes I don't have time to go to the gym for cardio.So I switched to more like walking workouts and at home workouts."

DISCUSSION
In this study, we demonstrate the potential of utilizing an iterativereflection-based framework to assist people in gaining self-knowledge and improving their physical activity plans.We found that participants were able to create behavior change plans without necessarily having a clear idea of which factors (e.g., activity type, intensity, timing) would contribute to their physical activity level.We also found that a high-level summary could help participants unpack the complexity of life scenarios (i.e., identifying preferable exercise routines).Findings further suggest that iteration supported participants in developing and modifying their planning strategies to coordinate with life changes and temporal disruptions.Lastly, we found that the weekly reflection served as a facilitator for iteration as it helped participants update their self-knowledge and guide the direction of iteration.In this section, we discuss these findings and provide insights regarding designing future personal informatics systems.

Supporting the Creation of Behavior Change Plans by Unpacking Life Scenario Complexities
Prior work argued the importance of providing simple self-experiment frameworks and technology to better support novice users [27,29,[49][50][51]57].In light of the literature [27,29], we designed the Planneregy app to scaffold this practice.It allowed participants to bundle their physical activity plans into a strategy and summarize the strategy with keywords.The app also had features that encouraged participants to perform different physical activity strategies and to reflect on their outcomes in an iterative manner.
In accordance with an earlier study [31], we found that participants who had been physically inactive for a while lacked knowledge of the factors that contribute to a physical activity routine.Additionally, identifying preferable ways of doing physical activities usually requires participants to test multiple facets (e.g., morning indoor light exercise vs. afternoon gym exercise), making it difficult for people to evaluate these aspects separately.Choe et al. [20] found that individuals may initiate self-experimentation without any assumptions but come up with variables to test throughout the practice.Facing this ambiguity in initiating strategies, many participants created initial behavior change plans by leveraging their existing knowledge [29] regarding routines.Aligned with the prior quantified-selfers' practice [20], the rest of the participants created initial behavior change plans without a clear understanding of factors (e.g., activity type, intensity, timing) that may influence their physical activity routines.At the beginning, they started by planning random activities to gain self-knowledge.Thus, when designing personal informatics systems, flexibility is an important consideration to account for individual differences in users' knowledge and practice.
Similar to prior studies [9,59], participants expressed the need to break down complicated tracking problems into multiple facets.We found that keywords [92] helped participants summarize what they planned at a high level, so they could test and evaluate different aspects separately.By using the Planneregy app, participants bundled the plans they made for one week into strategies and iterated on them on a weekly basis.In this process, keywords are design elements that help deliver additional information regarding user strategies.They could also reinforce their understanding of the essence and nuances of those strategies.Keywords also made participants more mindful of what they tried and what they learned.This suggests that high-level summaries may be needed to help individuals unpack the complexity of the problem.Apart from using keywords, future research might explore other innovative ways to serve a similar goal.
As mentioned earlier, we didn't enforce a conventional selfexperimentation design (e.g., [27,29,[49][50][51]57]).When engaging with Planneregy to iterate on their behavior change plans, participants didn't perform self-experimentation with randomization, nor did they gain statistical insights from their records.Still, the reflective iteration framework was inspired by self-experimentation. Thus, we encourage future researchers to investigate applying the insights from this study to the design of future self-experimentation systems.For example, when participants find it difficult to form hypotheses, researchers may ask them to first specify multiple keywords to stand for behavior change variables.

Supporting Iteration Across Different Life Scenarios
Prior work suggests the need for iteration in self-experimentation practice [26,27], where individuals can revise or restart their experiment.Inspired by prior work, we designed the Planneregy app as a tool that supports individuals to iterate on weekly physical activities as bundles (referred to as "strategies").Unlike previous studies [11,28,29,50,89,103] where participants iterated through the same experiment to gain one-time knowledge, participants in this study were able to constantly update their knowledge to coordinate with life changes and temporal disruptions.
Research shows that iteration can help individuals improve their behavioral change plans [56,57].We reexamined this claim in the context of identifying preferable physical exercise routines.Participants in this study iterated strategies to test the aspects that they were unsure about.Similar to the "discovery phase" in self-reflection [60], iteration with one strategy shows the effect of helping participants gain individual knowledge [20,57].Additionally, we provided nuances on how participants could gradually improve their physical exercise routine by iterating on the same strategy.As observed in the "maintenance phase" of individuals' self-discovery processes [60], participants tended not to easily change their established routine when there was no anticipated change in their lives.Instead, they tried to preserve the overall structure of their routine but tweaked part of it (e.g., increase the intensity of activities, add new aspects).Future work may further consider ways to assist individuals' practices of performing consecutive experiments to iterate on the gained knowledge, which will be applied to their life practices simultaneously.
Reflective iteration can be conducted in constantly changing environments.While prior studies suggest considering the context in individuals' tracking [20], we further address the influence of the context and argue that iteration can be used to cope with the changing context.Our participants, as college students [12], faced constant changes in their lives [81,83], causing the physical activity knowledge gained within certain contexts [71] not to work anymore.Such life changes and disruptions, referred to as "Breakdown" by Baumer [13], usually provoked reflection and motivated participants to change strategies to cope with the new situations.Faced with changing lives [16], there are no eternal truths regarding the knowledge gained through iterations.Thus, the future design of such a system may consider storing the outcome of iterations with specific contexts annotated so users will be able to switch to corresponding practices when the context changes.In this study, keywords of strategies were found to help participants annotate additional information and reinforce their understanding of strategies, suggesting that keywords might be able to serve as contextual cues.Participants' naming conventions (P2: "Moderate Relax, " "Pre-Exam Week, " "Post-Exam Week") of their strategies also contained information regarding the circumstances in which the strategy was performed.Such contextual cues can help participants identify a previous strategy to be resumed when facing temporal disruptions or life changes.Future designs may also investigate novel ways to help users navigate their records to identify the plans that fit their current circumstances.

Designing Reflection as a Facilitator of Iteration
Kant's philosophy [48] has shaped the way reflection has been investigated in the psychological, educational, and epistemological context [13].It has been conceived as a way of sense-making of uncertainties and situational awareness [91].Fleck et al. discussed developing technology to support bottom-up reflection [36], suggesting that reflecting on everyday experiences can facilitate healthy behavior change.In this study, we reinforced weekly reflection [9] to help participants gain insights on their physical activity planning records.Prior work shows that grouping planning records around shared characteristics helps individuals evaluate similar records as a whole [105].In this study, participants were able to use keywords to evaluate different aspects of their strategies separately.Studies suggest individuals should be allowed to customize the parameters to track [9,25].During this study, participants created different keywords to address conditions they wanted to test and came up with different criteria to evaluate them.Instead of letting participants determine their success statistically [27][28][29]85], we asked them to evaluate different aspects of their strategies as "helpful" or "not helpful".Participants reported that this function was intuitive to use and enabled them to evaluate different aspects of their plans.While prior works suggested using annotation to facilitate reflection [36], we found that evaluating keywords helped to reinforce the knowledge that users gained and was leveraged by participants in creating new strategies and iterating on old ones.We did not set a specific success criteria but simply asked participants to identify a physical activity planning strategy that worked for their physical capabilities and daily routines.Thus, participants evaluated the outcome with their own qualitative goals.Still, as prior literature suggested [72], future research may investigate ways to assist users in transferring their qualitative goals into quantitative ones or explore more personalized and specific goal-setting methods to motivate them [2,65,86].
Researchers have noted the importance of iteration in self-experimentation [26,56,57].We show that reflection, as a cyclic process [16], could further facilitate iteration.The reflection process helped participants update their knowledge.With a better understanding of what works and what does not work, participants could improve their strategies by preserving the working aspects and replacing those that do not.Studies pointed out that a self-experimentation system should also account for post-outcome steps [26,50] when the result has been revealed through experimentation.Whereas prior studies suggest that systems should allow individuals to either restart or revise their self-experimentation [27,50], we further show that the reflection process can help steer people's future iteration practices.In particular, we found that using keywords to summarize various aspects of the routines helped limit the choices that participants wanted to try, thus making alternative conditions more intuitive.A similar mechanism could be considered in future system designs to facilitate reflection by limiting individuals' choices so as to reduce cognitive burden.
In real-life situations, people often do not think in a binary way of "yes" or "no" [50].Baumer also suggested moving from quantitative ways of reflection to qualitative ways when the problem space is complicated or ill-structured [13].To evaluate how physical exercise works for them, individuals may adopt different criteria [38].Thus, we suggest that the future design of such personal informatics systems incorporate a high-level reflection component [16] and allow individuals to evaluate different aspects of their behavior change plans with flexibility.While prior works argue for using reflection to facilitate individuals' decision-making processes [1,4,59,105,106], we show that the reflection process also helps elicit strategy iteration.In this manner, evaluating their strategies becomes decision points for participants to decide what to do next.Apart from keywords, future researchers may explore other means to help facilitate individuals' iterations during the reflection process.

Limitations and Future Work
This qualitative study revealed insights for designing reflective iteration systems to help individuals establish and improve their physical activity routines.However, it has limitations regarding the sample, duration, and context domain that can be improved by future research.
First, we have a convenience sample made up of university students.This population usually has good digital literacy [42], meaning that they can quickly learn and adapt to the Planneregy app and its reflective iteration framework.Still, a few participants expressed confusion about iterating strategies during check-in interviews.Researchers and designers can develop low-burden and intuitive interfaces to streamline the onboarding process.Apart from incorporating reflective iteration into mobile apps, future researchers can also explore creative ways to implement this framework, such as using paper diaries [46] or tangible objects [47], to meet users' needs.Other alternatives include designing reflective iteration technology for vulnerable populations [33] via a capacitybased design approach [104] or based on people's prior experience [14].Future work on developing reflective iteration systems could also consider design approaches to accommodate populations with low technology literacy, such as older adults [98] and members of under-resourced communities [17].
In this study, we only applied the reflective iteration framework to help promote individuals' physical activity.However, we see possibilities for applying such a framework to other domains such as obesity [37], chronic conditions [73], sleep [28,29], and irritable bowel syndrome [50,67].For instance, users with sleeping issues could utilize such a framework to explore different ways to alter their sleeping habits (e.g., wearing earplugs, listening to audio books [29]) and evaluate the outcome using their own criteria.When facing life changes (e.g., traveling, changes in working hours), users can choose to resume a previous sleeping strategy.
Participants reported gaining self-knowledge about their physical activity routines during interviews.However, the compensation structure was such that they were rewarded for staying in the study (i.e., earned more money if they completed the study).Researchers should be aware that the financial incentives might have impacted participants' engagement in the study.Future studies may investigate different ways to minimize this effect so as to observe more naturalistic app usage.
Lastly, the study lasted for 42 days, which was a relatively short period of time.This study may not be able to address all types of life changes and temporal disruptions in individuals' lives outside of the college semester.Whether individuals can actively engage in this continuous reflective iteration over a longer period of time (e.g., six months, one year) still remains unknown.Many participants reached a maintenance stage [60,64] later in the study, during which they simply followed the strategy that worked for them instead of changing it.At this stage, participants were less involved in the reflection and used the records only to ensure they were on track.In the long run, this planning and tracking may make users feel burdened [84].Future design may consider offering users an exit point once they are satisfied with their behavioral change plans to achieve a happy abandonment [21].To benefit long-term use, the system can enable a low-effort tracking mode (e.g., only report failure moments of behavior change plans) when reaching the maintenance stage while allowing users to return to an active mode when they have new behavior change goals.

CONCLUSION
In this study, we build on the self-experimentation literature and propose a reflective iteration framework to facilitate people's behavior change goals.We examined this framework by incorporating it in a mobile application, Planneregy, and carrying out a 42-day user study.Findings from this study demonstrate that bundling physical activity plans and reflectively iterating on them can help individuals identify and improve their preferred physical activity routines.This was the case even when facing complicated life scenarios.The results provide ample design implications for designing flexible personal informatics systems.

Figure 1 :
Figure1: The Planneregy app allows participants to reflectively iterate on their physical activity plans on a weekly basis.1-4: On day 1, participants will plan physical activities for the following 7 days.They will use keywords (e.g., morning exercise, moderate exercise, etc.) to summarize the common characteristics of those plans.They will bundle those plans into one "strategy" and name it.5-12: Starting from day 1 to day 7, participants will report how they completed those planned activities on a daily basis.We provided four ways to visualize their completion information: 1) Completed the activity exactly as planned, marked as green; 2) Completed a different activity at the same time, marked as yellow; 3) Completed the same or a different activity at a different time, marked as yellow (the original planned activity is marked as light yellow); 4) Failed to complete any activities, marked as gray.On day 7, participants will reflect on the current strategy by first evaluating keywords and rating the strategy.Participants will then be asked to iterate on their strategies by: 1) continuing with the same one; 2) changing to a previously used strategy; and 3) creating a brand new strategy.To be noticed, screenshots in this figure were captured from the first author's personal use and didn't represent any of the participants' data.

Figure
Figure2: We adopted a Data-Driven Retrospective Interview[93] approach to generate insights in depth.In doing so, we tailored interviews for each participant to help them recall the nuances in their data.By leveraging the researcher version of the Planneregy app, researchers were able to go through participants' data before the interview.Based on the review of their data, researchers further added interview questions to address data points that they found interesting (e.g., repeated the same strategy for multiple weeks but rated it low).During the interview, participants were asked to share their screens, so both the researcher and participants were able to refer to specific data points during the walkthrough of the data.Researchers would have the researcher version of the Planneregy app open on the side, so they could keep digging into participants' data even during the interview.In this way, researchers were able to better understand participants' narratives and ask follow-up questions properly.

Figure 3 :
Figure3: 10 of 16 participants mentioned some sort of life changes or temporary disruptions during the study, which caused them to change their planning strategies.Half of the participants started with a new exercise routine, while the rest of them started with their current exercise routine.11 of the 16 participants changed their strategy after the first one.By the end of the study, all participants (3 male and 13 female) had created 48 distinct strategies with 203 keywords and 434 physical activity plans.Participants completed 70% of their planned activities and rated 50% of the keywords as helpful.9 of 16 participants had an increase in their weekly average physical activity level compared to what they reported before the study.
[…] Then I came back and I was like, okay, now I have the gym back in my house, I have my plan, and I need to follow that.[…] I was like, okay, I need to get back to my [regular] schedule.And now, with week 16, I'm hopefully back [to my normal routine]." […] When I planned out my week with the app, that really helped in sort of visualizing what I was thinking and then seeing what works and what doesn't work from an overall perspective." 4.3.2Weekly Reflection Helped Elicit and Steer the Direction of Iteration.