A Tool for Capturing Smartphone Screen Text

Context sensing on smartphones is often used to understand user behaviour. Amongst the many available sensors, the collection of text is crucial due to its richness. However, previous work has been limited to collecting text only from keyboard input, or intermittently collecting screen text indirectly by taking screenshots and applying optical character recognition. Here, we present a novel software sensor that unobtrusively and continuously captures all screen text on smartphones. We conducted a validation study with 21 participants over a two-week period, where they used our software on their personal smartphones. Our findings demonstrate how data from our sensor can be used to understand user behaviour and categorise mobile apps. We also show how smartphone sensing can be enhanced by using our sensor in conjunction with other sensors. We discuss the strengths and limitations of our sensor, highlighting potential areas for improvement and providing recommendations for its use.


INTRODUCTION
We present a new smartphone sensor that captures all screen text.This sensor can make a valuable contribution to the already vast amounts of data generated from smartphone interactions.Understanding user context via smartphone sensing [42] has in recent years led to breakthroughs in disciplines that study human behaviour, such as health [9,81,105] and education [92,102,120].
Yet, despite the rich sensor data that smartphones make available for context sensing, most of the sensed data is an ambiguous proxy for the actual behaviour of interest [35].Smartphone sensors such as location, applications running, and accelerometer/gyroscope, have been shown to correlate well with human behaviour, but must always be analysed in tandem with some alternative source of "ground truth".One crucial sensor that overcomes this limitation is the collection of screen text from smartphones, which can itself act as a basis for "ground truth" in understanding human behaviour [22].
So far, there have been two main ways of collecting text data from smartphones.One approach is to collect typing & keystroke data from smartphones [42].Such data can provide insights into health-related aspects such as emotion [95] and stress [91], as well as infer broader context, including an individual's environment, social context, or possible distractions [4].An important drawback of this approach is that the vast majority of screen text is not typed by users, and therefore is not captured.To tackle this, an alternative approach is to capture screenshots from mobile devices with high frequency [86].These screenshots -or Screenomes -are typically taken every few seconds, providing a detailed visual record of individuals' smartphone usage.This technique enables the analysis of broad aspects of smartphone use, but has multiple limitations due to the use of computer vision techniques to analyse and annotate the screenshots.The main drawbacks include the computational cost, scalability challenges, and transcription errors introduced by the optical character recognition analysis.Additionally, screenomes capture only the phone activity of users, omitting other contextual elements beyond the smartphone such as their location.The lack of this additional information may reduce the accuracy of contextual inference.
We overcome these limitations by developing a software sensor that continuously and unobtrusively captures all screen text on smartphones.Our approach uses Android's accessibility services to capture textual elements on-screen, format the data, and transfer this data to a database for analysis.The use of the accessibility API requires minimal data processing and energy to collect the text, does not sufer from error-prone recognition or gaps in the recordings, and ofers precise data that can act as a source of ground truth in understanding human behaviour.Furthermore, the integration of our sensor within the AWARE-Light framework [106] facilitates the seamless, concurrent capturing of screen text alongside a multitude of other sensors, providing researchers with an enhanced sensing tool that allows easy access to collecting data and running studies using screen text.The capabilities of the screen text sensor and its integration in AWARE-Light introduces a novel dimension to smartphone sensing that builds on existing methods.Firstly, it captures a broader range of smartphone interactions compared to keystrokes.Keystrokes refect only the typing activities of users and reveal little information about the context of their typed content.For instance, discerning a user's intention solely through keystrokes poses challenges due to the absence of information about the context of their searches or responses.This limitation is addressed by screen text, which captures entire conversations, ofering a more comprehensive context for user interactions.Secondly, our screen text sensor is fully integrated with AWARE-Light, a software that captures data from diverse smartphone sensors such as the accelerometer and GPS.By collecting and analysing screen text alongside other sensor data, a broader context beyond the smartphone such as user movement and location can be inferred and correlated with their screen content.In contrast to screenomes, which provide only information about smartphone use, the integration of our screen text sensor within a broader sensor framework allows for a comprehensive analysis of both digital and non-digital activities in tandem, providing researchers with a more comprehensive understanding of user behaviour.
In this paper we present a two-week validation study with 21 participants who used our software in their daily lives.The study aims to identify the strengths and weaknesses of this new sensor in realistic scenarios and demonstrate its potential in understanding human behaviour.
Our paper makes multiple contributions: • We present a novel smartphone sensor that captures screen text, which integrates with the AWARE-Light framework.• We demonstrate how the sensor captures and presents textual information from smartphone interactions.• We show how this sensor can be used in tandem with other sensors for further contextual inference.• We highlight the strengths and weaknesses of this sensor, and provide recommendations for its use in understanding behaviour.

RELATED WORK 2.1 Context-aware smartphone sensing of behaviour
Smartphone sensing has been widely used to understand human context and behaviour in various domains.The combination of smartphone sensing capabilities and their widespread usage makes them efective tools for capturing human behaviour unobtrusively [47,63].Integrating data obtained from multiple smartphone sensors enables a comprehensive understanding of a person's surroundings, activities, and behaviours [19,30].This understanding can be used to inform recommendations based on the user's context.Healthcare extensively utilises smartphone sensor data.For example, research in the feld of digital phenotyping has shown that GPS could be employed to assess mental health states [16,18,31,74,78,89,109,111,125], sociability [13,46,51,112], overall health status and well-being [8,36,57,64,76], and personality [58].GPS tracking enables the study of movement as a health indicator, such as understanding that people sufering from depression tend to travel less [53], whereas people who travel more may have larger social networks [97], which in turn increases happiness [27].Similarly, smartphone accelerometers have been studied to classify physical activity occurrence [7, 54, 65-67, 96, 114, 118] and sleep [50,79], both informative of lifestyle.Accelerometer data captures precise movement patterns and abnormalities in physical behaviour.Additionally, smartphone app usage and communication data can be used to inform context in health, such as monitoring student behaviour and performance [21,34,110], and mental health [12,40,83,94].These software sensors provide insights into people's interactions with diferent content types, aiding in the design of health recommendations and interventions.
Smartphone sensing strategies have also been extensively explored in education, often incorporating multiple sensors.WoBaLearn leverages the smartphone's light sensor and microphone to deliver teaching content in work-based learning, tailoring support to learners' individual educational needs, characteristics, and circumstances [127].Another approach utilises GPS data in a personalised contextaware recommendation learning system, notifying learners about nearby learning materials based on their current location [120].Similarly, CALMS employs ontologies that utilise GPS data and additional contextual information, such as academic profles and time, to ofer context-relevant academic content.This approach has shown positive efects on student grades and user satisfaction [38].In a fipped classroom environment, Louhab et al. [73] collected information from learners' mobile devices, including available software, screen size, battery level, and Internet connectivity, to determine the optimal format for delivering course content.Students using the context-aware application expressed high satisfaction with the content delivery.
Despite the availability of numerous smartphone sensors and the data they provide, combining these sensor data yields only an approximate measure of an individual's smartphone usage.This limitation arises from the inherent constraints in capturing precise content from existing smartphone sensors.Although user context can be inferred from these sensor data, the inability to accurately interpret specifc user actions introduces the risk of making erroneous assumptions about their context.This limitation presents a signifcant challenge in achieving a comprehensive understanding of users' smartphone interactions, necessitating further research to enhance the precision and accuracy of context inference in smartphone sensing applications.

Text corpora from smartphones
In addition to utilising existing smartphone sensors, analysing textual screen content on smartphones ofers another avenue for understanding user context.The proliferation of text-based communication platforms and applications has generated a vast amount of textual data, which can provide valuable insights into users' contexts [15,33,121].User-generated text, such as social media posts, can provide information about the user's personality traits, such as extraversion or agreeableness [5].Additionally, the characteristics of written text can serve as indicators of demographics, such as younger individuals using more feeling-related words when describing artwork [3].Therefore, analysing textual information, including browsing content and communications, holds the potential to enhance our understanding of users' behaviour and preferences.
Keystroke sensing can be used to analyse how individuals type on their smartphones, aiming to infer information about their context and behaviour.Tahir et al. [103] demonstrated that employing machine learning methods on keyboard character input and keystroke data achieves high accuracy in identifying emotions, such as happiness, sadness, and anger, from short text inputs.The number of negative words typed by a user has shown a strong correlation with increased perceived stress and reduced sleep duration [25], suggesting that the sentiment expressed in the text generated by an individual may serve as a predictor of their well-being.Furthermore, keystroke analysis has captured variances in semantic content across diferent texting platforms.For instance, based on the Linguistic Inquiry and Word Count (LIWC) 2022 dictionary, it was found that people prefer to share content such as books and songs, and discuss leisure activities on Facebook, whereas they were more task-oriented when communicating via SMS [71].Analysing text generated from keyboard typing also allows for measuring mistakes made while typing and the timing between keystrokes.These metrics have been suggested to be indicators of stress, as individuals experiencing stress tend to type faster and make more errors [29].Similarly, the accuracy and speed of typing on a smartphone keyboard can provide insights into abnormal upper extremity motor coordination, eye-hand coordination, and manual dexterity, which can be relevant in assessing certain conditions or diseases [48].
While keystroke sensing ofers insights into how individuals produce textual content, it will not in general provide information about the types of content they consume or interact with.To address this, smartphone screenomes have emerged as a tool for capturing all textual interactions.Screenomes consist of sequential high-frequency screenshots taken every 5 seconds, capturing individuals' day-to-day digital experiences and facilitating analysis of their engagement with the digital environment [82].The Screenomics framework studied screenomes extensively, employing optical character recognition (OCR) and image analysis to extract text and image content from the screenshots, respectively [82,85].Textual information can be extracted from each screenshot with an accuracy of 74% using OpenCV for image pre-processing and Tesseract for OCR [85].This allows for a comprehensive understanding of human behaviour on smartphones, including content viewing patterns across various categories and platforms [85], content engagement at diferent times of the day [22], and task-switching behaviours [119], among other directions of exploration.Screen-Life Capture is an open-source application that reduces the burden of collecting screenome data, allowing researchers to easily run their own screenome studies [122].It provides the fexibility for participants to start or stop screen capturing as desired, ensuring their comfort and control over data tracking.
Extracting textual content from smartphone screenshots poses challenges due to the content being displayed in multiple fonts and font sizes, resulting in resource-intensive and often inaccurate extraction processes.For instance, Chiatti et al. [28] reported that icons can be incorrectly identifed as characters, such as the Bluetooth icon ( ) being interpreted as a "$" sign.These additional time and computational requirements limit the scalability of studying screenomes for a larger population over an extended period.These inaccuracies in text recognition can potentially compromise the precision of observed trends.Additionally, the 5-second interval between screenome screenshots may miss shorter interactions such as scrolling through text, changing a song, or checking notifcations.Given the decreasing attention spans of humans leading to content consumption changing more rapidly [101], the omission of even brief interaction windows could signifcantly impact the overall understanding of an individual's context.
To address these challenges, we developed a lightweight sensor to capture textual content on smartphone screens.Our sensor captures textual information directly from the screen in real-time, eliminating the need for additional machine learning processing.Data is collected whenever the sensor is activated, enabling the capture of even brief interactions.Compared to state-of-the-art methods, this approach improves both the accuracy and rate of data capture while reducing computational demands.We demonstrate the technology and capabilities of our sensor in the following sections.

METHODOLOGY 3.1 The Screen Text Sensor
We have developed a sensor to continuously capture screen text on Android smartphones.The sensor has been designed to be part of the AWARE-Light smartphone sensing system [106].Built upon the AWARE framework [42], AWARE-Light enables the collection of data from various smartphone sensors, including keystrokes, screen status (locked/unlocked, on/of), and geolocation data.The application features a user interface that facilitates study sign-up, displays participant information, and provides a list of the sensors being captured once the participant joins the study.With the participant's consent, AWARE-Light passively captures data from the selected sensors without requiring further input or intervention from the participant.By default, the data is transmitted to our own MySQL database in the cloud.In keeping with AWARE-Light's open-source availability, our sensor is also openly available for other researchers to use and conduct studies 1 .The sensor works in a similar way to screen-readers, but it stores the screen text instead of vocalising it.Specifcally, we rely on Android's accessibility service, which notifes our software whenever the user interface updates.This can happen when users actively scroll on their phone, type, unlock their screen, or simply when the application updates its content.
When our software is notifed of a screen update, we receive a data object that contains a tree representation for all on-screen UI elements.From this, we are able to extract the text of each UI element, the screen (x,y) coordinates of the UI element, and the precise timestamp when this was shown on the screen.Because our software is notifed of all screen updates, it is often the case that although the screen has updated, the actual text has remained the same.This can lead to a large amount of duplicate data.Therefore, a fltering mechanism we introduced is to discard screen updates if the text labels remain unchanged.
Due to the way AWARE-Light operates, the data from our sensor is "fattened" into a single long string.Every time the screen updates, we generate and store this string in a database.Because the information originally comes in the form of a tree, we also store special markers in the string to refect the delimitation of UI elements.

Textual features
For this paper, we developed some further terminology to describe the data collected by this sensor.This is partly due to how the operating system provides us with the screen text data.To the best of our knowledge, there is a lack of established terminology to characterise such screen text.Hence, we propose a set of terms to describe screen text elements, which we refer to later in our analysis.
A screen refers to a single snapshot of the user's phone screen at any time.Each screen contains all text content present on the phone screen, which may include news articles, comment text, labels, or links in the navigation bar.The text for each screen is represented as one data entry, where it is concatenated into a single long string and delimited.A new screen is captured and stored only if its text content is diferent from the previous screen.Fig. 1 provides examples of screens and their contents.The operating system provides us with the Application that is associated with the screen (as a package name).
A phrase is a chunk of text within a screen.Defned within Android as a text node, this is the text that belongs to a UI element.A phrase can contain multiple words, and the length of a phrase can vary substantially since diferent applications use text diferently.Phrases are visually distinct on screen, as demonstrated in Fig. 1.Our sensor does not attempt to make any inference regarding the relationship between phrases, and simply delimits them according to what the operating system provides us.
For each phrase, we record its screen (x,y) coordinates as provided by the operating system.Specifcally, the coordinates are measured in pixels, and we record the top-left and bottom-right coordinates of the rectangular bounding box of the phrase.Fig. 2 demonstrates how each captured phrase and its position can be used to reconstruct the layout of text content on a screen.
Finally, all phrases in a screen are concatenated into one text string along with their positions.Each phrase and its position is delimited using double pipes (||) to facilitate individual analysis.

Text metrics
We use a number of metrics to quantify patterns in the text data we have collected.We consider the total number of screens as a measure of "information update".This number depends on how often participant's phones update the information they display.
We also use the number of phrases per screen as a measure of "information density", and we use it to quantify the volume of information.There are multiple ways to quantify information Figure 2: Reconstruction of a screen and its phrases.The density, including word count or character count.However, we original screen is shown on the left, while the reconstruction decided to use phrases per screen because it captures the underlying is on the right.complexity of the UIs that participants interact with.
We also use the metric of phrase diference to quantify "information churn", or a measure of how much the screen has changed.
For example, the phrase diference between a screen with the The phrase diference between two sequential screens is defned as phrases ["A", "B"] and another screen with ["B", "C", "D"] the set of unique phrases appearing in only one of the two screens.contains 3 phrases, as each phrase in the set {"A", "C", "D"} Therefore, given we have screen with a set of phrases and appear in only one screen.Fig. 3 compares two screens and visuscreen with a set of phrases , we can calculate the phrase alises their phrase diference.We use set diference as our measure diference using the following set diference formula: because duplicate phrases within a screen are uncommon, hence discarding duplicates will have a negligible impact on measuring = ( ∪ ) − ( ∩ ) overall content change between screens.We note that the metrics we present are independent.An application may have dense or sparse screens, which may update frequently or not, and those updates can bring about small or large changes.

Privacy implications
Collecting data from smartphone usage comes with privacy implications given the sensitive nature of the personal user data that is being collected.The need to accommodate user privacy preferences and secure user data is particularly salient with the screen text sensor, given the highly sensitive and revealing nature of its data.To address privacy concerns, we designed our screen text sensor to be easily disabled and re-enabled at the participants' discretion.The software also displays a persistent notifcation that informs the user whenever data collection is taking place, ensuring that participants are aware that their phone usage is being studied.These controls give participants control over the data they choose to share and when they share it, and the constant notifcation of data collection reminds them to disable any sensors they do not want tracked.Additionally, we implemented Android's built-in password-detection method to identify instances where participants were entering or viewing passwords.In such cases, we do not capture the corresponding screen text data.For studies that do not require all screen text, our sensor allows researchers to specify only certain applications to collect data from and exclude others.In doing so, participants have the assurance that sensitive information from specifc apps or activities will not be recorded.This selective approach to data collection not only preserves participant confdentiality but also encourages broader participation in studies, as individuals may be more inclined to contribute data when they have control over which aspects of their smartphone interactions are being captured.While these measures may result in minor data loss, they are essential for minimising participant burden and safeguarding their privacy.Beyond the privacy and security features of AWARE-Light and the screen text sensor, it is incumbent upon study administrators to implement a setup that ensures the secure transmission and storage of screen text data.

EVALUATION
We conducted a feld study to collect a comprehensive and realistic dataset comprising smartphone screen text and sensor data.Throughout a two-week study period, we collected textual data displayed on participants' smartphone screens.We also collected a range of additional sensor data to further inform our understanding of user context, including applications, battery, Bluetooth, communication (calls and SMS), keyboard, location, network, notifcations, proximity, screen state, and Wi-Fi data.Participants used their smartphones in a natural way over the two-week period.We also used Experience Sampling to understand participants' activities, and questionnaires to gauge their privacy concerns.
The study was approved by the University of Melbourne's Ofce of Research Ethics and Integrity.Participants were provided with a compensation of $65 AUD upon successfully completing two weeks of data collection and a debriefng questionnaire.

Participants
We recruited 21 participants (10 male, 11 female) for our study (demographics listed in the Appendix).The participants were between the ages of 18 and 54, with a median age group of between 25 to 34 years.Participants had a highest education level of a current university student ( = 3) Bachelor's degree ( = 10), Master's degree ( = 7), or Doctoral degree ( = 1), with a median annual household income of between $40,000 to $49,999 AUD.Participants reported a median daily phone usage time of 3 to 4 hours.We considered only English-speaking participants who regularly use their smartphones.We recruited only participants using Android devices (given that AWARE-Light runs only on Android), with an Android version from 9 to 13. Participants used a variety of devices from diferent manufacturers: Samsung ( = 10), Google ( = 5), OnePlus ( = 2), Huawei ( = 1), Motorola ( = 1), OPPO ( = 1), and Xiaomi ( = 1).

Experience Sampling Method
To gain a deeper understanding of user context, we collected Experience Sampling Method (ESM) responses from participants.ESM involves measuring participants' behaviour, thoughts, and feelings during their day-to-day activities through short questionnaires answered throughout the day [107].These questionnaires are completed by users in their actual environment, more closely replicating natural behaviour compared to controlled lab studies [80,107].The ESM questionnaire we employed aimed to capture user behaviour that may not be readily apparent in the sensor data, thus enabling a more comprehensive inference of context.
To understand this context, we asked two questions: (1) Where are you right now? Participants were presented with seven multiple-choice options (Fig. 4) and asked to select one that best represented their current location.The options included Home (Indoors), Home (Outdoors), School/Work (Indoors), School/Work (Outdoors), Outdoors (Not at home), Travelling in a vehicle, or Other.For participants who chose "Other," they were further prompted to provide a brief description of their specifc location.
(2) What are you doing right now?
Participants provided free-text responses (Fig. 5) by describing the primary activity they were engaged in during the fve minutes preceding the questionnaire.Participants were asked to provide a single sentence to capture this information.We designed our questionnaire to be brief and easy to complete to minimise participant fatigue.We aimed to avoid burdening participants with frequent or lengthy responses, as such requirements can cause annoyance [45].Given our focus on capturing context from participants' self-reports in real-time, it was important for them to spend minimal time on answering the questions to ensure timely responses.The frst question utilised predefned categories that represent common situations encountered in adults' daily lives [70].This approach allowed participants to select from general location types without providing specifc details.The second question asked participants to summarise their current activity in a single sentence.Categories were not provided for this question, as it is challenging to categorise the wide range of possible activities in which participants may be engaged.

Study procedure
Participants initially expressed their interest in the study by completing an expression of interest form.Following this, they received a Plain Language Statement that provided detailed information about the study, along with a consent form to be completed.Upon returning the consent form, participants were provided with an instruction sheet outlining the process of setting up the application on their smartphones, joining the study, and completing the ESM questionnaires.A three-day testing period followed to ensure compatibility of participants' smartphones with AWARE-Light and accurate capture of sensor data.Participants whose smartphones were incompatible due to accessibility setting issues ( = 2) or experienced force-closing of the application ( = 1) were excluded from the study.Following the testing period, 21 participants proceeded with the remainder of their two-week study period.
In addition to passive data collection, participants were prompted to complete ESM questionnaires fve times a day, with two-hour intervals between 10 a.m. and 6 p.m.These questionnaires were delivered via notifcations that remained visible for 15 minutes in the smartphones' notifcation menu.Considering that users tend to activate their phone screens within 15 minutes of inactivity [41], they were likely to encounter the questionnaire before the notifcation expired, even if it was received shortly after their screen was turned of.If the notifcation expired, participants were unable to access the questionnaire until the next scheduled time.The questionnaires were scheduled at the same time each day to capture potential changes in participant behaviour during consistent time periods across diferent days.
Upon completing the two-week study period, participants were asked to fll out a debriefng questionnaire.This questionnaire aimed to gather feedback on participants' experience during the study, including the ease of using the application, participants' comfort level with data collection from each sensor, reasons for turning of any sensors, and overall feedback on the study (see the Appendix for details).The study concluded with participants being reimbursed upon completion of this questionnaire.

RESULTS
In our study, we gathered a total of 7,004,867 screens, 135,414,272 phrases, and 4,264,737,110 characters.The most active participant viewed 1,076,069 screens and 13,941,219 phrases, while the least active participant had 25,200 screens and 158,555 phrases.On average, the time between screen updates was 0.87 seconds.The most "dense" screen recorded comprised 5,245 phrases and was 127,593 characters long.The screen text table has a total size of 12.69 GB, with each participant generating an average of 43.16 MB of screen text data per day.In contrast, we collected only 32,199,892 characters in keyboard input, which represents 0.76% of the characters captured from screen text.This highlights the signifcantly richer interactions that can be captured within screen text compared to keystroke data.

Screen text and user behaviour
Various techniques, including NLP methods (e.g., keyword extraction, sentiment analysis, named entity recognition, topic modelling), corpus linguistics (e.g., phraseology, lexicogrammar, register, formal language usage), and statistical analyses, can be employed to analyse text.Our focus is to present the applicability of these approaches to the collected data.The presented measures are aggregated across hours and days of the week, encompassing interaction frequency (total screens), information density (phrases per screen), and text sentiment.
Firstly, examining screen text patterns over time reveals individual user characteristics and general phone usage trends.Across all participants, the lowest number of screens is captured from 3 a.m. to 5 a.m., constituting 2.06% of total screens, while the highest occurs from 8 p.m. to 10 p.m., comprising 20.37% (Fig. 6a).While this trend is generally consistent, individual behaviours may differ.For instance, P5 peaks at 5 a.m.(9.03%) and from 6 p.m. to 8 p.m. (21.46% total) but shows minimal usage during daytime from 8 a.m. to 11 a.m.(3.92% total) (Fig. 6b).Examining daily trends (Fig. 7a), Thursday consistently records a higher number of screens than other days from 1 p.m. onward, and Friday peaks at 2 a.m.Individual patterns are discernible as well; for instance, P2 exhibits alternating increases and decreases in screens viewed from morning to evening, a pattern consistent across each day (Fig. 7b).
For all participants, the average number of phrases per screen, refecting information density, peaks at 7 a.m., averaging 27.49 phrases.This is 2.17 times higher than the lowest average at 5 a.m., which stands at 12.68 phrases (Fig. 8a).This overarching trend is further elucidated for each participant.For instance, P12 exhibits a peak at 7 a.m. with an average of 49.15 phrases, and a similar peak at 10 p.m., averaging 48.19 phrases.Interestingly, this participant views an average of 4.87 phrases per screen from 12 a.m. to 6 a.m., constituting less than 10% of their peak at 7 a.m. (Fig. 8b).Exploring the hourly average number of phrases per screen across diferent days of the week reveals no noticeable similarities.
To showcase the potential for further linguistic analysis, we conduct sentiment analysis on all screen text using the VADER (Valence Aware Dictionary and sEntiment Reasoner) model [52].After pre-processing phrases by removing stop words, we concatenate all phrases of a screen into one string, tokenize the string, and apply the VADER model.The sentiment score ranges from -1 (most extreme negative) to 1 (most extreme positive), with 0 indicating neutral sentiment.The lowest average text sentiment occurs between 12 a.m. and 5 a.m., while sentiment remains stable for the rest of the day.All average hourly sentiments are positive (Fig. 9a).Some participants deviate substantially from this overall trend; for instance, P15 engages with content of positive sentiment mostly from 3 a.m. to 10 a.m.Overall, we observe that the average sentiment generally decreases from 12 a.m. to 5 a.m. on each day, then increases at 6 a.m. and fuctuates little throughout the day (Fig. 10).This trend is consistent across each day.

Screen text and app behaviour
Analysing screen text from various apps can help classify them based on the type and content of text they display.Using Android's package naming system, each unique package name is linked to a specifc app.We employ the app categorisation dataset by Schoedel et al. [93] to group apps into categories like social media, transportation, and gaming.Apps not in this dataset are labelled as "NA." We demonstrate categorising apps based on "density" (phrases per screen) and "dynamics" (phrase diference).Additionally, we present aggregated sentiment analyses for each app, exploring commonlyviewed content and its variations between users.These analyses highlight the semantics in screen text, ofering insights into app usage and user behaviour.
The average number of phrases per screen and the average phrase diference for each app show a strong positive correlation, (392) = .63,< .001(Fig. 11a).Some apps deviate from this trend;       Based on the VADER model, a sentiment score of below -0.05 is considered negative, between -0.05 and 0.05 is considered neutral, and above 0.05 is considered positive.Out of the 394 apps that participants used, 263 apps have a positive average sentiment, 108 apps have a neutral average sentiment, and 23 apps have a negative average sentiment.The SEEK app (job fnding) has the highest average sentiment, at 0.96, whereas the Samsung Safety Information app has the lowest average sentiment, at -0.95 (Fig. 12a).
Each app is grouped into one of 20 categories based on its primary functionality, with the remaining apps grouped as "NA".Out of the 21 app categories, 19 categories have a positive average sentiment, and 2 categories have a neutral average sentiment.Career-related apps have the highest average sentiment, at 0.75, while Time-related apps have the lowest average sentiment, at -0.05 (Fig. 12b).
Identifying the most frequently occurring words within each app or app category allows for content categorisation and understanding individual preferences.Table 1 displays the top 10 highestoccurring words across all participants for a specifc app and app category, along with a comparison between two participants (P8 and P11) for the same app.For instance, the SEEK app, a job-searching application, presents content about teaching and lecturing jobs, as well as jobs related to accounting and cybersecurity.Health-related apps contain information related to exercise (cal, km, distance, ftbit) and cardiovascular health (heart, rate), along with sleep.P8 predominantly uses the Spotify app (music) to fnd Taylor Swift's music, while P11 primarily views content related to rock music, including the Foo Fighters, Bon Jovi, and Led Zeppelin.

Screen text and other sensors
We further explore how our screen text data can be integrated with other sensor data to understand context from diferent perspectives.

Screen text across geographic locations.
We analyse geographic variations in screen text to uncover smartphone usage patterns in physical spaces.Using latitude and longitude data, we group locations within a 10-meter radius, reducing visual clutter while maintaining data precision.Figures (Fig. 13a, 13b, 13c) illustrate our fndings.
Comparing a park (middle-bottom of the map) and a business area (left of the map), we note fewer recorded locations in the park, indicating higher activity in the business area.In the park, participants generally view neutral content (Fig. 13a).Conversely, the business area exhibits diverse sentiments, with positive clusters near the cinema and bar.Despite a higher screen count (Fig. 13b), the park sees fewer phrases per screen (Fig. 13c), suggesting frequent phone use during park traversal with minimal content consumption.In the business area, patterns are less distinct, but notable screen counts occur at the bar, restaurant, and road intersections.

Screen text and ESM.
We analyse responses from the second question of our ESM questionnaires, categorising them into 11 categories detailed in the Appendix.For each ESM response, we retrieve the participant's screen text from the fve minutes prior to receiving the questionnaire, allowing us to study the relationship between their activity and screen content.
The highest average number of screens viewed occurs during shopping, contrasting with the lowest during online searching (Fig. 17a).In terms of the average number of phrases per screen and average phrase diference, online searching and housework exhibit   the highest and lowest values, respectively (Fig. 17b, 17c).Regarding

Self-report data
sentiment, it peaks during housework and exercise, while it is lowest We asked participants to provide self-report feedback on how they during socialising and rest (Fig. 17d).perceived smartphone sensing using the AWARE-Light application.Each question was answered on a scale from 1 (strongly disagree) to 7 (strongly agree); the details are provided in the Appendix.We fnd that it is generally easy for participants to use the app and they are comfortable using it.Participants also indicate that they tended to be more conscious of their smartphone use and were somewhat concerned about their privacy while sensing occurred.Given an opportunity in future studies, they generally would be interested in receiving daily feedback about their phone usage.We also aimed to gain an understanding of how participants felt towards data collection from each sensor to see how our screen text sensor compares with existing sensors.Each question was answered on a scale from 1 (extremely uncomfortable) to 7 (extremely comfortable); the details are provided in the Appendix.Participants were generally comfortable with the battery, screen state, Bluetooth, proximity, Wi-Fi, network, and applications sensors, as these sensors reveal little information and cannot be used to identify users.The notifcations, communication, and location sensors were perceived neutrally, as they could be used in conjunction with other sensors to gather information specifc to a user.Participants felt slightly uncomfortable towards the screen text and keyboard sensors, likely due to the ability for these sensors to capture direct user interactions with their phone, which may include sensitive information.
During the study, 7 participants intermittently disabled specifc sensors, while the remaining 14 participants chose to retain the default settings over the entire study period.The screen text sensor was most-commonly disabled, being disabled by 4 participants for an average of 1.27 days during the study.The application and location sensors were disabled by 3 participants each, for an average of 5.08 hours and 9.33 hours, respectively.Participants were clearly told that they would not be penalised in any way if they chose to disable sensors.
Participants who disabled sensors generally did so due to privacy concerns when viewing sensitive content:   "During the time of launching certain private applications, I turned of all the sensors for a short time."-P5 Some participants who did not disable any sensors see potential benefts of smartphone sensing, or are less concerned about privacy due to also regularly using other devices: "I did not turn of any sensors.Initially I was a bit hesitant about the sensing, and was conscious about it when I was using my phone.However, as time went on, the sensing stood out less to me and I became more comfortable with it.I guess if there was a meaningful end result to the sensing, such as detailed analyses of my app usage or phone habits, I would not mind having my data collected in this way." -P19 "I mostly use my phone for reading stuf on the internet and some messaging, but I also use my laptop a lot, so I didn't feel the need to turn of phone sensors, as I could always use my laptop for communication." -P10 We also gather general feedback from participants about the study, where they express being more aware of their smartphone use while taking part: "In my opinion, it is a good study and encouraged me to be more aware of [my] smartphone use in everyday life." -P15

DISCUSSION
We ran this study to validate our screen text sensor and understand its behaviour "in-the-wild".We collected diverse screen text data from websites and mobile apps, which we believe is accurate and representative of actual smartphone use.Our study indicates that the data collected by the sensor is timely, precise, and complete.The data appears to be collected in real-time without any observable delays.We have visualised and analysed the collected data in a number of ways.The data appear to follow the daily, weekly, and semantic patterns that we expected without any notable anomalies.Furthermore, we fnd that data can be reliably associated with the applications that generate it, and our analyses show that the collected text match our expectations.
In addition, we have shown that the sensor data can be reliably cross-referenced against any of the other sensor data that can be collected by AWARE-Light, including with ESM questionnaires.This opens up a fascinating range of possibilities for new types of experiments and feld studies.At the same time, we also identify a number of challenges and limitations in deploying the sensor.We fully unpack these in our discussion.

Data richness
Unlike traditional time-based phone usage or screen time trackers, which ofer only a broad indication of time spent on each app, analysing screen text provides richer insights into phone usage behaviour.Time-based trackers merely measure how long the phone screen is on and which apps are displayed [44], without considering actual phone use or idle periods.Moreover, they don't capture the level of user interaction, distinguishing between passive content consumption and active information generation.Our sensor measures total phone usage and interactions with greater precision.For instance, the time diference between screens serves as a metric for understanding the frequency of phone use.The total number of screens captured refects the interactivity between a user and their phone, as a new screen is recorded only when the screen content changes.
Existing studies primarily focus on app usage to understand user behaviour [59,68,128,129].However, user interactions within an  app can vary signifcantly across categories, times of the day, and users [22], making it challenging to generalise behaviour from app usage alone.Analysing phrases allows for studying semantic meaning, understanding content engagement, and identifying patterns related to information consumption.Natural language processing can extract text sentiment to summarise the overall mood of userinteracted content.The number of phrases on a screen serves as an information density measure, indicating richer content.Examining total phrases viewed provides insights into user preferences for reading longer or shorter text.Task switching on mobile phones is often studied by analysing app switches [37,69].However, the purpose of each app can vary, making it challenging to understand content changes as users move between screens.Examining phrase diferences between screens helps discern newly generated or retained content, ofering insights into user engagement with specifc topics or activities that involve multiple applications.Phrase diferences can also measure content diversity and change, with larger diferences suggesting a broader range of content and potentially making the app more engaging for users.
Screenomes have been used to capture a rich array of smartphone interactions [85].By capturing a screenshot every few seconds and applying OCR, screenomes can record text and image content viewed by users.Our screen text sensor captures all text within the screen's UI using Android's accessibility API, meaning that it may pick up noise due to encountering invisible text.Although OCR will not pick up invisible elements, its lower accuracy with recognising text inherently generates noise that can be difcult to identify and correct within a large dataset.Additionally, the need for screenomes to capture and perform OCR on screenshots imposes a limit on the rate of data collection and analysis, which commonly occurs in fve-second intervals [82].While the screen text sensor collects only text and does not capture images and videos, the efciency of analysing plain text data enables real-time data collection and analysis, which can be applied to generate recommendations and interventions in situ.The screen text sensor can be efective in facilitating data collection and storage for studies on smartphone usage where text is the primary focus or there is a need for real-time analysis.On the other hand, studies that require capturing image and video content can be conducted using screenomes.Therefore, researchers should consider their study objectives when deciding on the most appropriate method.

Behaviour modelling
In our study, we showcase techniques for comprehending user and app behaviour on smartphones using screen text data.
Understanding user behaviour from smartphone use characteristics becomes feasible through screen text analysis.It sheds light on digital context, capturing sentiments and content frequency throughout the day.When integrated with other movement-based sensors or wearables, users can be profled based on both digital interactions and physical context.Passive sensing, prevalent in digital health felds [24,98,115,124], allows unobtrusive data collection.Although existing technologies track physical behaviour, assessing mental well-being in situ remains challenging.The ubiquity of smartphone use allows screen text analysis to delve deeper into people's feelings in their natural setting.By combining screen text with other sensor data, we enhance context inference by considering multiple dimensions of behaviour.Existing sensors, such as accelerometer/gyroscope movement [84] and connectivity [26], capture physical surroundings and social context.The addition of smartphone and mobile context via screen text prompts critical questions: • How do people's viewing habits vary across diferent physical locations?• What types of content do people tend to view when commuting?• How does proximity to others afect smartphone information consumption?
Answering these questions can advance research across felds.For instance, medical practitioners can identify patients' smartphone viewing habits and mobility information to detect anomalies [11,90].Understanding consumer interests is vital for marketing, facilitated by screen text.Market research can analyse locations generating searches relevant to products and services [43].Screen text analysis in education can profle students' study habits, enabling personalised learning [39,88].Screen text attributes also allow inference of app behaviour.Tracing content across diferent apps reveals if certain apps are used together for specifc tasks.For app developers, this insight can improve functionality by incorporating technologies from other apps or streamlining app fow for easier switching.Apps can be profled based on displayed information and its rate of change.Social media apps may display digestible content with frequent updates, while government apps may ofer informative content with less frequent changes.Comparing intended purposes with user interactions informs on-screen content delivery design.
Additionally, the sentiment of content within each app can inform usage patterns and user emotions.App usage between users can be examined to understand how the sentiment of content within one app can vary across individuals, representing diferences in the types of social media or news viewed.Patterns in app-switching can be analysed in greater detail, including identifying how the sentiment of content viewed changes as users switch between apps.This could be benefcial in classifying whether certain apps are "complementary" in sentiment, where they are used consecutively to retain (e.g. using a "positive" app followed by another "positive" app) or alter mood (e.g. using a "negative" app followed by a "positive" app).

Further approaches to Screen Text analysis
Our analysis focuses primarily on validating our screen text sensor through understanding the properties of collected data.These properties and screen text metrics can inform a wide range of behaviours when examined with its corresponding semantic content.For example, analysing people's reading habits involves considering how much time and how often they interact with diferent types of content.More engagement, measured by the number of screens viewed, suggests a higher level of interest.Additionally, the thoroughness of a person's reading can be inferred based on the information volume and variability of the text, measured by the number of phrases per screen and phrase diference, respectively, as well as how long they spend on each screen.For instance, if a user rapidly browses through numerous screens featuring a substantial volume of content and frequent changes, it may suggest a lack of interest in the topic or a more cursory reading approach.Specifcally, analysing screen text allows for studying the infuence of natural reading habits on factors such as academic achievement [2,10] and social behaviour [87,113], as opposed to within a lab setting.App developers can also leverage this insight to enhance their app design.For example, they could analyse which content types maximise user retention and encourage prolonged reading.Additionally, understanding how to strategically position content to optimise scrolling and engagement can help craft an efective and user-friendly app interface.
Nevertheless, there are many possibilities for screen text analysis that extend beyond the methods we explore.Examining screen text metadata allows us to study on-screen textual features, encompassing aspects like UI structure and text sources.These features can be extended to broader contexts, facilitating exploration of phenomena such as infnite scrolling and diferentiation between passively consumed and actively typed text.The semantics of screen text can be analysed for studying how people communicate within diferent scenarios.This may include studying workplace relations to see how people communicate with their managers in contrast to people they manage, which could be refected in how their tone of speech changes [55].Properties of formal versus informal communication can also be investigated, such as analysing the diferences in average text length and common topics between email and text messaging.Furthermore, people's use of language in relation to particular aspects of their work, such as their perceptions of cybersecurity, can be assessed [56].
Additionally, bi-lingual or multi-lingual analysis can also be conducted, aiming to understand if people who interact with smartphone content from multiple languages view diferent information or perform diferent tasks based on the language they are using, such as product reviews [32] or using personal assistants [117].These fndings can help to distinguish nuances in smartphone use, allowing for greater accuracy in inferring context of smartphone users.Broader questions about our use of language can also be investigated.For example, it is possible to study how the use of language varies across diferent age demographics [20], how language itself evolves (such as with the introduction of emojis [116]), and how ChatGPT-like services can be used in conjunction with our sensor to provide recommendations, summaries, and explanations [1].
While we have highlighted screen text metrics like phrase diference and sentiment, it is important to recognise the rich variety of metrics available.Each study requires an analysis of metrics tailored to the unique aspects of the studied phenomenon.For instance, in studies of workplace relations, researchers may focus on the tone of speech in texting using tools like Linguistic Inquiry and Word Count (LIWC) [104].This approach involves designing metrics that precisely capture and diferentiate between the nuances of various tones, which can inform behavioural inference.Conversely, explorations of how various age groups use language may narrow down to specifc elements like slang [49] or emoji analysis [60], which could reveal distinctions between generations [23].The broadness of our screen text sensor allows for the application of a wide range of analysis techniques and metrics, providing researchers with the fexibility to align their analyses with the needs of their studies.

User perceptions
Previous work has shown that people are willing to share data captured on their smartphones with scientists who are engaged in a worthwhile cause, and often this behaviour can be considered as a donation of data [72].Struminskaya et al. [100] found that the most common reason for unwillingness in sharing data is due to privacy and anonymity concerns.To increase transparency and privacy in mobile sensing, platforms should provide control to users over what data they choose to share [17].Our study sheds some light on participants' perceptions of the various sensors on their smartphones, including the screen text sensor that we have developed.On average, participants felt slightly uncomfortable with their screen text being captured.Although we did not collect passwords, some participants naturally felt more alert when viewing private content, which is consistent with previous studies [99].
However, participants appreciated the ability to disable sensors when they did not feel comfortable sharing their data and re-enable them once they had completed their current task.However, we note that only one-third of the participants disabled any sensors during the study, demonstrating that they did not fully utilise this control [62].We fnd that even though keyboard inputs are captured as part of screen text, people felt less comfortable on average with keyboard sensing.This could be due to the modality of interaction with each of these sensors infuencing perception.For the keyboard sensor, "tracking" the action of typing on the keyboard may induce more caution for users as they are actively producing content.In contrast, viewing text on the phone screen may not necessarily carry this connotation, as users are generally engaging in a passive state of interaction.
The open-source nature of our sensor allows researchers to customise the tool based on their needs.We have demonstrated password-detection as one method for enhancing privacy, though there are many more that could be implemented using similar techniques based on the research context and the precision of data required.We emphasise that the range of collected data can vary based on the studied phenomenon, and often does not need to encompass all screen text.We have earlier mentioned the ability to confgure AWARE-Light studies such that screen text data collection can be confned to nominated apps (or alternatively certain apps can be excluded).Beyond this confguration option, one could modify the AWARE-Light code such that flters are added, whereby only certain screen text content is collected and saved to the database.For example, studies related to ftness may capture only screen text containing keywords associated with ftness [108], which could indicate the frequency of which users engage in physical activity.Research on understanding user behaviour on Twitter could limit data collection to only the Twitter app.Despite this restriction, the screen text sensor enables the capture of various interactions, including creating, viewing, or re-posting posts.This capability enhances existing social media studies, enabling a more thorough investigation of user activity to understand individual behaviour, rather than solely analysing behaviour across all social media users [126].Additionally, studies that do not require participant identifcation can further remove information from screen text data such as email addresses, phone numbers, and people names as an additional security measure that can alleviate participant concerns [14].

Lessons learned
Our study has provided valuable insights into the implementation and considerations associated with our novel sensor, and we aim to share key learnings for researchers intending to employ this sensor in their studies.Many of these insights were gained through extensive trial-and-error and technical troubleshooting, highlighting the need for careful consideration to ensure smooth and uninterrupted data collection.
Firstly, researchers should be mindful of the text variability displayed on smartphones, which can be quite diverse.One consideration is the database table confguration for storing screen text.We recommend using the MEDIUMTEXT or LONGTEXT data types for the screen text column due to potential encounters with large volumes of text, exceeding the default TEXT data type limit.This issue is particularly relevant when dealing with hidden text on web pages, where the 65,535-character limit may be exceeded, leading to data transfer blockages and preventing further data uploading to the database until this data is discarded.Utilising the MEDIUMTEXT (16,777,215-character limit) or LONGTEXT (4,294,967,295-character limit) types can address this limitation.
Additionally, the database table for storing screen text data should be confgured using the utf8mb4 character set (4-Byte UTF-8 Unicode Encoding) to enable the capture of emojis, which is not possible with utf8mb3.For longitudinal studies or those with large sample sizes, researchers must consider database storage capacity.Our study, spanning two weeks and 21 participants, accumulated 12.69 GB in screen text data, underscoring the potential for substantial storage requirements in larger and longer-term sensing scenarios.It is advisable to calculate the total storage needed based on the study size and duration, allocating additional storage to accommodate potential outliers.
Given that our sensor captures all screen text, researchers should be aware of the presence of non-printable ASCII characters, which may be retained using their ASCII codes (e.g. the null character is often represented as \x00).These characters, if unidentifed, can cause errors during data analysis or when imported into a code editor.Precautions should be taken during the data analysis phase to screen for such anomalies and sanitise the data accordingly.
To address participant discomfort regarding screen text collection, AWARE-Light provides researchers with the fexibility to confgure which applications are included or excluded when capturing screen text.For instance, a study focusing on web browsing behaviour may choose to collect data solely from web browsers, excluding sensitive apps like messaging and banking apps.This selective approach can enhance participant comfort during the study.
Lastly, we acknowledge that our sensor may not be universally compatible with every Android version or smartphone make.The reliance on Android's accessibility services necessitates participant permission, but certain phones may automatically disable these services after a brief period.Although we instructed participants to disable battery optimisation, which is a primary cause for systemdisabling of accessibility services, some phones do not allow for prolonged use of accessibility by third-party applications.To address this, we implemented a "trial" phase for potential participants, allowing us to monitor data uploads over a few days to confrm compatibility.This helps identify participants who may not upload data or upload incomplete data due to compatibility issues.

Limitations and Future Work
Our sensor only collects screen text provided it is available to the operating system as raw text.Unlike previous work, our sensor does not detect text within images or graphics.This potentially limits our data collection, but at the same time ensures that our sensor can be easily deployed to large numbers of participants without computational consequences.
Furthermore, our screen text sensor uses Android's accessibility services to capture text from phone screens.Therefore, we may capture hidden text within certain apps, mostly Internet browsing.For example, websites sometimes use hidden text either for accessibility purposes, or as a search engine optimisation strategy.Common methods for storing hidden text include positioning text beyond the screen limits of the page, matching the text colour with the background colour, or using a text font size of zero.Although they are invisible to users, the Android system still detects and stores these texts as they exist as on-screen text nodes.This may be misleading for analysis if a screen has large volumes of hidden text.Therefore, further work can be done to build on our screen text sensor to distinguish between visible and invisible text on each screen, such as inspecting each text node for properties infuencing visibility.Similarly, we collect the rectangular boundary of each text node as its position.However, in some cases, it may be unclear where the text is located within the boundary box.Although all text is located within its boundaries, various styling such as centering, font sizes, and padding means that the text may not cover the entirety of its boundary box.Given that the boundaries of text nodes may overlap, the positions of some text may not be precisely inferred.
Additionally, some participants expressed that they changed their behaviour during the study [75].We note that participants may have regulated their behaviour in a manner that appeared more socially desirable [123], meaning that we may not have captured the true behaviour of all participants in a natural setting.To increase participant privacy, we provided participants with control over enabling and disabling sensors.In doing so, participants who alter their behaviour or frequently disable sensors inherently reduce the ecological validity of their data, which our tool cannot prevent.To facilitate understanding of the reasons behind sensor deactivation, AWARE-Light logs instances when users enable or disable a sensor.This not only adds another valuable data point for analysis but also allows for exploring associations with other data.For instance, it can help identify patterns of behaviour preceding the users' decision to disable the screen text sensor or correlate such actions with changes observed in other sensors.Gaining further insights into why the screen text sensor is disabled can contribute to enhancing its design and better addressing privacy concerns.To mitigate behaviours that reduce ecological validity, studies employing our sensor should tailor the scope of data collection to the specifc phenomena under investigation.By collecting only relevant and necessary information, clearly communicating the study's objectives to participants, and ensuring transparency in data collection methods, participants can feel more comfortable in their involvement, contributing to a more accurate representation of participants' natural behaviours.
Whilst AWARE-Light has been developed with security practices in mind, ultimately safeguards for the screen text data collected are the responsibility of those using this privacy-sensitive tool, both those who confgure the data collection setups and those smartphone users with AWARE-Light installed on their phone.For those responsible for confguring AWARE-Light instances, it is essential to set up databases so that they receive data from AWARE-Light over SSL-secured transmissions.Furthermore, at-rest encryption of the stored data is another security choice.For smartphone users with AWARE-Light installed and confgured to collect screen text data, they must be clearly informed of the nature of this sensor (including which apps screen text is being collected from), and clearly informed of the option to disable the sensor if/when they do not want such data collected from their phone.
Overall, our study had a small sample size of 21 participants, with a majority being students and staf of our university.Therefore, the screen text we collected may largely contain themes that are representative of people working in a tertiary education environment.As we are validating our screen text sensor, this may have reduced the scope of our analysis ideas.Collecting data from a larger and more diverse cohort can enable more reliable comparisons between participant groups and hypothesis testing.However, we believe that the size and demographic of our sample has little efect on how we validate our sensor's technological capabilities.

CONCLUSION
We have presented a study to validate our screen text sensor that gathers text from smartphone interactions.Previous work has collected text from smartphones by either logging keyboard interactions, or taking intermittent screenshots and performing optical character recognition.Our work overcomes many of these limitations by collecting screen text continuously, unobtrusively, and without the need for computer vision processing.
In this paper, we have validated the technology and capabilities of our screen text sensor in a feld study with 21 participants over two weeks.We presented examples of analyses that can be conducted using screen text data, and we suggested multiple possible directions for further exploration.Given the ubiquity of screen text data, it can be efectively applied across various disciplines to investigate specifc behaviours according to researchers' needs.
Hence, a wealth of opportunities exist for exploring the potential applications of screen text across diferent domains.

A PARTICIPANT DEMOGRAPHICS
In Table 4 we provide a summary of the participant demographics.

Figure 1 :
Figure 1: Examples of what a Screen and Phrase looks like on-screen.Each highlighted rectangle represents a single entity.

Figure 3 :
Figure 3: Illustration of how we calculate Phrase Diference between two screens.The highlighted phrases are the ones that difer between the two screens.
(a) Total number of screens for all participants.(b)Total number of screens for P5.

Figure 6 :
Figure 6: Total number of screens for each hour of the day.
(a) Total number of screens for all participants.(b)Total number of screens for P2.

Figure 7 :
Figure 7: Total number of screens for each hour and day of the week.

Figure 8 :
Figure 8: Phrases per screen for every hour of the day.
(a) Sentiment for all participants.(b) Sentiment for P15.

Figure 9 :
Figure 9: Sentiment for each hour of the day.

Figure 10 :
Figure 10: Sentiment for each hour and day of the week (all participants).
(a) All apps.Each data point represents one app.(b) All app categories.Each data point represents one app category.

Figure 11 :Figure 12 :
Figure 11: Average number of phrases per screen vs. Average phrase diference.

Figure 13 :
Figure 13: Statistics per location in the study (selected subset).

( a )
Average number of screens.(b) Average number of phrases per screen.(c) Average phrase diference.(d) Average sentiment.
(a) Average number of screens.(b) Average number of phrases per screen.(c) Average phrase diference.(d) Average sentiment.
(a) Average number of screens.(b) Average number of phrases per screen.(c) Average phrase diference.(d) Average sentiment per Battery Level.

Figure 16 :
Figure 16: Average statistics per Battery level.Error bars denote standard error (SE).
(a) Average number of screens.(b) Average number of phrases per screen.(c) Average phrase diference.(d) Average sentiment.

Table 1 :
Top 10 words by occurrence, grouped by App and App Category.

Table 2 :
Self-report of perceptions when using the AWARE-Light application.

Table 3 :
Self-report of comfort for each sensor activation.