GazePrompt: Enhancing Low Vision People's Reading Experience with Gaze-Aware Augmentations

Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations based on users' gaze behaviors. GazePrompt includes two key features: (1) a Line-Switching support that highlights the line a reader intends to read; and (2) a Difficult-Word support that magnifies or reads aloud a word that the reader hesitates with. Through a study with 13 low vision participants who performed well-controlled reading-aloud tasks with and without GazePrompt, we found that GazePrompt significantly reduced participants' line switching time, reduced word recognition errors, and improved their subjective reading experiences. A follow-up silent-reading study showed that GazePrompt can enhance users' concentration and perceived comprehension of the reading contents. We further derive design considerations for future gaze-based low vision aids.


INTRODUCTION
Low vision is a visual impairment that cannot be fully corrected by eye glasses or contact lenses [70].It includes a wide range of low vision conditions such as blurry vision, central vision loss, and peripheral vision loss, caused by cataract, macular degeneration, glaucoma and many more diseases.According to the WHO, at least 2.2 billion people have a near or distance vision impairment worldwide [106], and that number is projected to be doubled in the next ten years [105].
Reading, as a major means to access information in everyday life, can be significantly affected by different low vision conditions.For example, people with blurry vision cannot distinguish small text, while people with tunnel vision (i.e., severe peripheral vision loss) have to scan letter by letter to perceive a single word.Different low vision aids have been designed to assist low vision people, including optical and video magnifiers for print reading [90,102] and built-in accessibility support on computers and smartphones for on-screen reading, such as screen magnification [7,64] and contrast enhancement [38].These vision enhancements have also been incorporated into head-mounted displays (HMDs) to augment low vision people's residual vision in various visual tasks [24,112].
While conventional aids enable low vision users to read [56,69], they can also cause new barriers [12,35,65,85,104].For example, with screen magnification, low vision people still need more time to recognize words due to reduced visual span [54,85].Visual field loss may also lead people to misidentify words as letters appear missing or distorted [104].Moreover, screen magnification reduces a user's field of view, thereby increasing the difficulty of locating the next line while reading long passages [12,65,104].
To address these issues, we seek to improve low vision people's reading performance and experience using eye-tracking technology.Eye tracking can be a promising solution due to its capability to recognize readers' low-level gaze behaviors and provide prompt assistance.Compared to conventional vision enhancements (e.g.magnification) that modify a user's full visual field, eye tracking presents the opportunity to render more targeted augmentations that are tailored to user behavior.Prior research has demonstrated early success to calibrate and collect high quality gaze data from low vision users using commercial eye trackers [104].With the advance of eye tracking technology, it is critical and timely to explore how to leverage this technology and design gaze-aware augmentations for low vision people.
We present GazePrompt, a gaze-aware system that provides visual and audio augmentations based on users' gaze behaviors to support low vision people in reading tasks.Inspired by reading difficulties faced by low vision people [5,35,104], GazePrompt focuses on two features: (1) Line-Switching Support that augments the line a user intends to read; two design alternatives are provided due to low vision users' different visual abilities and preference, including Line Highlighting along the whole line, and Arrow that points out the beginning of the line; as well as (2) Difficult-Word Support that augments the word where a user stares or hesitates around for a long time; two design options are offered, including Text-to-Speech that reads out loud the difficult word, and Word Magnifier that magnifies the difficult word.The features were iterated on and refined via a formative study with three low vision users.We evaluated GazePrompt with two studies: a well-controlled reading-aloud study with 13 low vision participants to evaluate the system effectiveness quantitatively and qualitatively, and a freeform silent-reading study with another 13 low vision participants for deeper qualitative understanding of the impact of GazePrompt in a more realistic reading context (there are some participant overlap between the two studies).We seek to answer: (RQ1) Whether and how can GazePrompt improve low vision users' reading performance and behaviors (e.g., line switching time, line switching accuracy, number of misread words)?(RQ2) Whether and how can GazePrompt affect low vision users' subjective reading experience?(RQ3) What are low vision users' preferences on the augmentation design for each feature?
Our research shows that GazePrompt significantly reduced participants' line switching time and enabled more page scrolling flexibility in the reading-aloud study.While no significant difference, GazePrompt reduced the total number of misread words across all low vision participants.The silent-reading study highlighted that GazePrompt can enhance users' concentration and perceived comprehension.Our research also revealed low vision users' different preferences on the augmentation design and derived design considerations for future gaze-aware low vision aids.

RELATED WORK 2.1 Low Vision and Low Vision Aids
"Low vision" is a visual impairment that cannot be fully corrected by eye glasses or contact lenses [70].Low vision people experience a wide range of visual impairments, such as low visual acuity, visual field loss, low contrast sensitivity loss, and extreme light sensitivity [52], which lead to various visual challenges in daily activities, such as reading [55,97,112], navigating [96,111], and socializing [68,83].
Various assistive tools and technologies are devised to support low vision people in daily tasks.Magnifier is a cornerstone that supports people with low visual acuity, ranging from low tech optical aids (e.g., handheld magnifier and reading glasses) [102] to electronic devices, such as video magnifiers that can magnify text or objects captured by a camera on a digital display [90,102].Prior work has also integrated magnification into a head-mounted display (HMD) to magnify real-world environments [95,112].Many personal devices (e.g., computers, smartphones) today also provide screen magnification as a standard accessibility feature [7,64].
Despite its usefulness, magnification cannot address all reading barriers-low vision people still spend more time recognizing words due to reduced visual span [54,85], and people with visual field loss still face difficulty distinguishing similar words or identifying long words due to missing or distorted letters [104].Moreover, magnification itself can bring new challenges [5,16,17,35,97,104].For example, in a reading scenario, the decreased field of view reduces a user's reading speed [16,17].The user also has to pan around with a mouse to reveal different proportion of a page, making it difficult to locate the next line quickly and accurately [5,104] as well as increasing their cognitive load [97].Researchers have come up with solutions to improve user experiences with screen magnification [8,10,99]; for example, Aydin et al. [8] developed an intelligent screen magnifier that optimally magnifies elements of interest in dynamic content, as identified by a video saliency model.
Besides magnification, other types of visual augmentations are designed to compensate different visual impairments [9,39,72,108,112].For example, contrast enhancement is a common strategy to tackle low contrast sensitivity [18,95,112].Modern video magnifiers (e.g., RUBY [90]) provide high-contrast color filters to enhance low vision users' visual experience.Researchers have also explored contour enhancement by increasing contrast between objects and their background, which has been effective for people with central vision loss [39,41,108] A minified contour of a wider field is also used as an overlay (a.k.a., multiplexing) in one's central vision to expand the visual field of people with peripheral vision loss [58,74,108].Moreover, color enhancement (e.g., changing certain color) has been designed to improve the color discerning ability for people who are color blind [51,98].
In addition to image-processing-based enhancements that alter users' full field of view, researchers have started exploring more targeted visual cues based on specific contexts or tasks, such as navigation [29,39,109,110] and visual search [50,113].For example, Zhao et al. [113] facilitated visual search tasks by rendering visual cues directly on the search targets.Fox et al. [29] has explored the usability of different visual cues that highlight obstacles for low vision users in navigation.However, there has been limited research on more tailored, context-aware visual augmentations for vital but challenging reading tasks.To our knowledge, the only relevant work is Gowases et al. [31], which presents line highlighting or a pointer to label the next line, thus mitigating the loss of context caused by magnification.However, the feature's manual control method, using a mouse for magnification manipulation and keyboard for highlight control, caused operation difficulty and increased cognitive load.Moreover, the feature has only been evaluated with sighted people, without involving any low vision users.
As opposed to the manually-controlled visual support methods of prior work [31], we seek to leverage eye tracking technology to provide more intelligent, gaze-aware reading support tailored to low vision users' behaviors and intent, thus compensating the drawbacks of existing low vision aids.

Gaze-Aware Technology
Eye tracking is a promising technique to enhance users' reading experience.Research efforts have been made on eye-tracking-based reading support.One important reading task is to resume from the previous reading position when switching attention between reading and other activities.Eye tracking-based solutions are thus developed to track and label the previous reading position [21,46,61].For example, Mariakakis et al. [61] used eye tracking to detect when the user looks away from the phone and looks back, and highlighted the line where they have left off to direct the user's attention.Besides locating lines, eye tracking is also used to enhance comprehension [15,33,42]; for instance, Hyrskykari et al. [42] generated real-time text translation for foreign readers when difficult words are detected from readers' gaze patterns.Cheng et al. [15] has designed a gaze-based reading annotation system that summarizes and shares a teacher's reading characteristics, such as reading speed, transitions between sections, and frequency of re-reading, to improve their students' reading comprehension.Moreover, other than inferring readers' behaviors or intent, researchers also use gaze as direct control to replace traditional input methods, such as mouse and keyboard [14,53,91].For example, Shakil et al. [91] has developed a system that allows a user to use gaze to control code navigation, such as performing "Go to Definition" by dwelling on a color square on the side of the screen, and found that gaze control can improve code reading efficiency.
Eye tracking-based augmentations can also benefit the reading performance for people with reading disabilities [57,93].For example, Sibert et al. [93] proposed a gaze-based auditory support that can highlight words and pronounce them if the user pauses at a word for a relatively longer duration.Lunte and Boll [57] designed a gaze-contingent reading assistant for children with reading difficulties that dynamically changes the color of letters according to the user's gaze position to improve their reading experience.
Besides reading, gaze-aware technology has also been designed to support other activities, such as collaborative work [89,100,103], social activities [66], and video conferencing [37,88,101].For example, He et al. [37] developed a virtual conferencing system that conveys users' eye gaze direction to other people in a meeting through their profile picture, improving the engagement of meeting participants.
Although prior research has widely used eye tracking to enhance reading comprehension and efficiency over a variety of tasks, they mainly focus on sighted people.Little research has investigated how this technology can be applied to assist low vision people in reading tasks.

Eye Tracking Research for Low Vision
Despite the potential, eye tracking research remains nascent in the low vision field.Compared to sighted people, low vision people may have different visual abilities, eye characteristics, and eye behaviors, which leads to low gaze estimation accuracy and high data loss in eye tracking [60,62,67,104].As a result, eye tracking technology has been mostly used for vision science and ophthalmology to simulate low vision conditions for sighted participants, collecting early empirical data from participants with "simulated low vision" [2,3,34,36].The most commonly simulated condition is central scotoma (i.e., blind spots in one's visual field).By tracking a user's gaze, an artificial blind spot is rendered at the position that they are looking at on a computer or HMD.For instance, to evaluate the effectiveness of a text-remapping aid for people with central vision loss (a method that re-renders blocked text around the scotoma), Gupta et al. [34] recruited 35 sighted participants who experience simulated gaze-contingent scotoma, finding that participants with simulated scotoma read significantly faster with the remapping aid.Similarly, Aguilar and Castet [3] assessed a gaze-controlled magnifier by simulating gaze-contingent scotoma for 10 sighted participants.While gaze-contingent low vision simulation allows easy data collection to form preliminary test results, they are not guaranteed to transfer to low vision people due to the different viewing strategies between the two groups.For example, a person with central scotoma may have developed a preferred retinal locus (PRL) to replace the damaged fovea [78], which cannot be reflected by participants with simulated scotoma.
In the field of Human-Computer Interaction (HCI), less research has investigated or leveraged eye tracking technology for low vision people.Maus et al. [62] has designed and evaluated a gazecontrolled screen magnifier with seven low vision participants and found five of them demonstrate high data loss (> 50%).Meanwhile, Ivanov et al. [44] attempted to study the gaze behaviors of people with peripheral vision loss in walking tasks via eye tracking, but 11 out of 25 participants failed in calibration.Recently, Wang et al. [104] improved gaze calibration and data collection methods and gained high quality gaze data from low vision users that is comparable to sighted users, enabling researchers to investigate low vision people's gaze behaviors with a commercial eye tracker.They further analyzed and compared low vision and sighted participants' gaze data in reading tasks, revealing difficulties faced by low vision readers, such as tracking and locating a specific line and quickly identifying difficult words.Despite early successes in collecting low vision people's gaze data, no research has explored how to design effective gaze-aware technologies to enhance their visual experiences.Our research fills this gap by designing, implementing, and evaluating GazePrompt.

GAZEPROMPT: SYSTEM DESIGN & IMPLEMENTATION
We designed and built GazePrompt, a gaze-aware system that generates visual and auditory augmentations based on a low vision user's gaze behaviors to facilitate reading.GazePrompt is a complement to existing low vision aids (e.g., screen magnification, contrast enhancement) that further enhances people's reading experience.
As such, we focus on addressing two key challenges that low vision people encounter in reading, even when using conventional low vision aids: line-switching, which is especially difficult under high magnification [5,104], and difficult word recognition (e.g., visually similar words, long words) due to cut-off, missed, distorted letters caused by vision loss [59,104].We elaborate on our feature design for these two challenges (Section 3.1), the system implementation (Section 3.2.3),and our iteration with three low vision participants in a formative study (Section 3.3).

Feature Design
3.1.1Line-Switching (LS) Support.To enable users to easily and accurately follow a line or locate the next line, we design a lineswitching support method that detects and augments the line of interest (LOI)the line that the user is reading or intends to read.Via eye tracking data, we recognize the current line the user is focusing on as well as their line switching behaviors.We distinguish three behaviors and present augmentations accordingly: (1) when a user is following a line, the current line is augmented; (2) when the user finishes the current line and switches to the next line, the next line is augmented right away; and (3) when the user is jumping to a different line (e.g., skipping lines or revisiting previous lines), the target line is augmented after the line jumping behavior stabilizes.Recognition algorithms are described in Section 3.2.3.We provide two augmentation options for different visual conditions and preferences: Line Highlighting.Prior work has shown that highlighting can improve searching and reading performance [61,107], making it easier to locate a new line [31] and reducing cognitive load [71].We thus highlight a LOI by changing its background color.Since low vision users usually need high contrast to read, our highlighting color is adaptive to the reading materials; by default, we use yellow (RGB [255,255,0]) to highlight black text on a white page and blue (RGB [0,0,255]) to highlight white text on a black page (Fig. 2a).We further allow users to customize the highlighting colors due to feedback from the formative study (Section 3.3).
Arrow.Low vision users may not want the whole line to be highlighted since it reduces the contrast (black-white has the highest contrast) and may also distract the users [22].We thus provide a relatively subtle design-labeling the beginning of a LOI with a high-contrast arrow-to help a low vision user to locate the next line [31].Similar to the highlighting augmentation, we use a blue arrow to create a high contrast against the white background, and a yellow arrow for black background.Users also have the flexibility to customize the color (Fig. 2b).
3.1.2Difficult-Word (DW) Support.To enable a low vision user to accurately and quickly recognize a word, we design a difficult-word support that detects and augments the word of interest (WOI)the word that the user is interested in but has difficulty recognizing.Based on eye tracking data, we detect a word as a WOI when the user stares or hesitates on the same word for a long time (see implementation in Section 3.2.3).Two augmentation alternatives are provided for a WOI: Word Magnifier.Since magnification is the most common method used by low vision people to see details, we magnify the WOI to the maximum magnification level supported by the screen magnifier to enable the user to thoroughly examine the word.The magnified version is rendered above the WOI (or below if the word is close to the upper border of the display) to avoid blocking the reading context.The magnifier will disappear if the user moves their eyes away from it.The word magnifier provides local magnification of the WOI while maintaining the global reading context, which could be useful for low vision users who do not prefer full-screen magnification (Fig. 2c).
Text-to-Speech.Besides visual augmentations, some low vision people prefer auditory feedback on complex information since it reduces their visual effort [110].We thus design an auditory augmentation that reads aloud the WOI to the user.Similar design has been applied to support reading for people with dyslexia and proven to be effective [81,93].

Implementation
GazePrompt was implemented using a Tobii Pro Fusion (120Hz) [79] screen-based eye tracker attached on the bottom of a computer display (24-inch, 1920 × 1200 resolution).We built the system through three steps: (1) Improving the eye tracking calibration for low vision users; (2) Filtering and processing the gaze data; and (3) Recognizing gaze behaviors.We describe the implementation of each step below.[62,67].To address this issue, we adopted an adjustable calibration interface  (i.e., the calibration target size was adjusted based on user's visual ability) and dominant-eye-based data collection (i.e., gaze data collection focused on the user's dominant eye if there was one) following guidance from prior literature [104].We used a 14-dot calibration interface, the maximum target number supported by the Tobii Pro SDK, to achieve a high calibration granularity for low vision users (Fig. 3a), followed by a 5-dot validation interface (Fig. 3b).

Gaze Calibration & Collection. Low vision users experience inaccurate recognition and high data loss with eye trackers due to inaccessible calibration and data collection methods
Moreover, since eye tracking suffers from vertical drift-the vertical coordinate of the estimated gaze position becomes less accurate overtime [13,92]-we improved the calibration process by involving a line-based correction after the 14-dot calibration.Specifically, we rendered a target stimulus (a white solid circle with a black dot in the center) that moved horizontally along a line from the left side of the screen to the right.We instructed the user to keep tracking the target with their gaze until it disappeared (Fig. 3c).The target was the same size as the ones used in 14-dot calibration.The same process was repeated five times, with the vertical position of the calibrated line shifting down by 20% of the screen height each time (Fig. 3d).We collected the user's gaze data as they tracked each line and calculated the mean vertical gaze offset from the line as the vertical drift of that line.The system then interpolated the vertical drift in between adjacent calibrated lines and correct the drift across the entire screen.A validation interface with another four horizontally moving targets were then presented to evaluate the vertical correction (Fig. 3e).

Gaze
Data Filtering and Processing.The gaze data was retrieved via the Tobii Pro SDK [80] in Python.We filtered the data by removing noise and outliers on the fly.We then converted the user's continuous gaze data into a series of fixations (i.e., short pauses of gaze during reading to process information [87]) via a real-time fixation detection algorithm [25,48] for further gaze behavior detection.We set up a Flask-SocketIO [32] server to process the gaze data and enable bi-directional and low-latency communication between the server and the system user interface.

Gaze Behavior Recognition.
We then recognized high-level gaze behaviors upon the fixation sequence.Specifically, we identified the current reading line and line switching behaviors for LS support, and we recognized hesitation around words for DW support.
Line Identification.Considering the inherent uncertainty of eye tracking, we used a weighted voting mechanism [45] to identify the line that a user is reading based on their latest three fixations.We defined the space of a line with a bounding box that wrapped the line of text, so that a line  can be defined by the top and bottom border positions of its bounding box (  ,   ), as shown in Fig. 4. For each fixation  (  ,   ), we first calculated its landing line (  ,   ) by identifying the line that had the smallest vertical distance to the fixation.The landing line thus represented the "vote" of the fixation.We then calculated the weight () of the fixation:  = 1 1+| | , where  represented the normalized distance between the fixation and its landing line; so that the smaller the normalized distance was, the more weight a fixation had.The normalized distance was defined as:  =   − 0.5ℎ , where  =   +  2 , ℎ =   −   .Based on the vote and weight (i.e., number of votes) of each fixation, we determined the final landing line as the line that had the most votes from the latest three fixations.
Line Switching Behaviors.We detected a line switching behavior via a return sweep-fast eye movements to switch focus from the end of one line to the beginning of the next [82].Suppose (   ,    ) and  (   ,    ) are two adjacent fixations.We determined a return sweep based on the following criteria: (1) the horizontal distance between the two fixations should be greater than a threshold   0 , i.e.,    −    >   0 ; (2) the later fixation should land on the left portion of the page, around the beginning of a line, i.e.,  . <   1 ; and (3) the two fixations should be at least one line apart vertically, i.e.,    −    >   2 .Inspired by prior work on line switching detection [13], we set   0 to be 500,   1 to be one third of the text width, and   2 to be the line height.When no return sweep was detected, we distinguished a line following behavior and a line jumping (i.e., jumping to a different line without return sweep) behavior.Specifically, if the line identification result remained the same, we assumed that the user was following a line; if the line identification result changed and the change remained stable for three consecutive fixations, we treated it as a line jumping behavior.
Difficult Word Detection.Fixation duration on a word is positively correlated to the difficulty of information processing, and both first fixation duration and total fixation duration on a word can indicate the difficulty of recognizing the word [43,82].Moreover, the number of re-fixations (i.e.other than the first fixation) on a word indicates the amount of adjustment towards optimal viewing location [82], which is also related to word recognition difficulty.Therefore, our system detected the difficulty of word recognition when any of the following happened: (1) the first fixation on a word was longer than threshold   0 ; (2) the number of re-fixations on the word was greater than threshold   1 ; (3) the total fixation duration on a word was greater than threshold   2 .Note that, to facilitate real-time difficult word detection, we only considered fixations on a word in one-pass (i.e., consecutive fixations on a word before leaving for other words).We empirically determined the thresholds as:   0 = 500 ms,   1 = 4, and   2 = 1500 ms according to the data collected in our prior work [104].
Building upon gaze behavior recognition, GazePrompt rendered corresponding augmentations on a web-based reading interface.We built the interface using React [94].

Design Iteration via a Formative Study
Following a user-centered design approach [1], we conducted a formative study with three low vision participants (P1-P3 in Table 1) to iterate on the design of GazePrompt.We introduced GazePrompt's two features and their two design alternatives to participants.They then freely read short passages with our system until they were familiar with each feature.The passages were magnified and adjusted to the most suitable contrast to ensure readability for participants.
We then asked about their experience with the two features and how they wanted to improve them.We analyzed their responses qualitatively and refined our feature design accordingly: More Color Selection for Line-Switching Support.All three participants pointed out that the default color options (i.e., yellow or blue) in the LS support were not their preferred colors; for example, P2 felt that yellow was too bright, expediting eye strain.We thus improved our system by introducing more color options.To simplify color adjustment, our color selection procedure adopted the HSL (hue, saturation, and lightness) model [107], where users can choose their preferred hue and lightness with the saturation remaining 100% (Fig. 2d).
Adjustable Triggering Threshold for Difficult-Word Support.Participants preferred different triggering time for DW support due to their different visual abilities and reading habits.P3 mentioned the feature was triggered too late, while P2 felt it triggered too frequently.Therefore, we made the fixation duration thresholds   0 and   2 adjustable, with 500 and 1500 being the starting points, allowing users to adjust the value up or down with a step of 50 and 250, respectively.

STUDY I: READING ALOUD
We first evaluate the effectiveness of the two features in GazePrompt (RQ1 in Intro) in a reading aloud task, which helps us to better control participants' focus, reducing the confounding effect of content comprehension on their gaze behaviors [104].We also explore low vision users' subjective experiences (RQ2) and customization  preferences (RQ3) when using GazePrompt.To assess the feature effectiveness quantitatively, we compare participants' reading behaviors between using GazePrompt and not using GazePrompt (the baseline).Since GazePrompt is developed as a complement to existing low vision aids (e.g., increasing font size) instead of replacing them, we allowed participants to change font size and use contrast enhancement in both conditions in the study.We address two sets of hypotheses: We recruited participants from a local low vision rehabilitation service.Before a potential participant was recruited, we conducted a screening interview via phone or email to make sure they were eligible for the study.A participant was eligible if they were at least 18 years old and had low vision but still were able to use residual vision to read.All participants completed the study without glasses.Participants were compensated $20 per hour and were reimbursed for travel expenses.
4.1.2Procedure.We conducted a single-session study in a well-lit lab.The study lasted 1.5 to 2 hours, including the following phases: Initial Interview & Visual Acuity Test.After obtaining participants' consent, we interviewed participants about their demographic information, visual condition, and their challenges with daily reading, as well as their experience with assistive technology.We then measured their visual acuity using a letter-size ETDRS 1 and ETDRS 2 logMAR chart [26].Participants were instructed to sit at four feet from the eye chart, and for those who could not see the largest row on the chart, we tested their visual acuity at 2 feet.They were asked to read chart 1 with their left eye covered, and then read chart 2 with the right eye covered.We recorded the smallest line that participants can recognize at least three out of five letters correctly.Our visual acuity test covered a range from 20/10 to 20/400.We used reported visual acuity for participants whose visual acuity was outside of the range (Table 1).
Gaze Calibration & Validation.We then conducted gaze calibration with participants, including both the 14-dot calibration and line-based correction, as described in Section 3.2.1.Participants sat in front of a computer screen with an eye tracker.After adjusting their position to achieve a horizontal distance of 65cm to the screen, participants were instructed to sit straight with their back touching the back of the chair to remain that position.We then adjusted screen height to align the participant's eye level with the center of the screen.Participants were asked to keep their body still in the study but small head movement was allowed when necessary.
Before calibration, we adjusted the calibration target size for participants until they could locate the center of the target (the black dot) without eye squinting.They then completed the 14-dot calibration.A 5-dot validation followed, which collected the participants' calibrated gaze data when staring at five targets and calculated the accuracy.Finally, participants completed the line-based correction along with a 4-line validation, where we collected their corrected gaze data when tracking the moving target along four lines and calculated the accuracy.Based on the validation results, we decided which eye's data to use (left, right, or average) following the dominant-eye-based data collection [104], as well as whether or not to apply the line-based correction.
Tutorial & Customization.After calibration, we conducted a tutorial session to familiarize the participants with GazePrompt and allow them to customize the features.First, we showed participants an example passage and adjusted the font size, font weight, and color for participants so that they could read comfortably without eye squinting.We then introduced the LS and DW support in GazePrompt.For each feature, we demonstrated the two design alternatives and asked them to freely experience the feature on the example passage.During the exploration, participants customized the feature, including the augmentation color for the LS support, and the triggering time threshold for the DW support, until they were fully comfortable and familiar with it.We also asked for participants' feedback and suggestions on each design alternative.Finally, we asked participants to select their preferred design alternative for each feature to use in the following reading tasks.
Reading Tasks.Participants performed multiple trials of reading aloud tasks in four conditions: (1) without GazePrompt as the baseline, (2) LS support only, (3) DW support only, and (4) both LS and DW support.In all conditions, participant adjusted the reading content to their preferred magnification level and contrast to simulate their daily reading setup.Participants were instructed to read aloud as quickly and accurately as possible without the need of comprehending the content.We counterbalanced the four conditions using Latin Square [11].Ten participants read two passages per condition, while the other three (P9, P11, P14) only read one passage per condition due to time constraints.We randomized the passages across conditions.We collected participants' gaze data in all reading tasks.The passages were selected from CLEAR corpus [20].We first filtered passages with sixth grade level reading difficulty using Flesch Reading Ease [27], and then calculated the cosine similarity between passages based on their Flesch-Kincaid Grade Level [47], Automated Readability Index [47], SMOG [63] and word count.As a result, we selected 20 passages with similar difficulty.Eight passages were selected as default passages in our study and the rest were used as back up passages to handle particular situations, such as data loss due to eye tracking failure or system errors.
Exit Interview.We ended our study with a semi-structured interview about participants' reading experiences with GazePrompt.Participants rated the effectiveness, distraction level, and comfort level of the reading interface when reading in the four conditions on a scale of 1 (strongly negative) to 7 (strongly positive).We also asked for their scores on the perceived accuracy and the learnability of the LS support and DW support, respectively.Participants then discussed their suggestions for improvements and attitudes towards gaze control and traditional manual control.

Analysis
We collected both quantitative and qualitative data.We first describe our quantitative analysis and then qualitative analysis.

Effectiveness of LS support.
We first evaluated the effectiveness of the LS support.We defined three measures, including (1) line switching time, that is defined as the time between the last fixation at the end of the prior line and the first fixation that successfully lands on the next line followed by a non-backward saccade along that line.We calculated participants' mean line-switching time across all lines per passage; (2) frequency of line switching deviation: we define a line switching deviation as an event when a user finishes reading the prior line and intends to switch to the next, their fixations accidentally land on a "wrong" line.We counted the total number of such events per passage and calculated the frequency of line switching deviation by dividing the total number of lines in the passage; and (3) magnitude of line switching deviation, is defined as the distance between the wrong line and the target line in each deviation.We calculated the mean deviation magnitude per passage.
We had one within-subject factor Condition with two levelswithout GazePrompt (Baseline), and using LS support only.To investigate the effect of visual conditions, we involved two between subject factors, VisualAcuity with two levels-Low, High-with 20/100 in the better eye as the splitting threshold [104], and PeripheralVision including two levels-Limited, Intact-based on their self-report visual field.To validate the counterbalancing, we involved another within-subject factor Order, and found no significant effect of Order on any measures.We checked the normality of each measure using Shapiro-Wilk test.If a measure was normally distributed, we fitted our data with the Linear Mixed-Effects (LME) Model and calculated the ANOVA table to achieve p-values for the fixed effects [49]; Tukey's HSD was then used for post-hoc comparison if significance was found on interaction between factors.Otherwise, we used Aligned Rank Transform (ART) ANOVA and ART contrast test for post-hoc comparison [23].We used partial eta squared ( 2  ) to calculate effect size, with 0.01, 0.06, 0.14 representing the thresholds of small, medium and large effects [19].

Effectiveness of DW support.
We then evaluated the effectiveness of the DW support.We define two measures: (1) maximum one-pass fixation time on a word: the maximum time a user fixates on a word within the first pass until their fixation switch to another word; and (2) number of misread words: the number of words that were read incorrectly by the participants.We identified the misread words by comparing reading content with the audio recordings of participants' reading aloud tasks.Note that we ignored words that are inherently less important and thus often omitted by people when reading, such as articles, prepositions, pronouns and helping verbs, since the errors on these words do not necessarily imply visual difficulty.
We had one within-subject factor Condition with two levels: without GazePrompt (Baseline) and using DW support only.Similar to the analysis for LS support, we involved two between-subject factors VisualAcuity and PeripheralVision to investigate the effect of visual conditions.We also had Order as another within-subject factor and found no significant effect of Order on the measure, thus validating the counterbalance.The analysis method mirrors that in prior section.4.2.3Qualitative Analysis.We video recorded the interviews and transcribed them using an online automatic transcription service.Our researchers then manually corrected the transcription errors.We analyzed the data using a standard qualitative analysis method [86].Two researchers independently coded three sample transcripts from three participants using open coding and generated a codebook upon agreement.Each researcher then coded half of the rest interviews based on the codebook, and updated the codebook upon agreement when new code emerged.Finally, we derived themes and categories based on the different aspects (e.g., effectiveness, user preferences) of the evaluated feature.

4.
3.1 Gaze Data Quality.With our improved calibration procedure that involves line-based correction, the mean angular distance between the estimated gaze position and the target position in the 5-dot validation is 0.79°( = 0.45°) viewed at 65cm from the screen, which is about 34 pixels on the screen.This result suggests that participants' gaze data were accurate enough for GazePrompt to function normally, since the smallest font size our participants chose was 48 pixels.Moreover, the percentage of data loss was 5.85% ( = 10.1%) which is much lower than the data loss (about 60%) reported in prior work [62].
4.3.2Line-Switching Support.We report our quantitative and qualitative results about the performance of Line-Switching (LS) Support and participants' preferences on the augmentation design.
Line Switching Time (H1.1).We found a significant effect of Condition (LME:  2 (1,  = 26) = 5.99,  = 0.014,  2  = 0.27) on the line switching time, indicating that the LS support enabled low vision users to locate the next line faster during line switching.We did not see any significant effect of VisualAcuity or PeripheralVision or their interactions with Condition on line switching time.However, we found that among the four participants whose line switching time was not improved (P5, P8, P14, P15), three had peripheral vision loss.This suggests that LS support might not be as effective for people with limited peripheral vision since it could be more difficult for them to notice the augmentations on the next line than people with full peripheral vision.
Line Switching Accuracy (H1.2).We found no significant effect of Condition on the frequency of line switching deviation (ART:  (1,9) = 0.83,  = 0.39,  2  = 0.084) or the magnitude of line switching deviation (ART:  (1,9) = 0.19,  = 0.67,  2  = 0.021), indicating no significant improvement of LS support on participants' line switching accuracy.This could be partially due to that participants adjusted their reading behaviors to adapt to the line switching difficulty without GazePrompt (see Scrolling Behaviors Change section).Additionally, no significant effect of PeripheralVision or VisualAcuity or any interactions was found on line switching accuracy.While no significant difference on line switching accuracy, participants generally felt the LS support effectively improved their line switching experiences.As a result, they raised significantly higher scores to the effectiveness of the LS support ( = 6.64,  = 0.50) than magnification without GazePrompt ( = 4.82,  = 1.83,Wilcoxon signed-rank test:  = 0,  = 0.008).
Scrolling Behavior Change.Since the magnified font required participants to scroll up and down the page to read the whole passage, we further looked into their scrolling behaviors.We found participants scrolled significantly longer distance in a single scrolling event (separated by a pause longer than 100) when reading with LS support (ART:  (1,12) = 17.2,  = 0.001,  2  = 0.59), resulting in less scrolling events (Fig. 5b).When not using GazePrompt, participants usually scrolled only one or two lines at a time to limit the content change, thus easing the line localization during line switching (Fig. 5a).This evidence indicates that the LS support successfully reduces the line localization difficulty, allowing users to scroll more flexibly when reading long paragraphs.
Preference on Augmentation Design.We identified participants' preferences between Line Highlighting and Arrow design.The majority of participants (10 out of 13) chose Line Highlighting (Fig. 6).All of them agreed that LS support improved their focus on the line (P3-P6, P8-P13, P15), thus reducing cognitive load (P13) and eye strain (P4, P10).As P13 mentioned, "I think [LS support] makes reading the text easier.Because I'm not as focused on [with no support], and I'm kind of going up and down the page.It makes me focus on what I'm really reading instead of worrying about the logistics of reading, like 'Okay, where am I on the page?"'Compared with Arrow, most participants liked that the Line Highlighting augments the whole line instead of just the beginning of lines (P3-P6, P8-P11, P15).This is particularly true for participants with peripheral vision loss since they could barely notice the Arrow as they read towards the end of a line (P13, P15).P6 further commented searching for the arrow made her eyes tired.
Despite the drawbacks of Arrow, three participants (P5, P7, P14) chose this design since it was less invasive and less distracting.P5 liked that the Arrow entered their visual field only when they needed it, "When it's on the side, I don't even know what's on the side until I go to my next line.And then I'm like, Oh, it automatically takes me there." Five participants (P4, P5, P13-P15) appreciated the flexibility with color selection in LS support, which allowed them to customize for their visual condition and preferences.All participants chose the color that created high contrast to the text, with seven of them choosing non-default colors in the system.Participants' choices of color are presented in the appendix (Fig. A.1).Some participants desired even more fine-grained color palette to better customize for their visual abilities in different lighting conditions (P15).
Potential Improvement on Design.Participants brainstormed potential improvements of the line switching support.Instead of augmenting the whole line, P3 would like a word-level augmentation that guides them through each word as they read on a line.P9 suggested using underlining to augment the line such that the whole line is augmented in a less distracting manner.P15 further suggested placing the arrow on both sides of the screen, which could be more useful for participants with limited peripheral vision loss who could not notice the left side of the screen quickly during line switching.Furthermore, two participants (P9, P13) suggested replacing the arrow with other shapes to further improve their reading satisfaction.The sense of command afforded by arrow made reading less pleasant according to P9, "The arrow is a command, 'Go this way.'Whereas a rectangle would just be saying, 'you're on this line."' 4.3.3Difficult-Word (DW) Support.We report our quantitative and qualitative results about the performance of Difficult-Word Support and participants' preference on the augmentation design.
Maximum One-Pass Per-word Fixation Time (H2.1).While participants felt the DW support made reading faster (P6, P8, P10, P11), we found no significant effect of Condition (ART:  (1,9) = 0.034,  = 0.86,  2  = 0.004) on the maximum time participants fixated on a word in the first pass.This could be because most participants (10 out of 13) chose longer fixation duration threshold to trigger the DW support (  ).We also did not see significant effect of PeripheralVision or VisualAcuity or any interactions on this measure.
Number of Misread Words (H2.2).We further looked into participants' reading errors.We found that, overall fewer words (30) were misread when using DW support than the baseline (39).80% of misread words were recognized as words with similar appearance, such as 'though' vs. 'through', and 'interposed' vs. 'interrupted', indicating that the errors were probably caused by visual challenges.However, we did not find a significant effect of Condition on the number of misread words (ART:  (1,9) = 1.55,  = 0.24,  2  = 0.15) across participants, neither did PeripheralVision or VisualAcuity or any interactions.
Feature Triggering.We investigated when the DW support was triggered.Participants on average triggered 17.2 word augmentations ( = 16.5) when reading each passage using DW support.Out of the 30 misread words under DW support, only two triggered the DW support, and they were all augmented by the Word Magnifier design (P6, P11).This suggests that participants using Word Magnifier might still face difficulty recognizing the words visually.
Overall, the DW support was perceived to be accurate ( = 5.58,  = 0.90) and effective ( = 5.45,  = 1.51) in recognizing difficult words.However, we acknowledge that the DW support was only helpful when participants noticed the difficulty of the words; the feature would not be triggered if they thought that they recognized the word easily and correctly when they actually did not.
Preferences on Augmentation Design.We report participants' preferences on the Text-to-Speech and Word Magnifier design.Eight participants chose Text-to-Speech (Fig. 6), because audio feedback was perceived to be faster (P3) and easier for people who had stronger auditory senses than visual senses (P3, P14).They felt the audio feedback more human-like and pleasant to use (P8).As P8 explained, "It's more interactive.It's more rehabilitative.It's like a rehab guy sitting there with me helping me [read]."Compared with Word Magnifier, Text-to-Speech does not require additional eye movement to scan the magnified word (P3, P12, P15), thus reducing visual burden (P13).Despite the merits of Text-to-Speech, some participants felt audio information took longer to process that she was reading when not using LS Support; (b) P7 scrolled five lines up from bottom while her fixation went up to track the line starting with 'very' that she was reading when using LS Support.Red arrow indicates the progressing direction of fixations.
Figure 6: Participants' preference on different augmentation designs of GazePrompt than visual information (P13, P11).For a reading-aloud task, Textto-Speech was perceived to be distracting and intrusive (P5, P7, P8, P10, P14) because it sometimes overlapped with participants' speech and therefore interrupted reading process.
Participants who chose Word Magnifier felt the visual augmentation less distracting and easier to ignore when not needed (P4, P7).For participants who had difficulty recognizing certain single letters in a word due to limited central vision, Word Magnifier could be a faster remedy than hearing the whole word.Four out of five participants who chose Word Magnifier had central vision loss, and the other one had intact visual field.P11 described how Word Magnifier helped her recognize words more quickly than Text-to-Speech, "Because the trouble I was having with the word was generally... just one or two letters... very often it was the first letter of the word.And it popped at me right away [with Word Magnifier].So then that told me what the word was."Furthermore, instead of directly feeding the to participants via speech, Word Magnifier facilitated sense of agency, making users feel more independent (P11, P12).The drawbacks of Word Magnifier were mostly about the visual design.The magnification level was too high and was not adjustable (P3, P8, P10, P13), and the border was too close to the magnified letters, reducing legibility (P13, P14).Moreover, the magnified word could block previous text the user wanted to revisit (P9, P10).Four participants complained that it was easy to lose track on their previous reading position when returning from the magnifier (P5, P8, P11, P12).Two participants with limited peripheral vision also found it difficult to locate the magnifier since its position got outside of their visual field (P13, P15).In fact, no participants with limited peripheral vision chose Word Magnifier.
Potential Improvement.For Text-to-Speech, some participants suggested customization for the voice to make it sound more natural (P5, P8).For Word Magnifier, participants proposed adjustable magnification level (P3), and fixed location (e.g., at the bottom right corner of the screen) to reduce distraction (P5).P3 and P13 also suggested considering different background colors and font for Word Magnifier to make contrast to the original text.P3 also suggested augmenting several adjacent words at a time to compensate for potential difficult word identification errors.

Overall
Reading Experience with GazePrompt.Participants all had positive experiences with GazePrompt.They all agreed both features in GazePrompt were not distracting (LS:  = 2.45,  = 1.69;DW:  = 3.82,  = 2.23) and easy to learn (LS:  = 6.85,  = 0.38; DW:  = 6.89,  = 0.33), but eight participants said they would be more comfortable using GazePrompt by practicing more.Six participants like the combination (LS+DW) more than individual features because they improved reading experience from different aspects and can even augment each other.For example, P13 believed the combination of both features were the best, since LS support improved her concentration on words, which in turn made DW support more accurate.
When comparing gaze-aware augmentations in GazePrompt with manually controlled augmentations (e.g., using a keyboard to switch highlighting to the next line [31]), nine participants preferred the gaze-aware method (P4, P6-P11, P13, P15) since it was faster (P7, P9, P13, P15), more natural (P13), and more accurate (P11).Due to participants' vision loss, manual control could be cognitively and visually taxing (P4, P5, P13, P11, P15).Interestingly, P8 found gaze control rehabilitative since it improved his self-awareness of how he used his eyes.According to P8, " [GazePrompt] kind of reminds me that I'm wandering, and my brain wants to correct it.So I'm really envisioning this being a very great rehabilitative tool to read." Furthermore, some participants suggested combining manual and gaze control to enable more accurate control (P10, P14).For example, P10 would like the Text-to-Speech in DW support to be triggered at a fixated word only after pressing a button.However, two participants preferred manual control because they were more familiar with manual control (P3) and the reading distance required by eye tracker made them feel uncomfortable (P12).Participants also brought up the potential mismatch between gaze and their mental status (P3, P9, P12, P13).As P3 explain, "So my eyes are on something but I [might be] processing something else that I've just read."

STUDY 2: SILENT READING
While the well-controlled reading aloud study (Section 4) allowed us to quantitatively examine the effectiveness of GazePrompt, it did not reflect the real-world reading scenario where people usually read silently and focus on content comprehension.To compensate for Study I, we conducted another silent reading study to understand low vision users' experiences with GazePrompt in a more realistic reading context.

Method
We recruited 13 low vision participants (P13-P25 in Table 1), including ten female and three male whose ages ranged from 30 to 85 ( = 58.6, = 19.6).Five participants were legally blind but had functional vision to read.The recruitment method and compensation was the same as Study I (Section 4).All participants completed the study, except for P16 who only briefly experienced the features and provided quick feedback due to extensively long calibration.
The study lasted 1.5 to 2 hours.The procedure was the same as Study 1 (Section 4.1.2),except that the reading tasks were silent reading, where participants were instructed to read as quickly as possible in the condition that they could build sufficient understanding of the passage.We asked two simple questions after each passage to ensure participants' basic comprehension on the reading content.In the exit interview, in addition to the same set of questions in Study 1, we further prompted for use cases where GazePrompt could be useful.
We selected reading passages from MCTest MC500 [84], a dataset of passages with multiple-choice reading comprehension questions intended for machine comprehension.The passages were fictional stories, reducing the impact of participants' prior knowledge on passage comprehension.We selected 28 passages that were in similar length (about 177-199 words per passage) and at similar difficulty level (about 6th-grade level) according to the Flesch Reading Ease (FRE) score [28].We manually checked each passage, ensuring no inappropriate content was involved.Readings were randomly selected from the 28 passages for each participant.
As silent reading may involve some complex and even unexplainable gaze behaviors (e.g., fixating on a word while mentally processing previous sentences, or revisiting previous content), our analysis focused only on understanding users' experiences qualitatively.The qualitative analysis method mirrored Study I (Section 4.2.3).

Findings
While most findings in this study echoed Study I, we identified insights that were unique to the silent reading tasks, including LS support enhancing comprehension and DW support being used as a confirmation tool.We elaborate these unique findings below.5.2.1 Line-Switching Support for Comprehension.Participants' opinions differed on whether the LS support facilitated improved comprehension.Some felt their reading comprehension improved because they could focus on the text better with GazePrompt (P13, P16, P21).One participant felt the background color change caused by Line Highlighting during line switching distracted them, which negatively affected their reading comprehension (P22).When asked about the reading scenarios where LS support can be helpful, participants believed it could be particularly favorable when reading long and technical passages that require a certain level of concentration (P13, P16, P23).Two participants mentioned that LS support could be particularly useful in a low lighting condition where their eyes could get tired more easily (P19, P22).Moreover, P15 indicated that the LS support can be more helpful when reading text with smaller line spacing.Other than the potential improvements reflected in Study I, P25 further suggested making the Arrow design blink to help him locate the line faster due to his tunnel vision (i.e., severe peripheral vision loss).

5.2.2
Difficult-Word Support for Confirmation.Similar to LS support, some participants believed DW support had the potential to aid comprehension (P13, P19, P21).As explained by P21, "[The Text-to-Speech] helps me just to be able to focus on what I'm reading, and comprehend it without worrying like, 'Oh, I'm straining my eyes just by gazing all the time."' Interestingly, besides recognizing difficult words, four participants (P16, P21, P22, P24) mentioned using the DW support as a confirmation tool to the words that they could recognize but were not confident about.As P16 commented, "when it (text-to-speech) would say it, my brain went 'Yes! Yes!That is the word I just saw."'P13 even intentionally chose longer fixation duration thresholds to use her gaze as an explicit input (rather than implicit estimation) to trigger the DW support.Similar to LS support, participants would like to use DW support when reading technical passages (P14, P17, P24), especially those that involve long and unfamiliar words and names (P13-P18, P20, P23).

DISCUSSION
We contribute the first research that explored the design space for gaze-aware augmentations to assist low vision people with reading tasks.We built GazePrompt to support two important tasks in daily reading: line switching and word recognition.Through a well-controlled reading-aloud study (Section4) with 13 low vision participants, we demonstrated the effectiveness of GazePrompt.We found that GazePrompt significantly reduced participants' line switching time (H1.1).While there was no significant improvement on line switching accuracy (H1.2),GazePrompt enabled more flexible scrolling behaviors, not requiring low vision users to scroll only one or two lines at a time to reduce line switching difficulty.For difficult word support, although GazePrompt did not significantly lower the upper bound of participants' word recognition time (H2.1)nor significantly reduce the number of misread words, the overall misread words were reduced across all participants from 39 to 30 (H2.2).A follow-up silent-reading study with another 13 low vision participants further highlighted the impact and potential of GazePrompt in a more realistic free-form reading context, such as enhancing perceived content comprehension, and enabling confirmation for uncertain words.As opposed to manually controlled augmentations, participants expressed strong preference towards gaze-aware augmentations due to their low effort and rehabilitative potential.
In this section, we discuss the challenges we encountered in our technology design and implementation, as well as deriving design implications for future gaze-aware technology for low vision.

Eye Tracking for Low Vision
With improved gaze calibration strategies inspired by prior literature [104] and further vertical drift correction (or line correction), we achieved stable and accurate eye tracking for low vision participants as reflected from both studies.We found line correction can be particularly useful in reading tasks for people with inconsistent dot-and line-validation results.In Study I, one participant (P12) was observed to have low dot-validation error (0.90°or 39 px), yet high line-validation error (vertical offset: 110 px).This could be due to eye recognition issues with specific head postures, or the fact that the user had inconsistent gaze behavior when performing different tasks (fixation vs. target tracking) to accommodate their visual condition (e.g., central vision loss).Since the line-calibration mimics the eye movements of reading, we were able to address this inconsistency by applying line offset correction, thus improving the eye tracking accuracy during reading tasks.
Despite our improvement on the gaze calibration, participants still faced challenges with inaccessible calibration procedure.For example, P1 (who had central vision loss) reported she developed multiple preferred retinal locus (PRL), which hindered current calibration algorithms from learning the correct mapping between the recognized pupil position and where they actually see.P25 had difficulty locating the targets on the screen during calibration due to his severe peripheral vision loss.As such, eight participants across the two studies reported occasional eye tracking issues when using GazePrompt.More research should be devoted to making eye tracking more accessible to people with diverse visual abilities.
In addition to gaze calibration, some participants complained about the distance requirement which made reading uncomfortable.During reading, participants tended to lean closer towards the screen [97,104] and slightly squint even when instructed not to.Such behavior could invariably affect eye tracking accuracy and increase the chance of data loss.Both our work and prior work [104] suggest that future gaze-based assistive technology should consider adopting wearable eye trackers and communicate eye tracking issues promptly to the user when necessary.

Design Implications
Inspired by participants' preferences on different augmentations as well as usability issues encountered using GazePrompt, we discuss potential design implications for future gaze-aware low vision aids.
Intelligent Line Identification Mechanism.Our line identification algorithm could successfully locate the line the user intended to read in most cases when the user revisits or jumps to a new line.However, such design could be less ideal for users with severe vision loss who requires excessively long time searching for lines during reading.For example, line switching is difficult for people with severe visual field loss (e.g., P25).Because of their limited visual field, they need to back trace on the line very slowly to make sure they are on the right track.However, with our line identification mechanism, this slow search behavior will be misidentified as a "line jumping" behavior with the augmentation moving to the wrong line.As such, the Line-Switching augmentation will be less usable.Prior work has used gaze data to predict users' reading comprehension [4] and their intention in social scenarios [73].No prior work had used gaze data to predict fine-grained fundamental gaze behaviors in reading, not to mention for low vision users.In light of this issue, future researchers should consider building a gaze behavior dataset that covers people with diverse low vision conditions, which would support the development of more intelligent and personalized algorithms to recognize low vision users' gaze behavior in reading.
Customization and Adaptation to Users' Reading Habits.GazePrompt provided a certain level of customization for visual augmentations.Participants could adjust the color for Line-Switching support and the fixation duration threshold for Difficult-Word support, which are two important parameters that made GazePrompt useable for low vision participants.However, participants would like additional customization options.For example, they would like augmentations to be adapted to their reading habits, such as reading speed (e.g, P8).Even with our experimental fixation duration adjustment, some participants still experienced under-or over-sensitivity that increased the number of false positives.Since low vision people can experience diverse visual conditions, it is important to provide sufficient and fine-grained customization to maximize the efficacy of gaze-aware augmentations designed for the low vision population.Moreover, in the field of HCI, adaptive user interfaces have been studied to deliver smooth and convivial user experiences [6,30,40].Similar ideas can be applied to adapt system parameters to each low vision user's unique reading habits.
Gaze-Aware Technology Requires High Gaze Control Ability.While most participants were optimistic about the gaze-aware augmentations, several pointed out the potential issues low vision users have when using gaze as an input modality.GazePrompt, as a system that is fully controlled by users' gaze, provided high responsiveness, enabling more targeted assistance in dynamic reading processes.However, the requirement of using eyes as input for a long time can cause eye strain (e.g., P21).Moreover, some low vision users who could not effectively control their eye movement due to some visual conditions (e.g., Nystagmus) could not use their eyes as the system requires.Given low vision users' difficulty in handeye coordination, and the strain caused by using gaze as the sole approach for manipulation, integration of gaze control and manual control should be considered to overcome the issues with either interaction modality.For example, gaze can be used for selection, and manual control for simple manipulation (e.g., tapping, pressing a button) [75,76].As such, convenience and accuracy of eye tracking can be preserved without involving significant visual or physical stress.Prior work has shown that the combination of gaze control and manual control can improve task completion efficiency without causing significant eye discomfort for sighted participants [76,77].Future research should comprehensively investigate low vision users' experience with gaze and manual control to derive new interaction paradigms for low vision users.Moreover, since low vision users manifest a wide range of visual abilities, the involvement of gaze control should be customizable.

Limitations
Our research has limitations.Although we involved 23 participants in the two studies for system evaluation, we only had 13 participants in each study given the difficulty of recruiting low vision people (e.g., limited mobility).Therefore, the power of our statistical analysis result is limited.Future work is needed to fully examine the potential of gaze-aware reading aid with sufficient number of participants representing different visual conditions.Second, we did not compare the effectiveness of gaze-aware reading augmentations with manually-controlled counterparts during evaluation.Therefore, participants' response about manually controlled reading augmentations were all based on their prior experience with conventional input method (e.g., keyboard and mouse ), which might not reflect their true preference.Future work should compare gaze-aware low vision aids with manually controlled aid to draw conclusions about the control modality more objectively.

CONCLUSION
In this paper, we presented GazePrompt, a reading aid system that provides gaze-aware augmentation for low vision users.Our user studies with 23 participants showed that GazePrompt improved participants' line-switching performance and difficult word recognition performance, and could potentially enhance comprehension.Participants discussed their preferences on the augmentation design of GazePrompt and their attitudes towards gaze-aware versus manually controlled reading augmentations.Based on our quantitative and qualitative findings, we discussed eye tracking challenges for low vision users and derived design implications for future gaze-aware low vision aids.

Figure 2 :
Figure 2: GazePrompt interfaces.(a) The Line Highlighting augmentation of Line-Switching Support; (b) The Arrow augmentation of Line-Switching Support; (c) The Word Magnifier of Difficult-Word Support; (d) More color options for Line-Switching Support.

Figure 3 :
Figure 3: Calibration & Validation interfaces.(a) 14-dot calibration interface; (b) 5-dot validation interface; (c) An illustration of sliding target for line-based calibration (the target will move from left side to the right side of the screen once activated, the white arrow is for demonstration only); (d) 5-line calibration interface; (e) 4-line validation interface.

Figure 4 :
Figure 4: An example of line identification.The green bounding boxes represent the space of two lines: L1 and L2.Orange dots represent fixations and blue line segments represent saccades.The line IDs that are closest to the latest three fixations are 1, 1, and 2, with weight 0.2, 0.1, and 0.9, respectively.The final landing line is line 2 because it receives the most votes.

Figure 5 :
Figure 5: GazePrompt changed participants' scrolling behavior.Orange dots represent fixations and orange line segments represent saccades.(a) P7 scrolled two lines up from bottom while her fixation went up to track the line starting with 'Matthews'that she was reading when not using LS Support; (b) P7 scrolled five lines up from bottom while her fixation went up to track the line starting with 'very' that she was reading when using LS Support.Red arrow indicates the progressing direction of fixations.

Figure A. 1 :
Figure A.1: Participants' preference on the two features of GazePrompt.The first column shows participants' preference on Line-Switching support design along with their text setting.The second column shows participants' preference on Difficult-Word support, where the magnifying glass icon represents Word Magnifier, and the speaker icon represents Text-to-Speech.
The Line-Switching (LS) support can improve a low vision user's line-switching performance.H1.1 Low vision people switch lines faster with LS support than the baseline.H1.2Low vision people switch lines more accurately with LS support than the baseline.H2The Difficult-Word (DW) support can improve a low vision user's word recognition performance.H2.1 Low vision people's maximum time spending on a word is reduced with DW support than the baseline.H2.2Low vision people make fewer word recognition errors with DW support than the baseline.We recruited 13 low vision participants (P3 -P15, 9 female and 4 male), whose ages ranged from 30 to 87 ( = 64.3,=14.4).Three participants (P5, P8, P15) were legally blind, but still had sufficient functional vision to read visually.Table1details participants' visual conditions.No participants had other conditions that cause reading difficulties other than low vision. 4.1.1Participants.