Unveiling Health Literacy through Web Search Behavior: A Classification-Based Analysis of User Interactions

More and more people are relying on the Web to find health information. Challenges faced by individuals with low health literacy in the real world likely persist in the virtual realm. To assist these users, our first step is to identify them. This study aims to uncover disparities in the information-seeking behavior of users with varying levels of health literacy. We utilized data gathered from a prior user experiment. Our approach involves a classification scheme encompassing events during web search sessions, spanning the browser, search engine, and web pages. Employing this scheme, we logged interactions from video recordings in the user study and subjected the event logs to descriptive and inferential analyses. Our data analysis unveils distinctive patterns within the low health literacy group. They exhibit a higher frequency of query reformu-lations with entirely new terms, engage in more left clicks, utilize the browser’s backward functionality more frequently, and invest more time in interactions, including increased scrolling on results pages. Conversely, the high health literacy group demonstrates a greater propensity to click on universal results, extract text from URLs more often, and make more clicks with the mouse middle button. These findings offer valuable insights for inferring users’ health literacy in a non-intrusive manner. The automatic inference of health literacy can pave the way for personalized services, enhancing accessibility to information and education for individuals with low health literacy, among other benefits.


INTRODUCTION
A diverse range of users widely seeks online health information.Notably, 55% of European citizens aged 16-74 acknowledged seeking health information online, with Finland reporting the highest participation at 80% and Bulgaria recording the lowest at 36% [8].In the United States, in 2017, a substantial 74.4% of American adults actively sought health information online [9].The parallels extend to Asian countries as well, mirroring a comparable reality [50].
Seeking health information online fosters increased participation and involvement in medical decision-making [20,34,36], has been associated with behavioral changes [20], and empowers patients [34,36].Research indicates that well-informed patients exhibit higher compliance, better adherence to treatment [34], and improved health outcomes [36].Additionally, such patients tend to utilize healthcare services more efficiently, resulting in reduced health costs [36].
However, despite the growing prevalence of online health searches, certain users encounter frustration during their search sessions [48].Moreover, the reading level required to comprehend the retrieved health information is often high [52].Recognizing the diverse user base engaged in online health searches underscores the need for search engines to adapt to users' specificities, aiming to alleviate difficulties that may arise during the search process.Among the user characteristics that can impact the success of health searches, health literacy stands out.
While lacking a universally agreed-upon definition [31], health literacy, in every interpretation, encompasses the competencies to access, understand, and apply health-related information.The recognition of users grappling with low health literacy is a gateway to exploring tailored strategies for providing targeted search support.
Individuals with diminished health literacy levels often deal with heightened health problems, frequent engagements with health services, and escalated healthcare costs [1].This scenario is particularly exacerbated in patients managing chronic conditions like diabetes and asthma, where adherence to instructions is paramount and continuous [17].Given the enumerated benefits of online health information searches, mitigating disparities in information access is crucial.
Traditionally, health literacy is measured through instruments unsuitable for a search environment for their intrusiveness.We hypothesize that users with varying health literacy levels exhibit distinct seeking behaviors during online health searches.Furthermore, we posit the possibility of automatically deducing users' health literacy through their interactions with the browser, search engine, and result pages in the online search process.
We introduce a detailed classification scheme for interactions during web searches to scrutinize whether these interactions differ based on the user's health literacy.Employing this scheme, we analyzed video recordings from a user experiment, manually annotating all interaction events during the search session.Subsequently, we used descriptive and inferential analyses to compare the behaviors of low and high-health literacy groups.

BACKGROUND AND STATE OF THE ART
The following subsections describe the concept of Health Literacy and work that focuses on the influence of expertise on web information-seeking behavior.

Health Literacy
The definition of health literacy is not consensual and continually evolves in its meaning [14,31] and how it is measured.Definitions differ on the inclusion or not of numeracy, health decision-making, and communication skills.For this work, health literacy is "the degree to which individuals can obtain, process, understand and communicate health-related information necessary to make informed health decisions" [33].
Low health literacy has two costs: economic costs to society and health system expenses linked to the human burden of diseases [35].Recent surveys reveal that 22% of Americans possess basic health literacy, while 14% fall below the basic level [19].In the European landscape, nearly half of the population grapples with inadequate (12%) or problematic (35%) health literacy [15].
Various instruments measure health literacy, each possessing distinct capacities and administration times.Among the widely employed methods are the Rapid Estimate of Literacy in Medicine (REALM) [6], the Test of Functional Health Literacy in Adults (TOFHLA) [41], S-TOFHLA (short version of TOFHLA) [3], Newest Vital Sign (NVS) [51], Short Assessment of Health Literacy for Spanish-speaking Adults (SAHLSA) [21], and Medical Term Recognition Test (METER) [44].
METER was the instrument used in the user experiment analyzed in this study.It comprises forty medical words and thirty intrusive words that mimic medical terms.While reading the list, participants are asked to mark the words they are confident are genuine terms.If the result of counting correct words and not words is equal or superior to, respectively, 35/40 and 18/30, the participant has an adequate health literacy level.Otherwise, the participant has an inadequate level of health literacy.

The influence of expertise on web information-seeking behavior
Studies of web information-seeking behavior concentrate on a variety of goals and approaches.

2.2.1
While seeking for general information.Various studies delve into the impact of expertise on search behavior and query formulation, with others concentrating on predicting expertise from search behavior and personalizing retrieval experiences based on expertise.
The methodologies employed range from user experiments to log analyses.
The assessment of users' expertise takes diverse forms, including specific recruitment [4,13], users' self-assessment of their knowledge in study domains/topics [11,18,23,32], specific expertise assessment methods [7,52,54,56], or a combination of the latter two approaches [22,57].Recruitment can be done in the domains of the search tasks [13] or can involve selecting participants who are experienced searchers in those domains [4].When not done at a recruitment level, expertise can be evaluated through quizzes [7], asking users to judge their understanding/familiarity with terms/concepts from a thesaurus [22,56,57] or asking users to self-assess their expertise on specific domains.A log analysis study has considered an expert who has visited at least 100 domain pages over a period [52].
Studies exhibit variations in their approaches to analyzing user behavior.Regarding query formulation, research may focus on the use of operators [2], the length and number of queries [2,7,11,13,22,32,52,56,57], query types and terms (narrow or broad) [32,54], the terminology employed in queries (technical or lay) [52], the count of unique queries, and the number of terms in common with the task description [22,32].The (re)formulation of queries was also considered in its different types, such as generalization, specification, substitution, regression, synonym, elaboration, backtracking, plural making, broadening, and refining [2,11,32,54].
In the realm of Search Engine Results Page (SERP) behavior, studies zero in on total time and the number of accessed SERPs [32,56], the rank position [56], and the click count on the results [4,32].When delving into webpage behavior, researchers consider the number of visited pages [7,32,52], total time spent [7,32], and reading time [18].At the session level, considerations include the number of actions [56], first dwell time, and total dwell time [7,23,32,56].
Diverse metrics measure search success across studies [18,56].Some use the ratio of saved documents among viewed documents [18], while others consider the number of relevant documents [56].Precision metrics such as Mean Average Precision (MAP) [56] also come into play.
Users with high expertise tend to be more successful in search [4,18,52].Regarding time, they spend less time on web pages and query formulation [7,18].Their preferences lean towards technical web pages [52], and they engage in a higher volume of search queries [56], often comprising more meaningful and synonyms words [57].While some studies suggest that expert users formulate longer queries [52,56,57], others suggest shorter queries [7].Experts display a more varied vocabulary [22], employ domainspecific terms by establishing lexicons in each domain [52], and exhibit a higher level of detail and sophistication in their search attempts [11].
Conversely, the low expertise group demonstrates less flexibility and efficiency in their search strategies, particularly in selecting concepts [11,13,54].They are inclined to formulate longer queries [13,32] and heavily rely on task descriptions [2,22].Regarding query reformulation, this group tends to engage in conversions between plural and singular forms, repeat search terms, reuse prior search terms, and maintain the same basic structure [11].Their usage of Boolean operators is higher, although many operators are implicitly present in the search [2].This group is also associated with a higher frequency of spelling errors [54].
2.2.2While seeking for health information.The influence of the expertise is also explored in web searches within the health domain.Some studies examine the impact of (e-) health literacy or medical expertise on web information search behavior [5,24,43,48,53,55].Others delve into understanding query formulation behavior, considering factors such as the users' expertise, health literacy, topic familiarity, or domain knowledge [25][26][27]46].
In user experiments, expertise is primarily assessed through specific instruments, including the Digital e-Health Literacy instrument [5], the eHealth Literacy Scale (eHEALS) [43], the health literacy measure tool from the European Health Literacy Survey (HLS-EU-Q) [48], and the health literacy instruments: METER [24,25], Newest Vital Sign [43] and SAHLSA [26,27].There is also a study that asks users to self-rate their health literacy on a three-point scale [46], one that assesses the community-based health literacy (CBHL) score [55], and another that considers experts the ones having used a specialist medical search engine [53].
Studies scrutinizing query behavior employ diverse metrics, including the type of terminology used [25,26], the length and number of queries [5, 25-27, 46-48, 53, 55], the type of reformulation [5], the utilization of Boolean operators [27], the incorporation of suggestions [25], and the presence of new terms and spelling errors [25,27,48].In the context of SERP, researchers focus on the number of visits, time spent, click count, and reading time [5,43,56].The analysis of webpage behavior involves the number of visited web pages and the total time spent [5,53,55].At the session level, the time spent on the task is a parameter under consideration [43,46].
Experts visit more technical websites and invest more time in their searches [53].They conduct more searches, formulate longer queries, and employ a more specialized vocabulary [25,46,48,53].Conversely, novices encounter challenges in achieving equivalent search success [27,55].They gravitate towards consumer-oriented websites [53], and some studies suggest that this group is associated with a longer dwell time on web pages, taking more time to read and focus on the page [55].In contrast, other studies propose that novices spend less time on result pages and pay less attention [24].Regarding queries, novices have more difficulty formulating queries, often employing less technical terminology [27].

METHODOLOGY
This study uses data collected in a previously conducted user experiment [28].Following the formulation of a classification scheme delineating events within web search interactions, we applied this scheme to log the interactions occurring in the video recordings of the user study.Then, we analyzed these event logs in a descriptive and inferential manner.

Research questions
Most of the works exploring the influence of expertise on information-seeking behavior focus on query (re)formulation and use relatively coarse measures to analyze interactions with SERP and Result Pages.Here, we intend to study more fine-grained interactions within these pages and the browser while also analyzing query (re)formulation behavior.Our research questions are: RQ1: Do the LHL and HHL groups interact differently with the browser while searching for health information?
RQ2: Do the LHL and HHL groups interact differently with SERP while searching for health information?
RQ3: Do the LHL and HHL groups interact differently with Result Pages while searching for health information?
RQ4: Do the LHL and HHL groups have different query formulation behaviors while searching for health information?
RQ5: Do the LHL and HHL groups have different query reformulation behaviors while searching for health information?
Drawing insights from the findings of the previously outlined studies, we formulate the following hypotheses.
RH4.1: LHL users dedicate more time to query formulation.RH4.2: LHL users formulate fewer queries during a search session.
RH5.1: LHL users exhibit a higher frequency of repeating queries within a search session.
RH5.4: LHL users formulate entirely new queries less often while reformulating in a search session.

User experiment
The user experiment involved twenty participants aged between 21-35 years with a native language different from English.Most (95%) were students specializing in Computer Science, Multimedia, and Design.Approximately half of the participants (45%) reported rarely searching the web for health information, while the remaining half (55%) did so occasionally.Participants' health literacy was measured using the METER instrument, previously translated, and validated to the users' native language [39].The application of this instrument revealed that eight participants (40%) possessed a level of literacy deemed inadequate.These eight users were assigned to the Low Health Literacy (LHL) group and the other 12 to the High Health Literacy (HHL) group.
After the initial assessment of health literacy and completing a demographic and search habits questionnaire, users were tasked with answering search queries without consulting external sources and refraining from guessing the answers.This step ensured that all participants were unfamiliar with the topics associated with the search tasks.
Subsequently, participants engaged in 10 search tasks, with five focusing on asthma and the other five on nutrition.These search tasks were exploratory and were defined based on questions included in translated and validated versions [40,45] of existing questionnaires [12,42] on these topics.For instance, a search task prompt was "Please indicate three diseases or health problems related to low fiber intake".Users were free to use any search engine, and there was no time limit for task completion.The topic of the search tasks was rotated among users, with half beginning with asthma-related tasks and the other half starting with nutritionrelated ones.
Each search session was recorded in video format, including audio.While users were not explicitly instructed to think aloud, some opted to verbalize their thoughts during the search session.Queries submitted by users were logged.For this study, only 19 out of the 20 participants' recordings were considered due to issues with one video.As this excluded video belonged to an HHL user, the final analysis included 8 LHL users and 11 HHL users.

Application of the Classification Scheme
Each user study participant is associated with a video recording, a chronological log of formulated queries, and their level of health literacy (low or high).
For every user, we generated a file to log details about the user (user ID, literacy group), the task undertaken (task ID), specific events (timestamp, category ID, event ID), Search Engine Results Page (SERP) interactions (SERP ID, click rank), and Result Page (RP) engagements (RP ID, RP view).The IDs for SERP and RP are automatically generated based on the corresponding event ID, ensuring a systematic and organized approach to tracking user interactions throughout the study.
To identify users and tasks, we have used sequentially assigned numbers.The event identifiers align with those proposed in the classification scheme.In SERP events, we also log the SERP identifier to facilitate aggregation based on individual SERPs.The assignment of a new SERP ID is triggered if the preceding event pertains to query formulation; otherwise, the SERP ID remains unchanged.It's worth noting that, with this assignment algorithm, two visualizations of a SERP with the same Uniform Resource Locator (URL) generate distinct SERP IDs, a distinction that does not pose an issue for this study.In the case of a SERP event involving a click (leftclick, middle-click, or right-click with open new tab) on a featured snippet, organic result, or universal result, we also log the rank position of the clicked result.The computation of rank position excludes paid results and specific universal results, such as those leading to Google Scholar articles.
In RP events, we register the page's identifier for the same reason we do it in SERP events.This identifier is automatically assigned in sequential order whenever the preceding event simultaneously qualifies as both a SERP event and a click event (any of the three types mentioned above).Additionally, we recorded whether the page was viewed or not.This decision was made due to instances where participants did not view some of the clicked results.There were situations where users opened results in different tabs but did not view those tabs, or they clicked on a result but returned to the SERP before the result page fully loaded.
Subsequently, we enhanced the information mentioned above with two additional elements: the duration of each event and the time to the first SERP click.Both aspects were automatically computed.The event duration is the time difference, in seconds, between the current event and the subsequent one.The time to the first SERP click is the time difference, in seconds, between the query formulation event that prompts a new SERP and the first click event (left-click, middle-click, or opening a new tab after a right-click).
For the analysis of query (re)formulation, we recorded information in a separate file.This file includes details about the user, task, and query.In addition to the query itself, we logged its sequentially assigned ID, the number of terms it comprises, its language, and its terminology (lay or medico-scientific).In this work, a query is deemed medico-scientific if it contains at least one medico-scientific term; otherwise, it is considered lay.The assessment of query terminology involved the collaboration of a health professional.

SEARCH INTERACTIONS CLASSIFICATION SCHEME
We devised this classification scheme to systematically analyze fine-grain information-seeking events.The scheme applies to web searches across various search engines, manually or (semi) automatically.The scheme is organized into three main categories: Browser, Search Engine, and Result Pages (RP), corresponding to the specific areas where events unfold.The Search Engine Home Page and Search Engine Results Page (SERP) categories are parts of the search engine and encompass the Query Bar category.Each category has distinct events, as detailed in Table 1.The Browser category encompasses general browser events commonly utilized during search sessions, and events associated with query formulation in the address bar.Interaction with the search engine can occur within the home and results pages.The category related to the home page comprises two events: ImFeelingLucky and Pri-vacyReminder.The privacy reminder event is also included in the SERP category.
In SERP interactions, we consider different types of results: organic results, paid results, featured snippets, universal results, and the knowledge panel [37,38].We also consider the interaction with SERP tools (e.g., translator).The event InteractR2 includes interactions that do not entail leaving the SERP to access a result page.For example, in People Also Ask, a universal result, users can click on multiple queries for more information without leaving the SERP.Similarly, scrolling through various images in a carousel does not require clicking on one.This event can occur in all results except the organic one, which can only be clicked.Within the Home Page and SERP categories, there is a category that groups all events that can happen in the query bar of these pages.
The event names are intended to be self-explanatory.Table 1 events in italics are linked to query formulation.In such instances, we can further characterize the query based on its length (number of terms), language, terminology (lay or medico-scientific), the presence of orthographic errors, and the existence of typos (e.g., caused by finger lapses).Additionally, each query reformulation is analyzed based on predefined types, as outlined in Table 8.

RESULTS
We present the results in two main sections.The first compares the occurrence of events, individually and aggregated by area of occurrence, in both health literacy groups.The second section delves into analyzing and comparing users' query formulation and reformulation behavior.Our analytical approach involves computing descriptive statistics and employing one-tailed hypothesis tests to assess the significance of differences between low and high-health literacy users.We utilize the t-test when the assumptions of this test are met -normality of the distributions and homogeneity of variance -and resort to the Mann-Whitney U test when they are not.In our reporting of results, we use *** for p < 0.01 to signify strong evidence against the null hypothesis (H0), ** for p < 0.05 to denote moderate evidence, and * for p < 0.1 to indicate weak evidence [10].Results with p ≥ 0.1 indicate insufficient evidence [10].In every table, bold is used to sign the largest value.

Occurrence and duration of interaction events
To compare the low and high health literacy groups, we examined the number of occurrences of each event within the classification scheme.These serve as our first-level measures and are detailed in Tables 2, 3 Beyond first-level measures, we computed second-level measures based on the former.These include metrics such as time spent scrolling the page (aggregating scrolling events), the total number of clicks (summing click events), time to the first RP click (calculated as the time of RP click minus the time of SERP load), and queriesrelated measures (accumulating address bar and search engine box events).These second-level measures are italicized in Table 3 and Table 4 for reference.
Aggregation is performed by SERP or RP whenever we are in the presence of an event of these categories.In other categories, we aggregate by session.In either case, we always end up aggregating by the user.We consider a session "a series of queries submitted by a user and related interactions during an episode of interaction between the user and the Web search engine around a single topic" [16].Following this definition, we consider that each of the ten tasks generates a separate search session.
Notably, there were no Search Engine Home Page interactions.The same holds for other events from the classification scheme with no occurrences.We opted to present non-significant results because they show the distribution of events per session/SERP/RP and suggest potential trends for exploration in future work.
In browser interaction (Table 2), Tab selection emerges as the most popular event in both groups, occurring, on average, more than once per session.Although a popular event in both groups, users with higher health literacy exhibit a significant tendency to close tabs more frequently.While both groups often use the address bar for query submissions, this is more prevalent in the HHL group.Both groups commonly employ the find functionality of the browser.Notably, the second most popular event in the LHL group is the use of the backward button, with a significantly higher average than the HHL group, suggesting potential difficulties in information-seeking for the former users.Selecting text from the URL is also prevalent in the HHL group, significantly more than in the LHL group.
SERP are the focal point of most interactions during a search session.As anticipated (Table 3), result clicking, and scrolling are the most popular events in both groups.The LHL group engages significantly more in left-click actions, both in general and specifically in organic results.Conversely, users with higher health literacy exhibit a significant inclination toward middle-click actions, both overall and in organic results.Additionally, they perform more left-clicks in universal results.It's worth noting that all right-clicks were employed to open the result in a new tab.Users generally tend to scroll more downward, spending an average of over 3.5 seconds scrolling in a SERP.
Apart from the measures outlined in Table 3 we computed the second-level measures detailed in Table 6.The LHL group maintains a significantly higher average number of SERP with clicks per session.Furthermore, low health literacy users took longer to click the first result.
In RP interactions, as shown in Table 4, it is noteworthy that users with low health literacy scroll significantly more, both in frequency and duration, than their counterparts.The findings in Table 6 further emphasize that LHL users interact significantly more with result pages regarding the number and duration of events.
The formulation of queries is a pivotal stage in the interaction between users and retrieval systems, occurring either in the search engine query bar or directly in the browser's address bar.Table 5 provides the average number of events for query formulation in the query bar and browser address bar per session.Interestingly, the LHL group is associated with a significantly higher number of interactions related to query formulation.To gain a more comprehensive understanding of query interactions, we computed additional measures outlined in Table 6.These results align with the previous findings, indicating that users with lower health literacy have significantly more interactions related to query formulation.

Query (Re)Formulation Behavior
When examining the nature of queries and reformulations, as depicted in Table 7 and Table 8, we initially compared the average number of queries (2.69 in HHL versus 2.51 in LHL, p-value=0.28)and their reformulations (1.69 in HHL versus 1.51 in LHL, p-value=0.28)per session between both groups, finding no significant differences.In Table 7, we characterized all submitted queries, either new or reformulated.Interestingly, users from the HHL group formulated a significantly higher number of queries with typos.
As seen in Table 8, the HHL group engages in significantly more reformulations by adding words to the previous query, substituting terms with synonyms, and changing plural forms to singular ones.Conversely, the LHL group employs significantly more totally new reformulations, characterized by having no common terms with the previous query.

DISCUSSION OF RESULTS
In this discussion, we focus on the significant differences, summarized in Table 9.The asterisks denote the strength of statistical significance, with the position of the asterisk indicating the group with the larger value.

Interactions with the browser
In addressing the first research question, "Do the LHL and HHL groups interact differently with the browser while searching for health information?",notable differences were observed in browser interactions.The LHL group demonstrated a higher occurrence of the backward button, suggesting a tendency among these users to open and frequently use the back button to return to the SERP.On the other hand, the HHL group exhibited a higher frequency of selecting text from the URL, closing tabs, generally using the address bar, and introducing new queries in the address bar.It's plausible to hypothesize that it may not be evident for all that submitting queries in the browser's address bar is possible.As greater health literacy is associated with greater perceived ease of use of health information technology [30], that may be why LHL users use the search engine box query less often.

Interactions with SERP
Addressing the second research question, "Do the LHL and HHL groups interact differently with SERP while searching for health

Interactions with Result Pages
The RP, focus of our third research question, was the stage where the most significant differences between the LHL and HHL groups were observed, and these differences were more pronounced.The LHL group exhibited a higher occurrence of events on Result Pages, which were also longer.This aligns with the notion that individuals with lower health literacy tend to spend more time on web pages, possibly due to taking more time to read and focus on the content [55].Interestingly, a study by Lopes and Ramos [24] reached a different conclusion, suggesting that LHL users paid less attention to Result Pages.Additionally, the LHL group showed more occurrences and longer durations of scrolling up, scrolling down, and general scrolling events.This could indicate a different interaction pattern with the content on Result Pages, possibly related to the effort to understand and extract information.
In the analysis of search sessions, we observed that LHL users tend to use online translator tools to seek the English translation of search terms.This behavior could be attributed to language comprehension challenges and may impact how they engage with health information online.

Query formulation behavior
In addressing the fourth research question, "Do the LHL and HHL groups have different query formulation behaviors while searching for health information?"several insights emerged.The LHL group demonstrated a tendency to interact more in both the browser address bar and the search engine query bar, with a more pronounced preference for the latter.However, despite this, the LHL group submitted fewer queries, a finding that, while not statistically significant, may still indicate a distinctive behavior in how they engage with search interfaces.
In terms of query characteristics, both groups showed a preference for queries in their native language, but the HHL group had a higher frequency of queries in English.Additionally, the HHL group had more typographical errors in their queries.Perhaps, as these users type more quicker, they make more mistakes in the search.This finding contradicts previous research by Wildemuth [54], which suggested that insufficient domain knowledge is associated with a higher frequency of errors.It also challenges the research hypothesis (RH4.4) that posited LHL users would make more orthographical errors.

Query reformulation behavior
In addressing the last research question, "Do the LHL and HHL groups have different query reformulation behaviors while searching for health information?"several interesting observations emerged.The LHL group exhibited a higher frequency of reformulations without terms in common with the initial query, suggesting a pattern of entirely revamping the search query.This behavior may indicate challenges or lack of success in the initial search attempts.Surprisingly, this contradicts the research hypothesis RH5.4 and findings from a study that suggested novices tend to retain the same basic structure in their reformulations [11].
Conversely, the HHL group engaged in more reformulations where they added words, substituted terms with synonyms, and changed plural forms to singular ones.This behavior could indicate difficulties in query formulation usually associated with LHL users [26].This finding challenges the hypothesis RH5.3 and the abovementioned study [11].Additionally, there was no evidence to support the initial research hypotheses RH5.1, supported by Hembrooke et al. [11], and RH5.2, supported by Aula [2].

Implications
Technology can reduce barriers caused by the lack of health literacy [49].The potential applications of understanding how health literacy influences online health information-seeking behaviors are substantial.This exploratory work provides valuable insights into behaviors that may serve as indicators of health literacy differences, enabling the development of non-intrusive methods for deducing users' health literacy levels.This could pave the way for personalized services tailored to individuals with low health literacy.
One practical application involves adapting the list of documents retrieved by search engines based on health literacy levels.For users with low health literacy, promoting documents with more readable content could enhance their understanding and decision-making.Additionally, personalized services could generate simplified versions of search results, making it easier for users to evaluate and select relevant information.Automated tools like the HealthTranslator extension proposed by Lopes and Sousa [29] offer an example of a service that aids in translating complex medical content into more understandable language.
Services providing automated simplification or summarization of content could prove beneficial at the result page level, improving comprehension and facilitating the selection of useful links.These advancements have the potential to significantly enhance the search efficiency and effectiveness for individuals with low health literacy, contributing to a more inclusive and accessible digital health landscape.

CONCLUSIONS
In this paper, we analyze the web information-seeking behavior of users with different levels of health literacy.We found that the low health literacy group made more totally new query reformulations, made more left clicks, used the browser's backward functionality more often, and spent more time interacting with the results pages, including scrolling.On the other hand, the high health literacy group clicked more on universal results, made more middle clicks, and selected text from the URL more often.In future work, we would like to analyze these data as a regression problem with behavioral indices explaining health literacy.

, 4 ,
and 5.The Scroll[Down/Up]Start and Scroll[Down/Up]Finish events were consolidated into Scroll[Down/Up] since they shared the same values.

Table 1 :
Interaction events by category.Events in italic are associated with query formulation.

Table 2 :
Average number (#) of browser interaction events per session.

Table 3 :
Average number (#) and duration (in seconds) of events per SERP.

Table 4 :
Average number (#) and duration (in seconds) of events per result page.

Table 5 :
Average number (#) of query bar interaction events per session.

Table 7 :
Query characterization.All averages are computed per session, except the first one.
tabs simultaneously.Although not statistically significant, the HHL group showed a higher occurrence of SERP without clicks, suggesting that they engage in more reformulations or gather information directly from the SERP page.