Social Determinants of Health and ER Utilization: Role of Information Integration during COVID-19

Emergency room (ER) admissions are the front door for the utilization of a community's health resources and serve as a valuable proxy for a community health system's capacity. While recent research suggests that social determinants of health (SDOH) are important predictors of patient health outcomes, their impact on ER utilization during the COVID-19 pandemic is not well understood. Further, the role of hospital information integration in moderating the impact of SDOH on ER utilization has not received adequate attention. Utilizing longitudinal claims data from a regional health information exchange spanning 6 years including the COVID-19 period, we study how SDOH affects ER utilization and whether effective integration of patient health information across hospitals can moderate its impact. Our results suggest that a patient's economic well-being significantly reduces future ER utilization. The magnitude of this relationship is significant when patients are treated at hospitals with high information integration but is weaker when patients receive care at hospitals with lower levels of information integration. Instead, patients' family and social support can reduce ER utilization when they are treated at hospitals with low information integration. In other words, different dimensions of SDOH are important in low versus high information integration conditions. Furthermore, predictive modeling shows that patient visit type and prior visit history can significantly improve the predictive accuracy of ER utilization. Our research implications support efforts to develop national standards for the collection and sharing of SDOH data and their use and interpretation for clinical decision making by healthcare providers and policy makers.


INTRODUCTION
The United States spends more than $4.1 trillion a year on healthcare, more than any other country in the world, yet reports significant gaps in care, health equity, and population health outcomes [ 1 ].Preventable hospitalizations and emergency room (ER) visits contribute to increased costs while overburdening the health system with avoidable bed occupancy and the use of scarce resources.As observed during the COVID-19 pandemic, ERs represent a gateway to a community's health resources and a pulse of its health ecosystem.Hence, it is crucial to understand the causes of ER admissions, especially in the current landscape where COVID-19 has placed a heavy burden on healthcare resources.While total ER visits declined 42% during the early stages of the COVID-19 pandemic in the U.S. [ 2 ] and 50% in Italy [ 3 ], higher proportions of ER patients sought care for COVID-19 related issues and mental and behavioral health concerns [ 4 ].The evidence also suggests that hospitals worldwide witnessed no reduction in ER activity at the peak of the pandemic, while ER admissions attributed to factors other than COVID-19 decreased by as much as 70%.
It is well known that non-clinically relevant factors, such as social determinants of health (SDOH) , are known to impact ER utilization as well.The Healthy People 2020 report defines SDOH as "conditions in the environments in which people are born, live, learn, work, play, worship, and age that affect a wide range of health, functioning, and quality-of-life outcomes and risks" [ 5 ].Socio-economic factors, such as income, education level, employment status, and food security, can also significantly affect healthcare resource utilization and clinical outcomes [ 6 , 7 ].Further, the extent of information integration between healthcare providers to enable sharing of patient medical records has also been linked to improved health outcomes [ 8 -10 ].It is essential to develop a systematic understanding of the impact of clinical and non-clinical factors on healthcare utilization, which is vital for local decision-making related to the optimal allocation of scarce ER resources.
However, a comprehensive and clinically relevant study on the impact of clinical and nonclinical factors, based on longitudinal analysis of ER admissions data, has been limited due to a lack of availability and the fragmented nature of patient data associated with ER admissions.In this study, we aim to address this research gap by identifying non-clinical factors that may potentially influence ER admissions, especially during the COVID-19 era.Specifically, we focus on SDOH and the level of information integration of patient health data at hospitals.We empirically test our research hypotheses using data from a variety of sources, including the Integrated Care Collaboration (ICC) network, the American Community Survey (ACS) , the Environmental Protection Agency Food Environment Atlas, and the American Hospital Association (AHA) IT Supplement database.
Our study contributes to the literature on the impact of COVID-19 as observed through the lens of healthcare resource utilization and the impact of SDOH under different boundary conditions of information integration.Previous studies on COVID-19 have focused mainly on outcome measures such as health exposure and mortality rates, with no emphasis on the utilization of scarce healthcare resources such as ER capacity.The utilization of critical healthcare resources such as emergency departments represents important metrics that decision-makers should consider when combating the pandemic.Although prior studies have analyzed how SDOHs may influence healthcare utilization, to the best of our knowledge, ours is one of the first studies to explore these relationships during the COVID-19 pandemic, which underscores the need to address deficiencies in patient access to care and health equity issues that have not been considered in prior research.Furthermore, our research underscores the role of information integration as a critical moderator of the relationship between SDOH and ED utilization and highlights the importance of different SDOH factors in high versus low information integration contexts.

BACKGROUND
Understanding ER utilization is important from several perspectives.The U.S. spent $136.6 billion on ER visits, accounting for 5% of healthcare spending in 2016.Additionally, ER-related spending is growing at an annual rate of more than 5%, surpassing the annual growth rate of 1.4% for all healthcare spending [ 11 ].Thus, understanding the factors related to ER use enables policies that can significantly reduce costs and free up healthcare resources.From a clinical perspective, emergency departments in the U.S. have become the primary gateway for inpatient hospitalizations and have been steadily increasing as a ratio of hospital admissions-fifty-eight percent of all hospital admissions originated in ERs in 2004, which increased to 70% in 2018 [ 12 ].Hence, it is imperative to understand how different factors influence patients' use of emergency care services, especially during a pandemic where the healthcare system is heavily stressed.

COVID-19
COVID-19 is an infectious respiratory disease caused by the SARS-CoV-2 virus that became a global pandemic in 2020.Symptoms of the viral infection range from a mild, common-cold-like illness to severe pneumonia and potentially fatal acute respiratory distress syndrome [ 13 ].The healthcare industry faced immense disruption during the COVID-19 pandemic, and emergency departments were particularly affected.For example, in the United States, the Centers for Disease Control and Prevention reported a decrease of 25% to 42% in ER visits between 2020 and 2021 [ 2 , 4 ].Meanwhile, hospitals worldwide reported a significant increase in ER visits related to respiratory and mental health conditions [ 2 , 4 , 14 ].
Several reasons have been offered to explain the significant reduction in ER visits.First, major changes in population lifestyles, at least partially attributed to the lockdown measures across the world, may have had unforeseen health impacts for various reasons.For example, due to less vehicular travel during the pandemic, fewer collisions and accidents have occurred on the nation's roads and highways that can partially explain the precipitous drop in ER visits.COVID-19 lockdowns and face mask use have also reduced the spread of other infectious diseases (such as influenza and malaria).These lifestyle changes may have reduced ER visits due to fewer patients seeking care for treatment of such infectious diseases [ 15 ].
Furthermore, school closures during 2020 and part of 2021 also contributed to a dramatic decrease in pediatric ER visits [ 16 ].Recent evidence suggests that lower ER utilization is associated with the fear of contracting COVID-19 at hospitals, symbolized by a significant reduction in patient treatments for severe conditions such as cancer and heart attacks.Such patients still need professional care but refuse to visit hospitals, resulting in a sharp increase in mortality related to such conditions [ 17 ].
The composition of ER patients has changed dramatically during the pandemic, as evidenced by a sharp reduction in non-COVID patients and a concomitant increase in COVID-19 patients and patients with mental issues [ 18 ].COVID-19 patients need to be treated separately from other patients due to the infectious nature of the disease, resulting in a greater workload for ER clinicians and significant changes in ER operations and capacity management [ 19 ].Hence, conclusions based on studies conducted prior to COVID-19 may not be generalizable to ER resource utilization during the pandemic.The unique dataset deployed in our study allows us to specifically explore the impact of SDOH and patient health information integration on ER utilization, before and during the COVID-19 pandemic.

Social Determinants of Health
There has been extensive research in the public health domain that links social, economic, and psychological factors to population health.SDOH, as defined by the World Health Organization in 2003, include non-medical factors that can potentially manifest as the root cause of healthcare inequities that include a patient's social gradient, stress, early life, social exclusion, work, unemployment, housing, social support, addiction, food, and transportation [ 20 ].Based on this definition, two theories have been proposed to understand patient socio-economic status and health status.Life course theory considers the disease state of individuals based on their past social status.Specifically, the theory attributes disease development to complex interactions between biological factors and non-biological factors such as the socio-economic status of individuals [ 21 , 22 ].In other words, social determinants of patient health are likely to be based on a person's prior social experiences, which in turn, impact their overall physiological and mental well-being [ 23 ].
However, the salutogenic model considers social factors that can create and maintain good health.The central construct of this model-the sense of coherence-links life stressors to individual and community health status.The model posits that stressors, such as medical conditions, are a common and unavoidable feature of human existence, and individuals and communities with a stronger sense of coherence possess a higher capacity to deal with these stressors and are better able to maintain their health status [ 22 , 24 ].There does not exist a well-accepted taxonomy or set of criteria that can help decide which socio-economic factors adequately represent social determinants of health, with subjective assessments being the predominant way of identifying SDOHs in the relevant literature [ 25 ].
Recent research has often focused on specific social determinants of interest and studied their potential effect on ER admission risk.For instance, McCarthy et al. [ 26 ] demonstrated that employment status and housing insecurity are associated with a greater need for ER care among Medicaid beneficiaries, while Davis et al. [ 27 ] observed that social determinants such as food, housing, and financial insecurity, lack of access to transportation, and psychosocial needs are related to higher ER utilization.For pediatric patients, unmet needs associated with housing, utilities, food, and transportation are correlated with a higher likelihood of ER visits [ 28 ].In this study, we focus on three SDOH factors-patient economic well-being, family and social support, and access to healthy food-and their impact on ER utilization during the COVID-19 pandemic.Specifically, we argue that lack of healthy food, stable housing, and financial strain are correlated with a lower likelihood of insurance coverage and access to regular medical care and medicines, shedding further light on the mechanisms underlying these relationships.To the best of our knowledge, none of these factors have been studied in the context of the COVID-19 pandemic using a large, longitudinal dataset based on encounter-level patient data and thereby represents a significant research gap that our study strives to address.

Information Integration
While the economic benefits of inter-organizational information sharing in commercial settings have been thoroughly explored in the information systems, operations management, and strategy literature [ 29 -31 ], the benefits of inter-organizational information sharing in healthcare have been harder to establish.Health information exchanges (HIE) enable inter-organizational information sharing between hospitals and clinics by providing a local or regional platform through which healthcare professionals can securely share patients' medical information electronically.While the extant literature has discussed the potential benefits of participating in HIEs [ 32 , 33 ], there is a lack of data-driven evidence on the impact of hospital-level information integration on patient health outcomes, especially related to ER utilization [ 34 , 35 ].
Overall, the evidence on the impact of information integration on clinical health outcomes has been mixed.While a few prior studies have identified a reduction in ER length of stay, rates of duplicate testing, and Medicare spending [ 8 , 9 ], other studies have reported that HIE participation is not associated with a reduction in healthcare resource utilization [ 36 -38 ].HIE access by physicians has been associated with increased efficiency and cost reduction in hospital admissions, CT use, and laboratory tests [ 39 -41 ].Ayabakan et al. [ 42 ] observed that greater information sharing across hospitals was associated with lower levels of duplicate testing, which in turn, improved operational efficiency.In a recent study, Janakiraman et al. [ 10 ] observed that HIE use was associated with lower patient readmission rates and hospital length of stay.We posit that access to patient health data, such as comorbidities and medication history, is likely to impact clinicians' treatment decisions during COVID-19, since it provides a better understanding of their risk of future ER utilization.For example, if physicians can access prior utilization history through an HIE and stratify patients based on their risk for COVID-19, then they may be able to better triage and decide on patients to be admitted to an ER and ensure better post-discharge coordination (hand-offs), which might reduce the level of future ER utilization.Similarly, patients with severe COVID-19 may not be able to communicate their comorbidities accurately to healthcare providers due to their debilitated health status.In such situations, information sharing between providers facilitated by HIEs is particularly useful for patient treatment as it allows care management teams to obtain patient health data, including prior medications, allergies, and comorbidities, in an efficient manner without waiting for patient input.In other words, HIEs can facilitate the interoperability of patient health data and the sharing of medical information with participating hospitals [ 43 , 44 ].
To the best of our knowledge, there have not been any studies that explore the impact of hospital information integration of patient health data on their ER utilization during the COVID-19 pandemic.A recent study based on ER visits to a single medical center in Arizona reported that average daily ER visits decreased by 20% during the COVID-19 pandemic compared to the same period in 2019 [ 45 ].Similarly, Venkatesh et al. [ 46 ] conducted an extensive study of data from the Clinical Emergency Department Registry comprising of 164 ED sites across 35 states from January 2019 to November 15, 2020.They observed a persistent decline in ER visits during the pandemic that was particularly pronounced for acute myocardial infarction and cerebrovascular diseases.A recent report by the Texas Department of Health and Human Services found that screening and referral for SDOH factors, especially housing and food security, was associated with greater quality outcomes among Medicaid and Children's Health Insurance Program beneficiaries [ 47 ].However, these studies do not identify the causal factors, including SDOH data, that can explain the significant decline in ER visits during the pandemic.Specifically, extant research does not address the causal relationships between SDOH factors, the role of information integration across healthcare providers, and their impact on ER utilization.We bridge this research gap by (a) studying the impact of SDOH factors on ER utilization and (b) evaluating the moderating role of hospital information integration on the relationship between SDOH and ER utilization during the COVID-19 pandemic.

RESEARCH HYPOTHESES
In this section, we develop our research hypotheses on the impact of (a) social determinants of health, and (b) hospital information integration of patient health data on ER utilization and admission risk.We draw on several streams of literature to motivate our hypotheses.

Social Determinants of Health and ER Utilization
Drawing on life course theory and the tenets of the salutogenic model, we hypothesize that individuals with adverse socio-economic status will experience higher levels of ER utilization based on two mechanisms.First, individuals with poor socio-economic status are more likely to have limited access to primary care services and lower capacity to deal with life stressors, such as chronic medical conditions, and are often forced to utilize the ER when there is a need for medical care.Specifically, such individuals are also more likely to delay preventive care and often need to use the ER due to deterioration in their health status.In this study, we focus on three aspects of an individual's socio-economic status-economic well-being, family support, and access to healthy food-and study how these SDOH factors impact ER utilization.
Patients choose to utilize ERs for a multitude of reasons, with one of the major reasons being the lack of access to primary care [ 48 ].Adverse economic well-being can severely limit an individual's ability to receive routine primary care and forces one to utilize the ER due to a lack of financial means, access to transportation, housing insecurity, and other factors [ 27 , 28 ].For example, many patients choose to utilize the ER instead of primary care because of the greater payment flexibility afforded, since emergency rooms require no payment at the time of care [ 49 ].Furthermore, adverse economic well-being can have a negative impact on the individual's capacity to deal with stressors, such as medical conditions, when they often have other stressors associated with adverse economic status, such as housing and food insecurity.For instance, one of the significant barriers to primary care is access to appointments at primary care clinics [ 50 ].Individuals with greater life stressors may be unable to receive care outside the ER, due to a lack of access, financial means for affordable care, and/or disparities in the availability of primary care in their local area.Hence, we hypothesize that individuals who exhibit adverse economic well-being are more likely to exhibit higher rates of ER utilization due to a lack of access to primary care.
Besides economic well-being, family social support can also impact individual health status and ER use.Prior research has observed that families comprising of multi-member households, such as through marriage and/or parenting, are likely to exhibit a lower incidence of negative health behaviors by facilitating greater support and social integration [ 51 ].Having multiple household members can help care for individuals with medical needs, as family support increases their capacity to deal with life stressors through emotional support and comfort.Furthermore, loneliness may also be associated with a significant increase in ER use, especially among older adults [ 52 ].During the COVID-19 pandemic, individuals with adverse family support conditions, such as loneliness or family conflict and housing insecurity, experienced elevated anxiety levels and depression symptoms and sustained psychological stress [ 53 , 54 ].Such mental health issues may also lead to increased ER use that may be alleviated by the presence of larger families where household members can provide social support and serve as a source of patient health information to care providers [ 55 ].Hence, we hypothesize that individuals with adverse family support are more likely to exhibit higher rates of ER utilization.
Food insecurity, measured as a lack of access to healthy food, is often associated with significantly higher consumption of healthcare resources [ 56 ].Prior research suggests that the prevalence of food deserts in locations that lack access to grocery stores is correlated with a greater incidence of chronic health conditions such as high blood pressure, high cholesterol, diabetes, and heart disease [ 57 ].Patients with such chronic conditions are more likely to be frequent ER users [ 58 , 59 ], accounting for 60% of ER visits in the U.S. [ 60 ].A lack of access to healthy food is also an indicator of patients' inability to prioritize or afford routine care, which can lead to preventable ER visits, especially for people with chronic conditions [ 61 ].Hence, we hypothesize that individuals who encounter difficulties in accessing healthy food options are more likely to incur greater levels of ER utilization.We formally posit our hypotheses on the impact of SDOH on ER utilization.

H1: Social determinants of patient health are associated with greater ER utilization. H1a: Adverse patient economic well-being is likely to be associated with greater ER utilization.
H1b: Adverse family support conditions are likely to be associated with greater ER utilization.H1c: A lack of patient access to healthy food options is likely to be associated with greater ER utilization.

Hospital Information Integration and ER Utilization
Prior research has proposed many potential mechanisms through which sharing of patient health information by healthcare providers may affect patients' consumption of healthcare resources.Access to patient medical history, such as prior diagnosis and treatment plans, can help physicians improve their quality of care by providing more accurate diagnoses of patient conditions, ordering appropriate tests and therapies, and uncovering potential gaps in patient care [ 62 , 63 ].Improvements in care quality, in turn, should reduce the incidence of future ER visits.
Previous studies have explored the role of health data characteristics (such as privacy) and their impact on the quality of care delivered [ 8 ].Based on an observational study of HIE users across 12 ERs and ambulatory safety-net clinics in a single metropolitan region, Frisse et al. [ 40 ] reported that users accessed the HIE for 6.8% of all encounters, with higher rates of access for repeat visits, for patients with comorbidities and patients known to have data in the exchange, and at sites providing HIE access to both nurses and physicians.Discharge summaries and test reports were the most frequently accessed data in the HIE.Healthcare providers also consistently noted that retrieving patient medical history, preventing repeat tests, comparing new results to retrieved results, and avoiding hospitalizations, are important consequences of HIE use.Although 29% of HIE users reported "provided additional history" as one of the benefits of HIE use, only 5.2% of all users observed that the HIE allowed them to better understand the social component of a patient's history [ 40 ].Recent studies have also observed that the lack of access to relevant SDOH data in clinical patient records poses a major challenge for care coordination and the continued use of HIEs for patient treatment and care management [ 64 ].Based on a systematic review of 173 studies, Wilder et al. [ 65 ] reported that SDOH are linked to the degree of compliance with patient medication adherence and proposed that SDOH data should be considered by clinicians when ordering treatments.
Our extensive literature review indicates a paucity of research on the role of information integration between healthcare providers and its impact on the relationship between SDOH factors and patient health outcomes.We observe that effective integration of external medical records across hospitals allows physicians to share and access relevant clinical data when treating patients [ 66 ].In high information integration settings, it is easier for providers to access patient health data previously recorded by the primary care provider and use this data for making decisions on patient care and treatment coordination.For instance, timely access to prior radiology and lab results through the integration of care records can provide physicians with useful information on prior medical history and improve the quality of decision-making related to patient diagnosis and tests.Hence, the role of SDOH factors related to patient economic well-being is likely to be more salient, since information about their employment status and residence location are easily accessible when hospitals can share such information with each other (i.e., in high information integration settings).Such social data on patient employment history, access to health insurance and geo-location, may allow clinicians to provide more personalized care to patients such as referrals to specialists and pharmacies in the same geographical area, as well as ensuring that appropriate follow-up care, such as home visits, are provided to patients with lower economic well-being.
However, in low information integration conditions, access to patient medical and family history is limited and care providers are unable to access patient health and/or social data through HIEs and other systems.Such patient health information is especially critical in hospitals with low information integration, where it may not be possible for providers to access patients' medical history, including allergies, medication records and test results from prior visits, in a timely manner.In such settings, social support from caregivers and family members may provide physicians with relevant patient health data in a timely manner [ 67 ].For instance, family members may be able to better understand patient symptoms and their medication history and allergies and share this 29:8 T. Guo et al.
information with their healthcare providers.Family and social support may also allow patients to better comply with treatment protocols and adhere to medication guidelines established by care providers.In other words, family/social support is likely to serve as a substitute for the lack of information access to patient health records in low information integration settings.
Access to SDOH data is especially critical during the pandemic, since the COVID-19 virus has been known to disproportionately affect people with limited access to healthcare resources and economic means (such as the ability to work remotely in knowledge-intensive professions).For example, if clinicians were able to gain timely access to patient health records, especially for patients with chronic diseases, then they could make informed decisions related to their care, since such patients are likely to be at greater risk of intubation and severe respiratory illness if not treated in a timely manner.Furthermore, the availability of SDOH data related to patients' professional occupation(s) as well as family support can provide clinicians with better treatment options to provide care to at-risk patients at home, instead of being admitted to the ER.COVID-19 patients with existing comorbidities, such as diabetes and asthma, may have different disease trajectories than similar patients without such comorbidities, requiring potentially different care treatments such as greater use of oxygen and ventilators to mitigate breathing difficulties [ 68 ].Having information on such comorbidities through integrated information (from outside sources such as HIEs) can help physicians make timely and more accurate treatment decisions.
Hence, we argue that the level of hospital information integration is likely to moderate the impact of SDOH on patient health.To this end, we propose the following hypothesis:

H2: The level of hospital information integration moderates the impact of SDOH factors on ER utilization.
H2a: Greater patient economic well-being is more likely to be associated with lower ER utilization when information integration is high.
H2b: Availability of adequate family support is more likely to be associated with lower ER utilization when information integration is low.
H2c: Food insecurity (or lack of access to healthy food) is more likely to be associated with greater ER utilization when information integration is low.

RESEARCH METHODOLOGY
Next, we describe the specification of our research model and the data and variables utilized to operationalize our model.

Data Collection
We obtained our research data from the ICC, a regional health information exchange in central Texas, which manages a comprehensive dataset of more than six million clinical encounters across 37 healthcare institutions during the years 2015-2020, collected from approximately 200,000 unique patients.For each patient, the dataset contains patient-specific demographic information and their residential zip code.For each patient encounter, the dataset includes associated diagnoses, procedures, and payer (insurance) information.Based on the unique identifiers for both patients and encounters, we can track the entire visit history of every patient throughout the 6-year study period across multiple healthcare institutions to study the drivers of ER utilization.
We also collected data measuring the socio-economic status of patients from the ACS conducted by the U.S. Census Bureau and the Food Environment Atlas (FEA) provided by the U.S. Environmental Protection Agency.ACS provides survey-based data at the zip code level for yearly demographic variables such as education, income, employment status, and housing characteristics.The FEA provides measurements of food environment factors, such as grocery 29:9 For each clinical encounter in the ICC dataset, data collected from ACS, FEA, and AHA were matched based on the location of the encounter, the residence zip code of the patient, and the year of the encounter.A 1-year lag was introduced during the data integration process to ensure appropriate identification.Table 1 provides relevant descriptive statistics on selected variables from the 55,957 clinical encounters utilized in our study-the selection process is described in the following sections.

Variable Description
4.2.1 Dependent Variable.For each clinical encounter, we are interested in the relationship between the explanatory variables of interest and future ER admission risk.Clinical encounters are defined as any patient encounter with a healthcare provider that occurs in an inpatient or ER setting.Specifically, we define the dependent variable for each clinical encounter as the incidence of an ER visit within 30 days after the current clinical encounter.Clinical encounters with an ER  We adopt the specific dependent variable definition from the prior literature that often focuses on ER visits within 30 days of the index encounter [ 69 -71 ].In addition to ER visits within 30 days after the index encounter, we also examined ER visits within 90 days to validate the results under a different time window setting.

Independent
Variables.Next, we describe the main independent variables of interest in our model, including the treatment variable (i.e., COVID-19 infection), as well as other variables that comprise the SDOH and health information integration constructs.Prior studies on the relationships between SDOH attributes and patient health outcomes have been inconclusive [ 25 ].To utilize a wide range of socio-economic variables available in the ACS and FEA data and ensure unbiased econometric estimation, we deployed factor analysis to generate factors that represent the social determinants of health.Factor analysis also helps to reduce the number of SDOH variables into fewer factors to improve interpretability and detect hidden relationships within the data [ 72 , 73 ].
Table 2 shows the results of factor analysis, which indicates the presence of three factors based on eight SDOH variables, with a patient's residential zip code as the unit of analysis.The values shown are the factor loadings, which represent the extent to which the variables are correlated with their respective factor.The first factor has high loading values for the percentage of population that is not below the poverty line, average work hours per week, and percentage of the population that does not rely on public transportation for commuting.We consider this factor a measure of Economic well-being .The second factor, which we denote as Family support , exhibits high loadings on two variables: income to home value ratio and average household size.The third factor has high loadings on two SDOH variables, the percentage of low-income population with access to grocery stores and the number of convenience stores per thousand population.We denote this factor as a measure of Access to healthy food .The three SDOH factors are then calculated based on the factor loadings shown in Table 2 and included in the primary econometric estimation model.
The five variables obtained from the AHA IT data, which represent different measures of health information integration, are shown in Table 3 .These variables measure how patient health information from external sources is integrated and used by healthcare providers at the focal hospital.(not e-Fax) from outside providers or sources when treating a patient?
In this respect, our study is more broadly generalizable compared to recent research that focuses exclusively on the use of HIE data.Although these variables were originally measured on a fivepoint scale (1 = yes, 2 = sometimes yes, 3 = No, 4 = Do not know, 5 = N/A), we transformed them into binary variables where the first two responses are recorded as one, and the last three responses are recorded as zero.
Since the inter-correlations between these five variables are very high, a summative index was constructed as an overall measure of hospital information integration that measures the extent of IT integration and the use of patient health data from external sources.The summative index is calculated as an average of the five variables based on a zero-1 scale and included in the econometric estimation model to test H3.The average value of the summative index is 0.495, with a standard deviation of 0.212.

Quasi-Natural Experiment.
To estimate the impact of COVID-19 infection on ER utilization, we deploy a difference-in-differences (DID) approach, which has been extensively used in the information systems and economics literature using natural and quasi-natural experiments [ 74 , 75 ].Under a DID approach, the treatment effect is estimated by comparing the treatment group against the control group in the pre-and post-treatment periods.This approach allows us to eliminate potential confounding effects due to unobserved factors from the treatment effect.
In our study, the treatment group is comprised of patients with COVID-19 diagnosis during the year 2020.Specifically, a list of ICD-10 codes (available from ICC) to track COVID-19 patients was used.Patients with a diagnosis code from this list, based on one or more clinical encounters during the year 2020, were included in the treatment group.By comparison, the control group consists of patients without any COVID-19 diagnosis during our study period.For each encounter, the binary variable, Treatment , takes a value of one if a patient belongs to the treatment group and a value of zero if the patient belongs to the control group.A second binary variable, Post , is assigned to each encounter.It takes a value of one if the encounter occurred in 2020 and zero if the encounter occurred before 2020.The interaction term Post × Treatment represents the impact of clinical encounters for the treatment group during the post period.
At the patient level, we also control for patient age, gender, ethnicity, count of ER visits, hospitalizations, and the number of outpatient visits in the previous year.At the individual encounter level, we control for the type of encounter (i.e., ER/Hospitalization/Outpatient), category of principal diagnosis, comorbidities, and payer type.Finally, we account for unobserved differences between hospitals through a location fixed effect.Our model specification is shown in Equation ( 1 (1)

Propensity Score Matching
Due to the observational nature of our research data, it is not feasible to randomly assign patients into treatment and control groups as in a natural experiment.Thus, an appropriate matching strategy is required for the quasi-natural experiment dimension of our study.Due to the overwhelming imbalance within the dataset-since most patients do not have a COVID-19 related diagnosispropensity score matching was used to match patients in the treatment group to similar patients in the control group based on their age, gender, ethnicity, medical history, and insurance type [ 76 ].
We deployed a one-to-many matching approach, resulting in a treatment group of 3,071 COVID patients and a control group of 10,314 matched patients with no COVID-19 related diagnosis.Patients across the two groups experienced a total of 55,957 clinical encounters across 16 hospitals during our study period from 2015 to 2020.Such clinical encounters include inpatient visits as well as visits to ER clinics within hospitals.Figure 1 validates the parallel trend assumption for the DID approach, which indicates that the control and treatment groups exhibited similar ER admission trajectories before COVID-19.

Control Function Estimation
While we utilize numerous control variables to control location-specific, patient-, and encounterlevel effects, one may still argue that the SDOH factors and information integration are subject to potential endogeneity concerns, especially due to reverse causality.For example, patients with a higher risk of ER visits may be likely to suffer from serious chronic health problems that result in such patients having limited job opportunities and being less economically well off.To address potential endogeneity concerns, we adopted a control function approach in the estimation model [ 77 ].The control function approach separates the correlation between potential endogenous explanatory variables and unobserved factors affecting the dependent variable through a two-stage regression process utilizing additional instrumental variables.In the first stage regressions, we estimate the residuals v as specified in Equation ( 2): ( The instrumental variables, FactorsIV SDOH , represent the weighted average values of the corresponding SDOH factors based on neighboring (i.e., peer) zip codes to the focal patient's residential zip code.Specifically, the relevant SDOH variables from the five nearest zip codes to the patient's residential zip code were collected for each zip code, by year.The average values of these seven variables, weighted by the population of each neighboring zip code, were used to calculate the three FactorsIV SDOH for the focal zip code-year, based on the factor loadings shown in Table 2 .These instrumental variables meet the relevance and exogeneity criteria for strong IVs and empirically demonstrate the relevance of IVs in Table A1 of the Appendix.The first-stage regression results reveal that the instrumental variables have statistically significant relationships with their respective endogenous variables.Second, we argue that these instrumental variables are likely to satisfy the exogeneity condition.The SDOH data from peer zip codes that neighbor a focal patient's residential zip code are unlikely to impact the ER utilization of the focal patient, since their socio-economic status is not dependent on the neighboring zip codes.
Finally, the estimated residuals from Equation ( 2 ), v , are incorporated into the second stage regression, as specified in Equation ( 3 ).The control function approach estimators, β s , are free of endogeneity concerns related to the dependent variables of interest (Wooldridge 2015).Hence, we estimate the model specified in Equation ( 3 ), (3)

RESULTS
In this section, we present the results of DID univariate analysis as well as the control function estimation model.The univariate analysis is shown in Table 4 while the control function estimation results are shown in Table 5 .

Univariate DID Analysis
Table 4 indicates that the risk of 30-day ER admission decreased by 7.5% in the post-period (compared to the pre-period) for the treatment group and 1.6% for the control group.The second difference is significant and negative ( t = 5.27, p value < 0.01), suggesting that 30-day ER utilization   decreased significantly in the treatment group.Similarly, the reduction in 90-day ER utilization is 10.2% more in the treatment group, compared to the control group, during the post-period.These findings suggest that COVID-19 patients incurred fewer ER visits during the post-period, without controlling for the effect of patient and encounter characteristics.

Multivariate DID Analysis with Control Function Estimation
The 30-day ER utilization estimation results for all encounters are shown in columns ( 1) and ( 2) in Panel A of Table 5 .The coefficient of Post is negative and significant for 30-day ER admission (coeff.= -0.319,p < 0.01), while the coefficient of Post × Treatment is not significant.These results suggest that, while ER utilization for the overall population decreased during the COVID-19 period (i.e., in 2020), patients with COVID-19 infection did not incur a higher risk of 30-day ER utilization compared to patients in the control group (i.e., without COVID-19).
Our results suggest that COVID-19 did not result in a greater likelihood of ER visits, despite the overall decrease in ER utilization during this period.We also observe that the coefficient of Economic well-being is significant and negative (coeff.= -0.137,p < 0.05), indicating that patients with greater economic well-being exhibit a lower risk of ER utilization.In column (2), we observe that the coefficient of Post × Family Support (coeff.= -0.115,p < 0.05) is negative and significant, which suggests that family support during the COVID-19 pandemic is associated with a reduction in ER utilization.Hence, our results provide partial support for H1, specifically H1a and H1b.Next, we turn to Panels B and C of Table 5 , which provide the control function estimation results of split-sample analysis based on high and low hospital information integration, respectively.We split the sample data into two, based on patient encounters that occur in hospitals with high (low) values of information integration, based on whether the corresponding value of the hospital information integration index is above or below its median value.In Panel B, we observe that the coefficient of Economic Well-being is significant and negative (coeff.= -0.163,p < 0.05), indicating that patients with greater economic well-being exhibit lower risk of ER utilization, when their encounters occur in hospitals with high information integration.In other words, hospitals where patient health information from external sources is easily integrated and accessible by healthcare providers are also more likely to accentuate the relationship between greater economic well-being and lower future ER utilization.We also observe that the main effects of other SDOH factors, namely Family Support and Access to Healthy Food , in Panel B are not statistically significant.The two-way and three-way interaction effects are also not significant.
Based on the split-sample analyses results in Panel C for "Low Information Integration," we observe that the coefficient of Post × Family Support is negative and significant (coeff.= -0.148,p < 0.05) in column (5), whereas it is not significant when patient encounters occur in hospitals with high information integration.Furthermore, the coefficient of Post × Family Support in column ( 6) is also significant (coeff.= -0.249,p < 0.05) even after accounting for the three-way interaction of COVID-19 patients.This result suggests that greater family support during COVID-19 is associated with lower ER utilization and may compensate for a lack of health information integration between hospitals, since families/caregivers may be able to share patient health information with their care providers.
Overall, our results underscore the differential impact of health information integration on ER utilization due to COVID-19.Specifically, we find that patient economic well-being significantly reduces ER utilization when hospital information integration is high, whereas family support reduces ER utilization when information integration is low.Furthermore, greater family support can compensate for the lack of information integration by allowing families and caregivers to play a more active role in patient care management and coordination during the COVID-19 pandemic.Our estimation results demonstrate how non-clinical factors influence ER utilization.While COVID-19 is not associated with a significant increase in ER utilization (despite the general decrease in the overall population), SDOH factors such as economic well-being and family support have a significant impact on future ER utilization, and this relationship is moderated by the level of hospital information integration.The other SDOH factor, Access to Healthy Food , is not significantly associated with ER utilization.Hence, our results provide partial support for H2, specifically for H2a and H2b.

Robustness Analysis
Next, we conduct and report several robustness tests based on an alternate measure of ER utilization using the frequency of ER visits, as well as expanding the window of measuring ER utilization to 90 days.As a robustness check, we estimated the frequency of 30-day ER utilization for the same set of encounters utilizing negative binomial distribution (NBD) regressions.Instead of the binary dependent variable used in the previous section, the dependent variable for the NBD regressions is the count of ER admissions that occur within 30/90 days of the encounter of interest.Table 6 provides the NBD estimation results for 30-day ER admissions, where Panel A shows the results for all ER encounters, and Panels B and C represent the split-sample analyses based on high and low information integration, respectively.Our results suggest that while the overall frequency of ER admissions declined during our study period, the positive coefficient of the "Post × Treatment " variable in Table 6 indicates that the frequency of ER utilization attributed to COVID-19 patients actually increased in 2020.
The estimation results in Table 6 are generally similar to our earlier results in Table 5 .Specifically, as shown in Panel B, we observe that the SDOH factor, Economic Well-Being , is negative and significant in the NBD estimation only when patient encounters occur in hospitals with high information integration.However, patients with greater family support are likely to incur fewer ER admissions (coeff.= -0.099,p < 0.01), and this relationship is significant only when the encounters occur in hospitals with low information integration (coeff.= -0.192,p < 0.01).This negative relationship is consistent even after accounting for three-way interaction effects and suggests that family support is a critical factor in reducing ER visit frequency during the COVID-19 pandemic.Overall, our robustness tests using NBD regressions are similar to the earlier control function results for ER utilization and provide partial support for H1 and H2 with respect to the role of Economic Well-being and Family Support under high and low information integration conditions, respectively.We also observe consistent results for 90-day ER utilization and visit frequency, as shown in Tables A2 and A3 in the Appendix.

Predictive Analytics of ER Utilization
Next, we train and test several predictive analytics models to predict the incidence of 30-day ER admissions based on our sample data.
Besides studying the association between COVID-19, SDOH factors, and health information integration, we are also interested in studying how such factors can help predict ER utilization in the future.We developed several machine learning models to predict future 30-day ER utilization using the same independent variables of interest and controls as our econometric models.Instead of analyzing the relationships between these variables using explanatory models, we predict future 30-day ER utilization using SDOH variables.In other words, we implement and test several predictive analytics models for ER utilization to evaluate the robustness of our explanatory models.
Our entire data consisting of over six million patient encounters were divided into training and test datasets using an 80-20% split and analyzed using machine learning (ML) methods including random forest, support vector machines (SVM) , AdaBoost, XGBoost, and regularized logit regressions.Up-sampling was performed to create a balanced training dataset, since only 8% of all visits are associated with an ER visit within 30 days of a clinical encounter.Table 7 demonstrates the performance of our ML methods, based on an average of five training and testing resampling runs.We observe that boosting techniques, such as XGBoost (row 1) and AdaBoost (row 2), result in the greatest area under curve (AUC) value as well as the highest F1 score.Specifically, XGBoost achieves an AUC value of 0.8037 and an F1 score of 0.7280.
Feature importance analysis based on the XGBoost model suggests that two categories of variables account for more than 90% of the gains achieved by the XGBoost model (Gains is the metric used to measure the quality of the decision trees created by XGBoost.We refer interested readers to the original paper by Chen and Guestrin [ 78 ] for a detailed explanation).The two categories of variables are encounter type of current patient visit (i.e., ER/Hospitalization/Outpatient) and count of visits in the previous year.Hence, we developed and tested a parsimonious model, utilizing six variables based on these two data types, and the corresponding performance metrics are shown in row 8 of Table 7 .The six variables include (a) a binary indicator for the current visit being an outpatient visit, (b) a binary indicator for the current visit being an inpatient visit, (c) a binary indicator for the current visit being an ER visit, (d) count of a patient's outpatient visits in the previous year, (e) count of a patient's inpatient visits in the previous year, and (f) count of the patient's ER visits in the previous year.We also developed and tested a more parsimonious model utilizing only two variables-a binary indicator of whether the current patient encounter is an ER visit , and a count of the patient's ER visits in the previous year .The performance of this model is shown in row 9 of Table 7 using the XGBoost model.
Comparing the performance metrics reported in rows 8 and 9 to rows 1 to 7, respectively, it is apparent that the parsimonious models perform on par, if not better, than the ML models that utilize all independent variables.The most parsimonious model with only two independent variables achieves an AUC value of 0.7705 and an F-1 score of 0.723, which is only marginally lower than the performance achieved by the full XGBoost model.The lift curves, as shown in Figure 2 , underscore the similarity between the full XGBoost and parsimonious models.Furthermore, the F-1 score of the most parsimonious model is comparable to the full AdaBoost model and is superior to other ML models utilizing the full range of independent variables.Our findings suggest that future ER admission risk can be predicted effectively based on accessing data on patients' prior ER utilization history and visit type of their current clinical encounter.
We also tested alternative splits of test and training data that indicate that our results are consistent.The performance of the full XGBoost model and the two parsimonious models for predicting future 30-day ER admissions in 2019/2020 is similar to the ones shown in Table 7 , when utilizing visits from 2015 to 2018 as the training set.

DISCUSSION
We now discuss the implications of our results on the interplay between SDOH and health information integration.

Significance of SDOH Factors
We hypothesized that adverse conditions of economic well-being, family support, and lack of access to healthy food should significantly increase the risk of future ER admissions.However, our results provide partial support for the impact of SDOH factors.Specifically, we observe that patient economic well-being is associated with a significant decrease in ER utilization and that this relationship is statistically significant only when the encounters occur in hospitals with high information integration of patient health data.Furthermore, we find that a high degree of family social support improves patient health by increasing their capacity to deal with stressors and reducing their ER utilization.The impact of family support is strongly significant in low information integration settings where patient health data may not be readily available to physicians and care providers.
The extant research also suggests that while lack of access to healthy food is associated with adverse health conditions, we observe that controlling for patient comorbidities can mitigate its impact on ER utilization [ 57 ].In other words, access to healthy food options may not directly impact ER admission risk.Instead, it may indirectly impact ER utilization by reducing the likelihood of other comorbidities such as diabetes and obesity.Since we control for several patient-specific characteristics in our models, including patient comorbidities, it is reasonable that no significant relationship between ER utilization and access to healthy food was observed.

Differences between Explanatory and Prediction Models
Our findings based on explanatory and predictive analysis paint a complex picture that necessitates different actions from different stakeholders.Our explanatory analyses suggest that COVID-19 patients with greater family support are likely to exhibit a lower risk of ER utilization, especially when their clinical encounters occur at hospitals with low levels of health information integration.From a policy-making perspective, our results suggest that relevant decision-makers should implement policies that address health inequities associated with non-clinical factors such as SDOH.For instance, a policy directive may include referring patients to hospitals that have a high level of information integration based on providers' use of HIEs and other information access.The prior literature has reported that in 2017, only 50% of the hospitals in the U.S. integrated patient medical information from outside sources into their local EHR [ 34 ].Our research provides strong incentives for hospitals to integrate patient health data from external data sources and encourage physicians to utilize such data efficiently.It also provides greater support for efforts to create national standards for the collection and utilization of SDOH data within electronic health records by healthcare providers [ 64 ].
However, for hospital managers and other stakeholders who are only interested in predicting ER visits, our results suggest that utilizing only two variables measuring a patient's prior encounter types and current ER utilization can effectively predict their future ER utilization.Healthcare providers and policy makers can achieve a reasonable understanding of future ER utilization by analyzing the trajectory of patients' prior ER utilization history, notwithstanding the impact of the COVID-19 pandemic.Hence, the explanatory model is complementary to our predictive model, since it highlights the underlying mechanisms that explain the relationship between SDOH and ER utilization.

CONCLUSIONS
In this study, we investigate how COVID-19 diagnosis, social determinants of health, and hospital information integration can affect ER utilization based on a longitudinal dataset collected from a regional health information exchange in central Texas.Our study examines the impact of SDOH on the likelihood of ER admissions as well as the moderation impact of hospital information integration.By utilizing unique, longitudinal data on patient-level ER admissions collected during the COVID-19 pandemic, we observe that patient economic well-being is associated with a lower likelihood of future ER utilization when patients receive care at hospitals with greater information integration.
Furthermore, greater family support during the COVID-19 pandemic is associated with a significant reduction in patient ER utilization in hospitals with lower information integration.The differential impact of SDOH on ER utilization under different IT integration settings provides a new dimension to study the role of SDOH and the boundary conditions under which they are likely to significant impact ER visits.Our findings provide new insights into the complex relationships among SDOH factors, health information integration, and clinical outcomes.Furthermore, our predictive analytics results suggest that COVID-19 patients are more likely to incur an ER visit after a previous ER encounter.We find that a parsimonious model utilizing only two variables that measure patients' prior ER utilization history can predict their future ER utilization.
As the healthcare industry moves away from a fee-for-a-service model toward value-based care, reductions in ER use are likely to be associated with reduced healthcare spending.Our findings on the association between family support and ER utilization encourage the creation of more social services to provide emotional support to patients.Our results related to the salient impact of hospital information integration on ER utilization should expedite HIE use by hospital providers and encourage national efforts to implement uniform data standards for the collection of SDOH data to address health inequities.Such changes should provide financial incentives to hospitals and payers alike and free up scare healthcare resources.
To the best of our knowledge, our study represents one of the first attempts at empirically analyzing how social determinants of health and hospital information integration can impact healthcare resource utilization based on data from the COVID-19 era.Our research is unique for several reasons.First, prior empirical analyses on the role of SDOH on health outcomes were conducted using data collected before the COVID-19 pandemic.Our dataset includes clinical encounters during the COVID-19 period, allowing us to test such relationships after the immense disruption to the healthcare industry during the pandemic.Specifically, due to dramatic changes in patient characteristics and ER utilization during the pandemic, the findings from prior studies conducted before the pandemic may not be generalizable to the COVID-19 era.Hence, our study based on the impact of COVID-19 on ER utilization is important in its own right as it provides a better understanding of the role of SDOH factors and health information integration in mitigating ER utilization.Second, ours is also one of the first studies to deploy a robust quasi-experimental research design, based on control and treatment samples, to estimate the impact of COVID-19 and SDOH factors on ER utilization.Third, our study complements explanatory models using machine learning methods to better understand the factors that predict ER utilization.

Limitations and Future Research
Several limitations of our study should be addressed.First, we do not measure SDOH at the individual patient level.Instead, we use aggregate zip code data (at the population level) to serve as relevant proxies for social determinants of patient health in our econometric analyses.While we are limited to aggregate measurements at the zip code, county, and hospital levels, we believe that having more personalized measurements at the block level or individual patients will allow us to improve the precision of our analyses as we move toward personalized medicine.Indeed, the results of our empirical analyses imply the need to collect additional patient-level SDOH data to accurately track the relationships between other social and non-clinical determinants of patient health outcomes.Second, our analysis focuses on the risk of all-cause ER admissions, while future research may focus on disease-specific studies of ER utilization.Nevertheless, we believe that

Table 1 .
Descriptive Statistics of Patient Encounter Characteristics and SDOH Variables neighborhood community characteristics, that may influence food choices and access to healthy food at the county level.Data on the extent of information integration of patient health information at hospitals in the ICC dataset was collected from the AHA IT Supplement.The database provides survey-based measurements of hospital-level use of health IT gathered from U.S. hospitals between 2014 and 2019.

Table 2 .
Factor Analysis of SDOH Variables visit recorded within 30 days of the discharge date are coded as one, while all other encounters are assigned a value of zero for the dependent variable.

Table 3 .
Measures of Hospital Information Integration

Table 7 .
Predictive Models of ER Utilization