Social determinants of health in electronic health records and their impact on analysis and risk prediction: A systematic review

Corresponding Author: Min Chen, Department of Information Systems and Business Analytics, College of Business, Florida International University, 11200 SW 8th Street, Miami, FL 33199, USA; ude.uif@2nehc.nim

Received 2020 Feb 15; Revised 2020 Jun 10; Accepted 2020 Jun 20.

Copyright © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Associated Data

ocaa143_Supplementary_Data. GUID: 2960DE23-22C2-4AD6-B303-95977AA86DFE

Abstract

Objective

This integrative review identifies and analyzes the extant literature to examine the integration of social determinants of health (SDoH) domains into electronic health records (EHRs), their impact on risk prediction, and the specific outcomes and SDoH domains that have been tracked.

Materials and Methods

In accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, we conducted a literature search in the PubMed, CINAHL, Cochrane, EMBASE, and PsycINFO databases for English language studies published until March 2020 that examined SDoH domains in the context of EHRs.

Results

Our search strategy identified 71 unique studies that are directly related to the research questions. 75% of the included studies were published since 2017, and 68% were U.S.-based. 79% of the reviewed articles integrated SDoH information from external data sources into EHRs, and the rest of them extracted SDoH information from unstructured clinical notes in the EHRs. We found that all but 1 study using external area-level SDoH data reported minimum contribution to performance improvement in the predictive models. In contrast, studies that incorporated individual-level SDoH data reported improved predictive performance of various outcomes such as service referrals, medication adherence, and risk of 30-day readmission. We also found little consensus on the SDoH measures used in the literature and current screening tools.

Conclusions

The literature provides early and rapidly growing evidence that integrating individual-level SDoH into EHRs can assist in risk assessment and predicting healthcare utilization and health outcomes, which further motivates efforts to collect and standardize patient-level SDoH information.

Keywords: social determinants of health, electronic health records, behavioral determinants, social factors, systematic review, risk prediction

INTRODUCTION

Social determinants of health (SDoH) are “conditions in which people are born, grow, live, work, and age,” 1 and they involve “the complex, integrated, and overlapping social structures and economic systems that are responsible for most health inequalities.” 2 , 3 Healthy People 2020 organizes SDoH around 5 key domains: (1) economic stability, (2) education, (3) health and health care, (4) neighborhood and built environment, and (5) social and community context. 4 As population health becomes an important focus of health care delivery, SDoH are increasingly seen as critical factors for identifying potential upstream drivers of poor outcomes and higher costs. 5 , 6 While these are often conflated with social risk factors, which are disadvantageous social conditions that may result in poor health outcomes, 7 this review addresses social and behavioral determinants that affect everyone. With SDoH information, it is anticipated that health systems and professionals can classify the complexity of their patients, identify appropriate interventions to meet various needs, and transform care with integrated services and community partnerships to improve health outcomes and reduce health disparities, while saving costs. 8–11

The digitization of clinical records presents a new opportunity to integrate SDoH into electronic health records (EHRs) to enhance care delivery and population health. 12 , 13 The 2009 U.S Health Information Technology for Economic and Clinical Health Act incentivized the adoption of EHRs throughout the country. Today, virtually all hospitals and nearly 9 in 10 office-based physicians have adopted an EHR. 14 , 15 With widespread adoption of EHRs, policy is now shifting toward the use of EHR technology in a meaningful manner. Meaningful use criteria were implemented by the Medicare and Medicaid EHR Incentive Programs (now known as the Promoting Interoperability Programs) in 3 time-bound phases. Beginning in 2011, the first 2 stages emphasized capturing data (eg, patients’ medical history, medication orders, vital signs, laboratory results, radiology reports, physician and nurse notes) and optimizing clinical workflows, respectively. The third stage, in 2017, called for all hospitals and eligible healthcare professionals to demonstrate continuous quality improvement of care and elimination of healthcare inequality across all groups. 16 , 17 The integration of SDoH into EHR systems will be central for healthcare institutions to meet Stage 3 objectives and avoid reductions to their Medicare payments for failing to do so. 13 More and more, health institutions and clinicians are exploring how to capture data related to social determinants in their EHRs and how to incorporate SDoH-related referral and intervention into routine care, with the goal to assess their quality performance and manage the health of not only individual patients but also of populations. 18–21

EHRs systematically collect clinical information about patients such as medical history, vital signs, laboratory tests and results, and medication orders. Nonclinical determinants of health can manifest in the structured data elements such as age, race, ethnicity, and diagnosis codes (eg, homelessness). Some EHRs also have a few lifestyle domains, such as preferred languages, smoking and alcohol use, in a structured format. 22 Information on selected environmental and social domains such as housing, social support, and financial resource strain may also be extracted from EHRs’ unstructured data (eg, free-text physician and nurse notes). 22 Yet, EHR-derived SDoH data are not sufficient to constitute a complete and accurate set of SDoH domains, and many social and behavioral determinants that may influence health and mortality are not captured. The expansion beyond the traditional clinical information collected in EHRs to include SDoH data requires the identification of what SDoH data to collect and how and when to collect them, and identifying the extent to which the SDoH data collected can be used in risk prediction and interventions to improve outcomes, such as collecting accurate and complete data on patients’ living arrangements and economic stability (2 major domains of SDoH) in predicting the risk of 30-day hospital readmission. 23

Prior literature reviews have explored the effectiveness of interventions targeting SDoH 24–26 or the evidence relating to screening for the SDoH in clinical care setting 27 ; however, none have explicitly examined the integration of SDoH into EHRs for the purposes of risk prediction and associated analytics. Other systematic reviews have focused on either a specific domain of SDoH such as food insecurity, 28 or interventions to improve SDoH among specific disadvantaged groups 29 ; or on specific health outcomes, such as type 2 diabetes, 30 pregnancy among young people, 31 and adult all-cause mortality. 32 Noteworthy results from these systematic reviews illustrate a dearth of generalizable high-quality evidence on the impact of SDoH interventions in the context of population and public health. In perhaps the closest analog to our work, Golembiewski et al 33 conducted a rapid review that was limited to U.S.-based articles published between January 2010 and April 2018 and characterized the extent to which existing research has combined nonclinical data derived from external sources with different clinical datasets. Owing to the rapid growth in adoption of EHRs and the pressing needs in the meaningful use of EHRs as a tool to improve health outcomes, it is essential to better understand the type of determinants, data sources, and measures used effectively in the new context of prediction-assisted, EHR-enabled care delivery.

The purpose of this study is to review and analyze the currently available literature to determine whether SDoH can affect health outcomes through risk prediction and targeted intervention, and which SDoH domains have been tracked in the context of EHRs. We are interested in examining quantitative evidence regarding how SDoH may affect various outcomes that have important implications for healthcare cost and quality, such as disease diagnosis, use of healthcare services, referral and other interventions targeted at SDoH and risk of ER visits, and hospital admission or readmissions. To inform efforts to create national standards for representing SDoH information in EHRs, we also analyze the sources and tools used in the studies to collect and screen domains related to SDoH.

MATERIALS AND METHODS

Search strategy

We followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines 34 and the PRISMA-Equity 2012 extension for systematic reviews with a focus on health equity 35 to conduct our systematic review. The published literature was searched using strategies created with the help of a university librarian for studies that utilized SDoH in the EHR context. The search strategies were established using a combination of standardized terminologies, keywords, and MeSH (Medical Subject Headings). We aimed to be broad and inclusive by adapting wording of SDoH, such as socioeconomic factors, behavioral factors, nonclinical determinants, health disparity, etc. To verify our coverage of all relevant articles, we also compiled an inclusive list of social and behavioral determinants of health measures to search in the literature based on the conceptual frameworks proposed by the World Health Organization and the U.S. Centers for Disease Control and Prevention, and the nonclinical determinants of health measures used in the existing literature review. 33

Table 1 shows the domains and dimensions of the health determinants we cover in the review, and the specific measures of these domains and dimensions can be found in Supplementary Table S1.

Table 1.

Examples of search domains and dimensions

DomainsDimensions
Economic stabilityIncome, financial resource, employment, basic needs
Neighborhood and built environmentTransportation, neighborhood, living arrangements, food access, environmental conditions
Health and health careInsurance status, behavioral health, mental health
EducationEducation, language
Social and community contextRace/ethnicity, social connections, other status, marital status

Data sources and searches

We queried PubMed (1948-present), CINAHL (1937-present), PsycINFO (1998-present), EMBASE (1947-present), and Web of Science (1965-present) all through July 15, 2019, then updated on March 31, 2020. We intentionally made the search query as broad as possible to make sure that we extract as many results as possible related to the research questions posed in this systematic review. A total of 6687 articles were identified by searching the databases based on the different variations of EHRs and a combination of the standardized SDoH terminologies and keywords representing specific social and behavioral determinants of health (Supplementary Table S1). Duplicate records were identified using the automatic duplication finder in Mendeley 36 and Rayyan 37 online review software. We removed 361 duplicates using Mendeley and identified 26 more duplicates using Rayyan, ending up with 6300 unique studies for further screening.

Study selection

To be eligible for inclusion, published research had to meet all of the following 4 criteria:

Peer-reviewed research with full text published in English. Integrated SDoH information into EHRs. Examined the impact of integrating SDoH into EHRs. Quantitative analysis.

Articles were excluded in 2 stages: a title and abstract review (6157 excluded) and then a full-text review of 143 articles (74 excluded). Studies excluded from the title and abstract screening included (1) systematic reviews/meta-analyses; (2) opinions, commentaries. perspectives, vision articles, guidelines, and protocol articles; (3) qualitative studies; (4) exploratory, conceptual studies, (5) studies prior to 2000; and (6) non–English language studies. During the full-text screening, we excluded conference posters and studies that did not focus on examining the impact of integrating SDoH into EHRs on analysis and risk prediction. This resulted in 69 articles for inclusion. We then used a snowballing technique whereby we searched the reference lists of included articles to identify potentially missed studies that should be considered and searched for the gray literature. We repeated this process on each additional article until no additional articles worthy of inclusion could be identified. This manual search process resulted in 2 more studies added to the pool. At each stage, we resolved conflicts in inclusion or exclusion by discussion and achieving consensus among the 3 independent reviewers. Figure 1 presents our search and screen process.

An external file that holds a picture, illustration, etc. Object name is ocaa143f1.jpg

Flowchart of literature search and screening process. EHR: electronic health record; SDoH: social determinants of health.

Data extraction and synthesis

From each selected article, we extracted a list of data elements determined by the research team such as year of publication, country of origin, data source, sample size, and level of SDoH measures used, how SDoH information is integrated into EHRs, outcome measures, study method, study purpose, findings, and limitations. Given the heterogeneity within included studies, as well as the lack of standardized or consistent reporting of SDoH domains and outcome measures, meta-analysis was not possible. Therefore, we used narrative synthesis to integrate our findings into descriptive summaries for SDoH and their impact on analysis and risk prediction.

RESULTS

Overall characteristics of the reviewed studies

We have summarized the characteristics of the 71 included studies in Table 2 . Overall, the literature on integrating SDoH into EHRs is quite new but growing rapidly. All included studies were published between 2008 and 2020, with the majority from 2017 to 2020 (53 studies, or 75%). Most of the studies (68%) were from the United States, and all but 2 studies were from developed countries ( Figure 2 ). There are 2 major approaches reported in the studies to acquire SDoH data: (1) merging SDoH information from external data sources into EHRs and (2) extracting SDoH information from unstructured clinical notes in the EHRs. A total of 56 of the 71 (79%) studies merged SDoH information from external data sources into EHRs, and the most frequently used external data sources were the publicly accessible American Community Survey (ACS) and the U.S. Census. While both the ACS and census data provide neighborhood level SDoH information, the ACS provides more up-to-date information about the social and economic needs of the community at the census-tract or ZIP code level every 5 years and releases estimates at the regional, state, and county levels every year. 38 Other studies that merged SDoH into the EHRs used commercial databases such as Nielsen Prime Location 39 and the Esri Business Analyst Premium product, 40 or initiated their own patient-level health surveys, as in Wagaw et al, 41 or community information system, as in Comer et al. 42