Assessing the Accuracy of Cardiovascular Data in Electronic Health Records

Norrina Allen,PhD, FSM, Preventative Medicine — Norrina Allen, PhD, assistant professor of Preventive Medicine in the Division of Epidemiology, was the principal investigator of the study in Circulation.

In a new study published in the journal Circulation, Northwestern Medicine investigators identified similarities and differences between cardiovascular data pulled from electronic health records (EHRs) and data collected in a traditional cohort study.

The findings illustrate both the potential benefits and limitations of using clinical EHR data for epidemiological research, especially as EHRs become standard in healthcare systems and are increasingly leveraged for major research programs, such as the “All of Us” precision medicine initiative.

“Electronic health records are going to be a valuable resource for research as we go forward, but no one has really looked at how reliable and valid the data that is being included is,” said senior author Norrina Allen, PhD, MPH, assistant professor of Preventive Medicine in the Division of Epidemiology. “This study is exciting because it’s the first time that we’ve been able to look at the accuracy of EHRs, and examine how we can leverage their strengths.”

Faraz Ahmad, MD, MS, a fellow in advanced heart failure and transplant cardiology, was the first author of the study.

To explore the validity of the two sources, the investigators examined the health data of individual patients who were included in both HealthLNK — an electronic database of close to 3 million Chicago area residents from six large health systems — and the Multi-Ethnic Study of Atherosclerosis (MESA), a traditional cohort study.

Because the MESA data was collected specifically for the purposes of cardiology research, and followed a rigorous protocol, the investigators could employ it as a “gold standard” with which to compare the clinical EHR data.

The investigators specifically compared measures of patient demographics, cardiovascular risk factors — such as blood pressure and BMI — and cardiovascular events.

They discovered that the EHRs had a significant amount of missing data for basic demographic characteristics, such as ethnicity, as compared to MESA.

Faraz Ahmad, MD, Cardiology — Faraz Ahmad, MD, MS, a fellow in advanced heart failure and transplant cardiology, was the first author.

After excluding the missing data, however, the investigators found good agreement between HealthLNK and MESA on gender, race/ethnicity, age and BMI measurements. They discovered differences in the two databases when it came to individuals’ blood pressure measures, prevalence of risk factors and cardiovascular events.

“We have to understand the limitations of EHRs and take advantage of its strengths in order to increase our understanding of heart disease,” Allen explained. “We now know how good the data are and we know what type of research we can use it for, and we can use that to ask questions of the data. This allows us to expand our research into electronic health records.”

The authors also note the need for continued research to understand the quality of data from electronic records — especially across different health systems and environments — and how to use diverse data sources to improve detection and classification.

“Our findings suggest that combining traditional methods of epidemiological research with electronic health record data may lead to a more complete picture of an individual’s health over time,” said Ahmad, who also conducted the research under faculty mentor and co-author Abel Kho, MD, MS, associate professor of Medicine in the Division of General Internal Medicine and Geriatrics and of Preventive Medicine in the Division of Health and Biomedical Informatics, and director of the Center for Health Information Partnerships (CHiP). “Our next step is to expand the linkage between traditional cohorts and electronic data research networks to other sites across the country.”

The Circulation paper was also co-authored by Philip Greenland, MD, the Harry W. Dingman Professor of Cardiology; Kiang Liu, PhD, professor of Preventive Medicine in the Division of Epidemiology and of Medicine in the Division of General Internal Medicine and Geriatrics; Marc Rosenman, MD, MS, associate professor of Pediatrics and a faculty member of CHiP; Daniel Fort, PhD, MPH, adjunct assistant professor of Preventive Medicine in the Division of Health and Biomedical Informatics; and Cheeling Chan, MS, a biostatistician in the Department of Preventive Medicine.

The research was supported by contracts HHSN268201500003I, N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01- HC-95166, N01-HC-95167, N01-HC-95168 and N01-HC-95169 from the National Heart, Lung, and Blood Institute and by grants UL1-TR-000040 and UL1-TR-001079 from National Center for Research Resources. Research reported in this publication was supported, in part, by the National Institutes of Health (NIH)’s National Center for Advancing Translational Sciences grant number UL1TR001422. Ahmad was supported by the National Heart, Lung, and Blood Institute of the NIH under award number T32HL069771 and by a 2015 Research Fellowship Award from the Heart Failure Society of America.

Kho reports that he is a co-founder and equity holder of Health DataLink, LLC, with $0 current value.