When I moved from working as an engineer into healthcare during the early 90s, I was shocked by the inability of healthcare providers to capture or use data to improve care delivery. Now, almost 30 years later, we do a much better job capturing clinical data secondary to the widespread implementation of electronic medical record (EMR) systems. However, the health data available electronically remains grossly underutilized because of the difficulty in integrating data from various sources to gain better insights. Key examples of areas that should be addressed using integrated data sets are fundamental to improving population health and include: (i) identification of unique patients; (ii) creating an inventory of health providers; (iii) linking patients to their providers; and (iv) better understanding each person’s social circumstances that influence health behaviors. The types of data required to implement each of these areas include clinical data as well as data from medical claims and information collected outside of the healthcare system that describe behavioral and social context.
Clinical Data: Clinical data is generated when patients are seen within a healthcare setting and the provider uses an electronic medical record (EMR) system to document information around a single encounter. Clinical data usually includes the medical issue driving the visit along with demographic data, medications, procedures, and allergies. Ideally, a medical problem and/or diagnosis would be documented along with the patient’s family history and selected social factors like smoking. Much of the information documented in the EMR is unstructured, making extraction difficult. Furthermore, there are hundreds of different EMRs and multiple instances of EMRs deployed at individual health systems limiting the ability to collect this data across different providers. EMR providers have not been required to make their systems interoperable, and are incentivized to keep their data models proprietary to limit the ease of changing vendors. Despite these limitations, EMR data often represents the best and most rapidly available data for tracking patient health outcomes over time and the quality of medical care provided by physicians. This is particularly true for large healthcare systems that use a single instance of an EMR across multiple care locations.
Claims Data: Claims data is collected primarily by healthcare payers including Medicare, Medicaid, and commercial insurance companies. This data is generated by physicians, hospitals, pharmacies, and other agencies associated with patient care. The data includes codes that describe medical diagnosis, procedures, medications, and medical equipment across all providers that a patient sees while they are covered by an insurer.