A Whole New World of Data

Challenges and opportunities with electronic health records David Pierce and Ella Young

As the health care industry continues to evolve, so must the actuaries who serve it. As more revenue is tied to value-based care that focuses on improving patient outcomes, risk effectively is being shifted from payers to providers. As this shift occurs, actuaries who have traditionally supported payers will need to adapt their tools and expertise to better support the provider space. One area that has the potential to be disruptive for actuaries as they reach out to providers is the type of data sources available to use. As actuaries engage with providers more, providers will want to make use of the clinical, billing and other data available in their own electronic health records (EHRs), in addition to claims data provided from a payer. Actuaries who practice in the health care space will need to become familiar with the potential gains and challenges of this data in order to grow the presence of the field in the new health care payment landscape.

New data, old and new challenges

EHR data presents actuaries with a new domain to demonstrate their value and expertise. Actuaries have long been experts in using administrative claims data, a skill built up over decades of practice and educational standards.

The first new challenge actuaries will need to overcome is accessing the data. Interoperability of EHRs continues to be a top story at health care industry events and in newsletters. This presents actuaries with the opportunity to help providers extract the most value out of the data in their EHRs, along with the challenge of learning new techniques and standards. Adding to this challenge is that different EHR vendors provide different mechanisms to access the data, and at varying costs to the providers.

The most common methods likely to be encountered for data extraction are Health Level 7 (HL7) messages, flat file data extracts or a custom application program interface (API). Unlike claims data that has been standardized over years of practice, EHR data feeds vary by vendor and version. And not all EHRs are created equal. Not only is every EHR feed different, but the structure of the feed is different than claims data—and it requires a different approach to utilize it.

This new structure creates an opening for actuaries to play a large supporting role in the data extraction process. Payers have maintained claims databases for years and have staff dedicated to curating and maintaining those databases. Providers, on the other hand, especially primary care providers, traditionally have not been in the position of maintaining databases. This has led to many ambulatory EHR vendors storing the EHR data either in the cloud or on their servers. This means the provider office does not need to support a complex hosting environment, but it can make it more challenging to acquire the data when there is not a local data expert at the provider’s location. Actuaries can support providers by being the data experts who can translate the raw data from the EHR into what providers need in their claims data.

Once the access issues have been fixed (which is not an easy task in some instances), the next issue to resolve is variations within the EHR data. While there is variability across different administrative claims fees, actuaries have a general sense of what will be included and the format of the data. The health care payment systems have standardized many of the fields actuaries use in their analyses. With minor variation, an actuary knows what to expect in the diagnosis field or the Health care Common Procedure Coding System (HCPCS) field. This is not the case with data from an EHR.

Medical Coding Terms

Health Level 7 (HL7) is an organization that provides common standards to be used in the exchange of electronic health information. (www.hl7.org)

Health Care Common Procedure Coding System (HCPCS) is a multilevel system of standardized codes that represent medical services that assist in processing medical claims for payment.

Logical Observation Identifiers Names and Codes (LOINC) is a standard code set used for identifying laboratory observations.

Admissions, Discharges and Transfers (ADT) messages are some of the most common HL7 messages and consist primarily of demographic information about a patient from a hospital or clinic.

Take laboratory tests and results, for example. Laboratory tests can be codified using the Logical Observation Identifiers Names and Codes (LOINC) standard; however, data feeds from EHRs often do not include LOINC codes, but instead include a human readable description of the test. Different users in the same provider clinic will use different terminology to describe the same test. For example, the hemoglobin A1c test used to diagnose and monitor diabetes can be recorded as: HbA1C, A1C, Hemoglobin A1C, etc. Similar scenarios exist for the actual lab values and other tests. Actuaries will need to understand how this variability will impact their analyses and develop methods to standardize and consolidate differences in terminology among users.

Additional issues of which actuaries should be aware prior to engaging with a project using EHR data are: How is EHR data linked with existing claims data? Will the needed data be available in the extract? EHR data is typically raw data, as compared with processed administrative claims data, meaning that defining who is who is not always straightforward. This is why in most cases, when linking EHR data with claims data, an actuary will need to create a master patient index (MPI) in addition to the patient identifiers in the claims and EHR data. The quality of the data contained in the EHR is something that will likely not be fully known until the extract has been evaluated. Providers have the ability to customize their EHR workflows in order to better align with how they practice medicine. This capability is a great thing for the patient and the provider, because technology should enable workflows, not make them more cumbersome. The downside to customization is that data captured in the customized workflow may not be available in the extract from the EHR. This can leave valuable data locked in the EHR because most APIs and extracts pull only specific fields that were anticipated to be used in standard workflows. Clever approaches and an intimate knowledge of EHR systems are required to work around this issue.

Expanding traditional actuarial work with EHR data

Actuaries have long used administrative claims data from a payer in their work. In health care, a payer has broader visibility than any single individual provider does regarding the claims experience of an individual patient. This wider view is important for risk scoring and understanding episodes of care across health care providers. Conversely, providers who have cared for patients for several years are going to have more longitudinal data on their patients, including lab values and vital signs. Additionally, EHR data can include more information than claims data for the same encounter, because typically only the data elements required to get a claim paid are coded on the claim. Another benefit of EHR data is that it can be extracted and used in real time as opposed to the delay seen in claims data. The opportunity to link claims and clinical data from an EHR combines the best of both data sets and could be used to expand a number of offerings, including reserve analysis and analyzing quality metrics.

Actuaries rely on the holistic view of individuals from claims data to perform a number of analyses, including reserve estimation. Reserves traditionally have been focused on how to estimate the financial liability of a payer for the claims that have occurred but have not yet been reported. If actuaries have access to real time or close to real time (e.g., daily) EHR, or admission, discharge or transfer (ADT) feeds, then the entire reserve process could be expanded and improved upon. Having access to the billing data contained in an EHR, actuaries would know about events long before they are reported in the payer data. Actuaries would be able to use up-to-date information regarding utilization of services by place of service, and then apply additional models to account for denials, severity and services for which no claim data is available. This has the potential to reduce variability in reserve models, allowing payers to have more confidence in the reserve estimate. This would be valuable to payers, but also to providers as they assume more risk.

When providers begin taking on risk through accountable care organizations (ACOs) or capitation models, they need to think about their overall exposures and what the likely utilization outcomes will be for the performance year. Using claims data alone is not ideal for this, which is due to the timing limitation previously discussed. Likewise, EHR data alone is not sufficient for this either due to the absence of other provider billings. Unless the providers are part of a truly integrated network, they will not have visibility into all of the services providers outside of the group deliver to their patients. This makes the combined claims and clinical offering the ideal solution for these groups as well. By combining robust analysis on claims and EHR data, ACOs will be in a better position to estimate their end-of-year performances and evaluate potential corrective actions. As more provider groups join or form ACOs or other alternative payment model arrangements, actuaries will need to be in a position to support them.

Supporting ACO quality metrics is another area in which combining claims and EHR data adds value. In the United States, quality metrics are a continued focus for ACOs and other managed care entities, especially as the impact of the Merit-Based Incentive Payment System (MIPS), as delineated in the Medicare Access and CHIP Reauthorization Act of 2015 (MACRA), becomes effective for providers. Being able to support both claims and clinical quality metrics is going to be a must for actuaries if they are to add value in the provider consulting space. There are several quality measures that rely solely on clinical data, and the ability to calculate them and recommend improvement strategies in near real time will be a must moving forward in this space.

Is the juice worth the squeeze?

There are many challenges and opportunities in using data from an EHR. Some of these were detailed already, but, additionally, there is the real expense to acquire the data, which varies by EHR system (monthly fees or one-time setup costs). The question of whether it’s worth the investment depends on the business situation and what other data sources are available. In addition to the question of cost, the availability of data, which has a lot to do with the national medical delivery model, needs to be considered.

U.S. Perspective

The U.S. health care system is a highly complex, decentralized system with multiple private and public payers.

As individuals move throughout the United States, both regionally and through time, their utilization data generally does not follow them. For example, when an individual qualifies for Medicare—the federal health insurance program for people over age 65—the federal government becomes the insurance payer. The claims data for this individual begins when he or she enrolls in Medicare. None of the prior payer data is transferred or available. If individuals retain their pre-Medicare providers, prior longitudinal clinical data is available in the EHR system.

The lack of historical information from the claims system presents issues for analysis. For example, consider new enrollees to Medicare. If actuaries had access to their EHR data and were able to fill in the gaps in the claims records, then the analysis could more accurately reflect what is known about them.

EHR Data Composition

Electronic health record (EHR) data is primarily grouped into three types of data: structured, unstructured and images.

Structured data is data that is entered into a specified field for a specific purpose, often with prescribed options for entering the data. Smoking status is a good example of structured data, as are vital signs.

Unstructured data is all of the free text that providers enter into a patient’s chart during or after an encounter. This data typically contains what the doctor did or observations about the patient. Items that could be seen in the free text block include “counseled patient on importance of medication adherence.”

The third type of data is images. These can be radiological images or scanned pathology notes.

In addition to clinical data (labs, vitals, diagnosis, procedures, etc.), EHRs also include billing information, demographic information, medical history, allergies and immunizations for encounters that occur in that medical system.

Canadian Perspective

Many nations are struggling to find an optimal blend of structures, frameworks and payment mechanisms to ensure patient outcomes are maximized for the resources expended. All have their own challenges. One challenge that is alleviated in universal-payer systems (like the one in Canada) is the inception-built infrastructure to capture utilization data that is both wide and deep, yielding comprehensive, longitudinal administrative data. That is, for all citizens, data around all physician encounters is captured, including diagnoses, prescriptions ordered, prescriptions filled, and tests ordered and conducted (though often not with test results). Encounter data is captured for as long as an individual lives in the health care system jurisdiction and retained potentially forever.

In Canada, health care is a provincial responsibility. Most of the system is paid through transfers from the federal government, but it is the provinces that decide how best to use those funds to achieve federal and provincial objectives. So, if someone moves within a given province and/or is in Canada and has not left that province for longer than what the provincial plan allows, all of his or her health care utilization is captured (even from other Canadian provinces). There are some gaps, such as for hospital pharmacy dispensing and physicians who are not fee-for-service (FFS), and there is variability among the provinces. However, the data is very comprehensive—much is known about trends, patterns and utilization costs by almost whatever cohort criterion one would like to apply. Accurate predictions may be made at the citizen level by leveraging this great store of information.

This is great—for taxpayers, patients, providers, researchers and others, such as actuaries. These data stores mean the potential for highly accurate and robust model results. To date, few actuaries have been involved with detailed analytics work with these great data warehouses, so many opportunities exist to derive value-added information with analytics. However, it may not be so great for EHRs. That is because unless any additional data collected by EHR adds significant value, then the cost involved with collecting, storing, securing, extracting, linking and analyzing it may be difficult to justify. In addition, before embarking on such a journey, a clear plan must be articulated around what will be done with any resulting analysis to better the system and outcomes for patients. This has been a major obstacle. Identifying a problem—and often an already known problem—is good, but if the causes and/or improvement plans cannot also be explored and feasible, then the value proposition is greatly reduced.

The universal-payer administrative data shows if and when patients visit multiple providers, and what occurs, such as prescriptions, diagnoses, test orders, emergency room (ER) visits, hospital admissions, specialist referrals, etc. This information would not be in one given provider’s EHR, as that provider would not know if a patient had been accessing other services unless that patient were to disclose it. A lack of sharing such information may not be intentional, as patients with cognitive issues may simply not remember these events during their next regular visit. One option for providers in universal systems is to use administration data stores more fully to improve patient care.

This is not to say that EHRs do not add value, especially for a given provider or group of providers in running their practice versus traditional methods, such as paper charting. In addition, being able to extract additional data elements from an EHR, such as test results, in a timely way could enable much better analytics results. However, from a population health and/or utilization/actuarial analytics lens, they have a high bar to surpass to show value.


Having access to and using the clinical data from an EHR can be expensive and unwieldy, and should not be entered into without a clear objective or business reason. Data analysis for the sake of data analysis is not what actuaries are known for or how they add value. Delivering high-quality insights into critical business problems for their employers and clients is how actuaries deliver value. EHR data, when used purposefully and with an understanding of the challenges it presents, can be a significant value-add for actuarial work.

David Pierce is director of operations, PRM Analytics, at Milliman in Indianapolis.
Ella Young, PMP, MHA, CHE, CRM, BComm (Hons), is director of the care continuum and actuarial analytics for Vancouver Coastal Health.

Copyright © 2016 by the Society of Actuaries, Chicago, Illinois.