The USA commercial data market is apparently full. Claims datasets cover the entire population and deidentified EMR data covers hundreds of millions of lives. But here’s the problem – those EMR datasets may have data on hundreds of millions of patients, but they do not contain data from some of the most widely used EMR software. This gap in coverage is the topic of this case study.
EMR software not included in commercial datasets is used by over 60% of Academic Medical Centers (AMCs), large Integrated Delivery Networks (IDNs) and major hospitals. This leads to some data issues:
The question is: What difference does it make having data from these institutions?
We explore a case study where patient-level EMR-equivalent data was collected from institutions that are ‘Inside’ and ‘Outside’ the existing commercial datasets. The case will demonstrate the differences in behavior seen in these groups, emphasizing the decision-making issues that the coverage gap creates.

Longitudinal patient data has a unique feature in that it reflects all the activities conducted by HCPs in managing their patients.
From before diagnosis, the medical record captures every data point, action, change and development experienced by the patient, as well as the response from the HCP to each stage of the disease. Tudor Health uses our proprietary approach to capture all of these data points and convert them into a meaningful dataset, that can be analyzed in an almost limitless way.
In this data set (like all Tudor Health datasets), the institutions at which the HCP manages patients were captured along with the particular brand of EHR software that is in use at those institutions. Knowing which EHR systems are and are not included in deidentified EMR data that is commercially available, Tudor Health is able to identify a simple flag for all patient data – are they Inside or Outside those commercially available datasets?
This case describes two specific HCP behaviors that are found to be significantly different between the data that would be found in other data sets, and data that is unique to Tudor Health.
This case examines data for a rare, serious, life-threatening condition. In the USA patients are managed in specialist clinics. Large IDNs and AMCs play a significant role in managing these patients.
Patients are treated with drugs that come from 6 different classes of drugs. Treatment includes combinations of drug class and there are some fixed combinations also available. There was significant new product activity in the period 2022-2024, with two of the main manufacturers involved in the disease each launching a new product with significant clinical trial benefits for patient management.
The Tudor Health data includes much more than just these measures, but these are indicative of the strength of analysis derived from including sources not included in legacy EMR data.
Longitudinal patient data was collected on a sample of 200 patients, including the EHR system in use at the treating institution. A sub-sample of n=82 come from institutions not found in legacy EMR data sets (Outside), n=118 that would normally be found Inside legacy data.
The case examines two groups of data – data that would be seen inside other EMR data sources and the impact of data found only in the Tudor Health dataset.
There are two areas of patient management highlighted in this case study:
The Tudor Health data includes much more than just these measures, but these are indicative of the strength of analysis derived from including sources not included in legacy EMR data.
One of the biggest gaps that Tudor Health data fills is measuring patient-level market shares for drugs used in the AMCs and IDNs not covered by legacy EMR data.
The Outside group have quite different profiles of usage for brands in the market. The following graph shows how including the Outside group changes the market share measure from legacy EMR data.
The data shows that these institutions have significantly different usage of half the products in this market.
The impact on decision making across branded competition is clear – not having access to the Outside group will give an inaccurate view of the market dynamics.
Data showing significant differences in product class split across patients. The Outside group uses combination therapy much less frequently, with only 12% of patients being treated with 3 or more drugs (compared to 29%).
This almost double the rate of mono therapy usage means that legacy EMR data will suggest that over 60% of patients are treated with combinations, when the actual number is 50%.
Further research is needed to identify specifically why this is occurring, but understanding this dynamic would form a strong element of any communication strategy addressing this market.
Capturing data across the full spectrum of institutions is a core capability of the Tudor Health approach and one of the principal reasons why Tudor Health was founded – filling data gaps.
This case study demonstrates that users of commercially available deidentified EMR data are missing critical differences in HCP behavior and patient treatment patterns in this rare disease. Making decisions based on the legacy datasets will result in misdirected resources and missed opportunities.
On reading this paper, if you realize that your current commercial and insight challenges could be met through the use of longitudinal patient data (or synthetic data and models built using the data), get in touch with us at Tudor Health and we can discuss a solution that meets your needs.
Equally, if you are still unsure about Market Emulation Models (MEMs), but would like to explore how they might be used in your organization, contact us to discuss how they can be applied to your commercial goals.