Considerations for emulations of randomized controlled trials using real-world data: learnings from an emulation of MONALEESA-2
Publication: Journal of Comparative Effectiveness Research
Abstract
Aim: The eligibility criteria of randomized controlled trials can exclude groups of patients indicated for the therapy post-approval and the trial population may not be generalizable to routine care. We attempted to emulate the MONALEESA-2 trial of ribociclib plus letrozole for women with advanced breast cancer using real-world data (RWD) to assess the impact of modifying entry criteria of the trial, which found a survival benefit associated with ribociclib (hazard ratio [HR] = 0.76). Materials & methods: Post-menopausal women with recurrent or metastatic breast cancer were identified in a linked electronic health records-claims database. Treatment groups were ribociclib plus letrozole or letrozole without ribociclib (referred to as letrozole). Overall survival was compared between the two groups using Cox proportional hazards models with inverse probability of treatment weights and propensity score matching. Results: There were 132,406 patients who initiated ribociclib plus letrozole or letrozole from 13 March 2017 (ribociclib approval) to 30 September 2023. After applying trial entry criteria, the sample size was 3912 patients. Compared with real-world patients, trial participants tended to be younger (>50% were <65 years old compared with 38%) and more commonly had liver or lung metastases (>50% vs <15%). Among real-world patients, those treated with ribociclib plus letrozole had higher comorbidity scores (mean Elixhauser Comorbidity Index Score 15 vs 9) and were more likely to have metastatic disease burden than patients treated with letrozole (85% vs 45%). We were unable to emulate the trial findings; all HRs in this analysis were >1. Conclusion: Real-world patients may differ from those participating in randomized controlled trials. Along with data source limitations, such as missing clinical information or incomplete capture of mortality, this can impact the ability to emulate trials in RWD. However, RWD is key for describing patients in routine care and to answer relevant questions following the approval of new therapies.
Plain language summary
What is this article about?
Patients who are indicated for an approved therapy may have different demographic and health conditions than those who participated in the randomized controlled trials. To assess the potential impact of relaxing overly restrictive inclusion and exclusion criteria, we attempted to emulate the MONALEESA-2 randomized controlled trial evaluating overall survival of women with advanced hormone receptor (HR)-positive, human epidermal growth factor receptor 2 (HER2)-negative breast cancer treated with ribociclib plus letrozole using real-world data (RWD).
What were the results?
The women meeting all the trial criteria in addition to a baseline observability requirement were a small proportion of those initiating ribociclib plus letrozole in the real world (3%). Trial patients were younger and had more liver or lung metastasis than real-world patients. In the RWD, there were also significant differences between women receiving ribociclib plus letrozole and women receiving letrozole without ribociclib, with those treated with ribociclib plus letrozole having more metastatic disease. We were unable to emulate the trial findings.
What do these results mean?
We could not assess the impact of relaxing trial inclusion and exclusion on the main findings for overall survival because the trial results could not be replicated in this RWD source. There are important considerations for conducting trial emulations, including differences in patient populations, data source limitations, and study design challenges. Ongoing research in RWD is needed after approval to understand drivers of treatment decisions and relevant questions beyond those addressed by the pivotal trial(s).
Randomized controlled trials (RCTs) are considered the gold standard for assessing treatment safety and efficacy; however, their inclusion and exclusion criteria often lead to trial populations that differ significantly from real-world populations. The emphasis of achieving internal validity in RCTs and accruing a sufficient number of outcomes to power the study can come at the expense of external validity, leading to limited generalizability of RCT findings to real-world clinical practice [1–4]. This concern has gained notable attention in oncology. In 2016, the American Society of Clinical Oncology (ASCO) and the advocacy organization Friends of Cancer Research (Friends) launched an effort to further expand eligibility criteria for cancer clinical trials to increase patient recruitment, ideally leading to more rapid advances in cancer treatment [4].
It is well established that RCTs enroll more restricted patient populations than those treated in the real world. An estimated 17–21% of patients cannot enroll in clinical trials mainly due to overly restrictive eligibility criteria [5]. This ultimately results in lower patient accrual and an underrepresentation of older adults, racial/ethnic and sexual/gender minorities, and patients with comorbidities [6]. For instance, a study published in 2022 found that exclusions for comorbidities from pancreatic cancer clinical trials disproportionately excluded Black patients from enrolling [7]. More restrictive trial inclusion/exclusion criteria can limit the generalizability of trial results. One example is the exclusion of high-risk patients from oncology RCTs [8]. This lack of representation can subsequently limit discussions between patients and their oncologists regarding risk-benefit trade-offs among treatment options, potentially delaying access to lifesaving care.
Recognizing this, many oncology stakeholders are calling for more inclusive RCTs. Calls from the FDA, ASCO, and Friends emphasize the need to simplify and broaden eligibility criteria, which underscores the broader recognition of the need for further inclusivity in clinical trials [4,5,9]. In an evaluation by the ASCO, 56% of surveyed clinicians agreed that some criteria are too stringent. Still agreement could not be reached on the removal of specific criteria [10].
Real-world data (RWD) plays a key role in assessing the generalizability of RCT participants and the external validity of RCTs. First, the demographics and clinical characteristics of RCT patients can be compared with real-world patients meeting the trial criteria to assess how similar patients in the trial are to real-world patients. Second, trial emulation, using the original trial inclusion/exclusion criteria, can be conducted in RWD to assess effectiveness and how effectiveness may change when some trial criteria are relaxed. There are many considerations when conducting these types of analyses including data source and study design. To advance the science and methodology of assessing trial representatives and conducting trial emulation, here we describe the challenges in emulating the Study of Efficacy and Safety of LEE011 in Postmenopausal Women with Advanced Breast Cancer (MONALEESA-2), a phase III trial of ribociclib plus letrozole in women with advanced breast cancer in RWD. We discuss our learnings to document them for other researchers in the trial emulation space, not for interpretation of the results.
Materials & methods
Study design
This cohort study used secondary, de-identified, individual-level patient data. The MONALEESA-2 trial was selected because of several factors favoring its replication in RWD: use of an active control arm (letrozole) since placebo arms are difficult to mimic in RWD, outcome of overall survival (OS) which is objectively measurable in routine clinical practices unlike other oncology-specific outcomes such as progression, and a sufficiently historical US FDA approval date allowing for follow-up in the data given the expected data lag in RWD. This analysis was granted Institutional Review board exemption.
Data source
A data feasibility assessment was conducted to determine the most fit for purpose data for this analysis using the SPIFD2 framework [11]. The data feasibility was based on data manuals, data dictionaries, published validation analyses and discussions with data vendors. Patient-level data samples were not analyzed as part of the assessment. After comparing five US-based data sources, Optum’s de-identified Market Clarity Data (Optum® Market Clarity) was selected on the basis of sample size, data elements available and administrative considerations such as cost and contracting timelines.
Optum® Market Clarity contains linked data from claims and electronic health records (EHR) for over 1.8 million US patients. All patients within Optum® Market Clarity have events from EHR and a subset have information from claims data. The database comprises enrollees of commercial health plans, Medicare, Medicaid and uninsured patients who received healthcare across all 50 states. EHR data are sourced from integrated delivery networks consisting of ambulatory only facilities and hospital networks and includes clinical data from laboratories. The EHR data contain drug information from prescriptions written or medications administered in the outpatient setting. A subset of patients has prescription fill information from external outpatient pharmacy dispensing records. This dataset contains a combination of structured and natural language processed unstructured data (e.g., provider notes) from the EHR. The database provides a comprehensive, multipayer view of the demographics, diagnoses, procedures performed during outpatient visits or inpatient stays and outpatient prescription records. Death information is provided at month-year granularity and is aggregated from the EHR, claims, obituary data sources, death master file and Centers for Medicare & Medicaid Services.
A subset of Optum® Market Clarity patients with clinical and provider notes as well as other unstructured documents from large health systems suitable for extraction are included in oncology-specific data tables. The Optum enriched oncology dataset is a group of tables that can supplement the EHR dataset. It contains additional cancer-specific information on a subset of the EHR population that have at least one solid or nonsolid tumor diagnosis. In addition to structured data obtained from medical records, the enriched oncology tables provide additional clinical information extracted from provider notes using natural language processing (NLP) machine learning models (for example, tumor progression, histology, etc.). Variables obtained from NLP-derived fields included biomarker status and Eastern Cooperative Oncology Group (ECOG) performance status. Please see Supplementary Tables A1 & A2 for more information on each variable.
Optum oncology data initiatives include enriching the data by extracting essential information from the oncology patient’s medical records and making it usable for researchers. Specific oncology concepts important in understanding the progression of the disease are often not available in structured formats. The dataset framework of deriving valuable data elements from the unstructured data within EHRs are carried out by a collaborative effort of Optum’s data scientists, clinical experts and product executives.
For the quality controls around the enriched oncology dataset: Once completed, each concept within the enriched oncology dataset goes through a manual validation process on a brand-new sample of notes, and precision and recall scores associated with the NLP algorithms are calculated based on that sample (our minimum precision requirement is 80%). Once the concept meets this requirement, the NLP-derived data goes through the following curation steps before inclusion in the enriched oncology deliverable: concept normalization (transforming the raw values into more standardized options for ease of interpretation, while ensuring no loss of information or context), exclusion of data output with little clinical context or value, de-duplication of records that provide the same result on the same note date and creation of a data dictionary that includes lookup tables, includes value breakdowns for all variables.
Patient selection
First, an RWD population was identified in Optum® Market Clarity. All trial inclusion and exclusion criteria were applied where possible (except those deemed to be related to safety outcomes based on clinical expertise) to identify a patient population reflecting that of the MONALEESA-2 trial (Figure 1). The detailed approach for operationalizing the eligibility criteria employed in MONALEESA-2 in RWD are described in Supplementary Tables A1 & A2.

Figure 1. Inclusion and exclusion criteria for MONALEESA-1 and real-world-data population.
ECOG: Eastern Cooperative Oncology Group; HR: Hazard ratio; OS: Overall survival; RCT: Randomized controlled trial; RECIST: Response Evaluation Criteria in Solid Tumor; RWD: Real-world-data.
Women were required to have evidence of advanced breast cancer defined as recurrent or metastatic breast cancer based on two claims-based algorithms [12,13] between 13 March 2017 (the approval date of ribociclib in the US) [14] and 30 September 2023 (the Optum® Market Clarity data cut end date). The date of the first prescription for ribociclib or letrozole was assigned as the index date (see Exposure section below for more detail on capture of treatment status). Patients were required to be postmenopausal women at least 18 years old on the index date. Because of missingness, patients with no evidence of hormone receptor (HR)-negative histology were considered HR-positive (i.e., no HR test results at all or no HR-negative tests), and patients with no evidence of human epidermal growth factor receptor 2 (HER)-positive disease were considered HER-negative. Women were required to have an ECOG performance status of 0 or 1, which was operationalized as no evidence of ECOG greater than 1. The number of missing values for each variable is reported.
In addition to the eligibility criteria applied in MONALEESA-2, an observability criterion was applied to increase the accuracy of the measurements of baseline characteristics in both populations. Patients were required to have 180 days of continuous claims enrollment or EHR activity prior to and including the index date, allowing for 30-day gaps.
Exposure & outcome
The exposure of interest was the new initiation of ribociclib in combination with letrozole, defined as a prescription written, medication administration, medical claim, or prescription claim for ribociclib and letrozole, within 60 days of one another, with no prior prescriptions/claims of either medication. The comparator group was patients who had at least one prescription/claim for letrozole and no prescriptions/claims for ribociclib in the following 60 days (referred to throughout this manuscript as ‘letrozole’). Patients were allowed to initiate other treatments following the index date, which is consistent with another MONALEESA-2 trial emulation [15]. The outcome of interest was OS.
Patient characteristics
Patient characteristics from the MONALEESA-2 trial population were identified from published study findings [16]. Demographic characteristics (age, race, region) are reported. Age was measured at index date while race and region are summarized as the most frequently recorded value in a patient’s records. Clinical characteristics including previous treatments and metastases based on diagnosis codes were measured in the 365 days prior to and including index date based on claims and EHR records. Some characteristics were measured in the RWD population despite not being captured in the trial to better describe a patient's disease severity.
Statistical analysis
To assess the generalizability of the trial participants, the impact of each inclusion/exclusion criterion was determined by the proportion of patients excluded at that point in the attrition table and by comparing the final sample size to the number of patients who initiated one of the exposures of interest. The distributions of demographic and clinical characteristics were compared between the ribociclib plus letrozole trial population and the RWD population using absolute standardized differences (ASDs). An ASD > 0.10 was considered a meaningful difference [17]. ASDs were also used to compare the RWD exposure and comparator groups.
To estimate the association between ribociclib plus letrozole treatment and OS, Cox proportional hazards models were fit comparing the ribociclib plus letrozole arm to the letrozole group. To control for confounding, inverse probability of treatment weighting (IPTW) and 1:1 propensity score matching were used. The following characteristics were included in the propensity score model: age, race, ethnicity, region, ECOG performance status, progesterone receptor (PgR) status, Elixhauser Comorbidity Score, evidence of ECG, prescription/claim for opioids and previous antineoplastic treatments. These variables were included to adjust for demographic differences, general health status and disease burden/characteristics which may be associated with treatment assignment and OS. For categorical variables with missing values, missing was considered a level of the variable (i.e., a missing indicator was created). Patient characteristics were considered balanced if the post-weighting/post-matching ASDs were <0.10 [17].
Post hoc analyses
Following the initial findings, several post hoc analyses were undertaken to explore and contextualize the results. After noting the baseline differences between the ribociclib plus letrozole and letrozole groups, two additional analyses were taken: the creation of historical control groups and a subgroup analysis of patients with metastatic disease based on the presence of a diagnosis code for secondary malignancy. First, to evaluate the impact of using other potential comparator groups, two additional time frames were used to select letrozole patients: initiated letrozole from 1 January 2013 through 2 February 2015 (prior to the approval of palbociclib, the first cyclin-dependent kinase 4/6 [CDK4/6] inhibitor approved in the US; Historical Letrozole Group 1) and initiated letrozole from 3 February 2015 through 12 March 2017 (after palbociclib approval but prior to ribociclib approval; Historical Letrozole Group 2) (Figure 2). The goal of the historical controls was to replicate the treatment landscape prior to ribociclib approval.

Figure 2. Relevant time periods for patient selection.
OS: Overall survival; RCT: Randomized controlled trial; RWD: Real-world-data.
Second, given the discrepancy between the proportion of patients with metastases in the two arms and because evidence of metastasis was not included in the propensity score model, a subgroup analysis of those with evidence of metastatic disease was conducted. The goal of the subgroup analysis was to adjust for uncontrolled confounding in disease stage, which would be associated with OS, between the comparator groups. Finally, to remove any potential immortal time bias resulting from patients in the ribociclib plus letrozole arm having to live long enough to receive two treatments compared with only requiring evidence of one treatment in the letrozole arm, the analysis was limited to patients with at least 60 days on follow-up with follow-up time starting on day 61 (ribociclib plus letrozole n = 97 and letrozole n = 3687).
Results
Impact of inclusion/exclusion criteria
There were 132,406 patients meeting the exposure or comparator group definitions during the study period. For transparency/replicability purposes, the numbers and proportions of patients excluded for each individual criterion presented in Figure 3 reflect the order in which the criteria were applied. A different order of application may result in different numbers and proportions; however, some trends are noteworthy. After excluding patients without sufficient baseline observability or previous use of the medications of interest, there were 55,189. A small number of patients were excluded for male or missing gender, or for being less than 18 years old. The application of the criteria for recurrent or metastatic breast cancer and postmenopausal status reduced the sample by approximately 70%, which was expected given the subsequent approvals for ribociclib during our study period [18]. Other exclusion criteria that resulted in a large reduction in sample size were prior systemic anti-cancer therapy in the 1 year prior to index date, diagnosis of another malignancy within 3 years prior to index date or on index date, active cardiac disease or history of cardiac dysfunction recorded in the 1 year prior to index date or on index date, and use of prohibited medications such as cytochrome P450 3A4 or 3A5 (CYP3A4/5) inducers or inhibitors and medications with a risk to prolong the QT interval. The final patient sample consisted of 3912 patients: 106 ribociclib plus letrozole patients and 3806 letrozole patients. Approximately one-quarter of ribociclib plus letrozole patients and 12% of patients treated with letrozole had another anticancer therapy within the 60 days after index date.

Figure 3. Real-world-data population patient attrition.
*Operationalized as no evidence of HR-positive status or HER2-negative status.
**Operationalized as no evidence of ECOG >1.
CDK4/6: Cyclin-dependent kinase 4/6; ECOG: Eastern Cooperative Oncology Group; HER2: Human epidermal growth factor receptor 2; HR: Hormone receptor; RWD: Real-world-data.
Comparison of trial participants & RWD population
The baseline characteristics for MONALEESA-2 trial participants and patients in the RWD population after applying the full set of inclusion/exclusion are presented in Tables 2 & 3. There were differences in most characteristics between the trial patients and real-world patients who received ribociclib plus letrozole. The trial participants tended to be younger than RWD patients among whom more than 60% were over 65. While the distribution of race was similar when classifying patients/participants as Asian versus non-Asian, the ribociclib plus letrozole arm of the RWD population was 9.4% Black whereas only 2.5% of the MONALEESA-2 population was Black, a limitation acknowledged in the publication [16]. The distribution of US region could not be compared as the trial was multinational and only the national-level distribution was presented. While prior use of chemotherapy was similar between the two groups, prior use of tamoxifen was more common among trial participants, while nonsteroidal aromatase inhibitor use was more common among RWD patients. Larger proportions of patients in the trial were noted as having liver or lung involvement than in the RWD population. The proportion of patients with bone lesions only approximately 20% among trial participants; whereas nearly 38% of RWD ribociclib plus letrozole patients had a diagnosis of bone metastasis without diagnosis of liver, lung, or lymph node metastasis. The proportion of trial patients with new metastatic disease was lower than the proportion of patients with a diagnosis code for any metastatic disease (new or previously diagnosed) within the prior year. Large amounts of missing data in the RWD population precluded valid comparisons on ECOG, PgR status and hormone receptor status.
| MONALEESA-2 RCT | RWD population | ASDs comparing ribociclib + letrozole RCT to ribociclib + letrozole RWD | ASDs comparing placebo + letrozole RCT to letrozole + no ribociclib RWD | ASDs comparing ribociclib + letrozole RWD to letrozole + no ribociclib RWD | |||
|---|---|---|---|---|---|---|---|
| Ribociclib + letrozole | Placebo + letrozole | Ribociclib + letrozole | Letrozole | ||||
| Patients, n | 334 | 334 | 106 | 3806 | |||
| Age (<65 years, >= 65 years)† | |||||||
| <65; n (%) | 184 (55.1%) | 189 (56.6%) | 40 (37.7%) | 1458 (38.3%) | 0.35 | 0.37 | 0.01 |
| >= 65; n (%) | 150 (44.9%) | 145 (43.4%) | 66 (62.3%) | 2348 (61.7%) | |||
| Race‡ | |||||||
| Asian; n (%) | 28 (8.4%) | 23 (6.9%) | 6 (5.7%) | 68 (1.8%) | 0.11 | 0.25 | 0.21 |
| Non-Asian; n (%) | 281 (84.1%) | 287 (85.9%) | 85 (80.2%) | 3,160 (83.0%) | 0.10 | 0.08 | 0.07 |
| Other/unknown/missing; n (%) | 25 (7.5%) | 24 (7.2%) | 15 (14.2%) | 578 (15.2%) | 0.22 | 0.26 | 0.03 |
| Race in RWD‡ | |||||||
| Caucasian; n (%) | 75 (70.8%) | 2803 (73.6%) | 0.06 | ||||
| Asian; n (%) | 6 (5.7%) | 68 (1.8%) | 0.21 | ||||
| Black; n (%) | 10 (9.4%) | 357 (9.4%) | 0.00 | ||||
| Other/unknown; n (%) | 15 (14.2%) | 578 (15.2%) | 0.03 | ||||
| Region | |||||||
| Asia; n (%) | 35 (10.5%) | 33 (9.9%) | N/A | N/A | N/A | ||
| Europe; n (%) | 150 (44.9%) | 146 (43.7%) | |||||
| Latin America; n (%) | 7 (2.1%) | 7 (2.1%) | |||||
| North America; n (%) | 108 (32.3%) | 121 (36.2%) | 106 (100.0%) | 3806 (100.0%) | |||
| Other; n (%) | 34 (10.2%) | 27 (8.1%) | |||||
| US Region‡ | |||||||
| Northeast; n (%) | 20 (18.9%) | 857 (22.5%) | N/A | N/A | 0.09 | ||
| Midwest; n (%) | 46 (43.4%) | 1224 (32.2%) | 0.23 | ||||
| West; n (%) | 13 (12.3%) | 534 (14.0%) | 0.05 | ||||
| South; n (%) | 24 (22.6%) | 1022 (26.9%) | 0.10 | ||||
| Other/Unknown; n (%) | 3 (2.8%) | 169 (4.4%) | 0.09 | ||||
| Year of index date | |||||||
| 2017; n (%) | 16 (15.1%) | 409 (10.7%) | N/A | N/A | 0.13 | ||
| 2018; n (%) | 12 (11.3%) | 555 (14.6%) | 0.10 | ||||
| 2019; n (%) | 16 (15.1%) | 632 (16.6%) | 0.04 | ||||
| 2020; n (%) | 7 (6.6%) | 636 (16.7%) | 0.32 | ||||
| 2021; n (%) | 9 (8.5%) | 683 (17.9%) | 0.28 | ||||
| 2022; n (%) | 19 (17.9%) | 528 (13.9%) | 0.11 | ||||
| 2023; n (%) | 27 (25.5%) | 363 (9.5%) | 0.43 | ||||
†
Assessed on the index date in RWD population.
‡
Assessed as the most frequent value observed from the start to the end of available data in RWD population.
Bold values are ASDs greater than 0.10, which indicate an imbalance between the groups.
ASD: Absolute standardized difference; RCT: Randomized controlled trial; RWD: Real-world data.
| MONALEESA-2 RCT | RWD population | ASDs comparing ribociclib + letrozole RCT to ribociclib + letrozole RWD | ASDs comparing placebo + letrozole RCT to letrozole + no ribociclib RWD | ASDs comparing ribociclib + letrozole RWD to letrozole + no ribociclib RWD | |||
|---|---|---|---|---|---|---|---|
| Ribociclib + letrozole | Placebo + letrozole | Ribociclib + letrozole | Letrozole | ||||
| Patients, n | 334 | 334 | 106 | 3806 | |||
| Previous chemotherapy† | |||||||
| No; n (%) | 188 (56.3%) | 189 (56.6%) | 57 (53.8%) | 1390 (36.5%) | 0.05 | 0.41 | 0.35 |
| Yes; n (%) | 146 (43.7%) | 145 (43.4%) | 49 (46.2%) | 2416 (63.5%) | |||
| Previous hormonal agent† | |||||||
| Nonsteroidal aromatase inhibitor and others; n (%) | 30 (9.0%) | 23 (6.9%) | 20 (18.9%) | 460 (12.1%) | 0.29 | 0.18 | 0.19 |
| None; n (%) | 158 (47.3%) | 162 (48.5%) | 82 (77.4%) | 3145 (82.6%) | 0.65 | 0.77 | 0.13 |
| Tamoxifen; n (%) | 146 (43.7%) | 149 (44.6%) | 6 (5.7%) | 284 (7.5%) | 0.98 | 0.93 | 0.07 |
| Surgery in previous year; n (%) | 23 (21.7%) | 2179 (57.3%) | N/A | N/A | 0.78 | ||
| Radiation therapy in previous year; n (%) | 9 (8.5%) | 656 (17.2%) | N/A | N/A | 0.26 | ||
| Max ECOG score‡ | |||||||
| Max ECOG = Missing; n (%) | 95 (89.6%) | 3579 (94.0%) | N/A | N/A | 0.16 | ||
| Max ECOG = 0; n (%) | 204 (61.1%) | 202 (60.5%) | 5 (4.7%) | 149 (3.9%) | 1.50 | 1.52 | 0.04 |
| Max ECOG = 1; n (%) | 130 (38.9%) | 132 (39.5%) | 6 (5.7%) | 78 (2.0%) | 0.87 | 1.04 | 0.19 |
| PgR status§ | |||||||
| Positive; n (%) | 271 (81.1%) | 278 (83.2%) | 14 (13.2%) | 516 (13.6%) | 1.86 | 1.94 | 0.01 |
| Negative; n (%) | 55 (16.5%) | 49 (14.7%) | 92 (86.7%)¶ | 111 (2.9%) | 0.58 | 0.43 | 0.15 |
| Other/missing; n (%) | 8 (2.4%) | 7 (2.1%) | 3179 (83.5%) | 3.09 | 2.89 | 0.06 | |
| Hormone-receptor status# | |||||||
| ER-positive and PgR-positive; n (%) | 269 (80.5%) | 277 (82.9%) | 14 (13.2%) | 509 (13.4%) | 1.83 | 1.94 | 0.01 |
| Other/unknown/missing; n (%) | 65 (19.5%) | 57 (17.1%) | 92 (86.8%) | 3,297 (86.6%) | |||
| ECG in previous year; n (%) | 12 (11.3%) | 407 (10.7%) | N/A | N/A | 0.02 | ||
| Opioids in previous year; n (%) | 35 (33.0%) | 1301 (34.2%) | N/A | N/A | 0.03 | ||
| Elixhauser Comorbidity Index Score; mean (SD)†† | 14.74 (9.43) | 9.37 (9.77) | N/A | N/A | 0.56 | ||
| Median [IQR] | 15.00 [9.50, 22.00] | 10.00 [0.00, 18.00] | |||||
| Liver involvement in RCT; diagnosis of liver metastasis†† | |||||||
| No; n (%) | 275 (82.3%) | 262 (78.4%) | 101 (95.3%) | 3749 (98.5%) | 0.42 | 0.66 | 0.19 |
| Yes; n (%) | 59 (17.7%) | 72 (21.6%) | 5 (4.7%) | 57 (1.5%) | |||
| Lung involvement in RCT; diagnosis of lung metastasis in RWD†† | |||||||
| No; n (%) | 181 (54.2%) | 185 (55.4%) | 95 (89.6%) | 3689 (96.9%) | 0.86 | 1.11 | 0.29 |
| Yes; n (%) | 153 (45.8%) | 149 (44.6%) | 11 (10.4%) | 117 (3.1%) | |||
| Liver or lung involvement in RCT; diagnosis of liver or lung metastasis in RWD†† | |||||||
| No; n (%) | 152 (45.5%) | 144 (43.1%) | 91 (85.8%) | 3644 (95.7%) | 0.94 | 1.39 | 0.35 |
| Yes; n (%) | 182 (54.5%) | 190 (56.9%) | 15 (14.2%) | 162 (4.3%) | |||
| Bone lesion only in RCT; diagnosis of bone metastasis in RWD††,‡‡ | |||||||
| No; n (%) | 265 (79.3%) | 255 (76.3%) | 66 (62.3%) | 3533 (92.8%) | 0.38 | 0.47 | 0.79 |
| Yes; n (%) | 69 (20.7%) | 79 (23.7%) | 40 (37.7%) | 273 (7.2%) | |||
| Newly diagnosed metastatic disease in RCT; diagnosis of metastasis in RWD†† | |||||||
| No; n (%) | 220 (65.9%) | 221 (66.2%) | 16 (15.1%) | 2102 (55.2%) | 1.21 | 0.23 | 0.93 |
| Yes; n (%) | 114 (34.1%) | 113 (33.8%) | 90 (84.9%) | 1704 (44.8%) | |||
†
Assessed from the start of available data through 366 days prior to index date in RWD population.
‡
Assessed as the maximum recorded ECOG measurement in the 90 days prior to the index date through the 90 days after the index date in RWD population.
§
Assessed as the most recent recorded positive or negative PgR interpretation in the 365 days prior to the index date through the index date in RWD population.
¶
Rows are combined for reporting due to small cell size.
#
Assessed as the most recent recorded positive or negative PgR and ER interpretations in the 365 days prior to the index date through the index date in RWD population.
††
Assessed in the 365 days prior to the index date through the index date in RWD population.
‡‡
The other sites of metastasis considered were liver and intrahepatic bile duct, lung, and axilla and upper limb lymph nodes in RWD population.
Bold values are ASDs greater than 0.10, which indicate an imbalance between the groups.
ASD: Absolute standardized difference; ECOG: Eastern Cooperative Oncology Group; ER: Estrogen receptor; PgR: Progesterone receptor; RCT: Randomized controlled trial; RWD: Real-world data.
| RWD population | ||||
|---|---|---|---|---|
| Ribociclib + letrozole | Historical Letrozole Group 1 | Historical Letrozole Group 2 | Contemporaneous letrozole | |
| Patients, n | 106 | 465 | 887 | 3806 |
| Surgery; n (%) | 23 (21.7%) | 166 (35.7%) | 458 (51.6%) | 2179 (57.3%) |
| Radiation therapy; n (%) | 9 (8.5%) | 125 (26.9%) | 177 (20.0%) | 656 (17.2%) |
| Secondary nonbreast malignant neoplasm; n (%) | 90 (84.9%) | 285 (61.3%) | 443 (49.9%) | 1704 (44.8%) |
| Secondary malignant neoplasm of bone; n (%) | 65 (61.3%) | 96 (20.6%) | 149 (16.8%) | 443 (11.6%) |
| Secondary malignant neoplasm of liver and intrahepatic bile duct; n (%) | 5 (4.7%) | 5 (1.1%) | 21 (2.4%) | 57 (1.5%) |
| Secondary malignant neoplasm of lung; n (%) | 11 (10.4%) | 11 (2.4%) | 28 (3.2%) | 117 (3.1%) |
RWD: Real-world data.
Comparison of treatment groups within the RWD population
While age was similar between the two treatment groups in RWD population, there were differences noted in race and US Census region: a larger proportion of ribociclib plus letrozole patients were Asian and a larger proportion were from the Midwest compared with letrozole patients (Table 1). There were also differences noted in the index year with smaller proportions of ribociclib plus letrozole patients entering the analysis in 2020 and 2021. Ribociclib plus letrozole patients had a larger comorbid disease burden as evidenced by a higher mean comorbidity score (Table 2). There were smaller proportions of patients with previous chemotherapy, surgery, and radiation compared with letrozole patients. A diagnosis code indicating metastatic disease in the past year was nearly twice as common among patients treated with ribociclib plus letrozole than patients treated with letrozole (85 vs 45%).
Comparisons of key characteristics between the ribociclib plus letrozole group and the three letrozole comparator groups are shown in Table 3. Surgery and radiation therapy were consistently more common in all three letrozole groups compared with the ribociclib plus letrozole group. A diagnosis of any metastasis (secondary nonbreast malignant neoplasm) was more common among ribociclib plus letrozole group compared with all three of the letrozole comparator groups; however, the proportion of women in the letrozole comparator group with any metastasis decreased over time, with the highest proportion of women with any metastases among Historical Letrozole Group 1. The same trend was observed for bone metastasis. Characteristics after weighting and matching, along with ASDs, are shown in Supplementary Tables A3 & A4 for primary comparison of ribociclib plus letrozole versus the contemporaneous letrozole group and the comparison within the subgroup of women with a secondary nonbreast malignancy. In the primary comparison, most characteristics were considered balanced with ASDs < 0.1; however, it is notable that prior treatments, comorbidity score, and diagnosis of secondary nonbreast malignancy were not balanced. In part, this prompted the subgroup analysis where we still found imbalances in certain disease-related characteristics such as prior surgical intervention and type of secondary malignancy. Characteristics for the historical letrozole groups after weighting and matching are shown in Supplementary Tables A5 & A6 and similar trends were noted in terms of imbalances.
Comparisons of OS
In the MONALEESA-2 trial, a significant survival benefit was found for women treated with ribociclib plus letrozole compared with letrozole plus placebo (hazard ratio [HR] = 0.76, 95% CI: 0.63–0.93). However, in the RWD emulation, treatment with ribociclib plus letrozole was associated with an increased risk of all-cause mortality with a HR of 3.07 (95% CI: 1.89–5.01) in the crude analysis (Table 4). The HR moved closer to the null after adjusting for baseline characteristics but was still above 1.0. The HRs were consistently >1.0 across all post hoc analyses. Kaplan-Meier curves from the primary comparisons of ribociclib plus letrozole versus contemporaneous and historical letrozole groups can be found in Supplementary Figures A1–A6. There is evidence that the assumptions are not met as we see divergence of the curves as time progresses. In the MONALEESA-2 trial, a divergence of the curves between ribociclib plus letrozole and letrozole plus placebo was also seen at around 2 years where ribociclib plus letrozole begins to look protective against all-cause mortality.
| Post hoc analysis | ||||
|---|---|---|---|---|
| Ribociclib + Letrozole vs Contemporaneous Letrozole | Ribociclib + Letrozole vs Historical Letrozole Group 1 | Ribociclib + Letrozole vs Historical Letrozole Group 2 | ||
| HR (95% CI) | HR (95% CI) | HR (95% CI) | ||
| Primary analysis | Crude | 3.07 (1.89, 5.01) | 2.57 (1.51, 4.38) | 2.66 (1.60, 4.43) |
| IPTW | 2.75 (1.62, 4.69) | 2.69 (1.49, 4.86) | 2.81 (1.53, 5.16) | |
| PS-matched | 1.50 (0.76, 2.98) | 2.27 (1.12, 4.61) | 2.07 (0.99, 4.33) | |
| Post hoc analysis: subgroup of patients with secondary nonbreast malignancy | Crude | 1.65 (0.94, 2.89) | 1.61 (0.88, 2.95) | 1.57 (0.88, 2.82) |
| IPTW | 1.18 (0.65, 2.14) | 1.69 (0.88, 3.25) | 1.55 (0.78, 3.09) | |
| PS-matched | 1.31 (0.60, 2.88) | 1.37 (0.67, 2.83) | 2.61 (1.10, 6.20) | |
| Post hoc analysis: starting follow-up on day 61 | Crude | 6.95 (4.87, 9.93) | 2.45 (1.42, 4.22) | 2.56 (1.52, 4.32) |
| IPTW | 3.01 (1.74, 5.22) | 6.63 (3.40, 12.92) | 3.20 (1.71, 6.00) | |
| PS-matched | 1.44 (0.68, 3.02) | 3.35 (0.96, 11.72) | 2.32 (1.10, 4.88) | |
HR: Hazard ratio; IPTW: Inverse probability of treatment weighting; PS: Propensity score.
Discussion
In this emulation of MONALEESA-2 using RWD, we found an HR in the opposite direction of the HR observed in the trial. Despite several post hoc analyses, we were unable to replicate the survival benefit found in the RCT. We did find that several exclusion criteria resulted in large reductions in sample size, such as excluding patients with prior systemic anti-cancer therapy in the prior year, active cardiac disease or a history of cardiac dysfunction, and use of prohibited medication, but we were unable to test whether relaxing those criteria altered the effect of ribociclib plus letrozole on OS. We hypothesize that data source limitations, study design choices and lack of a valid active comparator group influenced our study findings.
It is important to note that the MONALEESA-2 trial has also been emulated in RWD using registry data from a French cohort of women with metastatic breast cancer [15]. Patients initiating ribociclib and letrozole within 4 months of diagnosis of metastatic disease were compared with patients initiating letrozole without ribociclib within 4 months of diagnosis of metastatic disease. The letrozole with ribociclib patients were a mix of patients who initiated therapy before and after ribociclib approval. That data source had more complete clinical information than Optum® Market Clarity, including ECOG for nearly half of the patients (compared with ~10% in our analysis), and the researchers used multiple imputation methods to impute missing data. Death was certified by a physician and reported to the registry or extracted from a national database, whereas death information in Optum® Market Clarity may be less complete. During a median follow-up of 75 months, treatment with ribociclib was associated with an even larger survival benefit than what was found in the trial (HR = 0.55, 95% CI: 0.39–0.79). We suspect that differences in available data elements are the primary driver of the difference in effect estimates between the registry-based analysis and our analysis; differences in the amount of follow-up time may play a role as well.
Data source limitations
One challenge in emulations is identifying the most appropriate data sources, a process in which researchers must weigh the benefits and limitations of available data to arrive at the most fit-for-purpose dataset for their research question. Considerations include availability and completeness of clinical variables, recency of data and cost of acquiring the data source [11]. Often data feasibility assessments rely on available meta-data such as data dictionaries because it is not always possible to obtain sample data or detailed counts/explorations from data vendors. This was the case in this analysis and therefore, the level of missingness for key variables was not understood until this data source was already selected. This is a major challenge in RWD when purchasing commercialized datasets. In the event more detailed data explorations are possible, they should be conducted to determine the distributions and amount of missingness in key variables.
The outcome of interest in the emulation was OS. OS is a commonly used outcome in trial emulations because tumor response outcomes are not assessed in routine clinical practice as they are in trials. Work is ongoing to develop algorithms to capture these other outcomes in EHR and other sources of RWD [19]. In this analysis, we observed an absolute probability of death of 7%, which was much lower than expected based on the trial (absolute probability of death of 54% in the ribociclib arm and 66% in the placebo group). This could be due in part to incomplete follow-up and capture of death in the sources included in Optum® Market Clarity. Complete capture of death is challenging in RWD sources like claims and EHR, especially deaths occurring outside of a hospital setting. While there are data sources with more complete death information, there are trade-offs when utilizing such data sources. For example, in the US, the National Death Index is considered the gold standard for measuring mortality; however, there is a long data lag and a time-consuming process to obtain and link the data to other real-world sources with the relevant study variables. To our knowledge, there are no publicly available estimates of the completeness of mortality information in Optum® Market Clarity.
Because a small number of the patients included in this analysis had information in the oncology-specific tables created using NLP, we relied heavily on the EHR and claims information to operationalize the trial inclusion/exclusion criteria and patient characteristics. For example, the trial included only patients with advanced breast cancer defined as those with “locoregionally recurrent or metastatic breast cancer not amenable to curative therapy”. In our RWD source, detailed staging information was not available; therefore, we included patients who met one of two accepted claims-based algorithms. The first required at least two records with a breast cancer diagnosis at least 30 days apart followed by at least two records with a diagnosis for secondary nonbreast malignancy at least 30 days apart to define patients with metastatic disease [12] while the second defined recurrent disease as meeting at least one of the following criteria: at least 1 record for a secondary nonbreast malignancy greater than 180 days after a diagnosis of primary breast cancer, a mastectomy greater than 180 days after a diagnosis of primary breast cancer or radiation therapy after a diagnosis of primary breast cancer [13]. Therefore, it is possible we captured a slightly different patient population than the trial population. Similarly, we noted that a large number of real-world patients were excluded from our sample based on having prior systemic anticancer therapy in the prior year to initiating ribociclib plus letrozole or letrozole. The trial excluded patients with prior systemic anti-cancer therapy for advanced breast cancer, except for (neo) adjuvant therapy and <=14 days of letrozole or anastrozole treatment. Given the complexities of identifying adjuvant therapy in this RWD source, our exclusion likely removed patients who would have been trial-eligible.
We also noted high levels of missingness in biomarkers and ECOG performance status. In analyses of treated patients, lack of information on biomarker status may be less impactful given the medication's indication. In this case, ribociclib plus letrozole is only approved for HR-positive, HER-negative cancer. With the exception of potential off-label use, we are fairly confident of the biomarker status of women treated with ribociclib but far less so for the women treated with letrozole without evidence of ribociclib treatment. For performance status, only 10% of ribociclib plus letrozole patients had ECOG recorded in the +/-3 months around treatment initiation. ECOG is an important predictor of survival [20] and therefore, the substantial level of missingness in this analysis is a key limitation. Since medical record data are captured during clinical practice, they are less complete than clinical trial data regarding factors such as performance status. This highlights the importance of exploratory analyses during the feasibility stage to confirm the extent and pattern of missingness of key variables in the patient population of interest (which may be lower or higher than the general missingness reported for the data source as a whole). While some RWD sources with more complete notes or those that are oncology-specific may have better capture of ECOG status if providers record that information, real-world analyses conducted in claims data sources need to rely entirely on algorithms to predict performance status [21].
Study design
The primary rationale for conducting this analysis was to determine if there was an impact to the HR for OS by relaxing certain trial inclusion/exclusion criteria that may be overly restrictive. Therefore, we identified patients for cohort entry at the time of drug initiation for follow-up and capture of death. With this study design, we were able to compare the patients who received ribociclib plus letrozole in the real world to those in the trial in terms of a range of characteristics, including those that were not trial entry criteria. However, we were not able to adequately answer other questions about the generalizability of the trial population to the broader group of real-world patients who would be eligible for a trial or describe what types of patients were excluded. To address this question, patients would need to be selected first based on evidence of advanced HR-positive, HER2-negative breast cancer diagnosis, rather than treatment with ribociclib. Within that group of patients, we could have assessed the prevalence of conditions that were used to include/exclude patients from the trial.
For the design of the emulation itself, the exposure group was assigned based on the evaluation of all prescriptions/claims on or within 60 days of the first evidence of ribociclib or letrozole. In studies where the primary inclusion criterion includes multiple elements (such as initiating both ribociclib and letrozole), researchers should consider immortal time bias. Immortal time is a period of follow-up where the outcome of the study cannot occur because of the exposure definition [22]. In this case, patients who initiated both ribociclib and letrozole needed to live long enough to initiate both therapies. This could result in longer follow-up times for ribociclib plus letrozole patients, biasing the HR toward the null and potentially below 1. While we do not believe immortal time bias impacted our primary analysis because of the direction and magnitude of the HR, we conducted a sensitivity analysis requiring all patients were alive on day 60 after index date and began follow-up on day 61. The HRs after weighting and matching were similar to the primary analysis.
The ability of methods like IPTW and propensity score to control for confounding is determined by the accuracy in which confounding variables are measured and the extent to which all possible confounding variables are measured at all. The lack of availability of clinical information likely resulted in unmeasured confounding. Assuming the HR from the MONALEESA-2 trial of 0.76 is the true causal effect and our HR of 3.07 from the primary model, we calculated an E-value of 7.5 [23,24]. This indicates that unmeasured confounders would need to have a strength of association of at least 7.5 with both treatment with ribociclib plus letrozole and with OS to result in our HR. This seems plausible given the importance of several variables that were either missing or incompletely captured in this data source. One piece of evidence supporting the presence of unmeasured confounding is that after limiting the RWD population to those with a diagnosis of secondary nonbreast malignancy, the HRs were closed to the null, indicating there was uncontrolled confounding in the overall population. Following weighting and matching, balance was not achieved on all characteristics and these differences persisted within the subgroup analysis. While these variables could have further been adjusted for in multivariable regression, given the data source limitations described above, this step was not pursued. Additional useful information includes disease stage and provider and patient preferences.
Comparator group
In this analysis, letrozole without ribociclib was the comparator group, serving as the proxy for letrozole plus placebo. However, the approval of an effective therapy can dramatically change the landscape of treatments available to patients. This can make identifying a valid comparator group difficult. There are currently three CDK4/6 inhibitors approved for HR+, HER2- breast cancer: palbociclib (IBRANCE) approved in 2015 [25], ribociclib, and abemaciclib (Verzenio) approved in 2017 [26]. The approvals of these drugs dramatically altered the treatment landscape for advanced breast cancer and were considered “game-changers”, resulting in changes to the National Comprehensive Cancer Network treatment guidelines [27]. Patients who may be eligible for but do not receive effective therapies after their approval may not be an appropriate comparator to mimic the randomization in an RCT.
We hypothesize that the approval and introduction of these highly effective therapies for women with advanced HR-positive, HER2-negative breast cancer resulted in the substantial differences in the characteristics of RWD patients who received ribociclib plus letrozole and RWD patients who received letrozole. Differences were found in region, index year, comorbidities, previous treatments and diagnoses of metastatic disease. Differences in index year may be explained by the impact of the COVID-19 pandemic on healthcare patterns and comfort of providers in initiating certain therapies without the same level of monitoring as occurred earlier in the pandemic. Metastatic disease was more common among ribociclib plus letrozole patients while treatment with chemotherapies and other modalities such as surgery or radiation were less common. This may indicate that patients with more advanced disease stage and poorer prognosis received ribociclib plus letrozole. In attempt to account for this, we create two historical control groups where patients were selected prior to the approval of any CDK4/6 inhibitors (Historical Letrozole Group 1) and prior to ribociclib (Historical Letrozole Group 2). While the historical control groups were more similar to the ribociclib plus letrozole patients than the contemporaneous letrozole patients on some characteristics, there were still important differences in the prevalence of metastatic disease burden.
Importance of RWD for generating evidence needed for healthcare decision making
Despite the challenges, trial emulation is still important for understanding best practices for robust real-world study design and may be feasible in certain circumstances. In the context of this work, the unexpected results of this analysis demonstrate the importance of conducting feasibility analyses and descriptive studies of the baseline characteristics and treatment patterns of the real-world indicated population prior to doing trial emulation. A more thorough understanding of utilization of ribociclib in the real world would have potentially led to different study design decisions or possibly the conclusion that an emulation was not possible. First, our sample size was impacted by the subsequent expanded approval of ribociclib in pre- and peri-menopausal women. The total number of ribociclib patients in initial sample counts with a prescription/claim for ribociclib included those women as well as post-menopausal women who would have been eligible for MONALEESA-2. Second, the median follow-up time within the RWD population overall was substantially shorter than the trial population (27 vs 80 months) and was only 15 months for patients treated with ribociclib plus letrozole. The Kaplan Meier curve included in the MONALEESA-2 publication shows that the OS benefit of ribociclib began to emerge at approximately 20 months and continued to increase with longer follow-up, as indicated by survival at 5 years and 6 years [16]. This makes it unlikely that we would have found the same HR as the trial even with better balance between the exposure groups on confounders and complete death capture. Third, the duration of therapy was also much shorter in the real world than in the trial: a median of 6 months for the ribociclib plus letrozole group and 5 months for letrozole group compared with 20 months in the RCT. With a shorter duration of exposure in the emulation, patients may not have had enough time to experience the full therapeutic effects of the treatments. Information about sample size, follow-up, and duration of therapy should be obtained through descriptive analyses to provide insights before designing a trial emulation. Outside of trial emulations, there are many other relevant and important research questions that can be answered in RWD. Other important questions that can be answered in RWD include characterizing the patient population receiving ribociclib, describing treatment patterns of women with advanced breast cancer prior to and following the approval of ribociclib, understanding healthcare utilization and costs among women treated with ribociclib and characterizing the rates of safety events. Continuous evidence generation after approval asking new relevant questions with RWD is needed.
Conclusion
Trial emulations are challenging and in some cases, it may not be possible to emulate a pivotal RCT of a previously unapproved therapy (i.e., it is not possible to evaluate the same question as in a pivotal RCT of a previously unapproved therapy) following approval of the investigational therapy. The foundational challenge results from the inevitable changes in treatment patterns that may make it difficult to identify a sufficiently comparable group of untreated patients following approval. Other significant challenges arise from data source limitations and unmeasured confounding. However, ongoing research following approval utilizing RWD is needed, whether in the context of comparative studies or in drug development where information on the current standard of care is key. The success of that research is predicated on a deep understanding of clinical practices and data source capabilities, and ultimately the identification of an appropriate control group, valid outcome ascertainment and other efforts to minimize bias.
Summary points
•
The inclusion and exclusion criteria of clinical trials can lead to the enrollment of participants whose demographic and clinical characteristics differ from patients who ultimately receive the approved therapy in the real world.
•
To assess the potential impact of relaxing overly restrictive inclusion and exclusion criteria, we attempted to emulate the MONALEESA-2 randomized controlled trial evaluating overall survival of women with advanced hormone receptor (HR)-positive, human epidermal growth factor receptor 2 (HER2)-negative breast cancer treated with ribociclib plus letrozole using a database of electronic health records and healthcare claims.
•
There were 132,406 patients who initiated ribociclib plus letrozole or letrozole without ribociclib from 13 March 2017 to 30 September 2023 in the real-world data (RWD).
•
After applying trial inclusion/exclusion criteria and requiring baseline observability, the sample size was reduced to 3912 patients, including only 106 ribociclib plus letrozole-treated patients.
•
Trial participants tended to be younger than real-world patients, have prior tamoxifen treatment and have liver or lung involvement.
•
In the RWD, there were significant differences between women receiving ribociclib plus letrozole and women receiving letrozole without ribociclib, with those treated with ribociclib plus letrozole having more metastatic disease.
•
The MONALEESA-2 randomized controlled trial found that participants treated with ribociclib plus letrozole had significantly better overall survival than participants treated with letrozole plus placebo, but we could not replicate this finding in the RWD.
•
Lessons from our emulation attempt are related to additional efforts needed to confirm feasibility because proceeding with emulation. These efforts include exploratory analyses of the selected RWD source to ensure appropriate capture of key variables and descriptive studies to understand the real-world patient population characteristics and treatment landscape prior to undertaking an emulation.
•
Ongoing research in RWD is needed after approvals of new therapies to understand drivers of treatment decisions and identify and address questions about effectiveness and safety beyond those studied in the pivotal trial(s).
Author contributions
Author A Jaksa was responsible for study conception. Authors AM Kong, D Andrean, R Mamtani, R Parikh, A Jaksa and U Campbell were responsible for study design. Authors AM Kong, D Andrean, S Khan and J Choi were responsible for data analysis. All authors interpreted the study findings. Author AM Kong was responsible for drafting the manuscript. All authors critically reviewed manuscript drafts and approved of the final version of the manuscript.
Financial disclosure
This work was supported by a grant from Arnold Ventures. The authors have received no other financial and/or material support for this research or the creation of this work apart from that disclosed.
Competing interests disclosure
Authors AM Kong, D Andrean and U Campbell are employees of Aetion and hold stock in Aetion, which received funding to conduct this analysis and provides consulting services to biopharma companies. Authors SK and JC are employees of Aetion. Author A Jaksa was an employee of Aetion when the study was completed. A Jaksa is now an employee of Target RWE. A Jaksa holds stock in CVS Health, has received honoraria and meeting travel costs from the University of Illinois at Chicago and Academy of Managed Care Pharmacy, and has received support from the following organizations to attend meetings: RWE4Decisions and SUSTAIN-HTA, within the past 36 months. Author R Mamtani has received research grants from Aetion, Merck, Astellas, consulting fees from Merck, BMS, Astellas and Seagen, and payment for expert testimony from King and Spalding and McBreyer Firm within the past 36 months. Author R Parikh received a grant from Aetion. The authors have no other competing interests or relevant affiliations with any organization or entity with the subject matter or materials discussed in the manuscript apart from those disclosed.
Writing disclosure
No medical writing support or AI-assisted technologies were used.
Ethical conduct of research
The data analyzed are previously collected secondary use data that are de-identified. The analysis was granted IRB exemption. The principles outlined in the Declaration of Helsinki have been followed.
Open access
This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Supplementary Material
File (supplementary data.docx)
- Download
- 584.37 KB
References
Papers of special note have been highlighted as: • of interest
1.
Blonde L, Khunti K, Harris SB, Meizinger C, Skolnik NS. Interpretation and impact of real-world clinical data for the practicing clinician. Adv. Ther. 35(11), 1763–1774 (2018).
2.
Barnish MS, Turner S. The value of pragmatic and observational studies in health care and public health. Pragmatic. Obs. Res. 8, 49–55 (2017).
3.
Fortin M, Dionne J, Pinho G, Gignac J, Almirall J, Lapointe L. Randomized controlled trials: do they have external validity for patients with multiple comorbidities? Ann. Fam. Med. 4(2), 104–108 (2006).
4.
Kim ES, Bruinooge SS, Roberts S et al. Broadening eligibility criteria to make clinical trials more representative: American Society of Clinical Oncology and Friends of Cancer Research joint research statement. J. Clin. Oncol. 35(33), 3737–3744 (2017).
• Discusses working group consensus recommendations to improve the inclusiveness of clinical trials.
5.
Kim ES, Uldrick TS, Schenkel C et al. Continuing to broaden eligibility criteria to make clinical trials more representative and inclusive: ASCO-Friends of Cancer Research joint research statement. Clin. Cancer Res. 27(9), 2394–2399 (2021).
6.
Aldrighetti CM, Niemierko A, Van Allen E, Willers H, Kamran SC. Racial and ethnic disparities among participants in precision oncology clinical studies. JAMA Netw. Open 4(11), e2133205 (2021).
7.
Riner AN, Girma S, Vudatha V et al. Eligibility criteria perpetuate disparities in enrollment and participation of black patients in pancreatic cancer clinical trials. J. Clin. Oncol. 40(20), 2193–2202 (2022).
8.
Jin S, Pazdur R, Sridhara R. Re-evaluating eligibility criteria for oncology clinical trials: analysis of investigational new drug applications in 2015. J. Clin. Oncol. 35(33), 3745–3752 (2017).
9.
Food and Drug Administration. Diversity action plans to improve enrollment of participants from underrepresented populations in clinical studies: guidance for industry. (2024). Available from: https://www.regulations.gov/document/FDA-2021-D-0789-0111
10.
Kim ES, Bernstein D, Hilsenbeck SG et al. Modernizing eligibility criteria for molecularly driven trials. J. Clin. Oncol. 33(25), 2815–2820 (2015).
11.
Gatto NM, Vititoe SE, Rubinstein E, Reynolds RF, Campbell UB. A structured process to identify fit-for-purpose study design and data to generate valid and transparent real-world evidence for regulatory uses. Clin. Pharmacol. Ther. 112(6), 1235–1239 (2023).
• Provides a guide to identify fit-for-purpose study design and data for researchers to describe rationale for study design/data choices and communicate those decisions transparently.
12.
Chubak J, Yu O, Pocobelli G et al. Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer. J. Natl Cancer Inst. 104(12), 931–940 (2012).
13.
Liang C, Li L, Fraser CD et al. The treatment patterns, efficacy, and safety of nab (®)-paclitaxel for the treatment of metastatic breast cancer in the United States: results from health insurance claims analysis. BMC Cancer 15, 1019 (2015).
14.
Food and Drug Administation. Ribociclib (Kisqali). (2017). Available from: https://www.fda.gov/drugs/resources-information-approved-drugs/ribociclib-kisqali
15.
Antoine A, Perol D, Robain M et al. Assessing the real-world effectiveness of 8 major metastatic breast cancer drugs using target trial emulation. Eur. J. Cancer 213, 115072 (2024).
• Emulates eight breast cancer clinical trials using French registry data.
16.
Hortobagyi GN, Stemmer SM, Burris HA et al. Overall survival with ribociclib plus letrozole in advanced breast cancer. N. Engl. J. Med. 386(10), 942–950 (2022).
• Presents the pivotal overall survival findings from the MONALEESA-2 trial.
17.
Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat. Med. 28(25), 3083–3107 (2009).
18.
Food and Drug Administration. FDA expands ribociclib indication in HR-positive, HER2-negative advanced or metastatic breast cancer. (2018). Available from: https://www.fda.gov/drugs/resources-information-approved-drugs/fda-expands-ribociclib-indication-hr-positive-her2-negative-advanced-or-metastatic-breast-cancer#:∼:text=On%20July%2018%2C%202018%2C%20the,NSAI%20or%20tamoxifen%20and%20goserelin
19.
McKelvey BA, Garrett-Mayer E, Rivera DR et al. Evaluation of real-world tumor response derived from electronic health record data sources: a feasibility analysis in patients with metastatic non-small cell lung cancer treated with chemotherapy. JCO Clin. Cancer Inform. 8, e2400091 (2024).
• This collaborative effort assesses real-world response in multiple electronic health record data sources.
20.
da Silva SHK, de Oliveirazqz LC, da Mota e Silva Lopes MS, Wiegert EVM, Motta RST, Peres WAF. The patient generated-subjective global assessment (PG-SGA) and ECOG performance status are associated with mortality in patients hospitalized with breast cancer. Clin. Nutr. ESPEN 53, 87–92 (2023).
21.
Salloum RG, Smith TJ, Jensen GA, Lafata JE. Using claims-based measures to predict performance status score in lung cancer patients. Cancer 117(5), 1038–1048 (2010).
22.
Suissa S. Immortal time bias in pharmaco-epidemiology. Am. J. Epidemiol. 167(4), 492–499 (2008).
• This seminal paper describes the potential impact of immortal time bias in comparative analyses.
23.
Mathur MB, Ding P, Riddell CA, VanderWeele TJ. Website and R package for computing E-values. Epidemiology 29(5), e45–e47 (2018).
24.
VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann. Intern. Med. 167(4), 268–274 (2017).
25.
Food and Drug Administration. Palbociclib (IBRANCE). (2017). Available from: https://www.fda.gov/drugs/resources-information-approved-drugs/palbociclib-ibrance
26.
Food and Drug Administration. FDA approves abemaciclib for HR-positive, HER2-negative breast cancer. (2017). Available from: https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-abemaciclib-hr-positive-her2-negative-breast-cancer
27.
Giordano SH, Elias AD, Gradishar WJ. NCCN guidelines updates: breast cancer. J. Natl Compr. Canc. Netw. 16(Suppl. 5), 605–610 (2018).
Information & Authors
Information
Published In
Copyright
© 2025 The authors. This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License
History
Received: 10 April 2025
Accepted: 18 September 2025
Published online: 28 October 2025
Keywords:
Topics
Authors
Metrics & Citations
Metrics
Article Usage
Article usage data only available from February 2023. Historical article usage data, showing the number of article downloads, is available upon request.
Citations
How to Cite
Considerations for emulations of randomized controlled trials using real-world data: learnings from an emulation of MONALEESA-2. (2025) Journal of Comparative Effectiveness Research. DOI: 10.57264/cer-2025-0026
Export citation
Select the citation format you wish to export for this article or chapter.
Citing Literature
- Paul Arora, Sreeram V Ramagopalan, R WE ready for reimbursement? A round-up of developments in real-world evidence relating to health technology assessment: part 24, Journal of Comparative Effectiveness Research, 10.57264/cer-2026-0019, 15, 3, (2026).
