R WE ready for reimbursement? A round-up of developments in real-world evidence relating to health technology assessment: part 25

Authors: Paul Arora and Sreeram V Ramagopalan [email protected]Author Info & Affiliations

Publication: Journal of Comparative Effectiveness Research

Volume 15, Number 6

https://doi.org/10.57264/cer-2026-0073

Read insights about this article on The Evidence Base

PDF

Abstract

In this update, we discuss the use of real-world data in the Haute Autorité de Santé assessment of economic evaluations, review how real-world evidence is supporting regulatory approvals in multiple myeloma, and consider lessons from the failed EVOKE trials of semaglutide in Alzheimer's disease for the broader application of target trial emulation.

In France, manufacturers of innovative health products – including new drugs, vaccines and medical devices – are required to submit economic evaluations to Haute Autorité de Santé (HAS) when their products meet certain criteria relating to claimed therapeutic benefit and expected budgetary impact [1]. These cost-effectiveness analyses (CEAs) are assessed by the Committee of Economic Evaluation and Public Health (CEESP) and inform price negotiations between the French Economic Committee of Health Products (CEPS) and manufacturers. Ghabri and Sion from HAS present the first systematic analysis of how real-world data (RWD) studies have been incorporated into the CEAs of innovative health products assessed by HAS between January 2016 and May 2023 [2]. Of the 147 CEA assessments conducted during this period, 88% (129/147) incorporated at least one RWD study, with 527 individual RWD studies identified across these submissions. Retrospective cohort studies were the most frequently employed study design, accounting for 28% of all RWD studies, followed by registries (14%) and prospective cohorts (15%). The authors found that RWD was principally used to characterize the analyzed population (35% of use cases), support external model validation (19%) and document the breakdown of comparators (16%). Notably, the use of RWD for quality-of-life estimation was low but increased after 2022. The authors also identified several methodological concerns raised in HAS assessments, including insufficient documentation of external validation of long-term survival outcomes. Approximately 8% of CEAs including RWD were invalidated due to major methodological limitations, such as the use of ECAs from RWD. The authors call for a standardized typology of RWD study designs, a transparent framework for documenting model inputs and assumptions, and greater use of RWD in reassessments to address uncertainty in initial submissions, especially for orphan drugs and advanced therapy medicinal products. The HAS findings resonate with a broader international convergence on how real-world evidence (RWE) should be generated, assessed and reported for HTA and regulatory decision-making. As previously discussed, the number of RWE guidance documents from agencies worldwide has proliferated dramatically [3–5]. For manufacturers, the practical implication of these developments is clear: RWD is no longer optional in economic evaluation – it is expected. However, the HAS analysis demonstrates that the quality and methodological rigor of RWD integration remain highly variable, and that poorly documented or inappropriately applied RWD can invalidate an otherwise well-constructed submission. Manufacturers should therefore plan their RWD strategy early, align study designs with the expectations of the relevant HTA body, and ensure transparent reporting.

The importance of RWE of course extends beyond economic evaluation. Its role in supporting regulatory approval is also expanding, as illustrated by recent developments in multiple myeloma (MM). Taylor and colleagues provide a timely analysis of the role of RWE in supporting regulatory approvals for MM therapies between January 2021 and April 2025 [6]. Multiple myeloma, a rare and incurable hematologic malignancy characterized by frequent relapse and a rapidly evolving standard of care, increasingly challenges the feasibility of traditional randomized clinical trial (RCT) designs, particularly in advanced lines of therapy. The authors found that of 27 drug marketing applications approved by the US FDA and EMA during the study period, 12 (44%) used RWE to support regulatory approval. Among these, eight incorporated evidence from natural history studies and four included external comparator arms (ECAs). The use of RWE was concentrated in advanced lines of therapy, with ten of the 12 approvals involving the fourth line of therapy or higher. Two pre-authorization applications of RWE were highlighted: natural history studies – using longitudinal RWD to demonstrate limited treatment options and unmet need – and ECAs – comparing single-arm trial results with a noninterventional cohort receiving standard care in routine practice. The editorial highlights both the promise and practical challenges of RWE in this therapeutic area. A notable example of the latter comes from the 2021 BLA submission for idecabtagene vicleucel (Abecma™), where the FDA flagged concerns about pooling heterogeneous data sources and differences in follow-up time and response assessment between the trial and RWD study, ultimately concluding the RWD was inadequate for contextualizing trial outcomes. The multiple myeloma experience reinforces the point made in our discussion of the HAS analysis: the guidance frameworks that are converging around what constitutes good RWE are not confined to HTA submissions. The same principles of data relevance, reliability, transparent reporting and rigorous study design apply equally to regulatory submissions. Indeed, the FDA's own guidance on RWD for regulatory decision-making share many of the same expectations articulated by HTA bodies. For manufacturers, this alignment represents an opportunity: RWE generated to a high standard for a regulatory submission can, in principle, be leveraged for subsequent HTA submissions across multiple jurisdictions. Conversely, evidence that falls short of best practice is likely to be rejected by both regulators and HTA agencies alike. Planning RWE strategies early in the product lifecycle, with both regulatory and reimbursement end-uses in mind, should therefore be a priority.

Target trial emulation (TTE) has been widely discussed in installments of this series as it has been explicitly endorsed by major HTA agencies and regulators as a preferred framework for generating comparative effectiveness evidence from observational data [3,7–12]. A case study that warrants particularly careful examination therefore is the application of TTE and other observational methods to the question of whether GLP-1 receptor agonists, and semaglutide specifically, might reduce the risk of dementia and Alzheimer's disease (AD). Multiple real-world studies reported compelling associations, and these findings contributed directly to the decision by Novo Nordisk to pursue large-scale phase III trials. The subsequent failure of these trials requires a closer reading of the underlying observational evidence. The evidence linking GLP-1 receptor agonists to reduced dementia risk has accumulated across several study designs and data sources. The study that arguably catalyzed the field was published by Nørgaard and colleagues in 2022, combining two data sources: pooled individual patient data from three diabetes cardiovascular outcome trials (LEADER, SUSTAIN-6 and PIONEER 6; 15,820 patients) and a Danish nationwide registry-based cohort of 120,054 patients [13]. In the pooled RCT analysis, 15 patients randomized to a GLP-1 and 32 patients randomized to placebo developed dementia during a median follow-up of 3.61 years (HR: 0.47; 95% CI: 0.25–0.86). However, dementia was not a prespecified end point in any of the three trials; rather, it was ascertained through adverse event reporting. The registry component of the Nørgaard study used a nested case-control design in a cohort of diabetes patients receiving a second line treatment, matching 4849 dementia cases to 48,506 controls on age, sex and calendar date. The result was more modest: an HR of 0.89 (95% CI: 0.86–0.93) per year of increased GLP-1 receptor agonist exposure. The authors were unable to adjust for confounders such as body mass index, smoking or physical activity.

Wang and colleagues subsequently conducted a target trial emulation using the TriNetX electronic health record platform, a federated US database covering approximately 113 million patients across 64 healthcare organizations [14]. Among 1,094,761 eligible patients with Type 2 diabetes and no prior AD diagnosis, 17,104 new users of semaglutide were separately compared with new users of seven other antidiabetic medications over a 3-year follow-up. After propensity-score matching on more than 50 baseline covariates, semaglutide was associated with a 40–70% reduction in first-time AD diagnosis, with hazard ratios ranging from 0.33 compared with insulin to 0.59 compared with other GLP-1 receptor agonists. However, the Kaplan–Meier curves began to separate within the first 30 days of treatment initiation which likely reflects residual confounding [15]. Second, the largest effect sizes were seen against comparators such as insulin (HR: 0.33) and sulfonylureas (HR: 0.31), while the smallest effect was against other GLP-1 receptor agonists (HR: 0.59) – a pattern consistent with confounding by disease severity and treatment indication rather than a specific neuroprotective effect of semaglutide.

Tang and colleagues conducted a target trial emulation using the OneFlorida+ Clinical Research Consortium, which integrates electronic health records from approximately 17 million patients across Florida, Georgia and Alabama [16]. Among 396,963 eligible patients with Type 2 diabetes aged 50 years or older, the authors constructed three comparison cohorts: 33,858 patients in the GLP-1 receptor agonist versus other glucose-lowering drug cohort, 34,185 in the SGLT2 inhibitor versus other glucose-lowering drug cohort, and 24,117 in the head-to-head GLP-1 receptor agonist versus SGLT2 inhibitor cohort. GLP-1 receptor agonist use was associated with a 33% lower risk of ADRD compared with other second-line glucose-lowering drugs (HR: 0.67; 95% CI: 0.47–0.96). Notably, SGLT2 inhibitors showed a comparable reduction (HR: 0.57; 95% CI: 0.43–0.75), and there was no significant difference between the two drug classes (HR: 0.97; 95% CI: 0.72–1.32). The finding that SGLT2 inhibitors – which act through an entirely different mechanism – showed a similar or even greater apparent reduction in dementia risk raises important questions about whether the observed associations reflect a genuine neuroprotective drug effect or are instead driven by shared confounding factors. It is also notable that the cumulative incidence curves in the Tang study began to separate early in the follow-up period, a pattern the authors themselves acknowledged could partly reflect the fact that GLP-1 receptor agonist and SGLT2 inhibitor users were younger than comparator groups, potentially overestimating the protective association.

Perhaps the most methodologically rigorous study comes from Inoue and colleagues, who used a 20% random sample of US Medicare fee-for-service beneficiaries to compare GLP-1 receptor agonists with dipeptidyl peptidase-4 inhibitors as active comparators in older adults with Type 2 diabetes [17]. 2418 patients starting GLP-1 receptor agonists were propensity-score matched in a 1:2 ratio to 4836 patients starting DPP4s. Notably, Inoue and colleagues provided the most transparent specification of the target trial emulation framework among the studies reviewed here. Methodological transparency is a hallmark of rigorous TTE. The authors found no clear overall difference in dementia incidence between the two groups, with a risk ratio of 0.83 (95% CI: 0.61–1.05) at 30 months.

Novo Nordisk explicitly stated that the decision to pursue an AD indication was based on RWE studies, pre-clinical models and post-hoc analyses from diabetes and obesity trials [18]. In November 2025, the company announced that the phase III EVOKE and EVOKE+ trials – involving 3808 adults with early-stage symptomatic AD confirmed by amyloid positivity – did not demonstrate superiority of semaglutide over placebo in reducing disease progression, as measured by the change in Clinical Dementia Rating – Sum of Boxes (CDR-SB) [18]. The disconnect between the observational findings and the trial results is instructive, and several factors may explain it. First, and most fundamentally, there is a critical difference between the end points measured. The observational studies examined the incidence of a new dementia or AD diagnosis – essentially a prevention end point – in populations of Type 2 diabetes patients who did not have dementia at baseline. The EVOKE trials measured slowing of disease progression in patients who already had confirmed early-stage AD with amyloid pathology. Preventing the onset of cognitive decline through metabolic and vascular risk factor modification is a fundamentally different proposition to halting or reversing established neurodegenerative pathology. Second, the choice of comparator in the observational studies profoundly influences the magnitude of the observed associations. Studies comparing GLP-1 receptor agonists against insulin or sulfonylureas reported the largest reductions. When DPP4 was the comparator, the overall effect was attenuated to non-significance. This pattern is consistent with confounding by indication. Third, the observational studies are susceptible to forms of bias that are difficult to fully address even with sophisticated analytical methods. An early separation of Kaplan–Meier survival curves can be a telltale sign of residual confounding or selection bias [15]. Fourth, the population studied in the observational analyses differs markedly from the trial population. The observational studies primarily included patients with Type 2 diabetes, many of whom were overweight or obese. The EVOKE trials enrolled patients with biomarker-confirmed amyloid pathology and early-stage symptomatic disease, regardless of diabetic status.

The EVOKE case carries important implications for the TTE framework and for the use of RWE to support investment in clinical development programmes. TTE is a valuable methodological approach, but it cannot overcome data-inherent biases arising from unmeasured confounding, incomplete variable capture, or outcome misclassification. In the semaglutide–dementia literature, several of the studies were methodologically well-designed within the constraints of their data, but the fundamental question being answered – prevention of incident dementia diagnosis in diabetic populations – was different from the question posed by the EVOKE trial – slowing of progression in established, biomarker-confirmed AD. The Alzheimer's Association and others have noted that the EVOKE results do not rule out a role for GLP-1 receptor agonists in dementia prevention, as distinct from treatment of established disease [19]. The question of whether semaglutide or other GLP-1 receptor agonists might reduce the incidence of dementia in at-risk populations – the question actually addressed by the observational studies – remains open. For manufacturers considering the use of TTE to support HTA submissions, the emphasis must shift from simply generating more RWE to generating better RWE: evidence that is methodologically rigorous, transparently reported and aligned with the expectations of the decision-makers who will ultimately use it.

Financial disclosure

Author SV Ramagopalan has received an honorarium from Becaris Publishing for the contribution of this work. The authors have received no other financial and/or material support for this research or the creation of this work apart from that disclosed.

Competing interests disclosure

The authors have no competing interests or relevant affiliations with any organization or entity with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

Writing disclosure

No writing assistance was utilized in the production of this manuscript.

Open access

This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/

References

Ramagopalan SV, Pannelay AJ. Access in all areas? A round up of developments in market access and health technology assessment: part 9. J. Comp. Eff. Res. 14(10), e250120 (2025).