Skip to main content
The Evidence Base Post

From potential to practice: perspectives on the evolving use of generative AI in evidence generation and health technology assessment

  • Benjamin Bray, Stephen Duffield, Emma Clifton-Brown & Jack Said
Banner for a guest column by The Evidence Base, titled 'From potential to practice: perspectives on the evolving use of generative AI in evidence generation and health technology assessment.' Features headshots and names of contributors Benjamin Bray (LCP), Stephen Duffield (NICE), Emma Clifton-Brown (Pfizer), and Jack Said (Pfizer). A call-to-action button labeled 'READ HERE' is included.

The integration of generative AI into evidence generation and health technology assessment (HTA) is rapidly advancing as decision-makers, industry, and analytics firms explore its potential to automate laborious processes, enhance productivity, and uncover new insights. What was once a field dominated by theoretical speculation has now evolved into a dynamic area of exploration, with tangible applications demonstrating the value of generative AI in practice.

During a packed theater presentation at ISPOR Europe 2024 (Barcelona, Spain; November 17–20, 2025), experts delved into the evolving role of AI in evidence generation and HTA. Moderated by Benjamin Bray (LCP Health Analytics, UK), the session, titled, “Using Generative AI Methods for Evidence Generation and Health Technology Assessment: Perspectives from NICE and Industry,” featured perspectives from Stephen Duffield (National Institute for Health and Care Excellence [NICE], UK), Emma Clifton-Brown (Pfizer, UK) and Jack Said (Pfizer, UK), who shared insights from NICE and industry standpoints. 


Generative AI for real-world evidence generation

In his presentation, Ben Bray leveraged his expertise in AI and machine learning (ML) to explore the transformative potential of generative AI. Focusing on tools like large language models (LLMs), he described their ability to create new content—text, images, and sounds—as a groundbreaking advancement with far-reaching implications for personal and professional use. Bray characterized generative AI as a ‘game-changer’ and emphasized that its applications are still in their infancy, with vast untapped potential yet to be realized.

Bray explored how generative AI is being applied in health economics and outcomes research (HEOR) and real-world evidence (RWE). Key use cases include:

In the context of systematic literature reviews (SLRs), Bray explained that generative AI approaches human-level capabilities for certain steps but falls short in others. He emphasized that not all components of the process are equally suited to automation. Tasks requiring deep reasoning, such as developing search strategies and full-text article screening, still necessitate human oversight to maintain accuracy and reliability. However, for more manual aspects, such as abstract screening, quality assessment, and data extraction from text and tables, LLMs demonstrate a high degree of automation potential with notable accuracy.

Bray highlighted specific caveats, noting that research presented by LCP at ISPOR Europe 2024 shows that LLMs perform better at screening randomized controlled trials (RCTs) compared to observational studies. This performance disparity mirrors the challenges faced by human reviewers, making it a crucial consideration when applying AI tools to SLRs. Additionally, legal issues related to the use of full-text articles could present further obstacles to broader adoption. He also acknowledged ongoing challenges in areas such as report writing and more complex data extraction tasks (such as data extraction from graphs/figures), where generative AI is currently constrained by quality benchmarks.

Bray highlighted the next use case of generative AI: programming, particularly for real-world data (RWD) analysis and economic model building. He referenced recent advancements, such as Bristol Myers Squibb’s work using AI to automate health economic modeling, as a promising example. Bray predicted that the next wave of applications will focus on real-world data (RWD) analysis for RWE studies, with substantial progress expected within the next 12–18 months.

Addressing challenges, Bray discussed AI hallucinations—errors ranging from minor inaccuracies to critical mistakes presented as fact. To mitigate these risks, he proposed a risk-based framework to evaluate potential use cases. This framework categorizes hallucinations by their impact and frequency, helping determine when automation is appropriate. For high-impact errors, he recommended avoiding automation, while lower-risk issues could be addressed effectively through prompt engineering and testing to ensure reliability and safety.


Perspectives of a HTA body: the NICE example

Stephen Duffield provided a detailed overview of NICE's approach to integrating AI in HTA and evidence generation. NICE has published two cornerstone documents: the Statement of Intent for AI and the AI Position Statement, which guide the submitting organization’s engagement with AI technologies. He highlighted rapid changes in UK healthcare, including NHS pressures, workforce shortages, and growing demands for personalized, data-driven care, as noted in the Sudlow Review. In this context, AI is positioned as a vital tool for addressing challenges and improving efficiency.

Duffield emphasized several AI use cases currently being prioritized by NICE (shown below).

NICE’s AI strategy, as outlined in the Statement of Intent, will focus on three priority areas:

  • Guidance for technology developers on AI-based evidence generation
  • Evaluation of AI-integrated technologies like clinical prediction models
  • Use of AI to enhance internal efficiency

Duffield explained that the statement acknowledges both the opportunities and risks of AI, and outlines a 3-year plan for an agile approach, anticipating new, potentially disruptive, use cases as technologies and infrastructure develop. The statement emphasizes collaboration with experts and stakeholders to develop evidence requirements, pilot tools, and uphold best practices and government standards while maintaining NICE’s core principles.

Focusing on AI for evidence generation, Duffield discussed NICE’s Position Statement, which outlines expectations for using AI methods to generate and report evidence considered in its evaluation programs. While recognizing the numerous potential uses, Duffield emphasized, “we do not believe in AI-exceptionalism.” The position statement therefore underscores the importance of adhering to existing regulations, good practices, and guidelines for systematic reviews, clinical evidence (including RWD), and cost-effectiveness analysis. Duffield emphasized the need for early engagement with NICE when applying novel AI methodologies, particularly those not previously used, to align with best practices and ensure methods are presented effectively.

Duffield recommended starting with explainable, widely accepted methods, noting that companies often revert to traditional approaches due to insufficiently communicated justification for novel techniques. Certain uses are likely to receive greater scrutiny since they are more influential and therefore higher-risk applications, such as AI in causal inference. As such, these are likely to require comprehensive sensitivity analyses, comparisons with alternative methods, and contextual evaluation against other relevant clinical evidence to support plausibility and reliability. The statement also advocates for using tools like the PALISADE checklist to consider when it is appropriate to use AI and alignment with other best-practice frameworks, including NICE’s Real-World Evidence Framework, or use-specific reporting tools such as TRIPOD-AI

NICE’s next steps involve collaborating with AI experts to identify priority areas for developing best practice principles and reporting standards. This includes addressing key considerations such as bias, transparency, reproducibility, and ethical use. Duffield noted that NICE is increasingly engaging in exploratory research and pilot projects to bring knowledge in house, which will enhance its methods manuals and frameworks. Harmonizing standards with global regulatory bodies and HTA agencies is also a key focus, recognizing the international scope of pharmaceutical companies. Additionally, NICE aims to integrate these standards into its NICE Advice Service, creating a feedback loop to promote continuous improvement and advance best practices in AI-driven evidence generation.

To accelerate AI integration, NICE is actively engaging in several relevant pilot projects. Pilots may be supported by its HTA Lab, a policy sandbox for testing AI methods outside live appraisals, currently being used to explore the use of AI in economic modeling. Other initiatives, outside of HTA Lab, include exploring NLP methods to structure unstructured data for analysis, developing good practice frameworks for AI-enabled clinical prediction models, exploring the use of LLMs for data extraction in SLRs, partnering with academic institutions on ML applications for causal inference, and contributing to the EU-funded SYNTHIA consortium on synthetic data.


Perspectives of industry: the Pfizer UK example

During a discussion with Ben Bray, Emma Clifton-Brown provided insights into Pfizer’s efforts to integrate AI into HTA and evidence generation. She highlighted that Pfizer’s value and access team is still in the early stages of adopting generative AI, exploring use cases such as generating insights from large datasets to improve decision-making. These applications aim to free up time for HEOR professionals to focus on strategic tasks such as stakeholder engagement and nuanced problem-solving, as well as mitigating errors associated with manual processes.

Clifton-Brown noted that across the industry, companies are achieving high accuracy when automating several phases of the SLR process; however, she expressed caution about fully automating decision-making, emphasizing the importance of retaining human oversight. She underscored the need for a ‘human in the loop’ to ensure that AI outputs are reliable, ethically sound, and aligned with organizational objectives.

To remain competitive, Clifton-Brown stressed that the industry must embrace AI technologies. She highlighted AI’s potential to enhance productivity and efficiency, particularly in developing HTA submissions, by reducing manual effort, minimizing errors, and optimizing resource allocation. Generative AI, she noted, can help make submissions faster, more cost-effective, and improve resource management and decision-making.

Clifton-Brown also shared three specific use cases where Pfizer is exploring AI.

  1. Real-time insights: Pfizer is analyzing past NICE submissions (with NICE’s permission) to guide future HTA strategies.
  2. Content development: Generative AI is being used to adapt NICE dossiers into SMC dossiers by tailoring shared content, effectively replicating more than half of the content. While AI struggles with complex data tables, graphs, and figures it handles content redrafting efficiently.
  3. Systematic literature reviews: Pfizer is leveraging AI to streamline laborious, resource-intensive, and repetitive tasks such as abstract screening and data extraction, freeing analysts to focus on critical thinking and decision-making.

Future uses of AI: Outlook for 2025 and beyond

Ben Bray concluded that generative AI is a hot topic now with a lot of interest about how this can be used in industry. Many manufacturers are looking at how AI can help to automate existing manual processes in RWE and HEOR, such as systematic literature reviews, programming models and dossier creation. The main prize here are productivity gains through automation. Although there is also a lot of potential for AI to provide a speed boost to current workflows or enable whole new capabilities in the RWE and HEOR space, the big challenge is for developers of these technologies is to demonstrate that generative AI can deliver reliably and to the right level of quality without running into significant technical challenges such as hallucinations.

“I hope that 2025 will be the year that AI moves from promise to scaling up of adoption in specific use cases, particularly for approaches that move beyond generic chatbots to tools which meet the specific requirements of industry and stakeholders in RWE, HTA and HEOR.”

Emma Clifton-Brown emphasized the need for upskilling teams to maximize AI’s potential, ensuring both AI experts and non-technical stakeholders acquire the skills to understand and effectively use AI. She highlighted future applications in economic modeling and patient-reported outcomes (PROs) may help better capture the patient experience.

Stephen Duffield reflected on the significant opportunities AI offers, particularly with ‘human-in-the-loop’ systems to ensure appropriate output. “There’s a huge playground of potential use cases which are low risk or where risks are sufficiently mitigated through fully human-in-the-loop approaches. Particularly, in work that affects healthcare you often want to have someone looking over the shoulder of the AI and checking what the AI is doing,” he said. He pointed to SLRs as a potentially high-impact area for NICE, improving the number of questions that can be answered and the timeliness of NICE guidance. He emphasized the importance of focusing on achievable gains without moving too quickly into higher-risk (though often seemingly more exciting) applications.

Duffield underscored the critical role of collaboration with stakeholders in moving forwards, using transparent approaches, and early engagement with regulatory and HTA bodies like NICE to define best practices.


Authors

Ben Bray
Partner, Evidence Generation Lead, LCP, UK

Ben is a medical doctor and epidemiologist with over 15 years experience in RWE and health data science. He has extensive experience in advanced epidemiology methods (e.g., causal inference) and AI.


Stephen Duffield
Associate Director of Real-World Methods, NICE, UK

Stephen Duffield’s role involves the continuing development of NICE’s RWE framework, collaboration on RWE demonstration projects, and helping to transform NICE’s use of RWD across guidance products. He is also involved with upskilling individuals within and externally to the organization, contributing to training workshops and technical forums. Stephen has a degree in medicine and a PhD in public health. Previously, he worked as a clinical doctor and a guideline developer in NICE Centre for Guidelines.


Emma Clifton-Brown
Head of Health & Value, Pfizer, UK

Emma Clifton-Brown is Head of Access & Value at Pfizer UK, where she leads a team committed to demonstrating the full value of our medicines and vaccines to accelerate and optimise access to Pfizer’s breakthroughs and improve patient outcomes in the UK. Emma has worked in market access and health economics for over 15 years, across UK and global roles and numerous therapy areas. She is passionate about the role that medicines and vaccines can play in creating a healthier and more productive UK population and a healthcare ecosystem that is sustainable, improves outcomes that matter to patients and incentivises innovation. Outside of work, Emma loves to ski, cycle and explore the wild with her husband and two children.


Jack Said
Senior HTA Manager, Pfizer, UK

Jack Said works in the Access & Value team at Pfizer UK, where he is responsible for delivering HTA evidence submissions to NICE and SMC to accelerate and optimise patient access to breakthrough medicines. He has over 7 years’ experience in market access and HEOR and has worked across several therapy areas. His role also involves developing generative AI methodologies for evidence generation, contributing to Pfizer’s vision of leading the industry in AI.


Sponsorship for this Guest Column was provided by LCP Health Analytics.