Skip to main content
The Evidence Base Post

SYNTHIA project initiated to progress use of synthetic data for personalized medicine

  • Joanne Walker

SYNTHIA, the new public–private partnership funded by the Innovative Health Initiative (IHI), seeks to overcome the critical need for privacy-preserving data solutions in health care by dev...

SYNTHIA, the new public–private partnership funded by the Innovative Health Initiative (IHI), seeks to overcome the critical need for privacy-preserving data solutions in health care by developing validated tools and methods for synthetic data generation.

A new initiative, funded by the IHI, has been launched to harness the potential of synthetic data for advancing personalized health care. The project, titled SYNTHIA (Synthetic Data Generation Framework for Integrated Validation of Use Cases and AI Healthcare Applications), commenced with a kickoff meeting held on September 9–10, 2024, in Valencia, Spain. The project is coordinated by the Health Research Institute La Fe (IISLaFe), in partnership with 32 consortium members, including synthetic data generation specialists, data scientists, clinical researchers, and regulatory and policy advocates.

Synthetic data, AI-generated patient datasets that mimic real-world data (RWD), are being increasingly utilized in healthcare research to overcome challenges such as limited access to real patient data, data privacy concerns, and the need for more representative datasets. By providing diverse datasets without compromising patient privacy, synthetic data enables researchers to train and validate machine learning models, simulate clinical scenarios, and address gaps in RWD. However, challenges persist related to the quality, trust and responsible use of synthetic data.

SYNTHIA will aim to address these issues by creating validated tools and methodologies for generating synthetic data across multiple data domains, including laboratory test results, clinical documentation, genomic data, and medical imaging. A core element of the SYNTHIA project will be a federated platform, offering tailored synthetic data generation workflows, frameworks to evaluate data privacy, quality, and applicability and labeled datasets to ensure suitability for research. The project will initially focus on six specific diseases where personalized medicine can improve patient care: lung cancer, breast cancer, multiple myeloma, diffuse large B-cell lymphoma, Alzheimer’s disease and type 2 diabetes.

As highlighted by the Guillermo Sanz, Scientific Director, IISLaFe - SYNTHIA Academic Lead, "Generation of efficient synthetic databases by using AI is the unique way to pursue the goals of maintaining data privacy while offering the tools to advance in precision medicine. SYNTHIA is the first IHI synthetic data project to deal with this urgent need. SYNTHIA envisions to generate databases that could be used by the European Medicines Agency to authorize the design of new single-arm clinical trials to be used for the approval of more efficient new drugs. That will undoubtedly accelerate patients' access to them and, in return, reduce the cost of those therapies."