Skip to main content
The Evidence Base Post

Elucidata and Sapien Biosciences partner to transform biobank data into AI-ready resources for drug discovery and diagnostics

  • Katie McCool
Silhouette of human head overlaid with cityscape and digital data graphics.

The collaboration will convert one of the world’s largest biobank collections into harmonized, multimodal datasets to support AI-driven precision medicine, translational research, and diagnostic development.

Elucidata and Sapien Biosciences have announced a partnership to make large-scale, multimodal patient data more accessible for AI-driven research and development. The collaboration will apply Elucidata’s Polly platform to integrate, enrich, and harmonize Sapien’s extensive biospecimen and clinical data resources for advanced analytics, predictive modeling, and synthetic data generation.

Sapien Biosciences, founded in partnership with Apollo Hospitals, operates India’s first and largest commercial biobank. The collection includes more than 300,000 patient samples and over 2 million pathology specimens, covering oncology, cardiology, autoimmune diseases, inflammation, and neurology. over 85,000 cancer samples, many paired with digitized histopathology, genomic data, and longitudinal clinical records. This makes Sapien one of the top ten biobanks globally and the largest integrated source of Asian patient data.

At Sapien, we believe that deeply characterized patient data especially from underrepresented populations like those in India can catalyze more inclusive and effective diagnostics and therapies worldwide,” said Dr Jugnu Jain, CEO and Co-founder of Sapien Biosciences.

The first phase of the project will focus on oncology, developing AI models that infer genomic and transcriptomic insights from digitized pathology slides. By leveraging Sapien’s next-generation sequencing (NGS)-annotated cancer samples, the collaboration aims to create synthetic multimodal datasets to support research in rare cancers and enable tissue-sparing approaches where biopsy material is limited.

Dr Abhishek Jha, CEO and Co-founder of Elucidata, said, “Sapien’s scale, sample quality, and data depth make it a critical partner in the company’s mission to democratize access to high-quality, AI-ready biomedical data,” adding,

By applying our Polly platform to Sapien’s datasets, we can bridge the gap between fragmented sample collections and next-gen AI models that accelerate target discovery and biomarker validation.”

Future phases will extend beyond oncology to include cardiovascular, autoimmune, and neurological diseases. The collaboration will focus on building synthetic clinicogenomic datasets that combine real-world data (RWD) and clinical outcomes data to support pharmaceutical research, biomarker identification, and companion diagnostic development.

By making rich biospecimens and clinical datasets AI ready, this collaboration has the potential to accelerate global innovation and help reduce the complexity of human biology for research and patient care,” said Dr Navjot Singh, advisor of Elucidata.io.

Register for free today to become a member of The Evidence Base and receive the latest news straight to your inbox.