Skip to content

White paper: Responsible AI in UK Health Data

Published

22/04/2026

By

Arun Sujenthiran, Lucia Groizard, Melissa Estevez, Cornelius Thaiss,Natalia Viani, Kathi Seidl-Rathkopf

White paper: Responsible AI in UK Health Data
 
WHITE PAPER

Responsible AI in UK Health Data:
Setting Standards for Future Use of LLMs in Clinical Data Extraction

Authors: Arun Sujenthiran, Lucia Groizard, Melissa Estevez, Cornelius Thaiss, Natalia Viani, Kathi Seidl-Rathkopf

Contributors: Maria Alvarellos, Adam Manhi, Emma Salib, Amanda White

Artificial intelligence (AI) is transforming the way researchers use electronic health records (EHRs) to generate real-world data (RWD) and real-world evidence (RWE) for clinical studies and scientific discovery. Large language models (LLMs) now make it possible to extract meaningful clinical information from unstructured EHR text at a scale and speed far beyond traditional manual abstraction. This capability has the potential to accelerate clinical research, support regulatory decisions, and improve patient care.

However, the data within EHRs are complex, inconsistently documented, and often ambiguous. Furthermore, LLMs can behave unpredictably, be sensitive to input variations, or reinforce biases present in source data. Without proper oversight, these challenges could undermine data quality, reduce the reliability of downstream analyses, and erode public and patient trust, especially when working with sensitive health data. While these risks are well recognised, the costs of limited adoption are often overlooked, as vast amounts of longitudinal patient data would otherwise remain inaccessible because manual curation does not scale. Therefore, responsible AI deployment represents not only a technical advance but an ethical imperative to maximise patient benefit, under-scoring the need for high-quality, transparent evaluation frameworks.

To address this gap, Flatiron Health has developed the Validation of Accuracy for LLM/ ML-Extracted Information and Data (VALID) framework. This provides a structured, multi-dimensional approach to assessing the accuracy, reliability, and fitness-forpurpose of LLM-extracted clinical information. In the UK, Flatiron Health applies these principles within a robust governance approach built on standards that reflect a commitment to responsible innovation. By embedding structured, high-quality frameworks such as VALID in health research, the UK can set a global benchmark for trustworthy use of LLMs in healthcare to improve patient care and outcomes.

This white paper outlines how high-quality, well-governed health data is essential for safely developing and deploying LLMs in health research. It provides practical recommendations for ensuring that LLMs are adopted safely, responsibly, and to their full potential within the UK health data ecosystem. 

 

Share

Posted in