Explainable AI: Decoding medical data for a smarter future
Artificial intelligence (AI) is transforming medical research, with deep learning models offering new possibilities in diagnostics and treatment planning. Deep learning has indeed achieved impressive results in several medical fields, including cancer detection, cardiovascular risk assessment, and brain imaging. These models excel at recognizing intricate patterns within data, often surpassing human experts [Gulshan 2016, Rajpurkar 2018]. However, they frequently operate as "black boxes," generating accurate predictions without providing clear explanations. The lack of interpretability poses a serious obstacle to the adoption of deep learning models, especially in healthcare where transparency and trust can directly affect patient outcomes.
Explainable AI
To address this issue, researchers have developed explainable AI (XAI) techniques which can shed light on how AI models make predictions. Methods such as saliency maps, heatmaps, and feature attribution tools (e.g., SHAP, LIME) can help visualise the influence of different input features on model predictions [Lundberg 2017].
These visualisations allow clinicians to better understand AI-driven insights, improving trust and transparency in the decision-making process.
Examples of XAI include the simulation of “what-if” scenarios as well as the disentanglement of latent features. "What-if" scenarios, that is, the creation of alternative versions of a given (medical) image via generative models, can be used to show how subtle changes in the image can impact model predictions. For instance, slight modifications in the size or shape of a tumour in an MRI scan could alter the predicted diagnosis, offering clinicians a deeper understanding of the model’s “reasoning” [DeGrave 2023, Lang 2021]. Similarly, disentangling latent features in medical images [Rotem 2024] improves interpretability, by allowing clinicians to interpret one feature at a time and hence simplifying the process of understanding which image properties are driving the models outcomes. XAI brings humans into the loop, allowing for a collaborative, more transparent diagnostic approach.
Key advantages of XAI
AI interpretability has powerful applications in biomedical imaging, ranging from embryo quality assessment to brain structure analysis from MRI scans. As AI continues to revolutionise medicine, the demand for interpretability will only increase. The development of models that offer feature-specific, clinically relevant explanations is a critical step toward building trust in AI-driven healthcare solutions. These approaches not only improve transparency but also reveal new insights that could lead to better patient outcomes. Explainable deep learning approaches that do not require predefined features could be particularly important when not all features are known and could bring to the surface attributes that would otherwise be overlooked. Moreover, when explanations are easier to interpret and directly linked to the clinical decision-making process, this makes insights actionable and can impact patient outcomes.
Is this the future of medical AI?
While these advancements show great promise, their impact will depend on how well they integrate into existing medical workflows. Clinicians and AI researchers alike must consider whether these techniques are applicable to current diagnostic pipelines and if they will ultimately enhance trust in AI-driven predictions.
Clearly, similar interpretability-focused methods can be applied beyond the realm of clinical diagnostics, to many different areas of research, such as drug discovery and the rapidly growing field of longevity research. At Oxcitas we are committed to implementing XAI for the analysis of complex biological data in all of our domains of interest, where uncovering previously unknown factors — whether they are new drug targets or biomarkers of ageing — could lead to transformative solutions.
References
[Gulshan 2016] Gulshan V et al (2016). Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 316, 2402–2410.
[Rajpurkar 2018] Rajpurkar P et al (2018). Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practising radiologists. PLoS Medicine 15, e1002686.
[Lundberg 2017] Lundberg SM and Lee S-I (2017). A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, 4768-4777.
[DeGrave 2023] DeGrave AJ, Cai ZR, and Janizek JD (2023). Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians. Nature Biomedical Engineering. Epub ahead of print.
[Lang 2021] Lang O et al (2021). Explaining in Style: Training a GAN to explain a classifier in StyleSpace. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 673-682.
[Rotem 2024] Rotem O et al (2024) Visual interpretability of image-based classification models by generative latent space disentanglement applied to in vitro fertilisation. Nature Communications 15, 7390.