Deep Generative Models for Healthcare: Improved Generalisation and Interpretability

Tronchin, Lorenzo

In the realm of Healthcare 4.0, the integration of information and communication technologies and Artificial Intelligence (AI) has opened new avenues for enhancing patient care, especially for patients suffering from complex chronic conditions such as cancer and COVID-19. These advancements may have a great potential to support healthcare professionals if introduced into the clinical routines; for instance, they can help predict patient outcomes and facilitate early and personalised interventions. However, they face significant challenges: among them, data scarcity, privacy concerns, and a need for interpretability hinder the translation of AI models from research to clinical settings. Indeed, the robustness of AI models hinges on the availability of diverse, high-quality and privacy-preserving medical data. Additionally, the increasing complexity of machine learning methods, especially those processing multimodal data that integrate multiple data sources like imaging and clinical records, can appear as “black boxes” to healthcare practitioners. This thesis explores generative approaches to address these challenges, offering two main contributions to the field of AI in healthcare: the first investigates data scarcity and privacy concerns, whilst the second focuses on the need to interpret models working on multimodal data. Synthetic data generated by Generative Adversarial Networks (GANs) emerges as a viable solution to the pressing challenge of data scarcity and privacy of AI in healthcare. However, GANs, despite excelling in generating high-quality samples rapidly, struggle to represent the whole variability of the training data, falling short of achieving comprehensive mode coverage. Indeed, employing synthetic data from GANs in downstream tasks requires capturing the full variability of the data distributions. This thesis addresses this critical limitation by introducing two approaches: LatentAugment and GAN Ensembles. LatentAugment is a novel data augmentation method that aims to enhance the diversity and fidelity of synthetic data generated by GANs. It achieves this by manipulating the GAN’s latent space to force synthetic samples to better reproduce the variability of real-world medical datasets. LatentAugment allows the researcher to fully realise the potential of synthetic data from a single GAN, enabling its use in downstream tasks where data availability is limited, i.e., there is data scarcity. GAN Ensembles shifts from a single GAN to multiple GANs. This approach stems from the hypothesis that no single GAN can fully encompass the diversity of real-world data. By solving a multi-objective optimisation problem, GAN Ensembles aim to select the optimal combination of GANs that yield high-quality and diverse synthetic data with minimal redundancy. Indeed, the method ensures that each model contributes uniquely to the ensemble. GAN Ensembles alleviate the burden for practitioners and researchers in selecting which GAN to use and determining the ideal sampling point during training, proving pivotal in applications requiring data privacy. Together, LatentAugment and GAN Ensembles represent a significant advancement in overcoming the lack of mode coverage of synthetic data from GANs, paving the way for more extensive adoption of synthetic data in scenarios plagued by data scarcity and privacy concerns. We now turn our attention to the second contribution of this thesis. Interpreting medical findings often involves using data from multiple exams or modalities, such as images or health records. Each modality provides unique insights into patient health, capturing different aspects of medical conditions. Thus, for an AI system to understand the complex mechanisms underlying disease, it needs to be able to interpret multimodal medical data. Hence, there is a need for multimodal networks that can effectively integrate diverse medical data streams, such as images and medical records. However, the efficacy of these multimodal networks in a clinical setting hinges not only on their capabilities to process input from various sources but also on their interpretability. This thesis leverages deep generative models to explain the decisions taken by multimodal networks. We develop a deep architecture explainable by design, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. It first creates a multimodal embedded representation of the input modalities. Then, applying a latent shift mechanism that simulates a counterfactual prediction on the embedded representation, we reveal the features of each modality that contribute the most to the decision and compute a quantitative score indicating the modality’s importance. This thesis proposes methods for effectively translating AI advancements into clinical practice, addressing critical challenges in data scarcity, privacy, and interpretability in healthcare.

Deep Generative Models for Healthcare: Improved Generalisation and Interpretability

TRONCHIN, LORENZO

2024

Abstract

In the realm of Healthcare 4.0, the integration of information and communication technologies and Artificial Intelligence (AI) has opened new avenues for enhancing patient care, especially for patients suffering from complex chronic conditions such as cancer and COVID-19. These advancements may have a great potential to support healthcare professionals if introduced into the clinical routines; for instance, they can help predict patient outcomes and facilitate early and personalised interventions. However, they face significant challenges: among them, data scarcity, privacy concerns, and a need for interpretability hinder the translation of AI models from research to clinical settings. Indeed, the robustness of AI models hinges on the availability of diverse, high-quality and privacy-preserving medical data. Additionally, the increasing complexity of machine learning methods, especially those processing multimodal data that integrate multiple data sources like imaging and clinical records, can appear as “black boxes” to healthcare practitioners. This thesis explores generative approaches to address these challenges, offering two main contributions to the field of AI in healthcare: the first investigates data scarcity and privacy concerns, whilst the second focuses on the need to interpret models working on multimodal data. Synthetic data generated by Generative Adversarial Networks (GANs) emerges as a viable solution to the pressing challenge of data scarcity and privacy of AI in healthcare. However, GANs, despite excelling in generating high-quality samples rapidly, struggle to represent the whole variability of the training data, falling short of achieving comprehensive mode coverage. Indeed, employing synthetic data from GANs in downstream tasks requires capturing the full variability of the data distributions. This thesis addresses this critical limitation by introducing two approaches: LatentAugment and GAN Ensembles. LatentAugment is a novel data augmentation method that aims to enhance the diversity and fidelity of synthetic data generated by GANs. It achieves this by manipulating the GAN’s latent space to force synthetic samples to better reproduce the variability of real-world medical datasets. LatentAugment allows the researcher to fully realise the potential of synthetic data from a single GAN, enabling its use in downstream tasks where data availability is limited, i.e., there is data scarcity. GAN Ensembles shifts from a single GAN to multiple GANs. This approach stems from the hypothesis that no single GAN can fully encompass the diversity of real-world data. By solving a multi-objective optimisation problem, GAN Ensembles aim to select the optimal combination of GANs that yield high-quality and diverse synthetic data with minimal redundancy. Indeed, the method ensures that each model contributes uniquely to the ensemble. GAN Ensembles alleviate the burden for practitioners and researchers in selecting which GAN to use and determining the ideal sampling point during training, proving pivotal in applications requiring data privacy. Together, LatentAugment and GAN Ensembles represent a significant advancement in overcoming the lack of mode coverage of synthetic data from GANs, paving the way for more extensive adoption of synthetic data in scenarios plagued by data scarcity and privacy concerns. We now turn our attention to the second contribution of this thesis. Interpreting medical findings often involves using data from multiple exams or modalities, such as images or health records. Each modality provides unique insights into patient health, capturing different aspects of medical conditions. Thus, for an AI system to understand the complex mechanisms underlying disease, it needs to be able to interpret multimodal medical data. Hence, there is a need for multimodal networks that can effectively integrate diverse medical data streams, such as images and medical records. However, the efficacy of these multimodal networks in a clinical setting hinges not only on their capabilities to process input from various sources but also on their interpretability. This thesis leverages deep generative models to explain the decisions taken by multimodal networks. We develop a deep architecture explainable by design, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. It first creates a multimodal embedded representation of the input modalities. Then, applying a latent shift mechanism that simulates a counterfactual prediction on the embedded representation, we reveal the features of each modality that contribute the most to the decision and compute a quantitative score indicating the modality’s importance. This thesis proposes methods for effectively translating AI advancements into clinical practice, addressing critical challenges in data scarcity, privacy, and interpretability in healthcare.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Facoltà Dipartimentale di Ingegneria
			
	Corso di studio
	
				Dottorato in scienze e ingegneria per l'uomo e l'ambiente
			
	Data di pubblicazione
	
				26-apr-2024
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				SODA, PAOLO
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				IANNELLO, GIULIO
			
	Nome Editore
	
				Università Campus Bio-Medico
			
	Città Editore
	
				Rome
			
	Collezione di appartenenza
	
				Università Campus Bio-medico di Roma

File in questo prodotto:

File	Dimensione	Formato
PhD_Tronchin_Lorenzo.pdf accesso aperto Dimensione 8.52 MB Formato Adobe PDF Visualizza/Apri	8.52 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/122871

Il codice NBN di questa tesi è URN:NBN:IT:UNICAMPUS-122871