Correlation does not imply causation. This is the mantra we will repeat throughout this thesis: from the beginning to the very end, our objective will be to learn cause-effect relationships from data and experts knowledge in the context of medicine and healthcare. The main contributions of this work are: • An up-to-date review of the state of the art on causal discovery, including practical considerations on general evaluation, hyper-parameter tuning and available software. • Explore the effects of uncertainty and missing values on the causal discovery problem in the context of a multicentric clinical study on endometrial cancer. • Investigate the impact of different missingness mechanism assumptions on the causal discovery techniques for the specific study case, highlighting the inconsistencies between background knowledge and learnt models when such assumptions are violated. • A novel algorithm for federated causal discovery from multiple sources with client-specific missingness mechanisms, with both synthetic and real-world data. • Link causal models and decision making to optimize the outcome of interest. This thesis is divided into three parts: • Part I introduces the methodology shared by the other parts. In particular, Chapter 1 deals with the fundamental concepts of causal inference, such as potential outcomes, causal effects and counterfactuals, while Chapter 2 defines the problem of causal discovery, along with a detailed review of the state of the art. • Part II reports the contributions related to the multicentric study on endometrial cancer. Namely, Chapter 3 explores the challenges and pitfalls that impact the recovery of a causal graph in the context of missing data, Chapter 4 formalizes the problem using causal discovery and missingness graphs, and Chapter 5 proposes a novel algorithm to learn a causal graph from multiple sources in a federated learning fashion, relaxing the assumption of a global missingness mechanism. • Part III closes the thesis with additional theoretical outlooks. Chapter 6 links causal discovery with decision making in the context of recommender systems, making it easier to understand the role of a causality in expected outcome optimization.

Correlation does not imply causation. This is the mantra we will repeat throughout this thesis: from the beginning to the very end, our objective will be to learn cause-effect relationships from data and experts knowledge in the context of medicine and healthcare. The main contributions of this work are: • An up-to-date review of the state of the art on causal discovery, including practical considerations on general evaluation, hyper-parameter tuning and available software. • Explore the effects of uncertainty and missing values on the causal discovery problem in the context of a multicentric clinical study on endometrial cancer. • Investigate the impact of different missingness mechanism assumptions on the causal discovery techniques for the specific study case, highlighting the inconsistencies between background knowledge and learnt models when such assumptions are violated. • A novel algorithm for federated causal discovery from multiple sources with client-specific missingness mechanisms, with both synthetic and real-world data. • Link causal models and decision making to optimize the outcome of interest. This thesis is divided into three parts: • Part I introduces the methodology shared by the other parts. In particular, Chapter 1 deals with the fundamental concepts of causal inference, such as potential outcomes, causal effects and counterfactuals, while Chapter 2 defines the problem of causal discovery, along with a detailed review of the state of the art. • Part II reports the contributions related to the multicentric study on endometrial cancer. Namely, Chapter 3 explores the challenges and pitfalls that impact the recovery of a causal graph in the context of missing data, Chapter 4 formalizes the problem using causal discovery and missingness graphs, and Chapter 5 proposes a novel algorithm to learn a causal graph from multiple sources in a federated learning fashion, relaxing the assumption of a global missingness mechanism. • Part III closes the thesis with additional theoretical outlooks. Chapter 6 links causal discovery with decision making in the context of recommender systems, making it easier to understand the role of a causality in expected outcome optimization.

Causal Discovery in a Multicentric Clinical Study on Endometrial Cancer

ZANGA, ALESSIO
2025

Abstract

Correlation does not imply causation. This is the mantra we will repeat throughout this thesis: from the beginning to the very end, our objective will be to learn cause-effect relationships from data and experts knowledge in the context of medicine and healthcare. The main contributions of this work are: • An up-to-date review of the state of the art on causal discovery, including practical considerations on general evaluation, hyper-parameter tuning and available software. • Explore the effects of uncertainty and missing values on the causal discovery problem in the context of a multicentric clinical study on endometrial cancer. • Investigate the impact of different missingness mechanism assumptions on the causal discovery techniques for the specific study case, highlighting the inconsistencies between background knowledge and learnt models when such assumptions are violated. • A novel algorithm for federated causal discovery from multiple sources with client-specific missingness mechanisms, with both synthetic and real-world data. • Link causal models and decision making to optimize the outcome of interest. This thesis is divided into three parts: • Part I introduces the methodology shared by the other parts. In particular, Chapter 1 deals with the fundamental concepts of causal inference, such as potential outcomes, causal effects and counterfactuals, while Chapter 2 defines the problem of causal discovery, along with a detailed review of the state of the art. • Part II reports the contributions related to the multicentric study on endometrial cancer. Namely, Chapter 3 explores the challenges and pitfalls that impact the recovery of a causal graph in the context of missing data, Chapter 4 formalizes the problem using causal discovery and missingness graphs, and Chapter 5 proposes a novel algorithm to learn a causal graph from multiple sources in a federated learning fashion, relaxing the assumption of a global missingness mechanism. • Part III closes the thesis with additional theoretical outlooks. Chapter 6 links causal discovery with decision making in the context of recommender systems, making it easier to understand the role of a causality in expected outcome optimization.
17-feb-2025
Inglese
Correlation does not imply causation. This is the mantra we will repeat throughout this thesis: from the beginning to the very end, our objective will be to learn cause-effect relationships from data and experts knowledge in the context of medicine and healthcare. The main contributions of this work are: • An up-to-date review of the state of the art on causal discovery, including practical considerations on general evaluation, hyper-parameter tuning and available software. • Explore the effects of uncertainty and missing values on the causal discovery problem in the context of a multicentric clinical study on endometrial cancer. • Investigate the impact of different missingness mechanism assumptions on the causal discovery techniques for the specific study case, highlighting the inconsistencies between background knowledge and learnt models when such assumptions are violated. • A novel algorithm for federated causal discovery from multiple sources with client-specific missingness mechanisms, with both synthetic and real-world data. • Link causal models and decision making to optimize the outcome of interest. This thesis is divided into three parts: • Part I introduces the methodology shared by the other parts. In particular, Chapter 1 deals with the fundamental concepts of causal inference, such as potential outcomes, causal effects and counterfactuals, while Chapter 2 defines the problem of causal discovery, along with a detailed review of the state of the art. • Part II reports the contributions related to the multicentric study on endometrial cancer. Namely, Chapter 3 explores the challenges and pitfalls that impact the recovery of a causal graph in the context of missing data, Chapter 4 formalizes the problem using causal discovery and missingness graphs, and Chapter 5 proposes a novel algorithm to learn a causal graph from multiple sources in a federated learning fashion, relaxing the assumption of a global missingness mechanism. • Part III closes the thesis with additional theoretical outlooks. Chapter 6 links causal discovery with decision making in the context of recommender systems, making it easier to understand the role of a causality in expected outcome optimization.
causality; causal inference; causal discovery; endometrial cancer; missing data
VIVIANI, MARCO
STELLA, FABIO ANTONIO
File in questo prodotto:
File Dimensione Formato  
phd_unimib_815997.pdf

accesso aperto

Dimensione 1.61 MB
Formato Adobe PDF
1.61 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/193728
Il codice NBN di questa tesi è URN:NBN:IT:UNIMIB-193728