DOMAIN KNOWLEDGE-GUIDED LEARNING FOR ROBUST MYOCARDIAL INFARCTION DETECTION FROM 12-LEAD ELECTROCARDIOGRAMS

Ibrahimi, Silvia

Background: Myocardial infarction (MI) represents one of the leading causes of morbidity and mortality on a global scale. Diagnosis is primarily based on the interpretation of the 12-lead electrocardiogram (ECG) according to established clinical guidelines that specify alterations in ECG components. Manual ECG interpretation is time-consuming and prone to inter-observer variability, and motivated the development of automated MI diagnosis. In this context, deep learning (DL) has emerged as a promising approach for identifying MI from 12 lead ECG. Challenges: Despite the encouraging results reported for MI diagnosis, DL models are hindered by several key limitations. First, existing models often overlook the electrocardiographic domain knowledge (DK) codified into clinical decision rules. Without the explicit incorporation of such DK, these models may learn feature representations that are not physiologically grounded, thereby reducing their clinical generalisability. Second, when a DL model is trained on biased datasets, it may rely on spurious correlations associated with age rather than learning MI-relevant features. Third, prior work primarily addresses MI detection, stage classification, and localisation as separate tasks, while rarely integrating all three within a unified framework. As a result, the physiological interdependencies among these diagnostic tasks remain underexploited. Finally, ensuring robustness under dataset shifts remains a critical challenge in DL-based MI diagnosis. Models developed and evaluated primarily on internal datasets may fail to maintain performance when applied to external populations with different demographic distributions or ECG acquisition protocols, thereby limiting their generalisability. Objectives: The main objectives of this thesis were to improve the robustness of a DL model for MI diagnosis from 12-lead ECGs by: i) mitigating age bias for MI diagnosis; ii) incorporating DK into the DL model (DK-DL) to enhance clinically meaningful representations, and to compare its performance with a DL model trained in a standard way (B-DL) and with an implemented rule-based algorithm (RBA) that followed clinical guideline criteria; iii) introducing a multitask learning framework that simultaneously modelled MI stage (acute vs. prior vs. normal) classification and localisation by explicitly leveraging interdependencies among related MI diagnostic tasks; and iv) conducting a comprehensive evaluation of the DL models for MI diagnosis using external datasets to assess their robustness across diverse populations and ECG acquisition protocols. Methods: In this thesis, an adversarial multitask learning framework was proposed to train a DL model using contrastive objectives for MI diagnosis while mitigating age-related spurious correlations. In addition, a DL training framework was designed to incorporate DK to perform both MI detection, staging and localisation. Two strategies were proposed to inject DK into the DL model through two custom regularisation terms in the objective function to control the latent space. Specifically, the strategies were aimed at: i) learning specific ECG components, such as ST-segment, Q and R wave amplitudes, and Q wave durations; and ii) reconstructing a latent space from a set of differentiable approximations of clinical rules. The methods, i.e., DK-DL, B-DL and RBA were developed on PTB-XL+ dataset. In addition, their performance were evaluated on three external datasets, namely CODE, MIMIC-IV and Chapman-Shaoxing. Results: In this thesis, the proposed AML strategy effectively mitigated age-related bias, decreasing the Pearson correlation coefficient between predictions and age from 0.67 to -0.03, while maintaining an accuracy of 0.85. In addition, incorporating DK into the DL model improved performance in MI staging and localisation tasks. Specifically, the DK-DL model outperformed both the B-DL and RBA in acute MI detection, achieving a higher average recall (0.70 vs. 0.53 and 0.65, respectively). When considering overall MI detection (including both acute and prior MI) performance across all four available datasets, the DK-DL model maintained superior performance (0.91), compared with B-DL (0.89) and RBA (0.75). For MI localisation, the DK-DL model achieved mean recall values exceeding 0.84 across all anatomical regions, with major improvements in the lateral territory. These findings demonstrate that AML effectively mitigated age-related bias, whereas DK incorporation enhanced robustness of DL model for MI diagnosis. Conclusions: This thesis advances MI diagnosis by incorporating DK into DL frameworks, enabling more clinically aligned and robust ECG-based analysis. The proposed methodologies demonstrated improved generalisation and reduced reliance on spurious correlations, thereby enhancing the robustness of automated MI diagnosis across heterogeneous clinical environments. Collectively, the contributions of this thesis provide a foundation for the development of clinically generalisable DL models for 12-lead ECG.

DOMAIN KNOWLEDGE-GUIDED LEARNING FOR ROBUST MYOCARDIAL INFARCTION DETECTION FROM 12-LEAD ELECTROCARDIOGRAMS

IBRAHIMI, SILVIA

2026

Abstract

Background: Myocardial infarction (MI) represents one of the leading causes of morbidity and mortality on a global scale. Diagnosis is primarily based on the interpretation of the 12-lead electrocardiogram (ECG) according to established clinical guidelines that specify alterations in ECG components. Manual ECG interpretation is time-consuming and prone to inter-observer variability, and motivated the development of automated MI diagnosis. In this context, deep learning (DL) has emerged as a promising approach for identifying MI from 12 lead ECG. Challenges: Despite the encouraging results reported for MI diagnosis, DL models are hindered by several key limitations. First, existing models often overlook the electrocardiographic domain knowledge (DK) codified into clinical decision rules. Without the explicit incorporation of such DK, these models may learn feature representations that are not physiologically grounded, thereby reducing their clinical generalisability. Second, when a DL model is trained on biased datasets, it may rely on spurious correlations associated with age rather than learning MI-relevant features. Third, prior work primarily addresses MI detection, stage classification, and localisation as separate tasks, while rarely integrating all three within a unified framework. As a result, the physiological interdependencies among these diagnostic tasks remain underexploited. Finally, ensuring robustness under dataset shifts remains a critical challenge in DL-based MI diagnosis. Models developed and evaluated primarily on internal datasets may fail to maintain performance when applied to external populations with different demographic distributions or ECG acquisition protocols, thereby limiting their generalisability. Objectives: The main objectives of this thesis were to improve the robustness of a DL model for MI diagnosis from 12-lead ECGs by: i) mitigating age bias for MI diagnosis; ii) incorporating DK into the DL model (DK-DL) to enhance clinically meaningful representations, and to compare its performance with a DL model trained in a standard way (B-DL) and with an implemented rule-based algorithm (RBA) that followed clinical guideline criteria; iii) introducing a multitask learning framework that simultaneously modelled MI stage (acute vs. prior vs. normal) classification and localisation by explicitly leveraging interdependencies among related MI diagnostic tasks; and iv) conducting a comprehensive evaluation of the DL models for MI diagnosis using external datasets to assess their robustness across diverse populations and ECG acquisition protocols. Methods: In this thesis, an adversarial multitask learning framework was proposed to train a DL model using contrastive objectives for MI diagnosis while mitigating age-related spurious correlations. In addition, a DL training framework was designed to incorporate DK to perform both MI detection, staging and localisation. Two strategies were proposed to inject DK into the DL model through two custom regularisation terms in the objective function to control the latent space. Specifically, the strategies were aimed at: i) learning specific ECG components, such as ST-segment, Q and R wave amplitudes, and Q wave durations; and ii) reconstructing a latent space from a set of differentiable approximations of clinical rules. The methods, i.e., DK-DL, B-DL and RBA were developed on PTB-XL+ dataset. In addition, their performance were evaluated on three external datasets, namely CODE, MIMIC-IV and Chapman-Shaoxing. Results: In this thesis, the proposed AML strategy effectively mitigated age-related bias, decreasing the Pearson correlation coefficient between predictions and age from 0.67 to -0.03, while maintaining an accuracy of 0.85. In addition, incorporating DK into the DL model improved performance in MI staging and localisation tasks. Specifically, the DK-DL model outperformed both the B-DL and RBA in acute MI detection, achieving a higher average recall (0.70 vs. 0.53 and 0.65, respectively). When considering overall MI detection (including both acute and prior MI) performance across all four available datasets, the DK-DL model maintained superior performance (0.91), compared with B-DL (0.89) and RBA (0.75). For MI localisation, the DK-DL model achieved mean recall values exceeding 0.84 across all anatomical regions, with major improvements in the lateral territory. These findings demonstrate that AML effectively mitigated age-related bias, whereas DK incorporation enhanced robustness of DL model for MI diagnosis. Conclusions: This thesis advances MI diagnosis by incorporating DK into DL frameworks, enabling more clinically aligned and robust ECG-based analysis. The proposed methodologies demonstrated improved generalisation and reduced reliance on spurious correlations, thereby enhancing the robustness of automated MI diagnosis across heterogeneous clinical environments. Collectively, the contributions of this thesis provide a foundation for the development of clinically generalisable DL models for 12-lead ECG.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Informatica Giovanni Degli Antoni
			
	Corso di studio
	
				VALUE NOT DUMP
			
	Data di pubblicazione
	
				16-giu-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				SASSI, ROBERTO
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				RIVOLTA, MASSIMO WALTER
DAMIANI, ERNESTO
			
	Nome Editore
	
				Università degli Studi di Milano
			
	Numero di pagine
	
				107
			
	Collezione di appartenenza
	
				Università degli Studi di Milano

File in questo prodotto:

File	Dimensione	Formato
phd_unimi_R14103.pdf accesso aperto Licenza: Creative Commons Dimensione 1.82 MB Formato Adobe PDF Visualizza/Apri	1.82 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/372546

Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-372546