Learning Concepts with the Right Semantics: Reasoning Shortcuts and Human-Machine Alignment

Marconato, Emanuele

Understanding the functioning of current AI models is an urgent open problem, due to the massive scale use of deep neural networks and their black-box nature. Several works address how to explain the behavior of AI models and greater interest is growing to provide explanations with high-level variables, often called concepts or symbols. Leveraging concepts as a vehicle for explaining AI models allows to discard irrelevant information and focus only on the semantic content of the data. This has the potential to make models more interpretable, and achieve higher trust in their decision-making process. One key open problem is how to learn concepts from data such that they possess the correct semantics. This thesis analyzes this prob- lem in depth, presenting two major contributions. The first is explaining and addressing pitfalls in learning the right concepts in the context of tasks that involve reasoning on them. These pitfalls are due to Reasoning Shortcuts, whereby models can leverage poor-quality concepts to attain correct predictions. The second contribution is establishing a formal framework to test the quality of concepts learned by the model, and successively presenting a class of models that boost concept quality by leveraging advanced representation learning techniques. Overall, the presented works contribute to further understanding the issues complicating provably learning the concepts from data and to designing more trustworthy AI models for future high-stakes applications.

Learning Concepts with the Right Semantics: Reasoning Shortcuts and Human-Machine Alignment

MARCONATO, EMANUELE

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				16-feb-2025
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				Alignment
Causal Representation Learning
Explainable AI
Interpretability
Neuro-Symbolic AI
Shortcuts
			
	Relatore, Supervisor, Advisor o Tutor
	
				Passerini, Andrea
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				Teso, Stefano
Barra, Adriano
			
	Collezione di appartenenza
	
				Università degli Studi di Pisa

File in questo prodotto:

File	Dimensione	Formato
FinalReport_Marconato_pdfa.pdf non disponibili Dimensione 239.34 kB Formato Adobe PDF	239.34 kB	Adobe PDF
frontespizio_firma_digitale_pdfa.pdf non disponibili Dimensione 59.62 kB Formato Adobe PDF	59.62 kB	Adobe PDF
phd_thesis_marconato_final_2.pdf accesso aperto Dimensione 22.25 MB Formato Adobe PDF Visualizza/Apri	22.25 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/215662

Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-215662