Memory, explainability, and brain alignment: towards brain-inspired explainable continual learning

Proietti, Michela

Modern neural networks achieve human performance on individual tasks, but they tend to forget previously learned knowledge when trained on sequences of tasks. As the models learn subsequent tasks in the sequence, they lose the ability to accurately perform the previously learned ones: a phenomenon known as catastrophic forgetting. Continual learning (CL) methods aim at mitigating this issue by balancing the network's plasticity and stability, thus limiting interference between tasks. However, most of these approaches focus on improving model performance, without providing insights about what is happening internally. Our work addresses this gap by structuring and contributing to the field of eXplainable Artificial Intelligence (XAI)-guided CL, with a focus on self-interpretable approaches. First, we provide a survey of the existing XAI-guided CL methods, with the goals of encouraging research on the topic, unifying benchmarks and terminology, and identifying potential research avenues. Secondly, we introduce new self-interpretable architectures and develop novel XAI-guided CL approaches. The presented architectures rely on human-understandable concepts or prototypes, shedding light on the networks' inner workings, and providing insights into how old and new information is aggregated during CL. Thirdly, we show how to gain insights into how new and past information is integrated in artificial and biological neural networks, respectively. This objective is achieved by directly analyzing the alignment between the representations of the two systems through XAI. We additionally explore diverse application domains, including images, text, graphs, and reinforcement learning. Our findings demonstrate that XAI can serve a dual function: it can be applied to brain alignment to identify gaps in current computational models of cognition, and to CL to enhance performance and interpretability. Taken together, these applications lay the foundation for developing continual learners that are both interpretable and neuro-inspired. Empirically, our methods consistently outperform existing baselines in both class- and task-incremental learning, improve replay strategies in reinforcement learning, and provide novel insights into explanation drift and the role of long-range dependencies in brain–language model alignment.

Memory, explainability, and brain alignment: towards brain-inspired explainable continual learning

PROIETTI, MICHELA

2026

Abstract

Modern neural networks achieve human performance on individual tasks, but they tend to forget previously learned knowledge when trained on sequences of tasks. As the models learn subsequent tasks in the sequence, they lose the ability to accurately perform the previously learned ones: a phenomenon known as catastrophic forgetting. Continual learning (CL) methods aim at mitigating this issue by balancing the network's plasticity and stability, thus limiting interference between tasks. However, most of these approaches focus on improving model performance, without providing insights about what is happening internally. Our work addresses this gap by structuring and contributing to the field of eXplainable Artificial Intelligence (XAI)-guided CL, with a focus on self-interpretable approaches. First, we provide a survey of the existing XAI-guided CL methods, with the goals of encouraging research on the topic, unifying benchmarks and terminology, and identifying potential research avenues. Secondly, we introduce new self-interpretable architectures and develop novel XAI-guided CL approaches. The presented architectures rely on human-understandable concepts or prototypes, shedding light on the networks' inner workings, and providing insights into how old and new information is aggregated during CL. Thirdly, we show how to gain insights into how new and past information is integrated in artificial and biological neural networks, respectively. This objective is achieved by directly analyzing the alignment between the representations of the two systems through XAI. We additionally explore diverse application domains, including images, text, graphs, and reinforcement learning. Our findings demonstrate that XAI can serve a dual function: it can be applied to brain alignment to identify gaps in current computational models of cognition, and to CL to enhance performance and interpretability. Taken together, these applications lay the foundation for developing continual learners that are both interpretable and neuro-inspired. Empirically, our methods consistently outperform existing baselines in both class- and task-incremental learning, improve replay strategies in reinforcement learning, and provide novel insights into explanation drift and the role of long-range dependencies in brain–language model alignment.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				DIPARTIMENTO DI INGEGNERIA INFORMATICA, AUTOMATICA E GESTIONALE -ANTONIO RUBERTI-
			
	Corso di studio
	
				Ingegneria informatica
			
	Data di pubblicazione
	
				29-gen-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				CAPOBIANCO, ROBERTO
SILVESTRI, FABRIZIO
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				GRISETTI, GIORGIO
			
	Nome Editore
	
				Università degli Studi di Roma "La Sapienza"
			
	Collezione di appartenenza
	
				Università degli Studi di Roma La Sapienza

File in questo prodotto:

File	Dimensione	Formato
Tesi_dottorato_Proietti.pdf accesso aperto Licenza: Creative Commons Dimensione 21.92 MB Formato Adobe PDF Visualizza/Apri	21.92 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/358418

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-358418