Modern neural networks achieve human performance on individual tasks, but they tend to forget previously learned knowledge when trained on sequences of tasks. As the models learn subsequent tasks in the sequence, they lose the ability to accurately perform the previously learned ones: a phenomenon known as catastrophic forgetting. Continual learning (CL) methods aim at mitigating this issue by balancing the network's plasticity and stability, thus limiting interference between tasks. However, most of these approaches focus on improving model performance, without providing insights about what is happening internally. Our work addresses this gap by structuring and contributing to the field of eXplainable Artificial Intelligence (XAI)-guided CL, with a focus on self-interpretable approaches. First, we provide a survey of the existing XAI-guided CL methods, with the goals of encouraging research on the topic, unifying benchmarks and terminology, and identifying potential research avenues. Secondly, we introduce new self-interpretable architectures and develop novel XAI-guided CL approaches. The presented architectures rely on human-understandable concepts or prototypes, shedding light on the networks' inner workings, and providing insights into how old and new information is aggregated during CL. Thirdly, we show how to gain insights into how new and past information is integrated in artificial and biological neural networks, respectively. This objective is achieved by directly analyzing the alignment between the representations of the two systems through XAI. We additionally explore diverse application domains, including images, text, graphs, and reinforcement learning. Our findings demonstrate that XAI can serve a dual function: it can be applied to brain alignment to identify gaps in current computational models of cognition, and to CL to enhance performance and interpretability. Taken together, these applications lay the foundation for developing continual learners that are both interpretable and neuro-inspired. Empirically, our methods consistently outperform existing baselines in both class- and task-incremental learning, improve replay strategies in reinforcement learning, and provide novel insights into explanation drift and the role of long-range dependencies in brain–language model alignment.

Memory, explainability, and brain alignment: towards brain-inspired explainable continual learning

PROIETTI, MICHELA
2026

Abstract

Modern neural networks achieve human performance on individual tasks, but they tend to forget previously learned knowledge when trained on sequences of tasks. As the models learn subsequent tasks in the sequence, they lose the ability to accurately perform the previously learned ones: a phenomenon known as catastrophic forgetting. Continual learning (CL) methods aim at mitigating this issue by balancing the network's plasticity and stability, thus limiting interference between tasks. However, most of these approaches focus on improving model performance, without providing insights about what is happening internally. Our work addresses this gap by structuring and contributing to the field of eXplainable Artificial Intelligence (XAI)-guided CL, with a focus on self-interpretable approaches. First, we provide a survey of the existing XAI-guided CL methods, with the goals of encouraging research on the topic, unifying benchmarks and terminology, and identifying potential research avenues. Secondly, we introduce new self-interpretable architectures and develop novel XAI-guided CL approaches. The presented architectures rely on human-understandable concepts or prototypes, shedding light on the networks' inner workings, and providing insights into how old and new information is aggregated during CL. Thirdly, we show how to gain insights into how new and past information is integrated in artificial and biological neural networks, respectively. This objective is achieved by directly analyzing the alignment between the representations of the two systems through XAI. We additionally explore diverse application domains, including images, text, graphs, and reinforcement learning. Our findings demonstrate that XAI can serve a dual function: it can be applied to brain alignment to identify gaps in current computational models of cognition, and to CL to enhance performance and interpretability. Taken together, these applications lay the foundation for developing continual learners that are both interpretable and neuro-inspired. Empirically, our methods consistently outperform existing baselines in both class- and task-incremental learning, improve replay strategies in reinforcement learning, and provide novel insights into explanation drift and the role of long-range dependencies in brain–language model alignment.
29-gen-2026
Inglese
CAPOBIANCO, ROBERTO
SILVESTRI, FABRIZIO
GRISETTI, GIORGIO
Università degli Studi di Roma "La Sapienza"
File in questo prodotto:
File Dimensione Formato  
Tesi_dottorato_Proietti.pdf

accesso aperto

Licenza: Creative Commons
Dimensione 21.92 MB
Formato Adobe PDF
21.92 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/358418
Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-358418