Continual Incremental Language Learning for Neural Machine Translation

Resta, Michele

Since the inception of the Artificial Intelligence (AI) field, one of the main long-term objectives has been to understand and replicate intelligence in order to create systems capable of learning and behaving in a human-like manner. However, this task has proven to be extremely difficult for both traditional AI systems and neural approaches due to the phenomenon of catastrophic forgetting. When exposed to new data, neural network systems tend to erase previously learned knowledge. In this context, Continual Learning (CL) has emerged as a research field aimed at mitigating this behavior and moving towards AI systems that mimic human learning capabilities in lifelong learning tasks and environments. With the deep learning shift in Machine Translation (MT) and Natural Language Processing (NLP) systems, these characteristics have become even more desirable, given the substantial resources involved in training these models, especially in terms of training efficiency and transferability of knowledge. In this dissertation, we provide a practical contribution to this research area. We begin by reviewing fundamental concepts and theoretical aspects of Neural Machine Translation (NMT) and then survey prominent CL methodologies. Building on this foundation, we propose a Continual Learning framework for NMT with the goal of incrementally learning multilingual translation systems. We introduce the Continual Incremental Language Learning setting as a starting point to explore data selection strategies that enhance training efficiency when using effective continual learning strategies such as replay buffers. Furthermore, we demonstrate that employing an NMT model both as a learner and as a generator of replay data is effective in mitigating performance loss during continued training, alleviating several requirements related to training data storage. Within this incremental language learning context, we empirically evaluate, through quantitative and qualitative analyses, both the classical training paradigm and the pre-training and fine-tuning paradigm. We discuss their unique aspects when employing classical data-based rehearsal strategies. We extend our analysis to non-autoregressive NMT models and compare them to state-of-the-art autoregressive NMT systems. Through this work, we aim to provide a comprehensive framework and practical insights into continual learning for NMT, ultimately highlighting the needs and benefits of this learning paradigm.

File	Dimensione	Formato
phd_activities_pdfa.pdf non disponibili Dimensione 85.41 kB Formato Adobe PDF	85.41 kB	Adobe PDF
resta_thesis_final_1.pdf accesso aperto Dimensione 4.37 MB Formato Adobe PDF Visualizza/Apri	4.37 MB	Adobe PDF	Visualizza/Apri

Continual Incremental Language Learning for Neural Machine Translation

RESTA, MICHELE

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Continual Incremental Language Learning for Neural Machine Translation

RESTA, MICHELE

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)