Linguistic Generalization in Transformer-based Neural Language Models

Lasri, Karim

Neural language models are commonly deployed to perform diverse natural language processing tasks, as they produce contextual vector representations of texts which can be used in any supervised learning setting. Transformer-based neural architectures have been widely adopted towards this end. After being pre-trained with a generic language modeling objective, they achieve spectacular performance on a wide array of downstream tasks, which in principle require knowledge of sentence structure. As these models are not explicitly supervised with any grammatical instruction, this suggests that linguistic knowledge emerges during pre-training. The nature of their knowledge is scarcely understood, as these models are generally used as black boxes. This led to the emergence of a growing body of research aimed at uncovering the linguistic abilities of such models. While this literature is very abundant, the epistemic grounds of the existing methodologies are not translatable into each other, underlining the need to formulate more clearly the questions addressing the capture of linguistic knowledge. Throughout the thesis, we bridge the epistemic gap by formulating explicitly the relations which lie between facets of the broader question. To do so, we adopt three levels of analysis to understand neural language models: the behavioral, algorithmic, implementational levels. Further, we carry a series of experiments to uncover aspects of the linguistic knowledge captured by language models.

Linguistic Generalization in Transformer-based Neural Language Models

LASRI, KARIM

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				26-gen-2023
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				deep learning
generalization
linguistic knowledge
natural language processing
neural language model
			
	Relatore, Supervisor, Advisor o Tutor
	
				Lenci, Alessandro
Poibeau, Thierry
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				Baroni, Marco
Lappin, Shalom
Cassell, Justine
Ettinger, Allyson
			
	Collezione di appartenenza
	
				Università degli Studi di Pisa

File in questo prodotto:

File	Dimensione	Formato
dilles_report_fine_corso_Karim_Lasri.pdf non disponibili Dimensione 165.36 kB Formato Adobe PDF	165.36 kB	Adobe PDF
PhD_Manuscript_Final.pdf accesso aperto Dimensione 17.36 MB Formato Adobe PDF Visualizza/Apri	17.36 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/215750

Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-215750