Tracking Linguistic Abilities in Neural Language Models

Miaschi, Alessio

In he last few years, the analysis of the inner workings of state-of-the-art Neural Language Models (NLMs) has become one of the most addressed line of research in Natural Language Processing (NLP). Several techniques have been devised to obtain meaningful explanations and to understand how these models are able to capture semantic and linguistic knowledge. The goal of this thesis is to investigate whether exploiting NLP methods for studying human linguistic competence and, specifically, the process of written language evolution is it possible to understand the behaviour of state-of-the-art Neural Language Models (NLMs). First, we present an NLP-based stylometric approach for tracking the evolution of written language competence in L1 and L2 learners using a wide set of linguistically motivated features capturing stylistic aspects of a text. Then, relying on the same set of linguistic features, we propose different approaches aimed at investigating the linguistic knowledge implicitly learned by NLMs. Finally, we propose a study in order to investigate the robustness of one of the most prominent NLM, i.e. BERT, when dealing with different types of errors extracted from authentic texts written by L1 Italian learners.

Tracking Linguistic Abilities in Neural Language Models

MIASCHI, ALESSIO

2022

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				9-mag-2022
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				interpretability
machine learning
neural language models
NLP
probing tasks
			
	Relatore, Supervisor, Advisor o Tutor
	
				Dell'Orletta, Felice
Monreale, Anna
			
	Collezione di appartenenza
	
				Università degli Studi di Pisa

File in questo prodotto:

File	Dimensione	Formato
PhD_Thesis_Miaschi.pdf accesso aperto Dimensione 19.06 MB Formato Adobe PDF Visualizza/Apri	19.06 MB	Adobe PDF	Visualizza/Apri
Relazione_Attivit_Dottorato_Miaschi.pdf accesso aperto Dimensione 53.82 kB Formato Adobe PDF Visualizza/Apri	53.82 kB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/215751

Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-215751