Training conversational agents to understand complex dialogues

Sucameli, Irene

Nowadays, conversational agents are inspiring the academic and non-academic world thanks to the engaging interaction they establish with the user. However, finding valuable data to train a system able to converse as human-like as possible is not a trivial task. This is even more challenging for the Italian language, where only a few dialogic datasets are available. This thesis expressly addresses this challenge, proposing JILDA (Job Interview Labelled Dialogues Assembly), a new Italian dialogue dataset for the job-offer domain, and demonstrating its practical application for the training of a conversational agent able to understand syntactically and semantically complex data. JILDA dialogues, after being annotated via MATILDA, a new annotation tool developed in collaboration with Wluper, are used to train the Natural Language Understanding module of a conversational agent, as this is an essential component of any dialogue system. Three of the most recent pretrained LMs are benchmarked: Italian BERT, Multilingual BERT, and AlBERTo. Analysing the performance obtained, it was developed JILDA 2.0, an updated version of the resource useful to realise a first step in improving NLU for Italian dialogues. Finally, this thesis frames the research topic within a global ethical framework, considering the ethical issues which emerge in human-machine interaction, the gender biases embedded in the Embodied Conversational Agents (ECAs) and their impacts on modern society.

Training conversational agents to understand complex dialogues

SUCAMELI, IRENE

2022

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				4-mag-2022
			
	Lingua
	
				Italiano
			
	Parola chiave
	
				annotation tool
conversational agents
ethics and ECAs
Italian dialogue dataset
			
	Relatore, Supervisor, Advisor o Tutor
	
				Simi, Maria
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				Attardi, Giuseppe
Lenci, Alessandro
			
	Collezione di appartenenza
	
				Università degli Studi di Pisa

File in questo prodotto:

File	Dimensione	Formato
Report_Sucameli.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 253.95 kB Formato Adobe PDF Visualizza/Apri	253.95 kB	Adobe PDF	Visualizza/Apri
Training_conversation_agents_Sucameli.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 3.3 MB Formato Adobe PDF Visualizza/Apri	3.3 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/216291

Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-216291