Generalizing Under Data Scarcity. Enhancing the representation capability from few samples.

Braccaioli, Lorenzo

The widespread adoption of deep learning in both research and industrial contexts has revealed a central limitation: many real-world applications lack large, diverse, and reliably labeled datasets. This challenge is particularly evident in domains where data acquisition is costly, error-prone, or inherently scarce, such as industrial inspection, anomaly detection and localization. This thesis investigates how learning systems can be designed to operate effectively when only a small number of samples are available at training or inference time. The first part of the thesis focuses on meta-learning transformers for supervised and unsupervised few-shot tasks. We explore how transformers behave when trained on structured, multi-domain datasets under controlled conditions, where train/test contamination can be explicitly avoided. By reframing few-shot learning as a sequence modeling problem, we analyze the generalization capabilities of in-context learners across domains, and study how training order influences performance. We propose the GEOM framework for supervised few-shot classification and extend its principles to the unsupervised setting with CAMeLU, demonstrating state-of-the-art performance in cross-domain scenarios. The second part of the thesis addresses the gap between academic research and real-world industrial constraints. Working with an Italian company specializing in glass inspection systems, we propose two domain-specific solutions. The first is a few-shot approach for structural glass defect classification, enabling flexible adaptation to new defect types and variations in glass materials. The second is a reconstruction-based anomaly detection pipeline for identifying irregularities in silk-screen printed patterns, where labeled data are extremely scarce. This dissertation highlights the importance of designing models that do not rely on large-scale datasets, but instead leverage task structure, adaptation mechanisms, and data-efficient learning strategies. By bridging foundational research on meta-learning with concrete industrial use cases, the thesis demonstrates that few-shot paradigms can be robust, scalable, and practically applicable in demanding environments.

Generalizing Under Data Scarcity. Enhancing the representation capability from few samples.

Braccaioli, Lorenzo

2026

Abstract

The widespread adoption of deep learning in both research and industrial contexts has revealed a central limitation: many real-world applications lack large, diverse, and reliably labeled datasets. This challenge is particularly evident in domains where data acquisition is costly, error-prone, or inherently scarce, such as industrial inspection, anomaly detection and localization. This thesis investigates how learning systems can be designed to operate effectively when only a small number of samples are available at training or inference time. The first part of the thesis focuses on meta-learning transformers for supervised and unsupervised few-shot tasks. We explore how transformers behave when trained on structured, multi-domain datasets under controlled conditions, where train/test contamination can be explicitly avoided. By reframing few-shot learning as a sequence modeling problem, we analyze the generalization capabilities of in-context learners across domains, and study how training order influences performance. We propose the GEOM framework for supervised few-shot classification and extend its principles to the unsupervised setting with CAMeLU, demonstrating state-of-the-art performance in cross-domain scenarios. The second part of the thesis addresses the gap between academic research and real-world industrial constraints. Working with an Italian company specializing in glass inspection systems, we propose two domain-specific solutions. The first is a few-shot approach for structural glass defect classification, enabling flexible adaptation to new defect types and variations in glass materials. The second is a reconstruction-based anomaly detection pipeline for identifying irregularities in silk-screen printed patterns, where labeled data are extremely scarce. This dissertation highlights the importance of designing models that do not rely on large-scale datasets, but instead leverage task structure, adaptation mechanisms, and data-efficient learning strategies. By bridging foundational research on meta-learning with concrete industrial use cases, the thesis demonstrates that few-shot paradigms can be robust, scalable, and practically applicable in demanding environments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di studio
	
				Industrial Innovation
			
	Data di pubblicazione
	
				15-apr-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Conci, Nicola
			
	Nome Editore
	
				Università degli studi di Trento
			
	Città Editore
	
				TRENTO
			
	Numero di pagine
	
				140
			
	Collezione di appartenenza
	
				Università degli Studi di Trento

File in questo prodotto:

File	Dimensione	Formato
phd_unitn_braccaioli_lorenzo.pdf embargo fino al 25/03/2028 Licenza: Tutti i diritti riservati Dimensione 11.37 MB Formato Adobe PDF	11.37 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/365070

Il codice NBN di questa tesi è URN:NBN:IT:UNITN-365070