Deep learning techniques for image segmentation and anomaly detection in low data regimes

DE NARDIN, Axel

This thesis focuses on the problem of image segmentation in low-data settings. In particular, the two specific problems that are tackled in the present work are the ones revolving around anomaly segmentation for industrial quality control and doc- ument layout segmentation of ancient manuscripts. For the first problem, two novel attention-based are proposed one based on the popular U-Net architecture and the second one on the more recent Vision Transformer which has been enhanced for the task at hand with a masking module and a multi-resolution self-attention compo- nent. As for the document layout analysis, we introduce a few-shot segmentation framework based on the combination of DeepLabV3+, a robust deep learning ar- chitecture for semantic segmentation, with a traditional computer vision algorithm for image binarization while at the same time relying on a novel instance generation strategy that allows to leverage the small amount of data available fully. Further- more, we provide an analysis of the effects of transfer learning in this domain-specific context, showing the drawbacks of pre-training on large general-purpose datasets compared to smaller domain-specific ones. For each of the proposed approaches, we provide the experimental results obtained on popular publicly available datasets for the corresponding task.

Deep learning techniques for image segmentation and anomaly detection in low data regimes

DE NARDIN, AXEL

2024

Abstract

This thesis focuses on the problem of image segmentation in low-data settings. In particular, the two specific problems that are tackled in the present work are the ones revolving around anomaly segmentation for industrial quality control and doc- ument layout segmentation of ancient manuscripts. For the first problem, two novel attention-based are proposed one based on the popular U-Net architecture and the second one on the more recent Vision Transformer which has been enhanced for the task at hand with a masking module and a multi-resolution self-attention compo- nent. As for the document layout analysis, we introduce a few-shot segmentation framework based on the combination of DeepLabV3+, a robust deep learning ar- chitecture for semantic segmentation, with a traditional computer vision algorithm for image binarization while at the same time relying on a novel instance generation strategy that allows to leverage the small amount of data available fully. Further- more, we provide an analysis of the effects of transfer learning in this domain-specific context, showing the drawbacks of pre-training on large general-purpose datasets compared to smaller domain-specific ones. For each of the proposed approaches, we provide the experimental results obtained on popular publicly available datasets for the corresponding task.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Dottorato di ricerca in Informatica e scienze matematiche e fisiche
			
	Data di pubblicazione
	
				5-mar-2024
			
	Lingua
	
				Inglese
Inglese
			
	Parola chiave
	
				Image segmentation; Anomaly detection; low data learning; document analysis
			
	Relatore, Supervisor, Advisor o Tutor
	
				MARCONE, Alberto Giulio
PICIARELLI, Claudio
FORESTI, Gian Luca
			
	Nome Editore
	
				Università degli Studi di Udine
			
	Collezione di appartenenza
	
				Università degli Studi di Udine

File in questo prodotto:

File	Dimensione	Formato
PhD_Thesis Revised De Nardin.pdf accesso aperto Dimensione 2.45 MB Formato Adobe PDF Visualizza/Apri	2.45 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/164566

Il codice NBN di questa tesi è URN:NBN:IT:UNIUD-164566