This thesis focuses on the problem of image segmentation in low-data settings. In particular, the two specific problems that are tackled in the present work are the ones revolving around anomaly segmentation for industrial quality control and doc- ument layout segmentation of ancient manuscripts. For the first problem, two novel attention-based are proposed one based on the popular U-Net architecture and the second one on the more recent Vision Transformer which has been enhanced for the task at hand with a masking module and a multi-resolution self-attention compo- nent. As for the document layout analysis, we introduce a few-shot segmentation framework based on the combination of DeepLabV3+, a robust deep learning ar- chitecture for semantic segmentation, with a traditional computer vision algorithm for image binarization while at the same time relying on a novel instance generation strategy that allows to leverage the small amount of data available fully. Further- more, we provide an analysis of the effects of transfer learning in this domain-specific context, showing the drawbacks of pre-training on large general-purpose datasets compared to smaller domain-specific ones. For each of the proposed approaches, we provide the experimental results obtained on popular publicly available datasets for the corresponding task.

Deep learning techniques for image segmentation and anomaly detection in low data regimes

DE NARDIN, AXEL
2024

Abstract

This thesis focuses on the problem of image segmentation in low-data settings. In particular, the two specific problems that are tackled in the present work are the ones revolving around anomaly segmentation for industrial quality control and doc- ument layout segmentation of ancient manuscripts. For the first problem, two novel attention-based are proposed one based on the popular U-Net architecture and the second one on the more recent Vision Transformer which has been enhanced for the task at hand with a masking module and a multi-resolution self-attention compo- nent. As for the document layout analysis, we introduce a few-shot segmentation framework based on the combination of DeepLabV3+, a robust deep learning ar- chitecture for semantic segmentation, with a traditional computer vision algorithm for image binarization while at the same time relying on a novel instance generation strategy that allows to leverage the small amount of data available fully. Further- more, we provide an analysis of the effects of transfer learning in this domain-specific context, showing the drawbacks of pre-training on large general-purpose datasets compared to smaller domain-specific ones. For each of the proposed approaches, we provide the experimental results obtained on popular publicly available datasets for the corresponding task.
5-mar-2024
Inglese
Inglese
Image segmentation; Anomaly detection; low data learning; document analysis
MARCONE, Alberto Giulio
PICIARELLI, Claudio
FORESTI, Gian Luca
Università degli Studi di Udine
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis Revised De Nardin.pdf

accesso aperto

Dimensione 2.45 MB
Formato Adobe PDF
2.45 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/164566
Il codice NBN di questa tesi è URN:NBN:IT:UNIUD-164566