This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.

Study,  design, and development of intelligent systems for automatic diagnostics support

SCARDIGNO, ROBERTO MARIA
2026

Abstract

This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.
2026
Inglese
Bevilacqua, Vitoantonio
Dotoli, Mariagrazia
Carli, Raffaele
Buongiorno, Domenico
Dotoli, Mariagrazia
Politecnico di Bari
File in questo prodotto:
File Dimensione Formato  
38 ciclo-SCARDIGNO Roberto Maria.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 15.59 MB
Formato Adobe PDF
15.59 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/354363
Il codice NBN di questa tesi è URN:NBN:IT:POLIBA-354363