This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.
Study, design, and development of intelligent systems for automatic diagnostics support
SCARDIGNO, ROBERTO MARIA
2026
Abstract
This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.| File | Dimensione | Formato | |
|---|---|---|---|
|
38 ciclo-SCARDIGNO Roberto Maria.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
15.59 MB
Formato
Adobe PDF
|
15.59 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/354363
URN:NBN:IT:POLIBA-354363