Study,  design, and development of intelligent systems for automatic diagnostics support

Scardigno, Roberto Maria

This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.

Study, design, and development of intelligent systems for automatic diagnostics support

SCARDIGNO, ROBERTO MARIA

2026

Abstract

This doctoral research investigates the study, design, and development of intelligent systems for automatic diagnostic support across biomedical and industrial imaging. A unified methodological framework based on deep learning underlies both domains, emphasizing robustness and reliability under limited or noisy data conditions. In the biomedical domain, a generative mask-guided architecture, named CALIMAR-GAN, is developed for metal artifact reduction in computed tomography (CT). The model preserves anatomical structures while enhancing realism metrics (e.g., Fréchet Inception Distance) also on real clinical data, demonstrating improved generalization compared with paired learning strategies. Building upon this foundation, a conditional generative framework (SG2Pix) is designed for photoacoustic image reconstruction directly from sinograms. The approach investigates different encoding strategies—reshaped, windowed, and Gramian Angular Field representations—as well as hybrid inputs combining sinograms with back-projected (BP) images to embed physics-informed priors. A distinct supervised learning framework based on U-Net architectures is then proposed for quantitative photoacoustic imaging, exploiting multi-wavelength data for blood oxygenation (sO2) estimation and vascular segmentation. The results indicate that incorporating physically informed priors improves both reconstruction accuracy and perceptual realism of the oxygenation maps. Finally, a real-time ultrasound (US) pipeline is implemented, enabling the streaming of B-mode frames from a US system to both a workstation and a HoloLens 2 headset. This framework integrates deep segmentation and automatic volumetric kidney measurements based on principal-component ellipsoid fitting, providing hands- and voice-based interaction for clinician-in-the-loop usability. In the industrial domain, a comprehensive survey of more than 220 studies on deep learning for surface-defect inspection is conducted. A bi-dimensional taxonomy is introduced to relate recognition tasks and learning paradigms, revealing open challenges concerning data scarcity, explainability, and real-time applicability. Building on these insights, a systematic one-shot learning study is carried out, comparing a foundation model (DINOv2) with conventional CNN and ResNet18 architectures under multiple training regimes, including pure one-shot, augmented, and good-class-informed scenarios. The experiments highlight complementary strengths: ResNet18 exhibits higher robustness in genuine low-data settings, whereas DINOv2 achieves superior performance when richer supervision or contextual cues are available, confirming the potential of foundation models for adaptive and data-efficient industrial inspection. The dissertation is organized into two main parts: the first focuses on the biomedical domain, addressing artifact reduction, photoacoustic reconstruction, oxygenation estimation, and real-time US segmentation with augmented-reality visualization; the second focuses on the industrial domain, encompassing a comprehensive literature survey on surface defect inspection and experimental analyses on one-shot defect classification with foundation and convolutional architectures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria Elettrica e dell'Informazione
			
	Corso di studio
	
				Autonomous Systems
			
	Data di pubblicazione
	
				2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Bevilacqua, Vitoantonio
Dotoli, Mariagrazia
Carli, Raffaele
Buongiorno, Domenico
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				Dotoli, Mariagrazia
			
	Nome Editore
	
				Politecnico di Bari
			
	Collezione di appartenenza
	
				Politecnico di Bari

File in questo prodotto:

File	Dimensione	Formato
38 ciclo-SCARDIGNO Roberto Maria.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 15.59 MB Formato Adobe PDF Visualizza/Apri	15.59 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/354363

Il codice NBN di questa tesi è URN:NBN:IT:POLIBA-354363