This thesis explores Diffusion Models through three pillars: human-AI collaboration, downstream applications, and data scarcity. First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process. Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events. Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance. Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.

Advancing Generative AI for Creative Partnership, Downstream Utility, and Data-Scarce Domains

DALL'ASEN, NICOLA
2026

Abstract

This thesis explores Diffusion Models through three pillars: human-AI collaboration, downstream applications, and data scarcity. First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process. Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events. Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance. Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.
17-feb-2026
Inglese
generative
diffusion
collaborative
downstream
data scarcity
Ricci, Elisa
Wang, Yiming
File in questo prodotto:
File Dimensione Formato  
phd_thesis_dallasen_nicola_final_pdfa.pdf

accesso aperto

Licenza: Creative Commons
Dimensione 166.07 MB
Formato Adobe PDF
166.07 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/359110
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-359110