This thesis explores Diffusion Models through three pillars: human-AI collaboration, downstream applications, and data scarcity. First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process. Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events. Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance. Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.
Advancing Generative AI for Creative Partnership, Downstream Utility, and Data-Scarce Domains
DALL'ASEN, NICOLA
2026
Abstract
This thesis explores Diffusion Models through three pillars: human-AI collaboration, downstream applications, and data scarcity. First, we address interactive generative processes. Current text-driven models offer limited fine-grained control. We introduce Collaborative Neural Painting (CNP), reframing image generation as sequential stroke-based painting. This enables users to guide and collaborate with an AI iteratively, centering the human creator in the artistic process. Second, we investigate practical applications. For medical imaging, we develop MAMBO, a high-resolution mammography model that generates realistic images to augment datasets, improving breast cancer classification. Additionally, we leverage diffusion mechanics for unsupervised video anomaly detection, using reconstruction error on motion representations to identify anomalous events. Third, we confront data scarcity across scenarios. We present CoRE, a training-free approach enhancing zero-shot VLM classification through contextual retrieval. For few-shot learning, we propose DISEF, synthesizing diverse in-domain images for dataset augmentation combined with efficient VLM fine-tuning. We also introduce Chamfer Guidance, a training-free inference technique that steers models toward better distributional coverage relative to a few real examples, improving generation diversity and downstream classifier performance. Collectively, this thesis advances generative AI from a passive tool to an active collaborator, practical problem-solver, and effective solution in data-constrained environments.| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_thesis_dallasen_nicola_final_pdfa.pdf
accesso aperto
Licenza:
Creative Commons
Dimensione
166.07 MB
Formato
Adobe PDF
|
166.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/359110
URN:NBN:IT:UNIPI-359110