Generative models have experienced significant advancements in recent years, driven by the introduction of architectures such as Stable Diffusion, GPT-3, ChatGPT, and many others. These models are designed to learn probability distributions and efficiently sample from them during inference, typically conditioned on inputs like text. Trained on large volumes of unlabeled data, these models possess extensive knowledge that can be transferred to address specific tasks. In this thesis, we show how they can be harnessed to address a variety of tasks across different domains, including reasoning, image processing, and music generation. In particular, we will explore diverse methodologies to guide the generation process of a learned model to better suit the task at hand.

Harnessing the capabilities of Generative Models

MARIANI, GIORGIO
2024

Abstract

Generative models have experienced significant advancements in recent years, driven by the introduction of architectures such as Stable Diffusion, GPT-3, ChatGPT, and many others. These models are designed to learn probability distributions and efficiently sample from them during inference, typically conditioned on inputs like text. Trained on large volumes of unlabeled data, these models possess extensive knowledge that can be transferred to address specific tasks. In this thesis, we show how they can be harnessed to address a variety of tasks across different domains, including reasoning, image processing, and music generation. In particular, we will explore diverse methodologies to guide the generation process of a learned model to better suit the task at hand.
16-set-2024
Inglese
RODOLA', EMANUELE
MANCINI, MAURIZIO
Università degli Studi di Roma "La Sapienza"
105
File in questo prodotto:
File Dimensione Formato  
Tesi_dottorato_Mariani.pdf

accesso aperto

Dimensione 4.84 MB
Formato Adobe PDF
4.84 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/184104
Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-184104