Interactive and Controlled Visual Content Generation

Peruzzo, Elia

The rapid expansion of the creative economy highlights the need for generative tools that empower users to produce high-quality, engaging digital content efficiently. Although advances in deep learning have significantly reduced the barriers to content creation, generative models in the visual domain often lack intuitive control and customization mechanisms, limiting their applicability in artistic and professional workflows. This thesis explores novel methodologies to enhance user-driven interaction with generative models, with a focus on improving customization, control, and creative freedom across various modalities. We propose a series of contributions addressing these challenges. First, we introduce Interactive Neural Painting, a framework that explores a brushstroke decomposition of images, enabling granular control and iterative collaboration between users and generative systems. Second, we present PAIR Diffusion, a comprehensive image editing framework that integrates diverse tasks such as inpainting, object addition, and shape modification within a single model, emphasizing seamless and interactive customization. Lastly, we extend these principles to video editing with VASE and RAGME, a system that incorporates external motion patterns to enhance the realism and control of generated video dynamics. Through these contributions, we advocate for a shift from the traditional input-output paradigm, positioning users as central participants in the creative process. By integrating retrieval-based mechanisms, interactive interfaces, and object-level control, this thesis bridges technical innovation with user-centric design, fostering a more inclusive and empowering creative economy.

Interactive and Controlled Visual Content Generation

Peruzzo, Elia

2025

Abstract

The rapid expansion of the creative economy highlights the need for generative tools that empower users to produce high-quality, engaging digital content efficiently. Although advances in deep learning have significantly reduced the barriers to content creation, generative models in the visual domain often lack intuitive control and customization mechanisms, limiting their applicability in artistic and professional workflows. This thesis explores novel methodologies to enhance user-driven interaction with generative models, with a focus on improving customization, control, and creative freedom across various modalities. We propose a series of contributions addressing these challenges. First, we introduce Interactive Neural Painting, a framework that explores a brushstroke decomposition of images, enabling granular control and iterative collaboration between users and generative systems. Second, we present PAIR Diffusion, a comprehensive image editing framework that integrates diverse tasks such as inpainting, object addition, and shape modification within a single model, emphasizing seamless and interactive customization. Lastly, we extend these principles to video editing with VASE and RAGME, a system that incorporates external motion patterns to enhance the realism and control of generated video dynamics. Through these contributions, we advocate for a shift from the traditional input-output paradigm, positioning users as central participants in the creative process. By integrating retrieval-based mechanisms, interactive interfaces, and object-level control, this thesis bridges technical innovation with user-centric design, fostering a more inclusive and empowering creative economy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di studio
	
				Information and Communication Technology
			
	Data di pubblicazione
	
				28-mar-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Sebe, Niculae
			
	Nome Editore
	
				Università degli studi di Trento
			
	Città Editore
	
				TRENTO
			
	Numero di pagine
	
				110
			
	Collezione di appartenenza
	
				Università degli Studi di Trento

File in questo prodotto:

File	Dimensione	Formato
phd_unitn_Peruzzo_Elia.pdf accesso aperto Dimensione 51.91 MB Formato Adobe PDF Visualizza/Apri	51.91 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/201129

Il codice NBN di questa tesi è URN:NBN:IT:UNITN-201129