Visual Reinforcement Learning is a popular and powerful framework that fully leverages recent breakthroughs in Deep Learning. However, variations in input domains (e.g., changes in background colors due to seasonal shifts) or task domains (e.g., modifying a car’s target speed) can degrade agent performance, often requiring retraining for each variation. Recent advances in representation learning have demonstrated the potential to combine components from different neural networks to construct new models in a zero-shot fashion. In this dissertation, we build upon these advances and adapt them to the Visual Reinforcement Learning setting, enabling the composition of agent components to form new agents capable of handling novel visual-task combinations not seen during training. This is achieved by establishing communication between encoders and controllers from different models trained under distinct variations. Our findings highlight the promise of model reuse, significantly reducing the need for retraining and thereby cutting down on both time and computational cost.
Latent alignment techniques enable modular policies in the context of reinforcement learning
RICCIARDI, ANTONIO PIO
2025
Abstract
Visual Reinforcement Learning is a popular and powerful framework that fully leverages recent breakthroughs in Deep Learning. However, variations in input domains (e.g., changes in background colors due to seasonal shifts) or task domains (e.g., modifying a car’s target speed) can degrade agent performance, often requiring retraining for each variation. Recent advances in representation learning have demonstrated the potential to combine components from different neural networks to construct new models in a zero-shot fashion. In this dissertation, we build upon these advances and adapt them to the Visual Reinforcement Learning setting, enabling the composition of agent components to form new agents capable of handling novel visual-task combinations not seen during training. This is achieved by establishing communication between encoders and controllers from different models trained under distinct variations. Our findings highlight the promise of model reuse, significantly reducing the need for retraining and thereby cutting down on both time and computational cost.File | Dimensione | Formato | |
---|---|---|---|
Tesi_dottorato_Ricciardi.pdf
accesso aperto
Dimensione
10.7 MB
Formato
Adobe PDF
|
10.7 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/212551
URN:NBN:IT:UNIROMA1-212551