Sequential skew-symmetric posterior approximations

Dolmeta, Patric

Deterministic approximations of analytically intractable posterior distributions are a common tool in Bayesian analysis. However, accurate extensions of these methods to situations in which data stream in rapidly and sequentially are still under-explored. In this thesis, we cover this gap by deriving a general and provably-accurate skew-symmetric approximation of a target posterior whose parameters can be evaluated via novel window-type estimators that make computations effectively online. This is accomplished via a specific treatment of third-order Taylor expansions around an online estimate of the maximum-a-posteriori. Such a perspective enhances scalability, while ensuring accuracy improvements, both in theory and in practice, relative to the Laplace approximation. Following a comprehensive theoretical discussion on the conditions under which improved convergence rates to the target posterior can be achieved, we apply this new methodology to the example of bandit problems. Our focus is on generalized linear bandits, a variant of contextual reinforcement learning where a transformation of the expected reward is linearly predicted by a feature vector. Bayesian solutions to the reward maximization problem in structured bandits often face severe computational challenges when updating, and eventually sampling from, the posterior distributions in the online context. We suggest a hybrid approach to Thompson sampling, leveraging a recent closed-form posterior result combined with the precise skew-symmetric approximation as an alternative to existing approaches.

Sequential skew-symmetric posterior approximations

DOLMETA, PATRIC

2025

Abstract

Deterministic approximations of analytically intractable posterior distributions are a common tool in Bayesian analysis. However, accurate extensions of these methods to situations in which data stream in rapidly and sequentially are still under-explored. In this thesis, we cover this gap by deriving a general and provably-accurate skew-symmetric approximation of a target posterior whose parameters can be evaluated via novel window-type estimators that make computations effectively online. This is accomplished via a specific treatment of third-order Taylor expansions around an online estimate of the maximum-a-posteriori. Such a perspective enhances scalability, while ensuring accuracy improvements, both in theory and in practice, relative to the Laplace approximation. Following a comprehensive theoretical discussion on the conditions under which improved convergence rates to the target posterior can be achieved, we apply this new methodology to the example of bandit problems. Our focus is on generalized linear bandits, a variant of contextual reinforcement learning where a transformation of the expected reward is linearly predicted by a feature vector. Bayesian solutions to the reward maximization problem in structured bandits often face severe computational challenges when updating, and eventually sampling from, the posterior distributions in the online context. We suggest a hybrid approach to Thompson sampling, leveraging a recent closed-form posterior result combined with the precise skew-symmetric approximation as an alternative to existing approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				STATISTICS AND COMPUTER SCIENCE
			
	Data di pubblicazione
	
				31-gen-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				DURANTE, DANIELE
PAPASPILIOPOULOS, OMIROS
			
	Nome Editore
	
				Università Bocconi
			
	Collezione di appartenenza
	
				Università Commerciale Luigi Bocconi di Milano

File in questo prodotto:

File	Dimensione	Formato
Thesis_Dolmeta_Patric.pdf accesso aperto Dimensione 2.82 MB Formato Adobe PDF Visualizza/Apri	2.82 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/190589

Il codice NBN di questa tesi è URN:NBN:IT:UNIBOCCONI-190589