This thesis investigates the design and analysis of distributed and delayed online learning algorithms. First, we introduce delayed online learning, where model updates rely on feedback arriving with variable delays. We study the online learning problem with curved losses and delayed feedback, designing algorithms that exploit loss curvature to achieve improved guarantees with delayed feedback. Furthermore, we demonstrate how intermediate observations can mitigate the deleterious effects of delayed feedback in settings with partial feedback (e.g., multi-armed bandits) by developing a meta-algorithm that achieves near-optimal regret with significantly reduced sensitivity to total delay. Second, we consider distributed online convex optimization over communication graphs, in which a network of agents cooperatively minimizes a global convex loss function expressed as the sum of local loss functions, using only neighbor-to-neighbor exchanges and local computation. We propose and rigorously analyze a distributed online convex optimization algorithm that accommodates both random communication and stochastic agent availability, where two agents communicate only when both are simultaneously active. We then present a unified algorithmic framework that simultaneously addresses network decentralization and delayed feedback in distributed online convex optimization. We design distributed online learning algorithms that adapt to unknown, time- and agent-varying delays while maintaining near-optimal regret guarantees. Finally, transitioning from adversarial to stochastic regimes, we extend this framework to distributed stochastic multi-armed bandit settings over random communication graphs. We derive improved regret bounds that combine the optimal centralized regret with a natural term depending on the graph's algebraic connectivity and edge probability.

DISTRIBUTED AND DELAYED ONLINE LEARNING

QIU, HAO
2025

Abstract

This thesis investigates the design and analysis of distributed and delayed online learning algorithms. First, we introduce delayed online learning, where model updates rely on feedback arriving with variable delays. We study the online learning problem with curved losses and delayed feedback, designing algorithms that exploit loss curvature to achieve improved guarantees with delayed feedback. Furthermore, we demonstrate how intermediate observations can mitigate the deleterious effects of delayed feedback in settings with partial feedback (e.g., multi-armed bandits) by developing a meta-algorithm that achieves near-optimal regret with significantly reduced sensitivity to total delay. Second, we consider distributed online convex optimization over communication graphs, in which a network of agents cooperatively minimizes a global convex loss function expressed as the sum of local loss functions, using only neighbor-to-neighbor exchanges and local computation. We propose and rigorously analyze a distributed online convex optimization algorithm that accommodates both random communication and stochastic agent availability, where two agents communicate only when both are simultaneously active. We then present a unified algorithmic framework that simultaneously addresses network decentralization and delayed feedback in distributed online convex optimization. We design distributed online learning algorithms that adapt to unknown, time- and agent-varying delays while maintaining near-optimal regret guarantees. Finally, transitioning from adversarial to stochastic regimes, we extend this framework to distributed stochastic multi-armed bandit settings over random communication graphs. We derive improved regret bounds that combine the optimal centralized regret with a natural term depending on the graph's algebraic connectivity and edge probability.
9-dic-2025
Inglese
CESA BIANCHI, NICOLO' ANTONIO
SASSI, ROBERTO
Università degli Studi di Milano
219
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R13958.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 18.25 MB
Formato Adobe PDF
18.25 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/352540
Il codice NBN di questa tesi è URN:NBN:IT:UNIMI-352540