This thesis investigates the design and analysis of distributed and delayed online learning algorithms. First, we introduce delayed online learning, where model updates rely on feedback arriving with variable delays. We study the online learning problem with curved losses and delayed feedback, designing algorithms that exploit loss curvature to achieve improved guarantees with delayed feedback. Furthermore, we demonstrate how intermediate observations can mitigate the deleterious effects of delayed feedback in settings with partial feedback (e.g., multi-armed bandits) by developing a meta-algorithm that achieves near-optimal regret with significantly reduced sensitivity to total delay. Second, we consider distributed online convex optimization over communication graphs, in which a network of agents cooperatively minimizes a global convex loss function expressed as the sum of local loss functions, using only neighbor-to-neighbor exchanges and local computation. We propose and rigorously analyze a distributed online convex optimization algorithm that accommodates both random communication and stochastic agent availability, where two agents communicate only when both are simultaneously active. We then present a unified algorithmic framework that simultaneously addresses network decentralization and delayed feedback in distributed online convex optimization. We design distributed online learning algorithms that adapt to unknown, time- and agent-varying delays while maintaining near-optimal regret guarantees. Finally, transitioning from adversarial to stochastic regimes, we extend this framework to distributed stochastic multi-armed bandit settings over random communication graphs. We derive improved regret bounds that combine the optimal centralized regret with a natural term depending on the graph's algebraic connectivity and edge probability.
DISTRIBUTED AND DELAYED ONLINE LEARNING
QIU, HAO
2025
Abstract
This thesis investigates the design and analysis of distributed and delayed online learning algorithms. First, we introduce delayed online learning, where model updates rely on feedback arriving with variable delays. We study the online learning problem with curved losses and delayed feedback, designing algorithms that exploit loss curvature to achieve improved guarantees with delayed feedback. Furthermore, we demonstrate how intermediate observations can mitigate the deleterious effects of delayed feedback in settings with partial feedback (e.g., multi-armed bandits) by developing a meta-algorithm that achieves near-optimal regret with significantly reduced sensitivity to total delay. Second, we consider distributed online convex optimization over communication graphs, in which a network of agents cooperatively minimizes a global convex loss function expressed as the sum of local loss functions, using only neighbor-to-neighbor exchanges and local computation. We propose and rigorously analyze a distributed online convex optimization algorithm that accommodates both random communication and stochastic agent availability, where two agents communicate only when both are simultaneously active. We then present a unified algorithmic framework that simultaneously addresses network decentralization and delayed feedback in distributed online convex optimization. We design distributed online learning algorithms that adapt to unknown, time- and agent-varying delays while maintaining near-optimal regret guarantees. Finally, transitioning from adversarial to stochastic regimes, we extend this framework to distributed stochastic multi-armed bandit settings over random communication graphs. We derive improved regret bounds that combine the optimal centralized regret with a natural term depending on the graph's algebraic connectivity and edge probability.| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_unimi_R13958.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
18.25 MB
Formato
Adobe PDF
|
18.25 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/352540
URN:NBN:IT:UNIMI-352540