Papers
arxiv:2310.18701

Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Published on Oct 28, 2023
Authors:
,
,
,

Abstract

This paper investigates the problem of generalized linear bandits with heavy-tailed rewards, whose (1+epsilon)-th moment is bounded for some epsilonin (0,1]. Although there exist methods for generalized linear bandits, most of them focus on bounded or sub-Gaussian rewards and are not well-suited for many real-world scenarios, such as financial markets and web-advertising. To address this issue, we propose two novel algorithms based on truncation and mean of medians. These algorithms achieve an almost optimal regret bound of O(dT^{1{1+epsilon}}), where d is the dimension of contextual information and T is the time horizon. Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches. Additionally, our mean-of-medians-based algorithm requires only O(log T) rewards and one estimator per epoch, making it more practical. Moreover, our algorithms improve the regret bounds by a logarithmic factor compared to existing algorithms when epsilon=1. Numerical experimental results confirm the merits of our algorithms.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2310.18701 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.18701 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2310.18701 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.