Papers
arxiv:2303.17110

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

Published on Mar 30, 2023
Authors:
,
,
,
,
,
,

Abstract

We study contextual combinatorial bandits with probabilistically triggered arms (C^2MAB-T) under a variety of smoothness conditions that capture a wide range of applications, such as contextual cascading bandits and contextual influence maximization bandits. Under the triggering probability modulated (TPM) condition, we devise the C^2-UCB-T algorithm and propose a novel analysis that achieves an O(dKT) regret bound, removing a potentially exponentially large factor O(1/p_{min}), where d is the dimension of contexts, p_{min} is the minimum positive probability that any arm can be triggered, and batch-size K is the maximum number of arms that can be triggered per round. Under the variance modulated (VM) or triggering probability and variance modulated (TPVM) conditions, we propose a new variance-adaptive algorithm VAC^2-UCB and derive a regret bound O(dT), which is independent of the batch-size K. As a valuable by-product, our analysis technique and variance-adaptive algorithm can be applied to the CMAB-T and C^2MAB setting, improving existing results there as well. We also include experiments that demonstrate the improved performance of our algorithms compared with benchmark algorithms on synthetic and real-world datasets.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2303.17110 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2303.17110 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2303.17110 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.