Metadata only
Date
2021Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget C (i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows C and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case C = 0, while incurring additional additive terms respectively having a linear and quadratic dependency on C in general. We present algorithm-independent lower bounds showing that these additive terms are nearoptimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing C. Show more
Publication status
publishedExternal links
Book title
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)Journal / series
Proceedings of Machine Learning ResearchVolume
Pages / Article No.
Publisher
PMLREvent
Organisational unit
03908 - Krause, Andreas / Krause, Andreas
Funding
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
19-2 FEL-47 - Robust Sample-Efficient Learning when Data ist Costly (ETHZ)
More
Show all metadata
ETH Bibliography
yes
Altmetrics