Generalized Bandit Problems
نویسنده
چکیده
1 The questions addressed in this paper grew out of my work with Jeff Banks on bandit problems and their applications (Banks and Sundaram (1992a, 1992b, 1994)) and owe much to many discussions I had with him on this subject. I also had the benefit of several discussions with Andy McLennan, especially regarding the material in Sections 4 and 6 of this paper.
منابع مشابه
The Exploration vs Exploitation Trade-Off in Bandit Problems: An Empirical Study
We compare well-known action selection policies used in reinforcement learning like ǫ-greedy and softmax with lesser known ones like the Gittins index and the knowledge gradient on bandit problems. The latter two are in comparison very performant. Moreover the knowledge gradient can be generalized to other than bandit problems.
متن کاملEnhancing Evolutionary Optimization in Uncertain Environments by Allocating Evaluations via Multi-armed Bandit Algorithms
Optimization problems with uncertain fitness functions are common in the real world, and present unique challenges for evolutionary optimization approaches. Existing issues include excessively expensive evaluation, lack of solution reliability, and incapability in maintaining high overall fitness during optimization. Using conversion rate optimization as an example, this paper proposes a series...
متن کاملParametric Bandits: The Generalized Linear Case
We consider structured multi-armed bandit problems based on the Generalized Linear Model (GLM) framework of statistics. For these bandits, we propose a new algorithm, called GLM-UCB. We derive finite time, high probability bounds on the regret of the algorithm, extending previous analyses developed for the linear bandits to the non-linear case. The analysis highlights a key difficulty in genera...
متن کاملFour proofs of Gittins' multiarmed bandit theorem
We study four proofs that the Gittins index priority rule is optimal for alternative bandit processes. These include Gittins’ original exchange argument, Weber’s prevailing charge argument, Whittle’s Lagrangian dual approach, and Bertsimas and Niño-Mora’s proof based on the achievable region approach and generalized conservation laws. We extend the achievable region proof to infinite countable ...
متن کاملOnline Linear Optimization with Sparsity Constraints
We study the problem of online linear optimization with sparsity constraints in the 1 semi-bandit setting. It can be seen as a marriage between two well-known problems: 2 the online linear optimization problem and the combinatorial bandit problem. For 3 this problem, we provide two algorithms which are efficient and achieve sublinear 4 regret bounds. Moreover, we extend our results to two gener...
متن کامل