Parametric Bandits: The Generalized Linear Case
نویسندگان
چکیده
We consider structured multi-armed bandit problems based on the Generalized Linear Model (GLM) framework of statistics. For these bandits, we propose a new algorithm, called GLM-UCB. We derive finite time, high probability bounds on the regret of the algorithm, extending previous analyses developed for the linear bandits to the non-linear case. The analysis highlights a key difficulty in generalizing linear bandit algorithms to the non-linear case, which is solved in GLM-UCB by focusing on the reward space rather than on the parameter space. Moreover, as the actual effectiveness of current parameterized bandit algorithms is often poor in practice, we provide a tuning method based on asymptotic arguments, which leads to significantly better practical performance. We present two numerical experiments on real-world data that illustrate the potential of the GLM-UCB approach.
منابع مشابه
The Negative Binomial Distribution Efficiency in Finite Mixture of Semi-parametric Generalized Linear Models
Introduction Selection the appropriate statistical model for the response variable is one of the most important problem in the finite mixture of generalized linear models. One of the distributions which it has a problem in a finite mixture of semi-parametric generalized statistical models, is the Poisson distribution. In this paper, to overcome over dispersion and computational burden, finite ...
متن کاملProvably Optimal Algorithms for Generalized Linear Contextual Bandits
Contextual bandits are widely used in Internet services from news recommendation to advertising, and to Web search. Generalized linear models (logistical regression in particular) have demonstrated stronger performance than linear models in many applications where rewards are binary. However, most theoretical analyses on contextual bandits so far are on linear bandits. In this work, we propose ...
متن کاملScalable Generalized Linear Bandits: Online Computation and Hashing
Generalized Linear Bandits (GLBs), a natural extension of the stochastic linear bandits, has been popular and successful in recent years. However, existing GLBs scale poorly with the number of rounds and the number of arms, limiting their utility in practice. This paper proposes new, scalable solutions to the GLB problem in two respects. First, unlike existing GLBs, whose per-timestep space and...
متن کاملTHE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)
Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes. Small area estimation is needed in obtaining information on a small area, such as sub-district or village. Generally, in some cases, small area estimation uses parametric modeling. But in fact, a lot of models have no linear relationship between the small area average and the covariat...
متن کاملRestless Bandits, Partial Conservation Laws and Indexability
We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being det...
متن کامل