Variable Selection Via Thompson Sampling
نویسندگان
چکیده
–Thompson sampling is a heuristic algorithm for the multi-armed bandit problem which has long tradition in machine learning. The Bayesian spirit sense that it selects arms based on posterior samples of reward probabilities each arm. By forging connection between combinatorial binary bandits and spike-and-slab variable selection, we propose stochastic optimization approach to subset selection called Thompson (TVS). TVS framework interpretable learning does not rely underlying model be linear. brings together reinforcement order extend reach nonparametric models large datasets with very many predictors and/or observations. Depending choice reward, can deployed offline as well online setups streaming data batches. Tailoring multiplay provide regret bounds without necessarily assuming arm mean rewards unrelated. We show strong empirical performance both simulated real data. Unlike deterministic methods nature makes less prone local convergence thereby more robust.
منابع مشابه
Variable Selection Via Gibbs Sampling
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your perso...
متن کاملPortfolio Blending via Thompson Sampling
As a definitive investment guideline for institutions and individuals, Markowitz’s modern portfolio theory is ubiquitous in financial industry. However, its noticeably poor out-of-sample performance due to the inaccurate estimation of parameters evokes unremitting efforts of investigating effective remedies. One common retrofit that blends portfolios from disparate investment perspectives has r...
متن کاملStochastic Regret Minimization via Thompson Sampling
The Thompson Sampling (TS) policy is a widely implemented algorithm for the stochastic multiarmed bandit (MAB) problem. Given a prior distribution over possible parameter settings of the underlying reward distributions of the arms, at each time instant, the policy plays an arm with probability equal to the probability that this arm has largest mean reward conditioned on the current posterior di...
متن کاملAsynchronous Parallel Bayesian Optimisation via Thompson Sampling
We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel. Our theoretical analysis shows that a direct application of the sequential Thompson sampling algorithm in either synchronous or asynchronous parallel settings yields a surprisingly powerful resul...
متن کاملVariable Selection by Perfect Sampling
Variable selection is very important in many fields, and for its resolution many procedures have been proposed and investigated. Among them are Bayesian methods that use Markov chain Monte-Carlo (MCMC) sampling algorithms. A problem with MCMC sampling, however, is that it cannot guarantee that the samples are exactly from the target distributions. This drawback is overcome by related methods kn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2021
ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']
DOI: https://doi.org/10.1080/01621459.2021.1928514