The Budgeted Multi-armed Bandit Problem

نویسندگان

  • Omid Madani
  • Daniel J. Lizotte
  • Russell Greiner
چکیده

The following coins problem is a version of a multi-armed bandit problem where one has to select from among a set of objects, say classifiers, after an experimentation phase that is constrained by a time or cost budget. The question is how to spend the budget. The problem involves pure exploration only, differentiating it from typical multi-armed bandit problems involving an exploration/exploitation tradeoff [BF85]. It is an abstraction of the following scenarios: choosing from among a set of alternative treatments after a fixed number of clinical trials, determining the best parameter settings for a program given a deadline that only allows a fixed number of runs; or choosing a life partner in the bachelor/bachelorette TV show where time is limited. We are interested in the computational complexity of the coins problem and/or efficient algorithms with approximation guarantees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Budgeted Bandit Problems with Continuous Random Costs

We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confiden...

متن کامل

Finite budget analysis of multi-armed bandit problems

In the budgeted multi-armed bandit (MAB) problem, a player receives a random reward and needs to pay a cost after pulling an arm, and he cannot pull any more arm after running out of budget. In this paper, we give an extensive study of the upper confidence bound based algorithms and a greedy algorithm for budgeted MABs. We perform theoretical analysis on the proposed algorithms, and show that t...

متن کامل

Budgeted Learning, Part I: The Multi-Armed Bandit Case

We introduce and motivate the task of learning under a budget. We focus on a basic problem in this space: selecting the optimal bandit after a period of experimentation in a multi-armed bandit setting, where each experiment is costly, our total costs cannot exceed a fixed pre-specified budget, and there is no reward collection during the learning period. We address the computational complexity ...

متن کامل

Adaptive Budgeted Bandit Algorithms for Trust in a Supply-Chain Setting

Recently, an AAMAS Challenges & Visions paper identified several key components of a comprehensive trust management has been understudied by the research community [SEN13]. We believe that we can build on recent advances in closely related research in other sub-fields of AI and multiagent systems to address some of these issues. For example, the budgeted multi-armed bandit problem involves pull...

متن کامل

Budgeted Multi-Armed Bandit in Continuous Action Space

Multi–Armed Bandits (MABs) have been widely considered in the last decade to model settings in which an agent wants to learn the action providing the highest expected reward among a fixed set of available actions during the operational life of a system. Classical techniques provide solutions that minimize the regret due to learning in settings where selecting an arm has no cost. Though, in many...

متن کامل

Adaptive Budgeted Bandit Algorithms for Trust Development in a Supply-Chain

Recently, an AAMAS Challenges & Visions paper identified several key components of a comprehensive trust management has been understudied by the research community [13]. We believe that we can build on recent advances in closely related research in other sub-fields of AI and multiagent systems to address some of these issues. For example, the budgeted multi-armed bandit problem involves pulling...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004