Game of Thrones: Fully Distributed Learning for Multiplayer Bandits

نویسندگان

چکیده

We consider an N-player multiarmed bandit game in which each player chooses one out of M arms for T turns. Each has different expected rewards the arms, and instantaneous are independent identically distributed or Markovian. When two more players choose same arm, they all receive zero reward. Performance is measured using sum regrets compared with optimal assignment to that maximizes rewards. assume only knows player’s own actions reward received turn. Players cannot observe other players, no communication between possible. present a algorithm prove it achieves near-[Formula: see text]. This first achieve near order regret this fully scenario. All works have assumed either vector

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Balance of thrones: a network study on 'Game of Thrones'

How do social interactions in fictional words? Here we show that network theory can be used to systematically and quantitatively analyse relationships among noble houses and how the web of alliances and conflicts changes over time in the fantasy drama TV series Game of Thrones. Network analysis proved to be a powerful tool in capturing structures and dynamics of the story. Degree distribution a...

متن کامل

Detecting Cheaters in a Distributed Multiplayer Game

Cheating is currently a major problem in today’s multiplayer games. One of the most popular types of cheating involves having the client software render information which is not in the player’s current field of view. This type of cheating may allow a player to see their opponents through walls, or to see their opponents on a radar, or at extreme distances, when they would normally not be able t...

متن کامل

Asynchronous Real-time Multiplayer Game With Distributed State

Real-time multiplayer games are complex systems that often have a single point of failure and are not scalable. In this work a prototype design is created to handle node failure during game simulation. The client server paradigm is modified to construct a distributed server at each node. Propagation of gamestate is performed across nodes keeping each node up to date. Node failure is handled gra...

متن کامل

Game of Thrones: Accommodating Monetary Policies in a Monetary Union

In this paper we present an application of the dynamic tracking games framework to 1 a monetary union. We use a small stylized nonlinear three-country macroeconomic model of a 2 monetary union to analyse the interactions between fiscal (governments) and monetary (common 3 central bank) policy makers, assuming different objective functions of these decision makers. Using 4 the OPTGAME algorithm ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics of Operations Research

سال: 2021

ISSN: ['0364-765X', '1526-5471']

DOI: https://doi.org/10.1287/moor.2020.1051