Dynamic Programming for Partially Observable Stochastic Games

نویسندگان

  • Eric A. Hansen
  • Daniel S. Bernstein
  • Shlomo Zilberstein
چکیده

We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterative elimination of dominated strategies in normal form games. We prove that it iteratively eliminates very weakly dominated strategies without first forming the normal form representation of a finite-horizon POSG. This is the first dynamic programming algorithm for iterative strategy elimination in these types of games. For the special case in which agents share the same payoffs, the algorithm can be used to find an optimal solution. We present preliminary empirical results and discuss ways to further exploit POMDP theory in solving POSGs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Programming Approximations for Partially Observable Stochastic Games

Partially observable stochastic games (POSGs) provide a rich mathematical framework for planning under uncertainty by a group of agents. However, this modeling advantage comes with a price, namely a high computational cost. Solving POSGs optimally quickly becomes intractable after a few decision cycles. Our main contribution is to provide bounded approximation techniques, which enable us to sca...

متن کامل

Solving a Two-Period Cooperative Advertising Problem Using Dynamic Programming

Cooperative advertising is a cost-sharing mechanism in which a part of retailers' advertising investments are financed by the manufacturers. In recent years, investment among advertising options has become a difficult marketing issue. In this paper, the cooperative advertising problem with advertising options is investigated in a two-period horizon in which the market share in the second period...

متن کامل

Point-based Dynamic Programming for DEC-POMDPs

We introduce point-based dynamic programming (DP) for decentralized partially observable Markov decision processes (DEC-POMDPs), a new discrete DP algorithm for planning strategies for cooperative multi-agent systems. Our approach makes a connection between optimal DP algorithms for partially observable stochastic games, and point-based approximations for singleagent POMDPs. We show for the fir...

متن کامل

Best-response play in partially observable card games

We address the problem of how to play optimally against a fixed opponent in a twoplayer card game with partial information like poker. A game theoretic approach to this problem would specify a pair of stochastic policies that are best-responses to each other, i.e., a Nash equilibrium. Although such a Nash-optimal policy guarantees a lower bound to the attainable payoff against any opponent, it ...

متن کامل

Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems

For controlled discrete-time stochastic processes we introduce a new class of dynamic risk measures, which we call process-based. Their main features are that they measure risk of processes that are functions of the history of the base process. We introduce a new concept of conditional stochastic time consistency and we derive the structure of process-based risk measures enjoying this property....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004