Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty

نویسندگان

  • Landon Kraemer
  • Bikramjit Banerjee
چکیده

Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Reinforcement learning (RL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. This assumption may not always be necessary and may make learning difficult. We propose a novel RL approach in which agents rehearse with information that will not be available during policy execution, yet learn policies that do not explicitly rely on this information. We show experimentally that incorporating such information can ease the difficulties faced by non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rehearsal Based Multi-agent Reinforcment Learning of Decentralized Plans

Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Reinforcement learning (RL) based approaches have been recently proposed for distributed solution of Dec-POMDPs ...

متن کامل

Reinforcement Learning for Decentralized Planning Under Uncertainty (Doctoral Consortium)

Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. But in real world scenarios, model parameters may not be known a priori, or may be difficult to specify. We prop...

متن کامل

Optimizing decentralized production–distribution planning problem in a multi-period supply chain network under uncertainty

Decentralized supply chain management is found to be significantly relevant in today’s competitive markets. Production and distribution planning is posed as an important optimization problem in supply chain networks. Here, we propose a multi-period decentralized supply chain network model with uncertainty. The imprecision related to uncertain parameters like demand and price of the final produc...

متن کامل

Decentralized Planning for Self-Adaptation in Multi-cloud Environment

The runtime management of Internet of Things (IoT) oriented applications deployed in multi-clouds is a complex issue due to the highly heterogeneous and dynamic execution environment. To effectively cope with such an environment, the cross-layer and multi-cloud effects should be taken into account and a decentralized self-adaptation is a promising solution to maintain and evolve the application...

متن کامل

Large-Scale Planning Under Uncertainty: A Survey

Our research area is planning under uncertainty, that is, making sequences of decisions in the face of imperfect information. We are particularly concerned with developing planning algorithms that perform well in large, real-world domains. This paper is a brief introduction to this area of research, which draws upon results from operations research (Markov decision processes), machine learning ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013