Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty

نویسندگان

Landon Kraemer

Bikramjit Banerjee

چکیده

Decentralized partially-observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Reinforcement learning (RL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. This assumption may not always be necessary and may make learning difficult. We propose a novel RL approach in which agents rehearse with information that will not be available during policy execution, yet learn policies that do not explicitly rely on this information. We show experimentally that incorporating such information can ease the difficulties faced by non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rehearsal Based Multi-agent Reinforcment Learning of Decentralized Plans

متن کامل

Reinforcement Learning for Decentralized Planning Under Uncertainty (Doctoral Consortium)

متن کامل

Optimizing decentralized production–distribution planning problem in a multi-period supply chain network under uncertainty

Decentralized supply chain management is found to be significantly relevant in today’s competitive markets. Production and distribution planning is posed as an important optimization problem in supply chain networks. Here, we propose a multi-period decentralized supply chain network model with uncertainty. The imprecision related to uncertain parameters like demand and price of the final produc...

متن کامل

Decentralized Planning for Self-Adaptation in Multi-cloud Environment

The runtime management of Internet of Things (IoT) oriented applications deployed in multi-clouds is a complex issue due to the highly heterogeneous and dynamic execution environment. To effectively cope with such an environment, the cross-layer and multi-cloud effects should be taken into account and a decentralized self-adaptation is a promising solution to maintain and evolve the application...

متن کامل

Large-Scale Planning Under Uncertainty: A Survey

Our research area is planning under uncertainty, that is, making sequences of decisions in the face of imperfect information. We are particularly concerned with developing planning algorithms that perform well in large, real-world domains. This paper is a brief introduction to this area of research, which draws upon results from operations research (Markov decision processes), machine learning ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty

نویسندگان

چکیده

منابع مشابه

Rehearsal Based Multi-agent Reinforcment Learning of Decentralized Plans

Reinforcement Learning for Decentralized Planning Under Uncertainty (Doctoral Consortium)

Optimizing decentralized production–distribution planning problem in a multi-period supply chain network under uncertainty

Decentralized Planning for Self-Adaptation in Multi-cloud Environment

Large-Scale Planning Under Uncertainty: A Survey

عنوان ژورنال:

اشتراک گذاری