نتایج جستجو برای: partially observable markov decision process

تعداد نتایج: 1776231  

1997
Ronen I. Brafman

Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process oriented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be...

2007
Dorna Kashef Haghighi Doina Precup Joelle Pineau Prakash Panangaden

We consider the problem of learning the behavior of a POMDP (Partially Observable Markov Decision Process) with deterministic actions and observations. This is a challenging problem due to the fact that the observations can only partially identify the states. Recent work by Holmes and Isbell offers an approach for inferring the hidden states from experience in deterministic POMDP environments. ...

2015
Haibo Wang Hanna Kurniawati Surya P. N. Singh Mandyam V. Srinivasan

It is now widely accepted that a variety of interaction strategies in animals achieve optimal or near optimal performance. The challenge is in determining the performance criteria being optimized. A difficulty in overcoming this challenge is the need for a large body of observational data to delineate hypotheses, which can be tedious and time consuming, if not impossible. To alleviate this diff...

2005
Masoumeh T. Izadi Doina Precup

Partially Observable Markov Decision Processes (POMDP) provide a standard framework for sequential decision making in stochastic environments. In this setting, an agent takes actions and receives observations and rewards from the environment. Many POMDP solution methods are based on computing a belief state, which is a probability distribution over possible states in which the agent could be. T...

Journal: :CoRR 2001
Paat Rusmevichientong Benjamin Van Roy

We consider a partially observable Markov decision problem (POMDP) that models a class of sequencing problems. Although POMDPs are typically intractable, our formulation admits tractable solution. Instead of maintaining a value function over a high-dimensional set of belief states, we reduce the state space to one of smaller dimension, in which grid-based dynamic programming techniques are effe...

2014
Christopher Amato George D. Konidaris Leslie P. Kaelbling

Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions are primitive operations lasting exactly one time step. We address the case where each agent has macroactions: temporally extended actions which may requ...

2008
Yaodong Ni Zhi-Qiang Liu

The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter...

2014
Daniel T. Barry Jennifer Barry Scott Aaronson

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. We present quantum observable Markov decision processes (QOMDPs), the quantum analogs of partially observable Marko...

Journal: :Annals OR 2015
Yanling Chang Alan L. Erera Chelsea C. White

The leader-follower partially observed, multi-objective Markov game (LF-POMG) models a sequential decision making situation with two intelligent and adaptive decision makers, a leader and a follower, each of which can choose actions that affect the dynamics of the system and where these actions are selected on the basis of current and past but possibly inaccurate state observations. The decisio...

2012
Michael L. Littman

In the field of reinforcement learning (Sutton and Barto, 1998; Kaelbling et al., 1996), agents interact with an environment to learn how to act to maximize reward. Two different kinds of environment models dominate the literature—Markov Decision Processes (Puterman, 1994; Littman et al., 1995), or MDPs, and POMDPs, their Partially Observable counterpart (White, 1991; Kaelbling et al., 1998). B...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید