partially observable markov decision process

نتایج جستجو برای: partially observable markov decision process

تعداد نتایج: 1776231 فیلتر نتایج به سال:

Competition Adds Complexity

2007

Judy Goldsmith Martin Mundhenk

It is known that determinining whether a DEC-POMDP, namely, a cooperative partially observable stochastic game (POSG), has a cooperative strategy with positive expected reward is complete for NEXP. It was not known until now how cooperation affected that complexity. We show that, for competitive POSGs, the complexity of determining whether one team has a positive-expected-reward strategy is com...

متن کامل

Quasi-Deterministic Partially Observable Markov Decision Processes

2009

Camille Besse Brahim Chaib-draa

We study a subclass of POMDPs, called quasi-deterministic POMDPs (QDET-POMDPs), characterized by deterministic actions and stochastic observations. While this framework does not model the same general problems as POMDPs, they still capture a number of interesting and challenging problems and, in some cases, have interesting properties. By studying the observability available in this subclass, w...

متن کامل

A POMDP Model for Guiding Taxi Cruising in a Congested Urban City

2011

Lucas Agussurja Hoong Chuin Lau

We consider a partially observable Markov decision process (POMDP) model for improving a taxi agent cruising decision in a congested urban city. Using real-world data provided by a large taxi company in Singapore as a guide, we derive the state transition function of the POMDP. Specifically, we model the cruising behavior of the drivers as continuous-time Markov chains. We then apply dynamic pr...

متن کامل

SarsaLandmark: an algorithm for learning in POMDPs with landmarks

2009

Michael R. James Satinder P. Singh

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in partially observable Markov decision processes (POMDPs). Nevertheless, one can construct counterexamples, problems in which Sarsa(λ < 1 ) fails to find a good policy even though one exists. Despite this, these algorithms ...

متن کامل

Grasping POMDPs: Theory and Experiments

2007

Ross Glashan Tomás Lozano-Pérez

Abstract— We describe a method for planning under uncertainty for robotic manipulation of objects by partitioning the configuration space into a set of regions that are closed under compliant motions. These regions can be treated as states in a partially observable Markov decision process (POMDP), which can be solved to yield optimal control policies under uncertainty. We demonstrate the approa...

متن کامل

Planning and Acting under Uncertainty: A New Model for Spoken Dialogue System

2001

Bo Zhang Qingsheng Cai Jianfeng Mao Baining Guo

Uncertainty plays a central role in spoken dialogue systems. Some stochastic models like the Markov decision process (MDP) are used to model the dialogue manager. But the partially observable system state and user intentions hinder the natural representation of the dialogue state. A MDP-based system degrades quickly when uncertainty about a user's intention increases. We propose a novel dialogu...

متن کامل

Covering Number as a Complexity Measure for POMDP Planning and Learning

2012

Zongzhang Zhang Michael L. Littman Xiaoping Chen

Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable fr...

متن کامل

An Environment Model for Nonstationary Reinforcement Learning

1999

Samuel P. M. Choi Dit-Yan Yeung Nevin Lianwen Zhang

Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonsta-tionary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes. A mode basic...

متن کامل

A Hybrid POMDP-BDI Agent Architecture with Online Stochastic Planning and Plan Caching

Journal: :Cognitive Systems Research 2017

Gavin Rens Deshendran Moodley

This article presents an agent architecture for controlling an autonomous agent in stochastic environments. The architecture combines the partially observable Markov decision process (POMDP) model with the belief-desire-intention (BDI) framework. The Hybrid POMDP-BDI agent architecture takes the best features from the two approaches, that is, the online generation of reward-maximizing courses o...

متن کامل

Spoken dialogue management as planning and acting under uncertainty

2001

Bo Zhang Qingsheng Cai Jianfeng Mao Eric Chang Baining Guo

Some stochastic models like Markov decision process (MDP) are used to model the dialogue manager. MDP-based system degrades fast when uncertainty about user’s intention increases. We propose a novel dialogue model based on the partially observable Markov decision process (POMDP). We use hidden system states and user intentions as the state set, parser results and low-level information as the ob...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید