نتایج جستجو برای: partially observable markov decision process
تعداد نتایج: 1776231 فیلتر نتایج به سال:
It is known that determinining whether a DEC-POMDP, namely, a cooperative partially observable stochastic game (POSG), has a cooperative strategy with positive expected reward is complete for NEXP. It was not known until now how cooperation affected that complexity. We show that, for competitive POSGs, the complexity of determining whether one team has a positive-expected-reward strategy is com...
We study a subclass of POMDPs, called quasi-deterministic POMDPs (QDET-POMDPs), characterized by deterministic actions and stochastic observations. While this framework does not model the same general problems as POMDPs, they still capture a number of interesting and challenging problems and, in some cases, have interesting properties. By studying the observability available in this subclass, w...
We consider a partially observable Markov decision process (POMDP) model for improving a taxi agent cruising decision in a congested urban city. Using real-world data provided by a large taxi company in Singapore as a guide, we derive the state transition function of the POMDP. Specifically, we model the cruising behavior of the drivers as continuous-time Markov chains. We then apply dynamic pr...
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in partially observable Markov decision processes (POMDPs). Nevertheless, one can construct counterexamples, problems in which Sarsa(λ < 1 ) fails to find a good policy even though one exists. Despite this, these algorithms ...
Abstract— We describe a method for planning under uncertainty for robotic manipulation of objects by partitioning the configuration space into a set of regions that are closed under compliant motions. These regions can be treated as states in a partially observable Markov decision process (POMDP), which can be solved to yield optimal control policies under uncertainty. We demonstrate the approa...
Uncertainty plays a central role in spoken dialogue systems. Some stochastic models like the Markov decision process (MDP) are used to model the dialogue manager. But the partially observable system state and user intentions hinder the natural representation of the dialogue state. A MDP-based system degrades quickly when uncertainty about a user's intention increases. We propose a novel dialogu...
Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable fr...
Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonsta-tionary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes. A mode basic...
This article presents an agent architecture for controlling an autonomous agent in stochastic environments. The architecture combines the partially observable Markov decision process (POMDP) model with the belief-desire-intention (BDI) framework. The Hybrid POMDP-BDI agent architecture takes the best features from the two approaches, that is, the online generation of reward-maximizing courses o...
Some stochastic models like Markov decision process (MDP) are used to model the dialogue manager. MDP-based system degrades fast when uncertainty about user’s intention increases. We propose a novel dialogue model based on the partially observable Markov decision process (POMDP). We use hidden system states and user intentions as the state set, parser results and low-level information as the ob...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید