partially observable markov decision process

نتایج جستجو برای: partially observable markov decision process

تعداد نتایج: 1776231 فیلتر نتایج به سال:

Exploiting Fully Observable and Deterministic Structures in Goal POMDPs

2013

Håkan Warnquist Jonas Kvarnström Patrick Doherty

When parts of the states in a goal POMDP are fully observable and some actions are deterministic it is possible to take advantage of these properties to efficiently generate approximate solutions. Actions that deterministically affect the fully observable component of the world state can be abstracted away and combined into macro actions, permitting a planner to converge more quickly. This proc...

متن کامل

Using Factored Partially Observable Markov Decision Processes with Continuous Observations for Dialog Management

2005

J. D. Williams P. Poupart S. Young Jason D. Williams Pascal Poupart Steve Young

................................................................................................................................................. 5

متن کامل

Notions of State Equivalence under Partial Observability

2009

Pablo Samuel Castro Prakash Panangaden Doina Precup

We explore equivalence relations between states in Markov Decision Processes and Partially Observable Markov Decision Processes. We focus on two different equivalence notions: bisimulation (Givan et al, 2003) and a notion of trace equivalence, under which states are considered equivalent roughly if they generate the same conditional probability distributions over observation sequences (where th...

متن کامل

A Mixed Observability Markov Decision Process Model for Musical Pitch

Journal: :CoRR 2012

Pouyan Rafiei Fard Keyvan Yahya

Partially observable Markov decision processes have been widely used to provide models for real-world decision making problems. In this paper, we will provide a method in which a slightly different version of them called Mixed observability Markov decision process, MOMDP, is going to join with our problem. Basically, we aim at offering a behavioural model for interaction of intelligent agents w...

متن کامل

Policy Gradient Critics

2007

Daan Wierstra Jürgen Schmidhuber

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov Decision Processes (POMDPs) that require long-term memories of past observations and actions. The approach involves estimating a policy gradient for an Actor through a Policy Gradient Critic which evaluates probabilit...

متن کامل

A Navigation System for Assistant Robots Using Visually Augmented POMDPs

Journal: :Auton. Robots 2005

María Elena López Guillén Luis Miguel Bergasa Rafael Barea María Soledad Escudero

Assistant robots have received special attention from the research community in the last years. One of the main applications of these robots is to perform care tasks in indoor environments such as houses, nursing homes or hospitals, and therefore they need to be able to navigate robustly for long periods of time. This paper focuses on the navigation system of SIRA, a robotic assistant for elder...

متن کامل

Toward Affective Dialogue Modeling using Partially Observable Markov Decision Processes

2006

Trung H. Bui Anton Nijholt

We propose a novel approach to developing a dialogue model which is able to take into account some aspects of the user’s emotional state and acts appropriately. The dialogue model uses a Partially Observable Markov Decision Process approach with observations composed of the observed user’s emotional state and action. A simple example of route navigation is explained to clarify our approach and ...

متن کامل

Dynamics Based Control: An Introduction

2005

Zinovi Rabinovich Jeffrey S. Rosenschein

In this paper we introduce a novel approach to continual planning and control, called Dynamics Based Control (DBC). The approach is similar in spirit to the Actor-Critic [6] approach to learning and estimation-based differential regulators of classical control theory [12]. However, DBC is not a learning algorithm, nor can it be subsumed within models of standard control theory. We provide a gen...

متن کامل

Aplicación de procesos Markovianos para recomendar acciones pedagógicas óptimas en tutores inteligentes

Journal: :Research in Computing Science 2016

Hermilo Victorio Meza Manuel Mejía-Lavalle Alicia Martínez Rebollar Obdulia Pichardo-Lagunas Grigori Sidorov

We describe a project still in development about Intelligent Tutoring Systems and optimal educational actions. Good pedagogical actions are key components in all learning-teaching schemes. Automate that is an important 33 Research in Computing Science 111 (2016) pp. 33–45; rec. 2016-03-08; acc. 2016-05-05 Intelligent Tutoring Systems objective. We propose apply Partially Observable Markov Decis...

متن کامل

Improved QMDP Policy for Partially Observable Markov Decision Processes in Large Domains: Embedding Exploration Dynamics

Journal: :Intelligent Automation & Soft Computing 2004

Giorgos Apostolikas Spyros G. Tzafestas

Artificial Intelligence techniques were primarily focused on domains in which at each time the state of the world is known to the system. Such domains can be modeled as a Markov Decision Process (MDP). Action and planning policies for MDPs have been studied extensively and several efficient methods exist. However, in real world problems pieces of information useful for the process of action sel...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید