نتایج جستجو برای: partially observable markov decision process

تعداد نتایج: 1776231  

2013
Håkan Warnquist Jonas Kvarnström Patrick Doherty

When parts of the states in a goal POMDP are fully observable and some actions are deterministic it is possible to take advantage of these properties to efficiently generate approximate solutions. Actions that deterministically affect the fully observable component of the world state can be abstracted away and combined into macro actions, permitting a planner to converge more quickly. This proc...

2005
J. D. Williams P. Poupart S. Young Jason D. Williams Pascal Poupart Steve Young

................................................................................................................................................. 5

2009
Pablo Samuel Castro Prakash Panangaden Doina Precup

We explore equivalence relations between states in Markov Decision Processes and Partially Observable Markov Decision Processes. We focus on two different equivalence notions: bisimulation (Givan et al, 2003) and a notion of trace equivalence, under which states are considered equivalent roughly if they generate the same conditional probability distributions over observation sequences (where th...

Journal: :CoRR 2012
Pouyan Rafiei Fard Keyvan Yahya

Partially observable Markov decision processes have been widely used to provide models for real-world decision making problems. In this paper, we will provide a method in which a slightly different version of them called Mixed observability Markov decision process, MOMDP, is going to join with our problem. Basically, we aim at offering a behavioural model for interaction of intelligent agents w...

2007
Daan Wierstra Jürgen Schmidhuber

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov Decision Processes (POMDPs) that require long-term memories of past observations and actions. The approach involves estimating a policy gradient for an Actor through a Policy Gradient Critic which evaluates probabilit...

Journal: :Auton. Robots 2005
María Elena López Guillén Luis Miguel Bergasa Rafael Barea María Soledad Escudero

Assistant robots have received special attention from the research community in the last years. One of the main applications of these robots is to perform care tasks in indoor environments such as houses, nursing homes or hospitals, and therefore they need to be able to navigate robustly for long periods of time. This paper focuses on the navigation system of SIRA, a robotic assistant for elder...

2006
Trung H. Bui Anton Nijholt

We propose a novel approach to developing a dialogue model which is able to take into account some aspects of the user’s emotional state and acts appropriately. The dialogue model uses a Partially Observable Markov Decision Process approach with observations composed of the observed user’s emotional state and action. A simple example of route navigation is explained to clarify our approach and ...

2005
Zinovi Rabinovich Jeffrey S. Rosenschein

In this paper we introduce a novel approach to continual planning and control, called Dynamics Based Control (DBC). The approach is similar in spirit to the Actor-Critic [6] approach to learning and estimation-based differential regulators of classical control theory [12]. However, DBC is not a learning algorithm, nor can it be subsumed within models of standard control theory. We provide a gen...

Journal: :Research in Computing Science 2016
Hermilo Victorio Meza Manuel Mejía-Lavalle Alicia Martínez Rebollar Obdulia Pichardo-Lagunas Grigori Sidorov

We describe a project still in development about Intelligent Tutoring Systems and optimal educational actions. Good pedagogical actions are key components in all learning-teaching schemes. Automate that is an important 33 Research in Computing Science 111 (2016) pp. 33–45; rec. 2016-03-08; acc. 2016-05-05 Intelligent Tutoring Systems objective. We propose apply Partially Observable Markov Decis...

Journal: :Intelligent Automation & Soft Computing 2004
Giorgos Apostolikas Spyros G. Tzafestas

Artificial Intelligence techniques were primarily focused on domains in which at each time the state of the world is known to the system. Such domains can be modeled as a Markov Decision Process (MDP). Action and planning policies for MDPs have been studied extensively and several efficient methods exist. However, in real world problems pieces of information useful for the process of action sel...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید