partially observable markov decision process

نتایج جستجو برای: partially observable markov decision process

تعداد نتایج: 1776231 فیلتر نتایج به سال:

A reinforcement learning scheme for a multi-agent card game

2003

Hajime Fujita Yoichiro Matsuno Shin Ishii

We formulate an automatic strategy acquisition problem for the multi-agent card game “Hearts” as a reinforcement learning (RL) problem. Since there are often a lot of unobservable cards in this game, RL is approximately dealt with in the framework of a partially observable Markov decision process (POMDP). This article presents a POMDP-RL method based on estimation of unobservable state variable...

متن کامل

Active Chemical Sensing With Partially Observable Markov Decision Processes

2008

Rakesh Gosangi Ricardo Gutierrez-Osuna

We present an active-perception strategy to optimize the temperature program of metal-oxide sensors in real time, as the sensor reacts with its environment. We model the problem as a partially observable Markov decision process (POMDP), where actions correspond to measurements at particular temperatures, and the agent is to find a temperature sequence that minimizes the Bayes risk. We validate ...

متن کامل

Incremental Markov-Model Planning

1996

Richard Washington

This paper presents an approach to building plans using partially observable Markov decision processes. The approach begins with a base solution that assumes full observability. The partially observable solution is incrementally constructed by considering increasing amounts of information from observations. The base solution directs the expansion of the plan by providing an evaluation function ...

متن کامل

Reinforcement Learning for Problems with Hidden State

2002

Samuel W. Hasinoff

In this paper, we describe how techniques from reinforcement learning might be used to approach the problem of acting under uncertainty. We start by introducing the theory of partially observable Markov decision processes (POMDPs) to describe what we call hidden state problems. After a brief review of other POMDP solution techniques, we motivate reinforcement learning by considering an agent wi...

متن کامل

Acting Optimally in Partially Observable Stochastic Domains

1994

Anthony R. Cassandra Leslie Pack Kaelbling Michael L. Littman

In this paper, we describe the partially observable Markov decision process (pomdp) approach to nding optimal or near-optimal control strategies for partially observable stochastic environments, given a complete model of the environment. The pomdp approach was originally developed in the operations research community and provides a formal basis for planning problems that have been of interest t...

متن کامل

Scalable POMDPs for Diagnosis and Planning in Intelligent Tutoring Systems

2010

Jeremiah T. Folsom-Kovarik Gita Reese Sukthankar Sae Lynne Schatz Denise M. Nicholson

A promising application area for proactive assistant agents is automated tutoring and training. Intelligent tutoring systems (ITSs) assist tutors and tutees by automating diagnosis and adaptive tutoring. These tasks are well modeled by a partially observable Markov decision process (POMDP) since it accounts for the uncertainty inherent in diagnosis. However, an important aspect of making POMDP ...

متن کامل

Sequential Constant Size Compressors for Reinforcement Learning

2011

Linus Gisslén Matthew D. Luciw Vincent Graziano Jürgen Schmidhuber

Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Me...

متن کامل

An Optimal Lot-Sizing and Offline Inspection Policy in the Case of Nonrigid Demand

Journal: :Operations Research 2006

Shoshana Anily Abraham Grosfeld-Nir

A batch production process that is initially in the in-control state can fail with constant failure rate to the out-of-control state. The probability that a unit is conforming if produced while the process is in control is constant and higher than the respective constant conformance probability while the process is out of control. When production ends, the units are inspected in the order they ...

متن کامل

Point-Based Planning for Multi-Objective POMDPs

2015

Diederik M. Roijers Shimon Whiteson Frans A. Oliehoek

Many sequential decision-making problems require an agent to reason about both multiple objectives and uncertainty regarding the environment’s state. Such problems can be naturally modelled as multi-objective partially observable Markov decision processes (MOPOMDPs). We propose optimistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set...

متن کامل

Partially observable Markov decision processes

2007

Matthijs Spaan

For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In many problem domains, however, an agent suffers from limited sensing capabilities that preclude it from recovering a Markovian state signal from its perceptions. Extending the MDP framework, partially observable Markov...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید