نتایج جستجو برای: partially observable markov decision process
تعداد نتایج: 1776231 فیلتر نتایج به سال:
We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with an uncountable state space, namely the space of probability distributions on the original core state space. By devel...
Bayesian learning methods have recently been shown to provide an elegant solution to the exploration-exploitation trade-off in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). The primary focus of this paper is to extend these ideas to the case of partially observable domains, by introducing th...
Adaptive sensing involves actively managing sensor resources to achieve a sensing task, such as object detection, classification, and tracking, and represents a promising direction for new applications of discrete event system methods. We describe an approach to adaptive sensing based on approximately solving a partially observable Markov decision process (POMDP) formulation of the problem. Suc...
This is a demonstration of a voice dialer, implemented as a partially observable Markov decision process (POMDP). A realtime graphical display shows the POMDP’s probability distribution over different possible dialog states, and shows how system output is generated and selected. The system demonstrated here includes several recent advances, including an action selection mechanism which unifies ...
We present a foveated gesture recognition system that guides an active camera to foveate salient features based on a reinforcement learning paradigm. Using vision routines previously implemented for an interactive environment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hiddenstate reinforcement lear...
There are many sensing challenges for which one must balance the effectiveness of a given measurement with the associated sensing cost. For example, when performing a diagnosis a doctor must balance the cost and benefit of a given test (measurement), and the decision to stop sensing (stop performing tests) must account for the risk to the patient and doctor (malpractice) for a given diagnosis b...
We propose a novel approach, called parallel rollout, to solving (partially observable) Markov decision processes. Our approach generalizes the rollout algorithm of Bertsekas and Castanon (1999) by rolling out a set of multiple heuristic policies rather than a single policy. In particular, the parallel rollout approach aims at the class of problems where we have multiple heuristic policies avai...
We consider Incentive Decision Processes, where a principal seeks to reduce its costs due to another agent’s behavior, by offering incentives to the agent for alternate behavior. We focus on the case where a principal interacts with a greedy agent whose preferences are hidden and static. Though IDPs can be directly modeled as partially observable Markov decision processes (POMDP), we show that ...
V advisors often increase sales for those customers who find such online advice to be convenient and helpful. However, other customers take a more active role in their purchase decisions and prefer more detailed data. In general, we expect that websites are more preferred and increase sales if their characteristics (e.g., more detailed data) match customers’ cognitive styles (e.g., more analyti...
This paper presents a real-time system that guides stroke patients during upper extremity rehabilitation. The system automatically modifies exercise parameters to account for the specific needs and abilities of different individuals. We describe a partially observable Markov decision process (POMDP) model of a rehabilitation exercise that can capture this form of customization. The system will ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید