partially negative data

نتایج جستجو برای: partially negative data

تعداد نتایج: 2918678 فیلتر نتایج به سال:

Learning to Explore and Exploit in POMDPs

2009

Chenghui Cai Xuejun Liao Lawrence Carin

A fundamental objective in reinforcement learning is the maintenance of a proper balance between exploration and exploitation. This problem becomes more challenging when the agent can only partially observe the states of its environment. In this paper we propose a dual-policy method for jointly learning the agent behavior and the balance between exploration exploitation, in partially observable...

متن کامل

Learning Partially Observable Action Models: Efficient Algorithms

2006

Dafna Shahaf Allen Chang Eyal Amir

We present tractable, exact algorithms for learning actions’ effects and preconditions in partially observable domains. Our algorithms maintain a propositional logical representation of the set of possible action models after each observation and action execution. The algorithms perform exact learning of preconditions and effects in any deterministic action domain. This includes STRIPS actions ...

متن کامل

Robust Person Guidance by Using Online POMDPs

2013

Luis Merino Joaquín Ballesteros Noé Pérez-Higueras Rafael Ramón Vigo Javier Pérez-Lara Fernando Caballero

The paper considers a guiding task in which a robot has to guide a person towards a destination. A robust operation requires to consider uncertain models on the person motion and intentions, as well as noise and occlusions in the sensors employed for the task. Partially Observable Markov Decision Processes (POMDPs) are used to model the task. The paper describes an enhancement on online POMDP s...

متن کامل

MDPs Semi - Markov decision processes Hidden Markov models Partially observable SMDPs Hierarchical HMMs

2007

Sridhar Mahadevan

متن کامل

Goal Achievement in Partially Known, Partially Observable Domains

2006

Allen Chang Eyal Amir

We present a decision making algorithm for agents that act in partially observable domains which they do not know fully. Making intelligent choices in such domains is very difficult because actions’ effects may not be known a priori (partially known domain), and features may not always be visible (partially observable domain). Nonetheless, we show that an efficient solution is achievable in STR...

متن کامل

Discovering Abstract Concepts to Aid Cross-Map Transfer for a Learning Agent

2009

Cédric Herpson Vincent Corruble

The capacity to apply knowledge in a context different than the one in which it was learned has become crucial within the area of autonomous agents. This paper specifically addresses the issue of transfer of knowledge acquired through online learning in partially observable environments. We investigate the discovery of relevant abstract concepts which help the transfer of knowledge in the conte...

متن کامل

Subjective Equilibria in Interactive POMDPs: Theory and Computational Limitations

2005

Prashant Doshi

We analyze the asymptotic behavior of agents engaged in a infinite horizon partially observable stochastic game formalized by the interactive POMDP framework. We show that when agents’ initial beliefs satisfy a truth compatibility condition, their behavior converges to a subjective 2-equilibrium in a finite time, and subjective equilibrium in the limit. Imposing an additional assumption of mutu...

متن کامل

Goal Recognition over POMDPs: Inferring the Intention of a POMDP Agent

2011

Miquel Ramírez Hector Geffner

Plan recognition is the problem of inferring the goals and plans of an agent from partial observations of her behavior. Recently, it has been shown that the problem can be formulated and solved using planners, reducing plan recognition to plan generation. In this work, we extend this model-based approach to plan recognition to the POMDP setting, where actions are stochastic and states are parti...

متن کامل

Maintaining Predictions over Time without a Model

2009

Erik Talvitie Satinder P. Singh

A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Since predictive features provide direct information about the agent’s future, they have a number of...

متن کامل

Controlling Listening-oriented Dialogue using Partially Observable Markov Decision Processes

2010

Toyomi Meguro Ryuichiro Higashinaka Yasuhiro Minami Kohji Dohsaka

This paper investigates how to automatically create a dialogue control component of a listening agent to reduce the current high cost of manually creating such components. We collected a large number of listening-oriented dialogues with their user satisfaction ratings and used them to create a dialogue control component using partially observable Markov decision processes (POMDPs), which can le...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید