نتایج جستجو برای: partially negative data
تعداد نتایج: 2918678 فیلتر نتایج به سال:
A fundamental objective in reinforcement learning is the maintenance of a proper balance between exploration and exploitation. This problem becomes more challenging when the agent can only partially observe the states of its environment. In this paper we propose a dual-policy method for jointly learning the agent behavior and the balance between exploration exploitation, in partially observable...
We present tractable, exact algorithms for learning actions’ effects and preconditions in partially observable domains. Our algorithms maintain a propositional logical representation of the set of possible action models after each observation and action execution. The algorithms perform exact learning of preconditions and effects in any deterministic action domain. This includes STRIPS actions ...
The paper considers a guiding task in which a robot has to guide a person towards a destination. A robust operation requires to consider uncertain models on the person motion and intentions, as well as noise and occlusions in the sensors employed for the task. Partially Observable Markov Decision Processes (POMDPs) are used to model the task. The paper describes an enhancement on online POMDP s...
We present a decision making algorithm for agents that act in partially observable domains which they do not know fully. Making intelligent choices in such domains is very difficult because actions’ effects may not be known a priori (partially known domain), and features may not always be visible (partially observable domain). Nonetheless, we show that an efficient solution is achievable in STR...
The capacity to apply knowledge in a context different than the one in which it was learned has become crucial within the area of autonomous agents. This paper specifically addresses the issue of transfer of knowledge acquired through online learning in partially observable environments. We investigate the discovery of relevant abstract concepts which help the transfer of knowledge in the conte...
We analyze the asymptotic behavior of agents engaged in a infinite horizon partially observable stochastic game formalized by the interactive POMDP framework. We show that when agents’ initial beliefs satisfy a truth compatibility condition, their behavior converges to a subjective 2-equilibrium in a finite time, and subjective equilibrium in the limit. Imposing an additional assumption of mutu...
Plan recognition is the problem of inferring the goals and plans of an agent from partial observations of her behavior. Recently, it has been shown that the problem can be formulated and solved using planners, reducing plan recognition to plan generation. In this work, we extend this model-based approach to plan recognition to the POMDP setting, where actions are stochastic and states are parti...
A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Since predictive features provide direct information about the agent’s future, they have a number of...
This paper investigates how to automatically create a dialogue control component of a listening agent to reduce the current high cost of manually creating such components. We collected a large number of listening-oriented dialogues with their user satisfaction ratings and used them to create a dialogue control component using partially observable Markov decision processes (POMDPs), which can le...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید