markov decision process

نتایج جستجو برای: markov decision process

تعداد نتایج: 1627273 فیلتر نتایج به سال:

Apprentissage par renforcement pour les processus décisionnels de Markov partiellement observés Apprendre une extension sélective du passé

Journal: :Revue d'Intelligence Artificielle 2003

Alain Dutech Manuel Samuelides

We present a new algorithm that extends the Reinforcement Learning framework to Partially Observed Markov Decision Processes (POMDP). The main idea of our method is to build a state extension, called exhaustive observable, which allow us to define a next processus that is Markovian. We bring the proof that solving this new process, to which classical RL methods can be applied, brings an optimal...

متن کامل

User study of the Bayesian update of dialogue state approach to dialogue management

2008

Blaise Thomson Milica Gasic Simon Keizer François Mairesse Jost Schatzmann Kai Yu Steve J. Young

This paper presents the results of a comparative user evaluation of various approaches to dialogue management. The major contribution is a comparison of traditional systems against a system that uses a Bayesian Update of Dialogue State approach. This approach is based on the Partially Observable Markov Decision Process (POMDP), which has previously been shown to give improved robustness in simu...

متن کامل

Spoken Dialogue Management Using Probabilistic Reasoning

2000

Nicholas Roy Joelle Pineau Sebastian Thrun

Spoken dialogue managers have benefited from using stochastic planners such as Markov Decision Processes (MDPs). However, so far, MDPs do not handle well noisy and ambiguous speech utterances. We use a Partially Observable Markov Decision Process (POMDP)-style approach to generate dialogue strategies by inverting the notion of dialogue state; the state represents the user’s intentions, rather t...

متن کامل

A Four-Participant Group Facilitation Framework for Conversational Robots

2013

Yoichi Matsuyama Iwao Akiba Akihiro Saito Tetsunori Kobayashi

In this paper, we propose a framework for conversational robots that facilitates fourparticipant groups. In three-participant conversations, the minimum unit for multiparty conversations, social imbalance, in which a participant is left behind in the current conversation, sometimes occurs. In such scenarios, a conversational robot has the potential to facilitate situations as the fourth partici...

متن کامل

Existence of Optimal Policies for Semi-Markov Decision Processes Using Duality for Infinite Linear Programming

Journal: :SIAM J. Control and Optimization 2006

Diego Klabjan Daniel Adelman

Semi-Markov decision processes on Borel spaces with deterministic kernels have many practical applications, particularly in inventory theory. Most of the results from general semi-Markov decision processes do not carry over to a deterministic kernel since such a kernel does not provide “smoothness.” We develop infinite dimensional linear programming theory for a general stochastic semi-Markov d...

متن کامل

Planning in Cost-Paired Markov Decision Process Games

2003

H. Brendan McMahan Geoffrey J. Gordon

We describe applications and theoretical results for a new class of two-player planning games. In these games, each player plans in a separate Markov Decision Process (MDP), but the costs associated with a policy in one of the MDPs depend on the policy selected by the other player. These costpaired MDPs represent an interesting and computationally tractable subset of adversarial planning proble...

متن کامل

Software Vulnerability Patch Management with Semi-Markov Decision Process

2013

Chien-Cheng Huang Kwo-Jean Farn Feng-Yu Lin Frank Yeong-Sung Lin

Information security incidents frequency has been increasing dramatically, the aim of this study is to analyze the state-space reachability problems through the transition of vulnerable status after the informative system vulnerability exposure. In this research we took into consideration the time factor to analyze the arrival time to reachable states problem discussed in stochastic Petri nets....

متن کامل

Partially Observable Markov Decision Process for Recommender Systems

Journal: :CoRR 2016

Zhongqi Lu Qiang Yang

We report the ‘Recurrent Deterioration’ (RD) phenomenon observed in online recommender systems. The RD phenomenon is reflected by the trend of performance degradation when the recommendation model is always trained based on users’ feedbacks of the previous recommendations. There are several reasons for the recommender systems to encounter the RD phenomenon, including the lack of negative traini...

متن کامل

Optimal Electricity Supply Bidding by Markov Decision Process

2000

Haili Song Chen-Ching Liu Robert W. Dahlgren

The bidding decision making problem is studied from a supplier’s viewpoint in a spot market environment. The decision-making problem is formulated as a Markov Decision Process a discrete stochastic optimization method All other suppliers are modeled by their bidding parameters with corresponding probabilities. A systematic method is developed to calculate transition probabilities and rewards. A...

متن کامل

Partially Observed Markov Decision Process Multiarmed Bandits - Structural Results

Journal: :Math. Oper. Res. 2009

Vikram Krishnamurthy Bo Wahlberg

This paper considers multiarmed bandit problems involving partially observed Markov decision processes (POMDPs). We show how the Gittins index for the optimal scheduling policy can be computed by a value iteration algorithm on each process, thereby considerably simplifying the computational cost. A suboptimal value iteration algorithm based on Lovejoy’s approximation is presented. We then show ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید