MDPS Analysis Software Development

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in Finite MDPs: PAC Analysis Reinforcement Learning in Finite MDPs: PAC Analysis

We study the problem of learning near-optimal behavior in finite Markov Decision Processes (MDPs) with a polynomial number of samples. These “PAC-MDP” algorithms include the well-known E and R-MAX algorithms as well as the more recent Delayed Q-learning algorithm. We summarize the current state-of-the-art by presenting bounds for the problem in a unified theoretical framework. We also present a...

متن کامل

Software Development for Simulation of Reformer Furnace

In recent years, lots of research has been done on effective usage of natural gas; the first step in these processes is conversion of natural gas to Syngas. Natural gas reforming process by refomer furnace is commonly used for syngas and hydrogen production. In this paper, a windows based software, RIPI-RefSim, is introduced. By using proper heat, mass, kinetic and thermodynamic models as w...

متن کامل

Reinforcement Learning in Finite MDPs: PAC Analysis

We study the problem of learning near-optimal behavior in finite Markov Decision Processes (MDPs) with a polynomial number of samples. These “PAC-MDP” algorithms include the wellknown E3 and R-MAX algorithms as well as the more recent Delayed Q-learning algorithm. We summarize the current state-of-the-art by presenting bounds for the problem in a unified theoretical framework. A more refined an...

متن کامل

Qualitative Analysis of VASS-Induced MDPs

We consider infinite-state Markov decision processes (MDPs) that are induced by extensions of vector addition systems with states (VASS). Verification conditions for these MDPs are described by reachability and Büchi objectives w.r.t. given sets of control-states. We study the decidability of some qualitative versions of these objectives, i.e., the decidability of whether such objectives can be...

متن کامل

Analysis of methods for solving MDPs

New proofs for two extensions to value iteration are derived when the type of initialisation of the value function is considered. Theoretical requirements that guarantee the convergence of backward value iteration and weaker requirements for the convergence of backups based on best actions only are identified. Experimental results show that standard value iteration performs significantly faster...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the Korea Academia-Industrial cooperation Society

سال: 2014

ISSN: 1975-4701

DOI: 10.5762/kais.2014.15.9.5480