Long-Run Rewards for Markov Automata
نویسندگان
چکیده
Markov automata are a powerful formalism for modelling systems which exhibit nondeterminism, probabilistic choices and continuous stochastic timing. We consider the computation of long-run average rewards, the most classical problem in continuous-time Markov model analysis. We propose an algorithm based on value iteration. It improves the state of the art by orders of magnitude. The contribution is rooted in a fresh look on Markov automata, namely by treating them as an efficient encoding of CTMDPs with – in the worst case – exponentially more transitions.
منابع مشابه
Modelling and Analysis of Markov Reward Automata
Costs and rewards are important ingredients for many types of systems, modelling critical aspects like energy consumption, task completion, repair costs, and memory usage. This paper introduces Markov reward automata, an extension of Markov automata that allows the modelling of systems incorporating rewards (or costs) in addition to nondeterminism, discrete probabilistic choice and continuous s...
متن کاملExtending Markov Automata with State and Action Rewards∗
This presentation introduces the Markov Reward Automaton (MRA), an extension of the Markov automaton that allows the modelling of systems incorporating rewards in addition to nondeterminism, discrete probabilistic choice and continuous stochastic timing. Our models support both rewards that are acquired instantaneously when taking certain transitions (action rewards) and rewards that are based ...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملPerformance evaluation with temporal rewards
Today many formalisms exist for specifying complex Markov chains. In contrast, formalisms for specifying rewards, enabling the analysis of long-run average performance properties, have remained quite primitive. Basically, they only support the analysis of relatively simple performance metrics that can be expressed as long-run averages of atomic rewards, i.e. rewards that are deductible directly...
متن کاملPii: S0166-5316(02)00105-0
Today many formalisms exist for specifying complex Markov chains. In contrast, formalisms for specifying rewards, enabling the analysis of long-run average performance properties, have remained quite primitive. Basically, they only support the analysis of relatively simple performance metrics that can be expressed as long-run averages of atomic rewards, i.e. rewards that are deductible directly...
متن کامل