Monte Carlo Value Iteration with Macro-Actions
نویسندگان
چکیده
POMDP planning faces two major computational challenges: large state spaces and long planning horizons. The recently introduced Monte Carlo Value Iteration (MCVI) can tackle POMDPs with very large discrete state spaces or continuous state spaces, but its performance degrades when faced with long planning horizons. This paper presents Macro-MCVI, which extends MCVI by exploiting macro-actions for temporal abstraction. We provide sufficient conditions for Macro-MCVI to inherit the good theoretical properties of MCVI. Macro-MCVI does not require explicit construction of probabilistic models for macro-actions and is thus easy to apply in practice. Experiments show that Macro-MCVI substantially improves the performance of MCVI with suitable macro-actions.
منابع مشابه
Monte carlo bayesian hierarchical reinforcement learning
In this paper, we propose to use hierarchical action decomposition to make Bayesian model-based reinforcement learning more efficient and feasible in practice. We formulate Bayesian hierarchical reinforcement learning as a partially observable semi-Markov decision process (POSMDP). The main POSMDP task is partitioned into a hierarchy of POSMDP subtasks; lower-level subtasks get solved first, th...
متن کاملStochastic Assessment of Voltage Sags in Distribution Networks
This paper compares fault position and Monte Carlo methods as the most common methods in stochastic assessment of voltage sags. To compare their abilities, symmetrical and unsymmetrical faults with different probability distribution of fault positions along the lines are applied in a test system. The voltage sag magnitude in different nodes of test system is calculated. The problem with the...
متن کاملConvergence Analysis of Kernel-based On-policy Approximate Policy Iteration Algorithms for Markov Decision Processes with Continuous, Multidimensional States and Actions
Using kernel smoothing techniques, we propose three different online, on-policy approximate policy iteration algorithms which can be applied to infinite horizon problems with continuous and vector-valued states and actions. Using Monte Carlo sampling to estimate the value function around the post-decision state, we reduce the problem to a sequence of deterministic, nonlinear programming problem...
متن کاملA COMPARATIVE MODEL OF EVM AND PROJECT’S SCHEDULE RISK ANALYSIS USING MONTE CARLO SIMULATION
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: justify; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; backgro...
متن کاملA COMPARATIVE MODEL OF EVM AND PROJECT’S SCHEDULE RISK ANALYSIS USING MONTE CARLO SIMULATION
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: justify; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; backgro...
متن کامل