State Aggregation in Monte Carlo Tree Search
نویسندگان
چکیده
Monte Carlo tree search (MCTS) algorithms are a popular approach to online decision-making in Markov decision processes (MDPs). These algorithms can, however, perform poorly in MDPs with high stochastic branching factors. In this paper, we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning, but little attention has been paid to state aggregation in MCTS. Our main result is a performance loss bound for a class of value function-based state aggregation criteria in expectimax search trees. We also consider how to construct MCTS algorithms that operate in the abstract state space but require a simulator of the ground dynamics only. We find that trajectory sampling algorithms like UCT can be adapted easily, but that sparse sampling algorithms present difficulties. As a proof of concept, we experimentally confirm that state aggregation can improve the finite-sample performance of UCT.
منابع مشابه
Factored MCTS for Large Scale Stochastic Planning
This paper investigates stochastic planning problems with large factored state and action spaces. We show that even with moderate increase in the size of existing challenge problems, the performance of state of the art algorithms deteriorates rapidly, making them ineffective. To address this problem we propose a family of simple but scalable online planning algorithms that combine sampling, as ...
متن کاملNested Monte-Carlo Tree Search for Online Planning in Large MDPs
Monte-Carlo Tree Search (MCTS) is state of the art for online planning in large MDPs. It is a best-first, sample-based search algorithm in which every state in the search tree is evaluated by the average outcome of Monte-Carlo rollouts from that state. These rollouts are typically random or directed by a simple, domain-dependent heuristic. We propose Nested Monte-Carlo Tree Search (NMCTS), in w...
متن کاملMonte-Carlo Fork Search for Cooperative Path-Finding
This paper presents Monte-Carlo Fork Search (MCFS), a new algorithm that solves Cooperative Path-Finding (CPF) problems with simultaneity. The background is Monte-Carlo Tree Search (MCTS) and Nested Monte-Carlo Search (NMCS). Regarding MCTS, the key idea of MCFS is to build a tree balanced over the whole game tree. To do so, after a simulation, MCFS stores the whole sequence of actions in the t...
متن کاملMonte Carlo Action Programming
This paper proposes Monte Carlo Action Programming, a programming language framework for autonomous systems that act in large probabilistic state spaces with high branching factors. It comprises formal syntax and semantics of a nondeterministic action programming language. The language is interpreted stochastically via Monte Carlo Tree Search. Effectiveness of the approach is shown empirically.
متن کامل