Fully asynchronous policy evaluation in distributed reinforcement learning over networks
نویسندگان
چکیده
This paper proposes a fully asynchronous scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks. Without waiting any other node network, each can locally update its value function at time using (possibly delayed) information from neighbors. is in sharp contrast to gossip-based where pair nodes concurrently update. Even though setting involves difficult multi-timescale decision problem, we design novel incremental aggregated gradient (IAG) based algorithm and develop push–pull augmented graph approach prove exact convergence linear rate O(ck) c∈(0,1) k total number updates within entire network. Finally, numerical experiments validate that our method speeds up linearly with respect nodes, robust straggler nodes.
منابع مشابه
Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach
Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملError Bounds in Reinforcement Learning Policy Evaluation
With the advent of Kearns & Singh’s (2000) rigorous upper bound on the error of temporal difference estimators, we derive the first rigorous error bound for the maximum likelihood policy evaluation method as well as deriving a Monte Carlo matrix inversion policy evaluation error bound. We provide, the first direct comparison between the error bounds of the maximum likelihood (ML), Monte Carlo m...
متن کاملPAC-Bayesian Policy Evaluation for Reinforcement Learning
Bayesian priors offer a compact yet general means of incorporating domain knowledge into many learning tasks. The correctness of the Bayesian analysis and inference, however, largely depends on accuracy and correctness of these priors. PAC-Bayesian methods overcome this problem by providing bounds that hold regardless of the correctness of the prior distribution. This paper introduces the first...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Automatica
سال: 2022
ISSN: ['1873-2836', '0005-1098']
DOI: https://doi.org/10.1016/j.automatica.2021.110092