Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning

نویسندگان

چکیده

This article deals with distributed policy optimization in reinforcement learning, which involves a central controller and group of learners. In particular, two typical settings encountered several applications are considered: multiagent learning (RL) xmlns:xlink="http://www.w3.org/1999/xlink">parallel RL , where frequent information exchanges between the learners required. For many practical systems, however, overhead caused by these communication is considerable, becomes bottleneck overall performance. To address this challenge, novel gradient approach developed for solving RL. The adaptively skips during iterations, can reduce without degrading learning It established analytically that: i) algorithm has convergence rate identical to that plain-vanilla gradient; while ii) if heterogeneous terms their reward functions, number rounds needed achieve desirable accuracy markedly reduced. Numerical experiments corroborate reduction attained compared alternatives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the ...

متن کامل

Exponentiated Gradient Methods for Reinforcement Learning

This paper introduces and evaluates a natural extension of linear exponentiated gradient methods that makes them applicable to reinforcement learning problems. Just as these methods speed up supervised learning, we nd that they can also increase the ef-ciency of reinforcement learning. Comparisons are made with conventional reinforcement learning methods on two test problems using CMAC function...

متن کامل

Gradient Sparsification for Communication-Efficient Distributed Optimization

Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding...

متن کامل

Scalable Multitask Policy Gradient Reinforcement Learning

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...

متن کامل

Model-based Policy Gradient Reinforcement Learning

Policy gradient methods based on REINFORCE are model-free in the sense that they estimate the gradient using only online experiences executing the current stochastic policy. This is extremely wasteful of training data as well as being computationally inefficient. This paper presents a new modelbased policy gradient algorithm that uses training experiences much more efficiently. Our approach con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Control of Network Systems

سال: 2022

ISSN: ['2325-5870', '2372-2533']

DOI: https://doi.org/10.1109/tcns.2021.3078100