Formula-E race strategy development using distributed policy gradient reinforcement learning

نویسندگان

چکیده

Energy and thermal management is a crucial element in Formula-E race strategy development. In this study, the race-level development formulated into Markov decision process (MDP) problem featuring hybrid-type action space. Deep Deterministic Policy Gradient (DDPG) reinforcement learning implemented under distributed architecture Ape-X integrated with prioritized experience replay reward shaping techniques to optimize set of actions both continuous discrete components. Soft boundary violation penalties shaping, significantly improves performance DDPG makes it capable generating faster finishing solutions. The new proposed method has shown superior comparison Monte Carlo Tree Search (MCTS) policy gradient learning, which solves fully space as presented literature. advantages are time better handling ambient temperature rise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using policy gradient reinforcement learning on autonomous robot controllers

Robot programmers can often quickly program a robot to approximately execute a task under specific environment conditions. However, achieving robust performance under more general conditions is significantly more difficult. We propose a framework that starts with an existing control system and uses reinforcement feedback from the environment to autonomously improve the controller’s performance....

متن کامل

Online Policy-Gradient Reinforcement Learning using OLGARB for SpaceWar

The goal of this project is to explore the use of reinforcement learning techniques to address a difficult problem domain. The SpaceWar task is a competitive, continuousvalued, partially observable problem domain that provides a fun and interesting challenge for machine learning algorithms. This project focuses on the application of online policy-gradient reinforcement learning techniques to tr...

متن کامل

Scalable Multitask Policy Gradient Reinforcement Learning

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...

متن کامل

Model-based Policy Gradient Reinforcement Learning

Policy gradient methods based on REINFORCE are model-free in the sense that they estimate the gradient using only online experiences executing the current stochastic policy. This is extremely wasteful of training data as well as being computationally inefficient. This paper presents a new modelbased policy gradient algorithm that uses training experiences much more efficiently. Our approach con...

متن کامل

Inverse Reinforcement Learning through Policy Gradient Minimization

Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert’s policy. Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions. This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any “direct” RL pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Knowledge Based Systems

سال: 2021

ISSN: ['1872-7409', '0950-7051']

DOI: https://doi.org/10.1016/j.knosys.2021.106781