Improve generated adversarial imitation learning with reward variance regularization
نویسندگان
چکیده
Imitation learning aims at recovering expert policies from limited demonstration data. Generative Adversarial Learning (GAIL) employs the generative adversarial framework for imitation and has shown great potentials. GAIL its variants, however, are found highly sensitive to hyperparameters hard converge well in practice. One key issue is that supervised discriminator a much faster speed than reinforcement generator, making generator gradient vanishing. Although formulated as zero-sum game, ultimate goal of learn thus should play role more like teacher rather real opponent. Therefore, consider how could learn. In this paper, we disclose enhancing training equivalent increase variance fake reward provided by output. We propose an improved version GAIL, GAIL-VR, which also learns avoid vanishing through regularization rewards variance. Experiments various tasks, including locomotion tasks Atari games, indicate GAIL-VR can improve stability scores.
منابع مشابه
Generative Adversarial Imitation Learning
Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a...
متن کاملVariance Regularizing Adversarial Learning
We introduce a novel approach for training adversarial models by replacing the discriminator score with a bi-modal Gaussian distribution over the real/fake indicator variables. In order to do this, we train the Gaussian classifier to match the target bi-modal distribution implicitly through meta-adversarial training. We hypothesize that this approach ensures a nonzero gradient to the generator,...
متن کاملModel-based Adversarial Imitation Learning
Generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle D that discriminates between the expert’s data distribution and that of the generative model G. The generative model is trained to capture the expert’s distribution by maximizing the probability of ...
متن کاملMultimodal Storytelling via Generative Adversarial Imitation Learning
Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users’ interests. These works can extract inter...
متن کاملMulti-agent Generative Adversarial Imitation Learning
We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We introduce a practical multi-agent actor-critic algorithm with good empirical performance. Our method can be used to imitate complex behaviors in highdimensional environments with multiple cooperative or competitive agents. 1 MARKO...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2022
ISSN: ['0885-6125', '1573-0565']
DOI: https://doi.org/10.1007/s10994-021-06083-7