Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

نویسندگان

Will Grathwohl

Dami Choi

Yuhuai Wu

Geoffrey Roeder

David K. Duvenaud

چکیده

Gradient-based optimization is the foundation of deep learning and reinforcement learning, but is difficult to apply when the mechanism being optimized is unknown or not differentiable. We introduce a general framework for learning low-variance, unbiased gradient estimators, applicable to black-box functions of discrete or continuous random variables. Our method uses gradients of a surrogate neural network to construct a control variate, which is optimized jointly with the original parameters. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Black-box Identification with Discrete-time Neural Networks

In general, neural networks cannot match nonlinear systems exactly, neuro identiÞer has to include robust modiÞcation in order to guarantee Lyapunov stability. In this paper input-to-state stability approach is applied to access robust training algorithms of discrete-time neural networks. We conclude that the gradient descent law and the backpropagation-like algorithm for the weights adjustment...

متن کامل

Identifying irrelevant input variables in chaotic time series problems: Using the genetic algorithm for training neural networks

Many researchers consider a neural network to be a "black box" that maps the unknown relationships of inputs to corresponding outputs. By viewing neural networks in this manner, researchers often include many more input variables than are necessary for finding good solutions. This causes unneeded computation as well as impeding the search process by increasing the complexity of the network. The...

متن کامل

Exploring the Space of Black-box Attacks on Deep Neural Networks

Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can “transfer” to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model’s class probabilities, which do not rely on transfe...

متن کامل

Explaining Transition Systems through Program Induction

Explaining and reasoning about processes which underlie observed black-box phenomena enables the discovery of causal mechanisms, derivation of suitable abstract representations and the formulation of more robust predictions. We propose to learn high level functional programs in order to represent abstract models which capture the invariant structure in the observed data. We introduce the π-mach...

متن کامل

Improving Convergence of Iterative Feedback Tuning

Iterative Feedback Tuning constitutes an attractive control loop tuning method for processes in the absence of an accurate process model. It is a purely data driven approach aiming at optimizing the closed loop performance. The standard formulation ensures an unbiased estimate of the loop performance cost function gradient with respect to the control parameters. This gradient is important in a ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1711.00123 شماره

صفحات -

تاریخ انتشار 2017

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

نویسندگان

چکیده

منابع مشابه

Robust Black-box Identification with Discrete-time Neural Networks

Identifying irrelevant input variables in chaotic time series problems: Using the genetic algorithm for training neural networks

Exploring the Space of Black-box Attacks on Deep Neural Networks

Explaining Transition Systems through Program Induction

Improving Convergence of Iterative Feedback Tuning

عنوان ژورنال:

اشتراک گذاری