Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
نویسندگان
چکیده
Gradient-based optimization is the foundation of deep learning and reinforcement learning, but is difficult to apply when the mechanism being optimized is unknown or not differentiable. We introduce a general framework for learning low-variance, unbiased gradient estimators, applicable to black-box functions of discrete or continuous random variables. Our method uses gradients of a surrogate neural network to construct a control variate, which is optimized jointly with the original parameters. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm.
منابع مشابه
Robust Black-box Identification with Discrete-time Neural Networks
In general, neural networks cannot match nonlinear systems exactly, neuro identiÞer has to include robust modiÞcation in order to guarantee Lyapunov stability. In this paper input-to-state stability approach is applied to access robust training algorithms of discrete-time neural networks. We conclude that the gradient descent law and the backpropagation-like algorithm for the weights adjustment...
متن کاملIdentifying irrelevant input variables in chaotic time series problems: Using the genetic algorithm for training neural networks
Many researchers consider a neural network to be a "black box" that maps the unknown relationships of inputs to corresponding outputs. By viewing neural networks in this manner, researchers often include many more input variables than are necessary for finding good solutions. This causes unneeded computation as well as impeding the search process by increasing the complexity of the network. The...
متن کاملExploring the Space of Black-box Attacks on Deep Neural Networks
Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can “transfer” to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model’s class probabilities, which do not rely on transfe...
متن کاملExplaining Transition Systems through Program Induction
Explaining and reasoning about processes which underlie observed black-box phenomena enables the discovery of causal mechanisms, derivation of suitable abstract representations and the formulation of more robust predictions. We propose to learn high level functional programs in order to represent abstract models which capture the invariant structure in the observed data. We introduce the π-mach...
متن کاملImproving Convergence of Iterative Feedback Tuning
Iterative Feedback Tuning constitutes an attractive control loop tuning method for processes in the absence of an accurate process model. It is a purely data driven approach aiming at optimizing the closed loop performance. The standard formulation ensures an unbiased estimate of the loop performance cost function gradient with respect to the control parameters. This gradient is important in a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.00123 شماره
صفحات -
تاریخ انتشار 2017