DiCE: The Infinitely Differentiable Monte-Carlo Estimator
نویسندگان
چکیده
The score function estimator is widely used for estimating gradients of stochastic objectives in Stochastic Computation Graphs (SCG), e.g., in reinforcement learning and meta-learning. While deriving the first order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher order gradients is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order gradient involves increasingly cumbersome graph manipulations. Lastly, to match the first order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for higher order gradient estimators. To address all these shortcomings in a unified way, we introduce DICE, which provides a single objective that can be differentiated repeatedly, generating correct gradient estimators of any order in SCGs. Unlike SL, DICE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DICE both through a proof and through numerical evaluation of the DICE gradient estimates. We also use DICE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://goo.gl/xkkGxN.
منابع مشابه
Positive-Shrinkage and Pretest Estimation in Multiple Regression: A Monte Carlo Study with Applications
Consider a problem of predicting a response variable using a set of covariates in a linear regression model. If it is a priori known or suspected that a subset of the covariates do not significantly contribute to the overall fit of the model, a restricted model that excludes these covariates, may be sufficient. If, on the other hand, the subset provides useful information, shrinkage meth...
متن کاملA New Ridge Estimator in Linear Measurement Error Model with Stochastic Linear Restrictions
In this paper, we propose a new ridge-type estimator called the new mixed ridge estimator (NMRE) by unifying the sample and prior information in linear measurement error model with additional stochastic linear restrictions. The new estimator is a generalization of the mixed estimator (ME) and ridge estimator (RE). The performances of this new estimator and mixed ridge estimator (MRE) against th...
متن کاملPart pose statistics: estimators and experiments
Many of the most fundamental examples in probability involve the pose statistics of coins and dice as they are dropped on a flat surface. For these parts, the probability assigned to each stable face is justified based on part symmetry, although most gamblers are familiar with the possibility of loaded dice. In industrial part feeding, parts also arrive in random orientations. We consider the f...
متن کاملA CLT for Infinitely Stratified Estimators, with Applications to Debiased MLMC
This paper develops a general central limit theorem (CLT) for post-stratified Monte Carlo estimators with an associated infinite number of strata. In addition, consistency of the corresponding variance estimator is established in the same setting. With these results in hand, one can then construct asymptotically valid confidence interval procedures for such infinitely stratified estimators. We ...
متن کاملNonparametric Estimation of an Additive Quantile Regression Model
This paper is concerned with estimating the additive components of a nonparametric additive quantile regression model. We develop an estimator that is asymptotically normally distributed with a rate of convergence in probability of n−r/(2r+1) when the additive components are r-times continuously differentiable for some r ≥ 2. This result holds regardless of the dimension of the covariates and, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.05098 شماره
صفحات -
تاریخ انتشار 2018