Optimizing the CVaR via Sampling

نویسندگان

  • Aviv Tamar
  • Yonatan Glassner
  • Shie Mannor
چکیده

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risksensitive controller for the game of Tetris.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Selection and Adaptive Markov chain Monte Carlo for Bayesian Cointegrated VAR model

In this paper, we develop novel Markov chain Monte Carlo sampling methodology for Bayesian Cointegrated Vector Auto Regression (CVAR) models. Here we focus on two novel extensions to the sampling methodology for the CVAR posterior distribution. The first extension we develop replaces the popular sampling methodology of the griddy Gibbs sampler with an automated alternative which is based on an ...

متن کامل

Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures

In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk mea...

متن کامل

Computation of VaR and CVaR using stochastic approximations and unconstrained importance sampling

Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) are two risk measures which are widely used in the practice of risk management. This paper deals with the problem of computing both VaR and CVaR using stochastic approximation (with decreasing steps): we propose a first Robbins-Monro procedure based on Rockaffelar-Uryasev’s identity for the CVaR. The convergence rate of this algorithm to ...

متن کامل

Policy Gradients for CVaR-Constrained MDPs

We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along t...

متن کامل

Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling

Value-at-Risk (VaR) and Conditional-Value-at-Risk (CVaR) are two risk measures which are widely used in the practice of risk management. This paper deals with the problem of estimating both VaR and CVaR using stochastic approximation (with decreasing steps): we propose a first Robbins-Monro (RM) procedure based on Rockafellar-Uryasev’s identity for the CVaR. Convergence rate of this algorithm t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015