Accelerated Information Gradient Flow

نویسندگان

چکیده

We present a framework for Nesterov’s accelerated gradient flows in probability space to design efficient mean-field Markov chain Monte Carlo algorithms Bayesian inverse problems. Here four examples of information metrics are considered, including Fisher-Rao metric, Wasserstein-2 Kalman-Wasserstein metric and Stein metric. For both metrics, we prove convergence properties flows. In implementations, propose sampling-efficient discrete-time algorithm Wasserstein-2, with restart technique. also formulate kernel bandwidth selection method, which learns the logarithm density from Brownian-motion samples. Numerical experiments, logistic regression neural network, show strength proposed methods compared state-of-the-art algorithms.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated Gradient Boosting

Gradient tree boosting is a prediction algorithm that sequentially produces a model in the form of linear combinations of decision trees, by solving an infinite-dimensional optimization problem. We combine gradient boosting and Nesterov’s accelerated descent to design a new algorithm, which we call AGB (for Accelerated Gradient Boosting). Substantial numerical evidence is provided on both synth...

متن کامل

Asynchronous Accelerated Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a widely used optimization algorithm in machine learning. In order to accelerate the convergence of SGD, a few advanced techniques have been developed in recent years, including variance reduction, stochastic coordinate sampling, and Nesterov’s acceleration method. Furthermore, in order to improve the training speed and/or leverage larger-scale training data...

متن کامل

Accelerated Gradient Temporal Difference Learning

The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD(λ) to data efficient least squares methods. Least square methods make the best use of available data directly computing the TD solution and thus do not require tuning a typically highly sensitive learning rate parameter, but require quadratic computation and storage. Recent algorith...

متن کامل

Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method

We provide a novel accelerated first-order method that achieves the asymptotically optimal con-vergence rate for smooth functions in the first-order oracle model. To this day, Nesterov’s AcceleratedGradient Descent (agd) and variations thereof were the only methods achieving acceleration in thisstandard blackbox model. In contrast, our algorithm is significantly different from a...

متن کامل

A unified convergence bound for conjugate gradient and accelerated gradient∗

Nesterov’s accelerated gradient method for minimizing a smooth strongly convex function f is known to reduce f(xk) − f(x∗) by a factor of ∈ (0, 1) after k ≥ O( √ L/` log(1/ )) iterations, where `, L are the two parameters of smooth strong convexity. Furthermore, it is known that this is the best possible complexity in the function-gradient oracle model of computation. The method of linear conju...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Scientific Computing

سال: 2021

ISSN: ['1573-7691', '0885-7474']

DOI: https://doi.org/10.1007/s10915-021-01709-3