Distributed Stochastic Optimization via Matrix Exponential Learning

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed stochastic optimization for deep learning

We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin sche...

متن کامل

Distributed stochastic optimization for deep learning (thesis)

متن کامل

Precision Matrix Computation via Stochastic Optimization A Scalable Computation of Regularized Precision Matrices via Stochastic Optimization

We consider the problem of computing a positive definite p × p inverse covariance matrix aka precision matrix θ = (θij) which optimizes a regularized Gaussian maximum likelihood problem, with the elastic-net regularizer ∑p i,j=1 λ(α|θij |+ 1 2 (1− α)θ 2 ij), with regularization parameters α ∈ [0, 1] and λ > 0. The associated convex semidefinite optimization problem is notoriously difficult to s...

متن کامل

Distributed Stochastic Optimization via Adaptive Stochastic Gradient Descent

Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial in many applications, but the most popular algorithm, Stochastic Gradient Descent (SGD), is a serial algorithm that is surprisingly hard to parallelize. In this paper, we propose an efficient distributed stochastic op...

متن کامل

Matrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization: Supplementary Material

Preparation First, based on the definitions of A t , Y t , ˜ Z t and Z t , we can write g t = G(s t , r t , X t) = p −1 st p −1 rt (I − X t X ⊤ t)(E st ⊙ A)(E ·rt ⊙ X) = (I − X t X ⊤ t)A t Y t. Then from (6), we have X t+1 = X t + α t g t W t − α 2 t 2 X t g ⊤ t g t W t. Since W t = (I + α 2 t 4 g ⊤ t g t) −1 = I − α 2 t 4 g ⊤ t g t + O(α 4 t), we get X t+1 = X t + α t A t Y t − α t X t X ⊤ t A...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Signal Processing

سال: 2017

ISSN: 1053-587X,1941-0476

DOI: 10.1109/tsp.2017.2656847