Distributed Stochastic Optimization via Matrix Exponential Learning
نویسندگان
چکیده
منابع مشابه
Distributed stochastic optimization for deep learning
We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin sche...
متن کاملDistributed stochastic optimization for deep learning (thesis)
We study the problem of how to distribute the training of large-scale deep learning models in the parallel computing environment. We propose a new distributed stochastic optimization method called Elastic Averaging SGD (EASGD). We analyze the convergence rate of the EASGD method in the synchronous scenario and compare its stability condition with the existing ADMM method in the round-robin sche...
متن کاملPrecision Matrix Computation via Stochastic Optimization A Scalable Computation of Regularized Precision Matrices via Stochastic Optimization
We consider the problem of computing a positive definite p × p inverse covariance matrix aka precision matrix θ = (θij) which optimizes a regularized Gaussian maximum likelihood problem, with the elastic-net regularizer ∑p i,j=1 λ(α|θij |+ 1 2 (1− α)θ 2 ij), with regularization parameters α ∈ [0, 1] and λ > 0. The associated convex semidefinite optimization problem is notoriously difficult to s...
متن کاملDistributed Stochastic Optimization via Adaptive Stochastic Gradient Descent
Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial in many applications, but the most popular algorithm, Stochastic Gradient Descent (SGD), is a serial algorithm that is surprisingly hard to parallelize. In this paper, we propose an efficient distributed stochastic op...
متن کاملMatrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization: Supplementary Material
Preparation First, based on the definitions of A t , Y t , ˜ Z t and Z t , we can write g t = G(s t , r t , X t) = p −1 st p −1 rt (I − X t X ⊤ t)(E st ⊙ A)(E ·rt ⊙ X) = (I − X t X ⊤ t)A t Y t. Then from (6), we have X t+1 = X t + α t g t W t − α 2 t 2 X t g ⊤ t g t W t. Since W t = (I + α 2 t 4 g ⊤ t g t) −1 = I − α 2 t 4 g ⊤ t g t + O(α 4 t), we get X t+1 = X t + α t A t Y t − α t X t X ⊤ t A...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Signal Processing
سال: 2017
ISSN: 1053-587X,1941-0476
DOI: 10.1109/tsp.2017.2656847