Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms
نویسندگان
چکیده
Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variety of machine learning problems. Researchers and industry have developed several techniques to optimize SGD's runtime performance, including asynchronous execution and reduced precision. Our main result is a martingale-based analysis that enables us to capture the rich noise models that may arise from such techniques. Specifically, we use our new analysis in three ways: (1) we derive convergence rates for the convex case (Hogwild!) with relaxed assumptions on the sparsity of the problem; (2) we analyze asynchronous SGD algorithms for non-convex matrix problems including matrix completion; and (3) we design and analyze an asynchronous SGD algorithm, called Buckwild!, that uses lower-precision arithmetic. We show experimentally that our algorithms run efficiently for a variety of problems on modern hardware.
منابع مشابه
A Unified Approach for Design of Lp Polynomial Algorithms
By summarizing Khachiyan's algorithm and Karmarkar's algorithm forlinear program (LP) a unified methodology for the design of polynomial-time algorithms for LP is presented in this paper. A key concept is the so-called extended binary search (EBS) algorithm introduced by the author. It is used as a unified model to analyze the complexities of the existing modem LP algorithms and possibly, help ...
متن کاملCyclades: Conflict-free Asynchronous Machine Learning
We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during parallel execution, and offers a black-box analysis for provable speedups across a large fa...
متن کاملAnalyzing Hogwild Parallel Gaussian Gibbs Sampling
Scaling probabilistic inference algorithms to large datasets and parallel computing architectures is a challenge of great importance and considerable current research interest, and great strides have been made in designing parallelizeable algorithms. Along with the powerful and sometimes complex new algorithms, a very simple strategy has proven to be surprisingly useful in some situations: runn...
متن کاملClone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling
We propose a generalized Gibbs sampler algorithm for obtaining samples approximately distributed from a high-dimensional Gaussian distribution. Similarly to Hogwild methods, our approach does not target the original Gaussian distribution of interest, but an approximation to it. Contrary to Hogwild methods, a single parameter allows us to trade bias for variance. We show empirically that our met...
متن کاملHogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve stateof-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performancedestroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Advances in neural information processing systems
دوره 28 شماره
صفحات -
تاریخ انتشار 2015