Compressed Gradient Methods With Hessian-Aided Error Compensation
نویسندگان
چکیده
The emergence of big data has caused a dramatic shift in the operating regime for optimization algorithms. performance bottleneck, which used to be computations, is now often communications. Several gradient compression techniques have been proposed reduce communication load at price loss solution accuracy. Recently, it shown how errors can compensated algorithm improve Even though convergence guarantees error-compensated algorithms established, there very limited theoretical support quantifying observed improvements In this paper, we show that Hessian-aided error compensation, unlike other existing schemes, avoids accumulation on quadratic problems. We also present strong Hessian-based compensation stochastic descent. Our numerical experiments highlight benefits and demonstrate similar are attained when only diagonal Hessian approximation used.
منابع مشابه
Hessian Riemannian Gradient Flows in Convex Programming
In view of solving theoretically constrained minimization problems, we investigate the properties of the gradient flows with respect to Hessian Riemannian metrics induced by Legendre functions. The first result characterizes Hessian Riemannian structures on convex sets as metrics that have a specific integration property with respect to variational inequalities, giving a new motivation for the ...
متن کاملthe effects of error correction methods on pronunciation accuracy
هدف از انجام این تحقیق مشخص کردن موثرترین متد اصلاح خطا بر روی دقت آهنگ و تاکید تلفظ کلمه در زبان انگلیسی بود. این تحقیق با پیاده کردن چهار متد ارائه اصلاح خطا در چهار گروه، سه گروه آزمایشی و یک گروه تحت کنترل، انجام شد که گروه های فوق الذکر شامل دانشجویان سطح بالای متوسط کتاب اول passages بودند. گروه اول شامل 15، دوم 14، سوم 15 و آخرین 16 دانشجو بودند. دوره مربوطه به مدت 10 هفته ادامه یافت و د...
15 صفحه اولAsynchronous Stochastic Gradient Descent with Delay Compensation
With the fast development of deep learning, people have started to train very big neural networks using massive data. Asynchronous Stochastic Gradient Descent (ASGD) is widely used to fulfill this task, which, however, is known to suffer from the problem of delayed gradient. That is, when a local worker adds the gradient it calculates to the global model, the global model may have been updated ...
متن کاملA Comparison of Gradient- and Hessian-Based Optimization Methods for Tetrahedral Mesh Quality Improvement
Discretization methods, such as the finite element method, are commonly used in the solution of partial differential equations (PDEs). The accuracy of the computed solution to the PDE depends on the degree of the approximation scheme, the number of elements in the mesh [1], and the quality of the mesh [2, 3]. More specifically, it is known that as the element dihedral angles become too large, t...
متن کاملDistributed dual gradient methods and error bound conditions
In this paper we propose distributed dual gradient algorithms for linearly constrained separable convex problems and analyze their rate of convergence under different assumptions. Under the strong convexity assumption on the primal objective function we propose two distributed dual fast gradient schemes for which we prove sublinear rate of convergence for dual suboptimality but also primal subo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Signal Processing
سال: 2021
ISSN: ['1053-587X', '1941-0476']
DOI: https://doi.org/10.1109/tsp.2020.3048229