SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization
نویسندگان
چکیده
We propose a new algorithm for minimizing regularized empirical loss: Stochastic Dual Newton Ascent (SDNA). Our method is dual in nature: in each iteration we update a random subset of the dual variables. However, unlike existing methods such as stochastic dual coordinate ascent, SDNA is capable of utilizing all local curvature information contained in the examples, which leads to striking improvements in both theory and practice – sometimes by orders of magnitude. In the special case when an L2-regularizer is used in the primal, the dual problem is a concave quadratic maximization problem plus a separable term. In this regime, SDNA in each step solves a proximal subproblem involving a random principal submatrix of the Hessian of the quadratic function; whence the name of the method.
منابع مشابه
Stochastic Dual Coordinate Ascent with Adaptive Probabilities
This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems. Our modification consists in allowing the method adaptively change the probability distribution over the dual variables throughout the iterative process. AdaSDCA achieves provably better complexity bound than SDCA with the best fixed pr...
متن کاملLocal Smoothness in Variance Reduced Optimization
We propose a family of non-uniform sampling strategies to provably speed up a class of stochastic optimization algorithms with linear convergence including Stochastic Variance Reduced Gradient (SVRG) and Stochastic Dual Coordinate Ascent (SDCA). For a large family of penalized empirical risk minimization problems, our methods exploit data dependent local smoothness of the loss functions near th...
متن کاملAn Accelerated Proximal Coordinate Gradient Method
We develop an accelerated randomized proximal coordinate gradient (APCG) method, for solving a broad class of composite convex optimization problems. In particular, our method achieves faster linear convergence rates for minimizing strongly convex functions than existing randomized proximal coordinate gradient methods. We show how to apply the APCG method to solve the dual of the regularized em...
متن کاملTrading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent
We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We ...
متن کاملDoubly Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization with Factorized Data
We proposed a doubly stochastic primal-dual coordinate optimization algorithm for regularized empirical risk minimization that can be formulated as a saddlepoint problem. Different from existing coordinate methods, the proposed method randomly samples both primal and dual coordinates to update solutions, which is a desirable property when applied to data with both a high dimension and a large s...
متن کامل