DUAL-LOCO: Distributing Statistical Estimation Using Random Projections
نویسندگان
چکیده
We present Dual-Loco, a communicatione cient algorithm for distributed statistical estimation. Dual-Loco assumes that the data is distributed across workers according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependencies between features available to di↵erent workers. We show that Dual-Loco has bounded approximation error which only depends weakly on the number of workers. We compare Dual-Loco against a state-of-theart distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy. In particular, Dual-Loco allows for fast cross validation as only part of the algorithm depends on the regularization parameter.
منابع مشابه
LOCO: Distributing Ridge Regression with Random Projections
We propose LOCO, a distributed algorithm which solves large-scale ridge regression. LOCO randomly assigns variables to different processing units which do not communicate. Important dependencies between variables are preserved using random projections which are cheap to compute. We show that LOCO has bounded approximation error compared to the exact ridge regression solution in the fixed design...
متن کاملApplication of Clustering in the Non-Parametric Estimation of Distribution Density
Abstract. This paper discusses a multimodal density function estimation problem of a random vector. A comparative accuracy analysis of some popular non-parametric estimators is made by using the Monte-Carlo method. The paper demonstrates that the estimation quality increases significantly if the sample is clustered (i.e., the multimodal density function is approximated by a mixture of unimodal ...
متن کاملSimulation uncertainty of complex economic system behavior
Property relation management of the (ownership, disposable, using) limited resources, which naturally occurs in the economic systems, face a problem uncertainty behavior of its active elements. The model of identification and forecasting of the economic system trajectory states in time is offered in work, which allows complex estimation its conduct from positions of risk (additive distributing)...
متن کاملUsing Stable Random Projections
Abstract Many tasks (e.g., clustering) in machine learning only require the lα distances instead of the original data. For dimension reductions in the lα norm (0 < α ≤ 2), the method of stable random projections can efficiently compute the lα distances in massive datasets (e.g., the Web or massive data streams) in one pass of the data. The estimation task for stable random projections has been ...
متن کاملImproving Random Projections Using Marginal Information
We present an improved version of random projections that takes advantage of marginal norms. Using a maximum likelihood estimator (MLE), marginconstrained random projections can improve estimation accuracy considerably. Theoretical properties of this estimator are analyzed in detail.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016