Computationally efficient sparse clustering

نویسندگان

چکیده

Abstract We study statistical and computational limits of clustering when the means centres are sparse their dimension is possibly much larger than sample size. Our theoretical analysis focuses on model $X_i=z_i \theta +\varepsilon _{i}, \ z_i \in \{-1,1\}, \varepsilon _i \thicksim \mathcal{N}(0, I)$, which has two clusters with $\theta $ $-\theta $. provide a finite new algorithm based Principal Component Analysis (PCA) show that it achieves minimax optimal misclustering rate in regime $\|\theta \| \rightarrow \infty results require sparsity to grow slower square root Using recent framework for lower bounds—the low-degree likelihood ratio—we give evidence this condition necessary any polynomial-time succeed below Baik-Ben Arous-Péché (BBP) threshold. This complements existing reductions query bounds. Compared these results, we cover wider set parameter regimes more precise understanding runtime required error achievable. imply large class tests polynomials fail solve even weak testing task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computationally Efficient Robust Estimation of Sparse Functionals

Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algor...

متن کامل

Computationally Efficient Robust Sparse Estimation in High Dimensions

متن کامل

Efficient Sparse Representation Classification Using Adaptive Clustering

This paper is presenting a method for an efficient face recognition algorithm based on sparse representation classification (SRC) using an adaptive K-means clustering. In the context of face recognition, SRC is implemented based on the assumption that a face image from a particular subject can be represented as a linear combination of other face images from the same subject. SRC uses a set of e...

متن کامل

Design of computationally efficient density-based clustering algorithms

Article history: Received 1 September 2012 Received in revised form 5 May 2014 Accepted 24 November 2014 Available online 29 November 2014 The basic DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm uses minimum number of input parameters, very effective to cluster large spatial databases but involves more computational complexity. The present paper proposes a new s...

متن کامل

Computationally Efficient Target Classification

Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information and Inference: A Journal of the IMA

سال: 2022

ISSN: ['2049-8772', '2049-8764']

DOI: https://doi.org/10.1093/imaiai/iaac019