Computationally efficient sparse clustering
نویسندگان
چکیده
Abstract We study statistical and computational limits of clustering when the means centres are sparse their dimension is possibly much larger than sample size. Our theoretical analysis focuses on model $X_i=z_i \theta +\varepsilon _{i}, \ z_i \in \{-1,1\}, \varepsilon _i \thicksim \mathcal{N}(0, I)$, which has two clusters with $\theta $ $-\theta $. provide a finite new algorithm based Principal Component Analysis (PCA) show that it achieves minimax optimal misclustering rate in regime $\|\theta \| \rightarrow \infty results require sparsity to grow slower square root Using recent framework for lower bounds—the low-degree likelihood ratio—we give evidence this condition necessary any polynomial-time succeed below Baik-Ben Arous-Péché (BBP) threshold. This complements existing reductions query bounds. Compared these results, we cover wider set parameter regimes more precise understanding runtime required error achievable. imply large class tests polynomials fail solve even weak testing task.
منابع مشابه
Computationally Efficient Robust Estimation of Sparse Functionals
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algor...
متن کاملComputationally Efficient Robust Sparse Estimation in High Dimensions
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algor...
متن کاملEfficient Sparse Representation Classification Using Adaptive Clustering
This paper is presenting a method for an efficient face recognition algorithm based on sparse representation classification (SRC) using an adaptive K-means clustering. In the context of face recognition, SRC is implemented based on the assumption that a face image from a particular subject can be represented as a linear combination of other face images from the same subject. SRC uses a set of e...
متن کاملDesign of computationally efficient density-based clustering algorithms
Article history: Received 1 September 2012 Received in revised form 5 May 2014 Accepted 24 November 2014 Available online 29 November 2014 The basic DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm uses minimum number of input parameters, very effective to cluster large spatial databases but involves more computational complexity. The present paper proposes a new s...
متن کاملComputationally Efficient Target Classification
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information and Inference: A Journal of the IMA
سال: 2022
ISSN: ['2049-8772', '2049-8764']
DOI: https://doi.org/10.1093/imaiai/iaac019