mbkmeans: Fast clustering for single cell data using mini-batch k-means
نویسندگان
چکیده
منابع مشابه
Nested Mini-Batch K-Means
A new algorithm is proposed which accelerates the mini-batch k-means algorithm of Sculley (2010) by using the distance bounding approach of Elkan (2003). We argue that, when incorporating distance bounds into a mini-batch algorithm, already used data should preferentially be reused. To this end we propose using nested mini-batches, whereby data in a mini-batch at iteration t is automatically re...
متن کاملTurbocharging Mini-Batch K-Means
A new algorithm is proposed which accelerates the mini-batch k-means algorithm of Sculley (2010) by using the distance bounding approach of Elkan (2003). We argue that, when incorporating distance bounds into a mini-batch algorithm, already used data should preferentially be reused. To this end we propose using nested mini-batches, whereby data in a mini-batch at iteration t is automatically re...
متن کاملK-means vs Mini Batch K-means: A comparison
Mini Batch K-means ([11]) has been proposed as an alternative to the K-means algorithm for clustering massive datasets. The advantage of this algorithm is to reduce the computational cost by not using all the dataset each iteration but a subsample of a fixed size. This strategy reduces the number of distance computations per iteration at the cost of lower cluster quality. The purpose of this pa...
متن کاملFast Support Vector Data Description Using K-Means Clustering
Support Vector Data Description (SVDD) has a limitation for dealing with a large data set in which computational load drastically increases as training data size becomes large. To handle this problem, we propose a new fast SVDDmethod using K-means clustering method. Our method uses divide-and-conquer strategy; trains each decomposed subproblems to get support vectors and retrains with the suppo...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLOS Computational Biology
سال: 2021
ISSN: 1553-7358
DOI: 10.1371/journal.pcbi.1008625