نتایج جستجو برای: k means method
تعداد نتایج: 2217835 فیلتر نتایج به سال:
Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retr...
Much work has sought to discern the different types of cloud regimes, typically via Euclidean k-means clustering of histograms. However, these methods ignore the underlying similarity structure of cloud types. Wasserstein k-means clustering is a promising candidate for utilizing this structure during clustering, but existing algorithms do not scale well and lack the quality guarantees of the Eu...
We prove in this paper that the expected value of the objective function of the k-means++ algorithm for samples converges to population expected value. As k-means++, for samples, provides with constant factor approximation for k-means objectives, such an approximation can be achieved for the population with increase of the sample size. This result is of potential practical relevance when one is...
This is the Supplementary Information to Paper ”k-variates++: more pluses in the kmeans++”, appearing in the proceedings of ICML 2016. Notation “main file” indicates reference to the paper.
We present a new clustering algorithm called k-means-u* which in many cases is able to significantly improve the clusterings found by k-means++, the current de-facto standard for clustering in Euclidean spaces. First we introduce the k-means-u algorithm which starts from a result of k-means++ and attempts to improve it with a sequence of non-local “jumps” alternated by runs of standard k-means....
The k-means++ seeding algorithm is one of the most popular algorithms that is used for finding the initial k centers when using the k-means heuristic. The algorithm is a simple sampling procedure and can be described as follows: Pick the first center randomly from the given points. For i > 1, pick a point to be the i center with probability proportional to the square of the Euclidean distance o...
Even though virtualization provides a lot of advantages in cloud computing, it does not provide effective performance isolation between the virtualization machines. In other words, the performance may get affected due the interferences caused by co-virtual machines. This can be achieved by the proper management of resource allocations between the Virtual Machines running simultaneously. This pa...
خوشه بندی تکنیکی از داده¬کاوی است که تعدادی آیتم را می¬گیرد و آنها را براساس ویژگیها¬یشان درون خوشه¬ها قرار می¬دهد. آیتمهای درون هر خوشه بیشترین میزان شباهت را در ویژگی بخصوصی که از پیش مشخص شده است،با هم دارند و آیتمهای خوشه¬های مختلف بیشترین تفاوت را در آن ویژگی، نسبت به هم دارند. خوشه¬بندی انواع مختلفی دارد که k-means یکی از بهترین و ساده¬ترین آنهاست. این خوشه¬بندی به این دلیل که پایه¬ی برخی...
In this work, we study the k-means cost function. The (Euclidean) k-means problem can be described as follows: given a dataset X ⊆ R and a positive integer k, find a set of k centers C ⊆ R such that Φ(C,X) def = ∑ x∈X minc∈C ||x− c|| 2 is minimized. Let ∆k(X) def = minC⊆Rd Φ(C,X) denote the cost of the optimal k-means solution. It is simple to observe that for any dataset X, ∆k(X) decreases as ...
this paper compares clusters of aligned persian and english texts obtained from k-means method. text clustering has many applications in various fields of natural language processing. so far, much english documents clustering research has been accomplished. now this question arises, are the results of them extendable to other languages? since the goal of document clustering is grouping of docum...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید