نتایج جستجو برای: K-means clustering
تعداد نتایج: 786274 فیلتر نتایج به سال:
k-means++ is a seeding technique for the k-means method with an expected approximation ratio of O(log k), where k denotes the number of clusters. Examples are known on which the expected approximation ratio of k-means++ is Ω(log k), showing that the upper bound is asymptotically tight. However, it remained open whether k-means++ yields an O(1)-approximation with probability 1/poly(k) or even wi...
identifying clusters or clustering is an important aspect of data analysis. it is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. it is a main task of exploratory data mining, and a common technique for statistical data analysis this paper proposed an improved version of k-means algorithm, namely persistent k...
Due to the progressive growth of the amount of data available in a wide variety of scientific fields, it has become more difficult to manipulate and analyze such information. In spite of its dependency on the initial settings and the large number of distance computations that it can require to converge, the K-means algorithm remains as one of the most popular clustering methods for massive data...
The k-means++ algorithm is the state of the art algorithm to solve k-Means clustering problems as the computed clusterings are O(log k) competitive in expectation. However, its seeding step requires k inherently sequential passes through the full data set making it hard to scale to massive data sets. The standard remedy is to use the k-means‖ algorithm which reduces the number of sequential rou...
В настоящее время происходит активное накопление данных большого объёма в различных информационных средах, таких как социальные, корпоративные, научные и другие. Интенсивное использование больших данных в различных областях стимулирует повышенный интерес исследователей к развитию методов и средств обработки и анализа массивных данных огромных объёмов и значительного многообразия. Одним из персп...
Software projects are usually analyzed by experts based on their previous experience, their intuition and data they gather about the project. In this work, we show an approach for a purely data-driven retrospective project analysis. We plan to build on this work to make predictions about the evolution of software projects.
Özet. Yazılım teknolojileri hızla ilerlemekte ve buna paralel olarak hem kamu alanında hem de özel sektörde gerçekleştirilen yazılım projelerinin sayısı artmaktadır. Yazılım otomasyon projelerinden elde edilen en büyük çıktılardan birisi kuşkusuz ki üretilen verilerdir. Yüksek boyutlu, anlaşılması güç bu verilerin işlenerek, daha anlamlı ve yönlendirici verilere dönüştürülmesi önemli bir ihtiya...
In this paper, we describe our work at subtopic mining subtask in NTCIR-9 in simplified Chinese. To find possible subtopics of a specific query, we select related queries recorded by query log, or titles of searching results provided by Google and Baidu, or the catalog of corresponding entry in Baidu encyclopedia, which are lexically similar as the original query, then we apply k-means algorith...
This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means clusterbased selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید