Model Selection Strategies for Determining the Optimal Number of Overlapping Clusters in Additive Overlapping Partitional Clustering
نویسندگان
چکیده
Abstract In various scientific fields, researchers make use of partitioning methods (e.g., K -means) to disclose the structural mechanisms underlying object by variable data. some instances, however, a grouping objects into clusters that are allowed overlap (i.e., assigning multiple clusters) might lead better representation clustering structure. To obtain an overlapping from data, Mirkin’s ADditive PROfile CLUStering (ADPROCLUS) model may be used. A major challenge when performing ADPROCLUS is determine optimal number which pertains selection problem. Up now, this problem has not been systematically investigated and almost no guidelines can found in literature regarding appropriate strategies for ADPROCLUS. Therefore, paper, several existing -means (a.o., CHull, Caliński-Harabasz, Krzanowski-Lai, Average Silhouette Width Dunn Index information-theoretic measures like AIC BIC) two cross-validation based tailored towards context compared each other extensive simulation study. The results demonstrate CHull outperforms all especially negative log-likelihood, associated with minimal stochastic extension ADPROCLUS, used as (mis)fit measure. analysis post hoc AIC-based strategy revealed performance obtained different—more appropriate—definition complexity
منابع مشابه
Optimal Fuzzy Clustering in Overlapping Clusters
The fuzzy c-means clustering algorithm has been widely used to obtain the fuzzy k-partitions. This algorithm requires that the user gives the number of clusters k. To find automatically the “right” number of clusters, k, for a given data set, many validity indexes algorithms have been proposed in the literature. Most of these indexes do not work well for clusters with different overlapping degr...
متن کاملLowdimensional Additive Overlapping Clustering
To reveal the structure underlying two-way two-mode object by variable data, Mirkin (1987) has proposed an additive overlapping clustering model. This model implies an overlapping clustering of the objects and a reconstruction of the data, with the reconstructed variable profile of an object being a summation of the variable profiles of the clusters it belongs to. Grasping the additive (overlap...
متن کاملCoefficient-explicit Condition Number Bounds for Overlapping Additive Schwarz
In this paper we discuss new domain decomposition preconditioners for piecewise linear finite element discretisations of boundary-value problems for the model elliptic problem −∇ · (A∇u) = f , (1) in a bounded polygonal or polyhedral domain Ω ⊂ R, d = 2 or 3 with suitable boundary data on the boundary ∂Ω. The tensor A(x) is assumed isotropic and symmetric positive definite, but may vary with ma...
متن کاملModel-based Overlapping Co-Clustering
Co-clustering or simultaneous clustering of rows and columns of two-dimensional data matrices, is a data mining technique with various applications such as text clustering and microarray analysis. Most proposed co-clustering algorithms work on the data matrices with special assumptions and they also assume the existence of a number of mutually exclusive row and column clusters, but it is believ...
متن کاملthe test for adverse selection in life insurance market: the case of mellat insurance company
انتخاب نامساعد یکی از مشکلات اساسی در صنعت بیمه است. که ابتدا در سال 1960، توسط روتشیلد واستیگلیتز مورد بحث ومطالعه قرار گرفت ازآن موقع تاکنون بسیاری از پژوهشگران مدل های مختلفی را برای تجزیه و تحلیل تقاضا برای صنعت بیمه عمر که تماما ناشی از عدم قطعیت در این صنعت میباشد انجام داده اند .وهدف از آن پیدا کردن شرایطی است که تحت آن شرایط انتخاب یا کنار گذاشتن یک بیمه گزار به نفع و یا زیان شرکت بیمه ...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Classification
سال: 2022
ISSN: ['0176-4268', '1432-1343']
DOI: https://doi.org/10.1007/s00357-021-09409-1