Cluster analysis using different correlation coefficients
نویسندگان
چکیده
Partitioning objects into closely related groups that have different states allows to understand the underlying structure in the data set treated. Different kinds of similarity measure with clustering algorithms are commonly used to find an optimal clustering or closely akin to original clustering. Using shrinkage-based and rank-based correlation coefficients, which are known to be robust, the recovery level of six chosen clustering algorithms is evaluated using Rand’s C values. The recovery levels using weighted likelihood estimate of correlation coefficient are obtained and compared to the results from using those correlation coefficients in applying agglomerative clustering algorithms.
منابع مشابه
s-CorrPlot: Encoding and Exploring Correlation
Figure 1: Visualizations of correlation for a dataset containing 22,000 variables. The left two images show the correlation coefficients using a heatmap, clustered with (a) average linkage, and (b) complete linkage. The visible patterns in the heatmap are highly dependent on the clustering algorithm. In (c), our novel s-CorrPlot spatially encodes correlation coefficients, highlighting very diff...
متن کاملUsing matrix of thresholding partial correlation coefficients to infer regulatory network
DNA arrays measure the expression levels for thousands of genes simultaneously under different conditions. These measurements reflect many aspects of the underlying biological processes. A method based on the matrix of thresholding partial correlation coefficients (MTPCC) is proposed for network inference from expression profiles. It includes three main parts: (1) hierarchical cluster analysis,...
متن کاملComparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L)
The objective of this study was to evaluate whether different similarity coefficients used with dominant markers can influence the results of cluster analysis, using eighteen inbred lines of maize from two different populations, BR-105 and BR-106. These were analyzed by AFLP and RAPD markers and eight similarity coefficients were calculated: Jaccard, Sorensen-Dice, Anderberg, Ochiai, Simple-mat...
متن کاملبررسی تنوع مورفولوژیکی تودههای سیاهدانه (Nigella sativa L.) با استفاده از روشهای آماری چند متغیره
Nigella sativa L. (black cumin) belonging to the Ranunculaceae family, is one of the most important medicinal plants and wild and cultivated forms of this plant is used in Iran. Genetic diversity of 27 accessions of N .Sativa L. from different places of Iran was characterized by morphological characteristics and data was analyzed using univariate and multivariate analyses. ANOVA revealed high s...
متن کاملVisualizing Correlation
The well-known fact that Pearson’s product-moment correlation coefficient between two variables is the cosine of the angle between the centered variable profiles suggests a way to visualize correlation. This angular representation of product-moment correlation is automatically displayed in an h-plot. Using ideas from multidimensional scaling, an alternative angular representation of correlation...
متن کامل