Confidence estimation for t-SNE embeddings using random forest
نویسندگان
چکیده
Abstract Dimensionality reduction algorithms are commonly used for reducing the dimension of multi-dimensional data to visualize them on a standard display. Although many dimensionality such as t-distributed Stochastic Neighborhood Embedding aim preserve close neighborhoods in low-dimensional space, they might not accomplish that every sample and eventually produce erroneous representations. In this study, we developed supervised confidence estimation algorithm detecting samples embeddings. Our generates score each an embedding based distance-oriented random forest regressor. We evaluate its performance both intra- inter-domain compare it with neighborhood preservation ratio our baseline. results showed resulting provides distinctive information about correctness any compared The source code is available at https://github.com/gsaygili/dimred .
منابع مشابه
Visualizing Data using t-SNE
We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. ...
متن کاملFast Optimization for t-SNE
The paper presents an alternative optimization technique for t-SNE that is orders of magnitude faster than the original optimization technique, and that produces results that are at least as good.
متن کاملAccelerating t-SNE using tree-based algorithms
The paper investigates the acceleration of t-SNE—an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots—using two treebased algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N logN). Our experiments show that ...
متن کاملSupplemental Material for Visualizing Data using t - SNE
In this supplementary material, we present the results of our experiments that compare the visualizations produced by t-SNE with those produced by seven other dimensionality reduction techniques on five datasets from a variety of domains. Some of these results were already presented in the paper, however, we present the results here in a different form. The five datasets we employed in our expe...
متن کاملFuzzy Confidence Intervals for Mean of Fuzzy Random Variables
This article has no abstract.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Machine Learning and Cybernetics
سال: 2022
ISSN: ['1868-8071', '1868-808X']
DOI: https://doi.org/10.1007/s13042-022-01635-2