TopoMap: A 0-dimensional Homology Preserving Projection of High-Dimensional Data
نویسندگان
چکیده
Multidimensional Projection is a fundamental tool for high-dimensional data analytics and visualization. With very few exceptions, projection techniques are designed to map from space visual so as preserve some dissimilarity (similarity) measure, such the Euclidean distance example. In fact, although adopting distinct mathematical formulations favor different aspects of data, most multidimensional methods strive measures that encapsulate geometric properties distances or proximity relation between objects. However, relations not only interesting property be preserved in projection. For instance, analysis particular structures clusters outliers could more reliably performed if mapping process gives guarantee topological invariants connected components loops. This paper introduces TopoMap, novel technique which provides guarantees during process. particular, proposed method performs space, while preserving 0-dimensional persistence diagram Rips filtration ensuring filtrations generate same when applied original well projected data. The presented case studies show provided by TopoMap brings confidence analytic but also can used assist assessment other methods.
منابع مشابه
Methods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملEnsemble Clustering of High Dimensional Data with FastMap Projection
In this paper, we propose an ensemble clustering method for high dimensional data which uses FastMap projection to generate subspace component data sets. In comparison with popular random sampling and random projection, FastMap projection preserves the clustering structure of the original data in the component data sets so that the performance of ensemble clustering is improved significantly. W...
متن کاملSimilarity preserving compressions of high dimensional sparse data
The rise of internet has resulted in an explosion of data consisting of millions of articles, images, songs, and videos. Most of this data is high dimensional and sparse. The need to perform an efficient search for similar objects in such high dimensional big datasets is becoming increasingly common. Even with the rapid growth in computing power, the bruteforce search for such a task is impract...
متن کاملPreserving Privacy of Continuous High-dimensional Data with Minimax Filters
Preserving privacy of high-dimensional and continuous data such as images or biometric data is a challenging problem. This paper formulates this problem as a learning game between three parties: 1) data contributors using a filter to sanitize data samples, 2) a cooperative data aggregator learning a target task using the filtered samples, and 3) an adversary learning to identify contributors us...
متن کاملPrivacy-Preserving Distributed Linear Regression on High-Dimensional Data
We propose privacy-preserving protocols for computing linear regression models, in the setting where the training dataset is vertically distributed among several parties. Our main contribution is a hybrid multi-party computation protocol that combines Yao’s garbled circuits with tailored protocols for computing inner products. Like many machine learning tasks, building a linear regression model...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Visualization and Computer Graphics
سال: 2021
ISSN: ['1077-2626', '2160-9306', '1941-0506']
DOI: https://doi.org/10.1109/tvcg.2020.3030441