Deep Distance Metric Learning with Data Summarization
نویسندگان
چکیده
We present Deep Stochastic Neighbor Compression (DSNC), a framework to compress training data for instance-based methods (such as k-nearest neighbors). We accomplish this by inferring a smaller set of pseudo-inputs in a new feature space learned by a deep neural network. Our framework can equivalently be seen as jointly learning a nonlinear distance metric (induced by the deep feature space) and learning a compressed version of the training data. In particular, compressing the data in a deep feature space makes DSNC robust against label noise and issues such as within-class multi-modal distributions. This leads to DSNC yielding better accuracies and faster predictions at test time, as compared to other competing methods. We conduct comprehensive empirical evaluations, on both quantitative and qualitative tasks, and on several benchmark datasets, to show its effectiveness as compared to several baselines.
منابع مشابه
Deep Metric Learning with Data Summarization
We present Deep Stochastic Neighbor Compression (DSNC), a framework to compress training data for instance-based methods (such as k-nearest neighbors). We accomplish this by inferring a smaller set of pseudo-inputs in a new feature space learned by a deep neural network. Our framework can equivalently be seen as jointly learning a nonlinear distance metric (induced by the deep feature space) an...
متن کاملیادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملDECADE: A Deep Metric Learning Model for Multivariate Time Series
Determining similarities (or distance) between multivariate time series sequences is a fundamental problem in time series analysis. The complex temporal dependencies and variable lengths of time series make it an extremely challenging task. Most existing work either rely on heuristics which lacks flexibility and theoretical justifications, or build complex algorithms that are not scalable to bi...
متن کامل