Manifold Matching for High-Dimensional Pattern Recognition
نویسندگان
چکیده
In pattern recognition, a kind of classical classifier called k-nearest neighbor rule (kNN) has been applied to many real-life problems because of its good performance and simple algorithm. In kNN, a test sample is classified by a majority vote of its k-closest training samples. This approach has the following advantages: (1) It was proved that the error rate of kNN approaches the Bayes error when both the number of training samples and the value of k are infinite (Duda et al., 2001). (2) kNN performs well even if different classes overlap each other. (3) It is easy to implement kNN due to its simple algorithm. However, kNN does not perform well when the dimensionality of feature vectors is large. As an example, Fig. 1 shows a test sample (belonging to class 5) of the MNIST dataset (LeCun et al., 1998) and its five closest training samples selected by using Euclidean distance. Because the selected five training samples include the three samples belonging to class 8, the test sample is misclassified into class 8. Such misclassification is often yielded by kNN in highdimensional pattern classification such as character and face recognition. Moreover, kNN requires a large number of training samples for high accuracy because kNN is a kind of memory-based classifiers. Consequently, the classification cost and memory requirement of kNN tend to be high.
منابع مشابه
Efficiency investigation of manifold matching for text document classification
Manifold matching works to identify embeddings of multiple disparate data spaces into the same low-dimensional space, where joint inference can be pursued. It is an enabling methodology for fusion and inference from multiple and massive disparate data sources. In this paper three methods of manifold matching are considered: PoM, which stands for Multidimensional Scaling (MDS) composed with Proc...
متن کاملبهبود مدل تفکیککننده منیفلدهای غیرخطی بهمنظور بازشناسی چهره با یک تصویر از هر فرد
Manifold learning is a dimension reduction method for extracting nonlinear structures of high-dimensional data. Many methods have been introduced for this purpose. Most of these methods usually extract a global manifold for data. However, in many real-world problems, there is not only one global manifold, but also additional information about the objects is shared by a large number of manifolds...
متن کاملManifold Matching: Joint Optimization of Fidelity and Commensurability
Fusion and inference from multiple and massive disparate data sources – the requirement for our most challenging data analysis problems and the goal of our most ambitious statistical pattern recognition methodologies – has many and varied aspects which are currently the target of intense research and development. One aspect of the overall challenge is manifold matching – identifying embeddings ...
متن کاملNeighborhood matrix: A new idea in matching of two dimensional gel images
Automated data analysis and pattern recognition techniques are the requirements of biological and proteomicsresearch studies. The analysis of proteins consists of some stages among which the analysis of two dimensionalelectrophoresis (2-DE) images is crucial. The aim of image capturing is to generate a Photostat that can be used infuture works such as image comparison. The researchers introduce...
متن کاملLocal Derivative Pattern with Smart Thresholding: Local Composition Derivative Pattern for Palmprint Matching
Palmprint recognition is a new biometrics system based on physiological characteristics of the palmprint, which includes rich, stable, and unique features such as lines, points, and texture. Texture is one of the most important features extracted from low resolution images. In this paper, a new local descriptor, Local Composition Derivative Pattern (LCDP) is proposed to extract smartly stronger...
متن کاملSparse Modeling for High - Dimensional Multi - Manifold Data Analysis
High-dimensional data are ubiquitous in many areas of science and engineering, such as machine learning, signal and image processing, computer vision, pattern recognition, bioinformatics, etc. Often, high-dimensional data are not distributed uniformly in the ambient space; instead they lie in or close to a union of low-dimensional manifolds. Recovering such low-dimensional structures in the dat...
متن کامل