Similar Handwritten Chinese Character Discrimination by Weakly Supervised Learning
نویسندگان
چکیده
Traditional approaches for handwritten Chinese character recognition suffer in classifying similar characters. In this paper, we propose to discriminate similar handwritten Chinese characters by using weakly supervised learning. Our approach learns a discriminative SVM for each similar pair which simultaneously localizes the discriminative region of similar character and makes the classification. For the first time, similar handwritten Chinese character recognition (SHCCR) is formulated as an optimization problem extended from SVM. We also propose a novel feature descriptor, Gradient Context, and apply bag-of-words model to represent regions with different scales. In our method, we do not need to select a sized-fixed subwindow to differentiate similar characters. This “unconstrained” property makes our method well adapted to high variance in the size and position of discriminative regions in similar handwritten Chinese characters. We evaluate our proposed approach over the CASIA Chinese character data set and the results show that our method outperforms the state of the art.
منابع مشابه
Techniques for Highly Accurate Optical Recognition of Handwritten Characters and Their Application to Sixth Chinese National Population Census
Highly accurate optical character recognition (OCR) of handwritten characters is still a challenging task, especially for languages like Chinese and Japanese. To improve the accuracy, we developed four techniques for enhanced recognition: character recognition based on modified linear discriminant analysis (MLDA), subspace-based similar-character discrimination, multi-classifier combination, an...
متن کاملSemi-supervised learning for character recognition in historical archive documents
Training recognizers for handwritten characters is still a very time consuming task involving tremendous amounts of manual annotations by experts. In this paper we present semi-supervised labeling strategies that are able to considerably reduce the human effort. We propose two different methods to label and later recognize characters in collections of historical archive documents. The first one...
متن کاملHandwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach
In this work, a novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.1.3.3 dataset is reported. Greedy layer wise training of Deep Neural Network has helped to make significant strides in various pattern recognition problems. We employ layerwise training to Deep Convolutiona...
متن کاملAccuracy Improvement of Handwritten Character Recognition by Glvq
This paper deals with accuracy improvement of handwritten character recognition by the GLVQ (generalized learning vector quantization). In literature , the way of combining the FDA (Fisher discriminant analysis) and the GLVQ was investigated and evaluated to be effective for handwritten Chinese character recognition employing the minimum Euclidian distance classifier. In this paper, the project...
متن کاملGenerating Handwritten Chinese Characters using CycleGAN
Handwriting of Chinese has long been an important skill in East Asia. However, automatic generation of handwritten Chinese characters poses a great challenge due to the large number of characters. Various machine learning techniques have been used to recognize Chinese characters, but few works have studied the handwritten Chinese character generation problem, especially with unpaired training d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1509.05844 شماره
صفحات -
تاریخ انتشار 2015