Ambiguity Analysis Model of Word Segmentation Based on Word Group
نویسندگان
چکیده
منابع مشابه
Chinese Word Segmentation based on Mixing Model
This paper presents our recent work for participation in the Second International Chinese Word Segmentation Bakeoff. According to difficulties, we divide word segmentation into several sub-tasks, which are solved by mixed language models, so as to take advantage of each approach in addressing special problems. The experiment indicated that this system achieved 96.7% and 97.2% in F-measure in PK...
متن کاملAmbiguity Resolution in Chinese Word Segmentation
A new method for Chinese word segmentation named Conditional F&BMM (Forward and Backward Maximal Matching) which incorporates both bigram statistics (i.e., mutual information and difference of t-test between Chinese characters) and linguistic rules for ambiguity resolution is proposed in this paper. The key characteristics of this model are the use of: (i) statistics which can be automatically ...
متن کاملLanguage Model Based Arabic Word Segmentation
We approximate Arabic’s rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus. The algorithm uses ...
متن کاملA Chinese Word Segmentation System Based on Cascade Model
This paper introduces the system of Word Segmentation and analyzes its evaluation results in the Fourth SIGHAN Bakeoff . A novel method has been used in the system, which main idea is: firstly, the main problems of WS have been classified, and then a cascaded model has been used to gradually optimize the system. The core of this WS system is the segmentation of ambiguous words and the internal ...
متن کاملChinese Word Segmentation Based On Direct Maximum Entropy Model
Chinese word segmentation is a fundamental and important issue in Chinese information processing. In order to find a unified approach for Chinese word segmentation, the author develop a Chinese lexical analyzer PCWS using direct maximum entropy model. The paper presents the general description of PCWS, as well as the result and analysis of its performance at the Second International Chinese Wor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Applied Sciences
سال: 2013
ISSN: 1812-5654
DOI: 10.3923/jas.2013.3153.3160