Chinese Word Segmentation for Agriculture
نویسندگان
چکیده
Based on the Hash mechanism, a new algorithm is presented, the algorithm can realize search, update, deletion and addition operations for dictionary. According to the characteristics of Chinese characters GB code, by preserving the GB code of first word in entry, this method effectively improves the utilization rate of the storage space. In the dictionary, the one-to-many corresponding relationships between dialects and agricultural keywords are built, the dialect words can be translated efficiently into the agricultural key words, so as the word segment accuracy is improved. In the time complexity, Chinese word segmentation algorithm for agriculture were compared with the algorithms for array, linked list and AVL tree.
منابع مشابه
Adaptive Chinese Word Segmentation with Online Passive-Aggressive Algorithm
In this paper, we describe our system1 for CIPS-SIGHAN-2010 bake-off task of Chinese word segmentation, which focused on the cross-domain performance of Chinese word segmentation algorithms. We use the online passive-aggressive algorithm with domain invariant information for cross-domain Chinese word segmentation.
متن کاملThe CIPS-SIGHAN CLP2010 Chinese Word Segmentation Backoff
The CIPS-SIGHAN CLP 2010 Chinese Word Segmentation Bakeoff was held in the summer of 2010 to evaluate the current state of the art in word segmentation. It focused on the crossdomain performance of Chinese word segmentation algorithms. Eighteen groups submitted 128 results over two tracks (open training and closed training), four domains (literature, computer science, medicine and finance) and ...
متن کاملThe CIPS-SIGHAN CLP 2014 Chinese Word Segmentation Bake-off
This paper summarizes the SIGHAN 2014 Chinese Word Segmentation bakeoff in several aspects such as dataset, evaluation results. In addition, we analyze errors of segmentation by instance and make a suggestion for improving segmentation systems. 1 Goal of the Chinese word segmentation bake-off Chinese Word Segmentation is the preliminary step for Chinese information processing, which is extremel...
متن کاملChinese Word Segmentation Based On Direct Maximum Entropy Model
Chinese word segmentation is a fundamental and important issue in Chinese information processing. In order to find a unified approach for Chinese word segmentation, the author develop a Chinese lexical analyzer PCWS using direct maximum entropy model. The paper presents the general description of PCWS, as well as the result and analysis of its performance at the Second International Chinese Wor...
متن کاملSYSTRAN's Chinese Word Segmentation
SYSTRAN’s Chinese word segmentation is one important component of its Chinese-English machine translation system. The Chinese word segmentation module uses a rule-based approach, based on a large dictionary and fine-grained linguistic rules. It works on generalpurpose texts from different Chinesespeaking regions, with comparable performance. SYSTRAN participated in the four open tracks in the F...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JSW
دوره 8 شماره
صفحات -
تاریخ انتشار 2013