The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition
نویسنده
چکیده
The Third International Chinese Language Processing Bakeoff was held in Spring 2006 to assess the state of the art in two important tasks: word segmentation and named entity recognition. Twenty-nine groups submitted result sets in the two tasks across two tracks and a total of five corpora. We found strong results in both tasks as well as continuing challenges.
منابع مشابه
The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging
The Fourth International Chinese Language Processing Bakeoff was held in 2007 to assess the state of the art in three important tasks: Chinese word segmentation, named entity recognition and Chinese POS tagging. Twenty-eight groups submitted result sets in the three tasks across two tracks and a total of seven corpora. Strong results have been found in all the tasks as well as continuing challe...
متن کاملCharacter Language Models for Chinese Word Segmentation and Named Entity Recognition
We describe the application of the LingPipe toolkit (Alias-i 2006) to Chinese word segmentation and named entity recognition. We provide results for the third SIGHAN Chinese language processing bakeoff (Levow 2006). F1 measures on the best performing corpora were .972 for word segmentation and .855 for person/location/organization named-
متن کاملA Pragmatic Chinese Word Segmentation System
This paper presents our work for participation in the Third International Chinese Word Segmentation Bakeoff. We apply several processing approaches according to the corresponding sub-tasks, which are exhibited in real natural language. In our system, Trigram model with smoothing algorithm is the core module in word segmentation, and Maximum Entropy model is the basic model in Named Entity Recog...
متن کاملNetEase Automatic Chinese Word Segmentation
This document analyses the bakeoff results from NetEase Co. in the SIGHAN5 Word Segmentation Task and Named Entity Recognition Task. The NetEase WS system is designed to facilitate research in natural language processing and information retrieval. It supports Chinese and English word segmentation, Chinese named entity recognition, Chinese part of speech tagging and phrase conglutination. Evalua...
متن کاملAn Improved CRF based Chinese Language Processing System for SIGHAN Bakeoff 2007
This paper describes three systems: the Chinese word segmentation (WS) system, the named entity recognition (NER) system and the Part-of-Speech tagging (POS) system, which are submitted to the Fourth International Chinese Language Processing Bakeoff. Here, Conditional Random Fields (CRFs) are employed as the primary models. For the WS and NER tracks, the ngram language model is incorporated in ...
متن کامل