COV Model and its Application in Chinese Part-of-Speech Tagging

نویسندگان

  • Fukun Xing
  • Song Rou
چکیده

This article presents a new sequence labeling model named Context OVerlapping (COV) model, which expands observation from single word to n-gram unit and there is an overlapping part between the neighboring units. Due to the co-occurrence constraint and transition constraint, COV model reduces the search space and improves tagging accuracy. The 2-gram COV is applied to Chinese PoS tagging and the precision rate of the open test is as high as 96.83%, which is higher than the second order HMM, which is 95.73%. The result is also comparable to the discriminative models but COV takes much less training time than them. With symbol decoding COV prunes many nodes before statistics decoding and the search space of COV is about10-20% less than that of HMM.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Hidden Markov Model and Its Application in Natural Language Processing

This paper describes Hidden Markov Model and its application in natural language process, first introduces the basic concept of Hidden Markov Model, then introduces the three basic issues and the basic algorithm to solve the problems, finally gives the demonstration of the application of Chinese part-of-speech tagging and speech recognition via Hidden Markov Model.

متن کامل

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

An Enhanced Model for Chinese Word Segmentation and Part-of-Speech Tagging

This paper will present an enhanced probabilistic model for Chinese word segmentation and part-of-speech (POS) tagging. The model introduces the information of Chinese word length as one of its features to reach a more accurate result. And in addition, the model also achieves the integration of segmentation and POS tagging. After presenting the model, this paper will give a brief discussion on ...

متن کامل

A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis

This paper presents a robust method of lexical analysis for Chinese text-to-speech (TTS) synthesis using a pair-based Language Model (LM). The traditional way of Chinese lexical analysis simply regards the word segmentation and part-of-speech (POS) tagging as two separated phases. Each of them utilizes its own algorithms and models. Actually, the POS information is useful for word segmentation,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014