Grammatical Tagging of a Persian Corpus

نویسندگان

  • S. Mostafa ASSI
  • M. Haji ABDOLHOSSEINI
چکیده

The purpose of this article is to briefly introduce an interactive P.O.S. tagging system developed as a project at the Institute for Humanities and Cultural Studies in Tehran, Iran. The system is designed as part of the annotation procedure for a Persian corpus called The Farsi Linguistic Database (FLDB), and is the first attempt ever to tag a Persian corpus. In section I, the project itself will be introduced, section 2 presents an evaluation of the project and section 3 is allocated to some suggestions for future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

A hidden Markov model for Persian part-of-speech tagging

One of the important actions in the processing of languages is part-of-speech tagging. Against of this importance, although numerous models have been presented in different languages but there is few works have been done in Persian language. In this paper, a part-of-speech tagging system on Persian corpus by using hidden Markov model is proposed. Achieving to this goal, the main aspects of Pers...

متن کامل

A Statistical Part-of-Speech Tagger for Persian

This paper presents the statistical part-ofspeech tagger HunPoS trained on a Persian corpus. The result of the experiments shows that HunPoS provides an overall accuracy of 96.9%, which is the best result reported for Persian part-of-speech tagging.

متن کامل

A Persian Part-Of-Speech Tagger Based on Morphological Analysis

This paper describes a method based on morphological analysis of words for a Persian Part-Of-Speech (POS) tagging system. This is a main part of a process for expanding a large Persian corpus called Peyekare (or Textual Corpus of Persian Language). Peykare is arranged into two parts: annotated and unannotated parts. We use the annotated part in order to create an automatic morphological analyze...

متن کامل

Unsupervised Part of Speech Tagging for Persian

In this paper we present a rather novel unsupervised method for part of speech (below POS) disambiguation which has been applied to Persian. This method known as Iterative Improved Feedback (IIF) Model, which is a heuristic one, uses only a raw corpus of Persian as well as all possible tags for every word in that corpus as input. During the process of tagging, the algorithm passes through sever...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001