Extracting Arabic Collocations Based on Jape Rules

نویسندگان

  • Soraya zaidi
  • Ahmed Abdelali
چکیده

The massive amount of digital information available in all disciplines has generated a critical need to organize and structure their content. Among the existing tools for languages such as English or French can easily be adapted to Arabic language. In some cases a simple configuration is sufficient while in other cases significant modifications must be made to obtain acceptable results. We present in this paper a rulebased method for extracting collocations in Arabic language using Gate (General Architecture for Text Engineering). We use the extracted collocations as domain terms to build Arabic text-based ontologies. We validated the approach using The Crescent Quranic Corpus in order to build automatically the Quran ontology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TCtract-A Collocation Extraction Approach for Noun Phrases Using Shallow Parsing Rules and Statistic Models

This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (...

متن کامل

A Comparative Analysis of Collocation in Arabic-English Translations of the Glorious Quran

The Qur’an is the only holy book of Muslims all around the world. Each person with any religion and language is interested in comprehending and accepting the rules and regulations of their own belief. Translation of the Qur’an is only an attempt to present its meaning. One of the most challenges in translation of the Qur’an is collocation. A collocation is a sequence of words or terms that co-o...

متن کامل

The JAPE riddle generator: technical specification

Although the JAPE riddle generator has attracted significant attention and there are published accounts of its performance, there has been no detailed technical statement of its internal workings. This paper remedies that, by providing formal definitions of the program’s data structures, rules and procedures. The most important rules, the schemata, are listed in full in an appendix.

متن کامل

Collocational Translation Memory Extraction Based on Statistical and Linguistic Information

In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus to provide phrasal translation memories. The method integrates statistical and linguistic information to achieve effective extraction of bilingual collocations. The linguistic information includes parts of speech, chunks, and clauses. The method involves first obtaining an extended list of Englis...

متن کامل

Extracting Verb-Noun Collocations from Text

In this paper, we describe a new method for extracting monolingual collocations. The method is based on statistical methods extracts. VN collocations from large textual corpora. Being able to extract a large number of collocations is very critical to machine translation and many other application. The method has an element of snowballing in it. Initially, one identifies a pattern that will prod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010