A Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic

نویسندگان

  • Abdelati Hawwari
  • Mohammed Attia
  • Mona Diab
چکیده

In this paper we describe a framework for classifying and annotating Egyptian Arabic Multiword Expressions (EMWE) in a specialized computational lexical resource. The framework intends to encompass comprehensive linguistic information for each MWE including: a. phonological and orthographic information; b. POS tags; c. structural information for the phrase structure of the expression; d. lexicographic classification; e. semantic classification covering semantic fields and semantic relations; f. degree of idiomaticity where we adopt a three-level rating scale; g. pragmatic information in the form of usage labels; h. Modern Standard Arabic equivalents and English translations, thereby rendering our resource a three-way – Egyptian Arabic, Modern Standard Arabic and English – repository for MWEs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A CAD System Framework for the Automatic Diagnosis and Annotation of Histological and Bone Marrow Images

Due to ever increasing of medical images data in the world’s medical centers and recent developments in hardware and technology of medical imaging, necessity of medical data software analysis is needed. Equipping medical science with intelligent tools in diagnosis and treatment of illnesses has resulted in reduction of physicians’ errors and physical and financial damages. In this article we pr...

متن کامل

Building an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBulding an Arabic Multiword Expressions Repository

We introduce a list of Arabic multiword expressions (MWE) collected from various dictionaries. The MWEs are grouped based on their syntactic type. Every constituent word in the expressions is manually annotated with its full context-sensitive morphological analysis. Some of the expressions contain semantic variables as place holders for words that play the same semantic role. In addition, we ha...

متن کامل

Can Recognising Multiword Expressions Improve Shallow Parsing?

There is significant evidence in the literature that integrating knowledge about multiword expressions can improve shallow parsing accuracy. We present an experimental study to quantify this improvement, focusing on compound nominals, proper names and adjectivenoun constructions. The evaluation set of multiword expressions is derived from WordNet and the textual data are downloaded from the web...

متن کامل

Annotation of Multiword Expressions in the Prague Dependency Treebank

We describe annotation of multiword expressions in the Prague Dependency Treebank, using several automatic pre-annotation steps. We use subtrees of the tectogrammatical tree structures of the Prague dependency treebank to store representations of the multiword expressions in the dictionary and pre-annotate following occurrences automatically. We also show a way to measure reliability of this ty...

متن کامل

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014