A Comprehensive NLP System for Modern Standard Arabic and Modern Hebrew

نویسندگان

  • Dror Kamir
  • Naama Soreq
  • Yoni Neeman
چکیده

This paper presents a comprehensive NLP system by Melingo that has been recently developed for Arabic, based on Morfix – an operational formerly developed highly successful comprehensive Hebrew NLP system. The system discussed includes modules for morphological analysis, context sensitive lemmatization, vocalization, text-to-phoneme conversion, and syntactic-analysis-based prosody (intonation) model. It is employed in applications such as full text search, information retrieval, text categorization, textual data mining, online contextual dictionaries, filtering, and text-to-speech applications in the fields of telephony and accessibility and could serve as a handy accessory for non-fluent Arabic or Hebrew speakers. Modern Hebrew and Modern Standard Arabic share some unique Semitic linguistic characteristics. Yet up to now, the two languages have been handled separately in Natural Language Processing circles, both on the academic and on the applicative levels. This paper reviews the major similarities and the minor dissimilarities between Modern Hebrew and Modern Standard Arabic from the NLP standpoint, and emphasizes the benefit of developing and maintaining a unified system for both languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Smoothing a Lexicon-based POS Tagger for Arabic and Hebrew

We propose an enhanced Part-of-Speech (POS) tagger of Semitic languages that treats Modern Standard Arabic (henceforth Arabic) and Modern Hebrew (henceforth Hebrew) using the same probabilistic model and architectural setting. We start out by porting an existing Hidden Markov Model POS tagger for Hebrew to Arabic by exchanging a morphological analyzer for Hebrew with Buckwalter's (2002) morphol...

متن کامل

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...

متن کامل

Grapheme to phoneme conversion: an Arabic dialect case

We aim to develop a Speech-to-Speech translation system between Modern Standard Arabic and Algiers dialect. Such a system must include a Text-to-Speech module which itself must include a Grapheme-to-Phoneme converter. Algiers dialect is an Arabic dialect concerned by the most problems of Modern Standard Arabic in NLP area. Furthermore, it could be considered as an under-resourced language becau...

متن کامل

Machine Translation between Hebrew and Arabic: Needs, Challenges and Preliminary Solutions

Modern Hebrew and Modern Standard Arabic, both Semitic languages, share many orthographic, lexical, morphological, syntactic and semantic similarities, but they are still not mutually comprehensible. Most native Hebrew speakers in Israel do not speak Arabic, and the vast majority of Arabs (outside Israel) do not speak Hebrew. Machine translation (MT) between these two language has the potential...

متن کامل

Is Modern Hebrew Standard Average European? The View from European

In contrast with previous work emphasizing European influences on Modern Hebrew as compared to the Biblical Hebrew model adopted by the Hebrew revival movement, this article sets out to examine relevant typological features of Modern Hebrew in its own right. Taking the typological literature on Standard Average European as a starting point, it is argued that Modern Hebrew is in fact quite far f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002