Event Extraction from Classical Arabic Texts

نویسندگان

  • Razieh Baradaran
  • Behrouz Minaei-Bidgoli
چکیده

Event extraction is one of the most useful and challenging Information Extraction (IE) tasks that can be used in many natural language processing applications in particular semantic search systems. Most of the developed systems in this field extract events from English texts; therefore, in many other languages in particular Arabic there is a need for research in this area. In this paper, we develop a system for extracting person related events and their participants from classical Arabic texts with complex linguistic structure. The first and most effective step to extract event is the correct diagnosis of the event mention and determining sentences which describe events. Implementation and comparing performance and the use of various methods can help researchers to choose appropriate method for event extraction based on their conditions and limitations. In this research, we have implemented three methods including knowledge oriented method (based on a set of keywords and rules), data-oriented method (based on Support Vector Machine (SVM)) and semantic oriented method (based on lexical chain) to automatically classify sentences as on-event or off eventones. The results indicate that knowledge oriented and machine learning methods have high precision and recall in event extraction process. The semantic oriented method with acceptable precision minimizes the linguistic knowledge requirements of knowledge oriented method and preprocessing requirements of data oriented method; and also improves automatic event extraction process from the raw text. Next step is developing a modular rule based approach for extracting event arguments such as time, place and other participants involved in independent subtasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mani’s Living Gospel: A New Approach to the Arabic and Classical New Persian Testimonia

In order to reconstruct the contents of the most famous work of Mani, Living Gospel (written originally in Syriac), we have to use the Arabic and Classical New Persian texts containing accounts and even indirect quotations of this book. One of the most remarkable points in these accounts is that they clearly show that an important part of the Living Gospel contains the Manicha...

متن کامل

Tashkeela: Novel corpus of Arabic vocalized texts, data for auto-diacritization systems

Arabic diacritics are often missed in Arabic scripts. This feature is a handicap for new learner to read َArabic, text to speech conversion systems, reading and semantic analysis of Arabic texts. The automatic diacritization systems are the best solution to handle this issue. But such automation needs resources as diactritized texts to train and evaluate such systems. In this paper, we describe ...

متن کامل

تشخیص اسامی اشخاص با استفاده از تزریق کلمه‌های نامزد اسم در میدان‌های تصادفی شرطی برای زبان عربی

Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...

متن کامل

Classifying and Segmenting Classical and Modern Standard Arabic using Minimum Cross-Entropy

Text classification is the process of assigning a text or a document to various predefined classes or categories to reflect their contents. With the rapid growth of Arabic text on the Web, studies that address the problems of classification and segmentation of the Arabic language are limited compared to other languages, most of which implement word-based and feature extraction algorithms. This ...

متن کامل

Using text mining to identify crime patterns from Arabic crime news report corpus

Most text mining techniques have been proposed only for English text, and even here, most research has been conducted on specific texts related to special contexts within the English language, such as politics, medicine and crime. In contrast, although Arabic is a widely spoken language, few mining tools have been developed to process Arabic text, and some Arabic domains have not been studied a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2015