Arabic Book Retrieval using Class and Book Index Based Term Weighting

نویسندگان

  • M. Ali Fauzi
  • Agus Zainal Arifin
  • Anny Yuniarti
چکیده

Received Apr 27, 2017 Revised Sep 8, 2017 Accepted Sep 27, 2017 One of the most common issue in information retrieval is documents ranking. Documents ranking system collects search terms from the user and orderly retrieves documents based on the relevance. Vector space models based on TF.IDF term weighting is the most common method for this topic. In this study, we are concerned with the study of automatic retrieval of Islamic Fiqh (Law) book collection. This collection contains many books, each of which has tens to hundreds of pages. Each page of the book is treated as a document that will be ranked based on the user query. We developed class-based indexing method called inverse class frequency (ICF) and book-based indexing method inverse book frequency (IBF) for this Arabic information retrieval. Those method then been incorporated with the previous method so that it becomes TF.IDF.ICF.IBF. The term weighting method also used for feature selection due to high dimensionality of the feature space. This novel method was tested using a dataset from 13 Arabic Fiqh e-books. The experimental results showed that the proposed method have the highest precision, recall, and F-Measure than the other three methods at variations of feature selection. The best performance of this method was obtained when using best 1000 features by precision value of 76%, recall value of 74%, and F-Measure value of 75%. Keyword:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Effectiveness of Relevance Profiling

Relevance profiling is a general process for withindocument retrieval. Given a query, a profile of retrieval status values is computed by sliding a fixed sized window across a document. In this paper, we report a series of bench experiments on relevance profiling, using an existing electronic book, and its associated book index. The book index is the source of queries and relevance judgements f...

متن کامل

Stemming Methodologies Over Individual Query Words for an Arabic Information Retrieval System

Stemming is one of the most important factors that affect the performance of information retrieval systems. This article investigates how to improve the performance of an Arabic Information Retrieval System (Arabic-IRS) by imposing the retrieval method over individual words of a query depending on the importance of the WORD, the STEM, or the ROOT of the query terms in the database. This method,...

متن کامل

Modelling Anchor Text Retrieval in Book Search based on Back-of-Book Index

This paper proposes a probabilistic logic abstraction for modelling tf -boosting approaches to anchor text retrieval, adapted for the task of page-search in books. The underlying idea is to view the backof-book index (BoBI) as a list of anchors pointing to pages in the book. First, we model the direct application of hypertext-based tf boosting to books and show that this naive method of propaga...

متن کامل

Book Reviews: Arabic Computational Morphology: Knowledge-Based and Empirical Methods by Abdelhadi Soudi, Antal van den Bosch, and Neumann, Günter (editors)

The past few decades have witnessed an increased interest in Arabic natural language processing, and in particular computational morphology. In the early 1990s one had to contend with a number of papers that proposed methodologies to handle the various complexities of Arabic morphology, most of which had little implementation associated with them, with the sole notable exception of the works of...

متن کامل

Critique of Arabic Texts/ Manifestations of Westernization in Egyptian society: Social, Religious, and Cultural Pathology in the Book of Hadith Isa Ibn Hisham, Nafiseh Hajirjabi

Al Muwailihi was a genial and committed reformist and intellectual who devoted all his efforts to reform the ethics of society. A student of Seyed Jamal and Mohammed Abdeh, he was among the advocates of liberty in his era. Their reformist ideas influenced him, and since patriotic sentiment among the reformist thinkers was a subset of religious sentiment, he felt the need to write a book to desc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017