Ad Hoc Information Retrieval for Persian
نویسندگان
چکیده
In this paper we present an introduction to the Persian language and its morphology, and describe available resources for Persian text processing. We then propose and evaluate an information retrieval model, a variation of the vector space model which uses the relations existing between query terms. Our experiments on the Hamshahri collection show that the proposed model has better precision for top ranked documents in comparison with some popular IR models.
منابع مشابه
German, French, English and Persian Retrieval Experiments at CLEF 2009
We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2009. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...
متن کاملGerman, French, English and Persian Retrieval Experiments at CLEF 2008
We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2008. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...
متن کاملFusion of Retrieval Models at CLEF 2008 Ad-Hoc Persian Track
Metasearch engines submit the user query to several underlying search engines and then merge their retrieved results to generate a single list that is more effective to the users’ information needs. According to the idea behind metasearch engines, it seems that merging the results retrieved from different retrieval models will improve the search coverage and precision. In this study, we have in...
متن کاملAd Hoc Retrieval with the Persian Language
This paper describes our participation to the Persian ad hoc search during the CLEF 2009 evaluation campaign. In this task, we suggest using a light suffix-stripping algorithm for the Farsi (or Persian) language. The evaluations based on different probabilistic models demonstrated that our stemming approach performs better than a stemmer removing only the plural suffixes, or statistically bette...
متن کاملJHU Experiments in Monolingual Farsi Document Retrieval at CLEF 2009
At CLEF 2009 JHU submitted runs in the ad hoc track for the monolingual Persian evaluation. Variants of character n-gram tokenization provided a 10% relative gain over unnormalized words. A run based on skip n-grams, which allow internal skipped letters, achieved a mean average precision of 0.4938. Using traditional 5-grams resulted in a score of 0.4868 while plain words had a score of 0.4463.
متن کامل