Norbert Fuhr Information Retrieval Methods for Literary Texts
نویسنده
چکیده
Information retrieval focuses on content-based searching in text documents. For this purpose, first text content must be represented, by using a representation language (like thesauri or classification schemes) or by performing free-text search. The latter approach uses either string-based or computer-linguistic methods (stemming, dictionary lookup, syntax analysis). For retrieval, weighting and ranking methods give better results than Boolean retrieval, and some of them also allow for relevance feedback. Retrieval of XML documents requires new methods for support weighting and ranking, specificity-oriented search, data types with vague predicates and vague structural conditions.
منابع مشابه
XML Information Retrieval and Information Extraction
We present a new query language for information retrieval in XML documents and discuss its combination with information extraction methods. XIRQL is an XML query language which implements IR-related features such as weighting and ranking, relevance-oriented search, datatypes with vague predicates, and structural relativism. For information extracted from texts, XIRQL can rank records based on u...
متن کاملInformation Retrieval Course Material of the Course held in the Summer Term 1993 - Chapter 9: Fact Retrieval
متن کامل
Generating Search Term Variants for Text Collections with Historic Spellings
In this paper, we describe a new approach for retrieval in texts with non-standard spelling, which is important for historic texts in English or German. For this purpose, we present a new algorithm for generating search term variants in ancient orthography. By applying a spell checker on a corpus of historic texts, we generate a list of candidate terms for which the contemporary spellings have ...
متن کامل