DegExt: a language-independent keyphrase extractor
نویسندگان
چکیده
منابع مشابه
DegExt: a language-independent keyphrase extractor
In this paper, we introduce DegExt, a graph-based languageindependent keyphrase extractor,which extends the keyword extraction method described in (Litvak & Last, 2008). We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx (Turney, 2000) and TextRank (Mihalcea & Tarau, 2004). We evaluated DegExt on collections of benchmark summaries in two different languages: E...
متن کاملDegExt - A Language-Independent Graph-Based Keyphrase Extractor
In this paper, we introduce DegExt, a graph-based languageindependent keyphrase extractor,which extends the keyword extraction method described in [6]. We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx [11] and TextRank [8]. Our experiments on a collection of benchmark summaries show that DegExt outperforms TextRank and GenEx in terms of precision and area un...
متن کاملLanguage Independent Feature Extractor
We propose a new customizable tool, Language Independent Feature Extractor (LIFE), which models the inherent patterns of any language and extracts relevant features of the language. There are two contributions of this work: (1) no labeled data is necessary to train LIFE (It works when a sufficient number of unlabeled documents are given), and (2) LIFE is designed to be applicable to any languag...
متن کاملPhraserate: an Html Keyphrase Extractor *
A standard feature in cataloging documents is the list of keywords. When the source documents are web pages, we can attempt to aid the cataloger by analyzing the page and presenting relevant support material. Since the keywords that occur in a document generally occur in keyphrases, and keyphrases provide contextual material for reviewing candidate keywords, they are a natural aggregate to extr...
متن کاملLikey: Unsupervised Language-Independent Keyphrase Extraction
Likey is an unsupervised statistical approach for keyphrase extraction. The method is language-independent and the only language-dependent component is the reference corpus with which the documents to be analyzed are compared. In this study, we have also used another language-dependent component: an English-specific Porter stemmer as a preprocessing step. In our experiments of keyphrase extract...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Ambient Intelligence and Humanized Computing
سال: 2012
ISSN: 1868-5137,1868-5145
DOI: 10.1007/s12652-012-0109-z