Extraction of Key Words from News Stories
نویسندگان
چکیده
In this work, we consider the task of extracting key-words such as key-players, key-locations, key-nouns and key-verbs from news stories. We cast this problem as a classification problem wherein we assign appropriate labels to each word in a news story. We considered statistical models such as naïve Bayes model, hidden Markov model and maximum entropy model in our work. We have also experimented with various features. Our results indicate that a maximum entropy model that ignores contextual features and considers only word-based features combined with stopping and stemming yields the best performance. We found that extraction of keyverbs and key-nouns is a much harder problem than extracting keyplayers and key-locations.
منابع مشابه
Implications of News Segments and Movies for Enhancing Listening Comprehension of Language Learners
Abstract Armed with technological development, the present study aimed at gauging the effectiveness of exposure to news and movies as two types of audiovisual programs in improving language learners’ listening comprehension at the intermediate level. To this end, a listening comprehension test was administered to 108 language learners and finally 60 language learners were selected as intermedia...
متن کاملImplications of News Segments and Movies for Enhancing Listening Comprehension of Language Learners
Abstract Armed with technological development, the present study aimed at gauging the effectiveness of exposure to news and movies as two types of audiovisual programs in improving language learners’ listening comprehension at the intermediate level. To this end, a listening comprehension test was administered to 108 language learners and finally 60 language learners were selected as intermedia...
متن کاملKey Phrase Extraction of Lightly Filtered Broadcast News
This paper explores the impact of light filtering on automatic key phrase extraction (AKE) applied to Broadcast News (BN). Key phrases are words and expressions that best characterize the content of a document. Key phrases are often used to index the document or as features in further processing. This makes improvements in AKE accuracy particularly important. We hypothesized that filtering out ...
متن کاملAutomatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach
The purpose of automatic title generation is to understand a document and to summarize it with only several but readable words or phrases. It is important for browsing and retrieving spoken documents, which may be automatically transcribed, but it will be much more helpful if given the titles indicating the content subjects of the documents. For title generation for Chinese language, additional...
متن کاملMultiple developing news stories identified and tracked by social insects and visualized using the new galactic streams and concurrent streams metaphors
We have developed an approach to identification and tracking of currently unfolding news stories extracted from the news articles published on the Web. Our approach employs a set of agents to retrieve those articles from the Web that might refer to some developing news story. The set of agents is inspired by social insects, in particular by a bee colony. Bees identify popular terms, referred to...
متن کامل