Unsupervised Approaches for Automatic Keyword Extraction Using Meeting Transcripts

نویسندگان

  • Feifan Liu
  • Deana Pennell
  • Fei Liu
  • Yang Liu
چکیده

This paper explores several unsupervised approaches to automatic keyword extraction using meeting transcripts. In the TFIDF (term frequency, inverse document frequency) weighting framework, we incorporated partof-speech (POS) information, word clustering, and sentence salience score. We also evaluated a graph-based approach that measures the importance of a word based on its connection with other sentences or words. The system performance is evaluated in different ways, including comparison to human annotated keywords using F-measure and a weighted score relative to the oracle system performance, as well as a novel alternative human evaluation. Our results have shown that the simple unsupervised TFIDF approach performs reasonably well, and the additional information from POS and sentence score helps keyword extraction. However, the graph method is less effective for this domain. Experiments were also performed using speech recognition output and we observed degradation and different patterns compared to human transcripts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptive Approach to Named Entity Extraction for Meeting Applications

Named entity extraction has been intensively investigated in the past several years. Both statistical approaches and rule-based approaches have achieved satisfactory performance for regular written/spoken language. However when applied to highly informal or ungrammatical languages, e.g., meeting languages, because of the many mismatches in language genre, the performance of existing methods dec...

متن کامل

Improved Keyword and Keyphrase Extraction from Meeting Transcripts

Keywords play a vital role in extracting the correct information as per user requirements. Keywords are like index terms that contain the most important information about the content of the document. Keyword Extraction is the task of identifying a keyword or keyphrase from a document that can help users easily to understand the documents. Meeting transcripts is significantly different from docu...

متن کامل

A Fuzzy Logic Based Improved Keyword Extraction From Meeting Transcripts

Keyword Extraction is the process of assigning keywords to a document where the important words are selected by the system automatically. This proposed frame work is used to extract the keywords using Fuzzy logic method from Meeting Transcripts. At first, the given input is preprocessed. Subsequently, the preprocessed data will be sent to the features extraction method. In this method three fea...

متن کامل

Keyword extraction: a review of methods and approaches

Paper presents a survey of methods and approaches for keyword extraction task. In addition to the systematization of methods, the paper gathers a comprehensive review of existing research. Related work on keyword extraction is elaborated for supervised and unsupervised methods, with special emphasis on graphbased methods as well as Croatian keyword extraction. Selectivity-based keyword extracti...

متن کامل

Graph-Based Keyword Extraction for Single-Document Summarization

In this paper, we introduce and compare between two novel approaches, supervised and unsupervised, for identifying the keywords to be used in extractive summarization of text documents. Both our approaches are based on the graph-based syntactic representation of text and web documents, which enhances the traditional vector-space model by taking into account some structural document features. In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009