Class lecture summarization taking into account consecutiveness of important sentences

نویسندگان

  • Yasuhisa Fujii
  • Kazumasa Yamamoto
  • Norihide Kitaoka
  • Seiichi Nakagawa
چکیده

This paper presents a novel sentence extraction framework that takes into account the consecutiveness of important sentences using a Support Vector Machine (SVM). Generally, most extractive summarizers do not take context information into account, but do take into account the redundancy over the entire summarization. However, there must exist relationships among the extracted sentences. Actually, we can observe these relationships as consecutiveness among the sentences. We deal with this consecutiveness by using dynamic and difference features to decide if a sentence needs to be extracted or not. Since important sentences tend to be extracted consecutively, we just used the decision made for the previous sentence as the dynamic feature. We used the differences between the current and previous feature values for the difference feature, since adjacent sentences in a block of important ones should have similar feature values to each other, where as, there should be a larger difference in the feature values between an important sentence and an unimportant one. We also present a way to ensure that no redundant summarization occurs. Experimental results on a Corpus of Japanese classroom Lecture

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization

We automatically extract the summaries of spoken class lectures. This paper presents a novel method for sentence extraction-based automatic speech summarization. We propose a technique that extracts “cue phrases for important sentences (CPs)” that often appear in important sentences. We formulate CP extraction as a labeling problem of word sequences and use Conditional Random Fields (CRF) [1] f...

متن کامل

Corpus and Evaluation Measures for Multiple Document Summarization with Multiple Sources

In this paper, we introduce a large-scale test collection for multiple document summarization, the Text Summarization Challenge 3 (TSC3) corpus. We detail the corpus construction and evaluation measures. The significant feature of the corpus is that it annotates not only the important sentences in a document set, but also those among them that have the same content. Moreover, we define new eval...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms

This paper proposes an improved approach for spoken lecture summarization, in which random walk is performed on a graph constructed with automatically extracted key terms and probabilistic latent semantic analysis (PLSA). Each sentence of the document is represented as a node of the graph and the edge between two nodes is weighted by the topical similarity between the two sentences. The basic i...

متن کامل

Applying two-level reinforcement ranking in query-oriented multidocument summarization

Sentence ranking is the issue of most concern in document summarization today. While traditional featurebased approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graphbased ranking algorithms (such as the PageRank-like algorithms) recursively ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008