Automatic Text Summarization using Pathfinder Network Scaling

نویسندگان

  • Kaustubh Raosaheb Patil
  • Pavel Brazdil
چکیده

This thesis describes an automatic text summarization system based on graph-theory. Summaries are useful indicators of the document content. Traditionally summaries are created by humans by reading the text and identifying the important points in the text. So called information-overload makes such manual work very difficult (if not impossible). Many attempts have been made to automate the summarization task. This task is very difficult and involves natural language understanding. We propose an extractive text summarization system. Extractive summarization works by selecting a subset of sentences from the original text. Thus the system needs to identify most important sentences in the text. We use graph theory to identify relative importance of a sentence. The proposed method, SumGraph, uses a fully connected network to model a text with nodes as sentences and the lexical dissimilarity between pairs sentences as the link weights. This fully connected network is then scaled using Pathfinder Network Scaling technique. The scaled network reveals the conceptual organization of the sentences. Sentences are then selected for inclusion in the summary depending upon their relative importance in the conceptual network. Performance of the method is illustrated on a collection of newspaper articles. The data comes from Document Understanding Conferences. ROUGE measure is used for evaluating goodness of a summary. Comparison with other methods indicate that SumGraph produces better summaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Systematic literature review of fuzzy logic based text summarization

Information Overloadrq  is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq    informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Automatic Text Summarization Using Reinforcement Learning with Embedding Features

An automatic text summarization system can automatically generate a short and brief summary that contains a main concept of an original document. In this work, we explore the advantages of simple embedding features in Reinforcement leaning approach to automatic text summarization tasks. In addition, we propose a novel deep learning network for estimating Qvalues used in Reinforcement learning. ...

متن کامل

Extractive Text Summarization using Neural Networks

Text Summarization has been an extensively studied problem. Traditional approaches to text summarization rely heavily on feature engineering. In contrast to this, we propose a fully data-driven approach using feedforward neural networks for single document summarization. We train and evaluate the model on standard DUC 2002 dataset which shows results comparable to the state of the art models. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007