Automatic Text Summarization using Pathfinder Network Scaling
نویسندگان
چکیده
This thesis describes an automatic text summarization system based on graph-theory. Summaries are useful indicators of the document content. Traditionally summaries are created by humans by reading the text and identifying the important points in the text. So called information-overload makes such manual work very difficult (if not impossible). Many attempts have been made to automate the summarization task. This task is very difficult and involves natural language understanding. We propose an extractive text summarization system. Extractive summarization works by selecting a subset of sentences from the original text. Thus the system needs to identify most important sentences in the text. We use graph theory to identify relative importance of a sentence. The proposed method, SumGraph, uses a fully connected network to model a text with nodes as sentences and the lexical dissimilarity between pairs sentences as the link weights. This fully connected network is then scaled using Pathfinder Network Scaling technique. The scaled network reveals the conceptual organization of the sentences. Sentences are then selected for inclusion in the summary depending upon their relative importance in the conceptual network. Performance of the method is illustrated on a collection of newspaper articles. The data comes from Document Understanding Conferences. ROUGE measure is used for evaluating goodness of a summary. Comparison with other methods indicate that SumGraph produces better summaries.
منابع مشابه
A survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملSystematic literature review of fuzzy logic based text summarization
Information Overloadrq is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...
متن کاملBiogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization
Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...
متن کاملAutomatic Text Summarization Using Reinforcement Learning with Embedding Features
An automatic text summarization system can automatically generate a short and brief summary that contains a main concept of an original document. In this work, we explore the advantages of simple embedding features in Reinforcement leaning approach to automatic text summarization tasks. In addition, we propose a novel deep learning network for estimating Qvalues used in Reinforcement learning. ...
متن کاملExtractive Text Summarization using Neural Networks
Text Summarization has been an extensively studied problem. Traditional approaches to text summarization rely heavily on feature engineering. In contrast to this, we propose a fully data-driven approach using feedforward neural networks for single document summarization. We train and evaluate the model on standard DUC 2002 dataset which shows results comparable to the state of the art models. T...
متن کامل