Key Sentence Extraction from Single Document based on Triangle Analysis in Dependency Graph
نویسندگان
چکیده
Document summarization is a technique aimed to automatically extract main ideas from electronic documents. In this paper, we propose a novel algorithm, called TriangleSum for key sentence extraction from single document based on graph theory. The algorithm builds a dependency graph for the underlying document based on co-occurrence relation as well as syntactic dependency relations. The nodes represent words or phrases of high frequency, and edges represent dependency, or co-occurrence relations between them. The clustering coefficient is computed from each node to measure the strength of connection between the node and its neighborhood nodes in a graph. By identifying triangles of nodes in the graph, a part of the dependency graph can be extracted as marks of key sentences. At last, a set of key sentences that represent the main document information can be extracted. Keywords— document summarization; key sentence; dependency structure analysis; clustering coefficient; triangle finding
منابع مشابه
Single document Summarization based on Clustering Coefficient and Transitivity Analysis
Document summarization is a technique aimed to automatically extract the main ideas from electronic documents. With the fast increase of electronic documents available on the network, techniques for making efficient use of such documents become increasingly important. In this paper, we propose a novel algorithm, called TriangleSum for single document summarization based on graph theory. The alg...
متن کاملFeature extraction in opinion mining through Persian reviews
Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...
متن کاملExtraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...
متن کاملDense Semantic Graph and its Application in Single Document Summarisation
Semantic graph representation of text is an important part of natural language processing applications such as text summarisation. We have studied two ways of constructing the semantic graph of a document from dependency parsing of its sentences. The first graph is derived from the subject-object-verb representation of sentence, and the second graph is derived from considering more dependency r...
متن کاملSentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel depe...
متن کامل