Text-to-text Similarity of Sentences Text-to-text Similarity of Sentences

نویسندگان

  • Vasile Rus
  • Mihai Lintean
  • Arthur C. Graesser
  • Danielle S. McNamara
چکیده

Assessing the semantic similarity between two texts is a central task in many applications, including summarization, intelligent tutoring systems, and software testing. Similarity of texts is typically explored at the level of word, sentence, paragraph, and document. The similarity can be defined quantitatively (e.g. in the form of a normalized value between 0 and 1) and qualitatively in the form of semantic relations such as elaboration, entailment, or paraphrase. In this chapter, we focus first on measuring quantitatively and then on detecting qualitatively sentence-level text-to-text semantic relations. A generic approach that relies on word-to-word similarity measures is presented as well as experiments and results obtained with various instantiations of the approach. In addition, we provide results of a study on the role of weighting in Latent Semantic Analysis, a statistical technique to assess similarity of texts. The results were obtained on two data sets: a standard data set on sentence-level paraphrase detection and a data set from an intelligent tutoring system. DOI: 10.4018/978-1-60960-741-8.ch007

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

Reading and Assessing the City / Neighborhood FabricAs a Text. Case Study: Sar-Tapulah Historical Neighbourhood inSanandaj

From a linguistic point of view, the city can be seen as a text, consisting of different components and structures being related to each other beyond a sentence. Looking at the city from this point of view, what establishes a syntactic relationship and cohesion and coherence of the components of the city as a common language is called the syntax of the city. Linguistic study of the text of the ...

متن کامل

Iranian EFL Learners’ Lexical Inferencing Strategies at Both Text and Sentence levels

Lexical inferencing is one of the most important strategies in vocabulary learning and it plays an important role in dealing with unknown words in a text. In this regard, the aim of this study was to determine the lexical inferencing strategies used by Iranian EFL learners when they encounter unknown words at both text and sentence levels. To this end, forty lower intermediate students were div...

متن کامل

An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches

Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Calculating Statistical Similarity between Sentences

Sentence similarity plays an important role in text-related research and applications. It is closely related to word similarity and document similarity. The statistical similarity measures between sentences, based on symbolic characteristics and structural information, could measure the similarity between sentences without any prior knowledge but only on the statistical information of sentences...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016