نتایج جستجو برای: text similarity

تعداد نتایج: 268086  

2005
Courtney D. Corley Rada Mihalcea

This paper presents a knowledge-based method for measuring the semanticsimilarity of texts. While there is a large body of previous work focused on finding the semantic similarity of concepts and words, the application of these wordoriented methods to text similarity has not been yet explored. In this paper, we introduce a method that combines wordto-word similarity metrics into a text-totext m...

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

2016
Pan Huang Amna Basharat Khaled Rasheed

Text similarity measures have been widely studied and used in machine learning and information retrieval for many years. However, few applications of text similarity have dealt with multi-lingual translations of a specific document. Additionally, the growing number of texts with more translations being generated increases the challenge of distinguishing or identifying the similarity and differe...

2005
Peter Andras Olusola Idowu

Correct and efficient text classification is a major challenge in today’s world of rapidly increasing amount of accessible electronic text data. Kohonen networks have been applied to document classification with comparable success to other document clustering methods. An important challenge is to devise text similarity metrics that can improve the performance of text classification Kohonen netw...

Journal: :LLC 2014
Richard S. Forsyth Serge Sharoff

Quantifying the similarity or dissimilarity between documents is an important task in authorship attribution, information retrieval, plagiarism detection, text mining, and many other areas of linguistic computing. Numerous similarity indices have been devised and used, but relatively little attention has been paid to calibrating such indices against externally imposed standards, mainly because ...

2013
Daniel Bär Torsten Zesch Iryna Gurevych

We present DKPro Similarity, an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. DKPro Similarity comprises a wide variety of measures ranging from ones based on simple n-grams and common subsequences to high-dimensional vector comparisons and structural, stylistic, and p...

2013
Emily Jamison Iryna Gurevych

Thread disentanglement is the task of separating out conversations whose thread structure is implicit, distorted, or lost. In this paper, we perform email thread disentanglement through pairwise classification, using text similarity measures on non-quoted texts in emails. We show that i) content text similarity metrics outperform style and structure text similarity metrics in both a class-balan...

Journal: :J. UCS 2010
Hermann Stern Rene Kaiser Philip Hofmair Peter Kraker Stefanie N. Lindstaedt

One of the success factors of Work Integrated Learning (WIL) is to provide the appropriate content to the users, both suitable for the topics they are currently working on, and their experience level in these topics. Our main contributions in this paper are (i) overcoming the problem of sparse content annotation by using a network based recommendation approach called Associative Network, which ...

2013
F. San Roman S. R. D. de Pinho Rosane Minghim Maria Cristina Ferreira de Oliveira

Text Analytics is essential for a large number of applications and good approaches to obtain visual mappings of text are paramount. Many visualization techniques, such as similarity based point placement layouts, have proved useful to support visual analysis of documents. However, they are sensitive to data quality, which, in turn, relies on a critical preprocessing step that involves text ‘cle...

2017
Amal Htait Sébastien Fournier Patrice Bellot

For the purpose of opinion exploring in tweets, this article presents a sentiment classification of tweets content. First, we present a method to identify new sentiment similarity seed words. These seed words are used for predicting sentiment intensity of other words and short phrases in co-occurrence. Then, for testing sentiment similarity, we use: Similarity Measures methods between words and...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید