text similarity

نتایج جستجو برای: text similarity

تعداد نتایج: 268086 فیلتر نتایج به سال:

Random Walks for Text Semantic Similarity

2009

Daniel Ramage Anna N. Rafferty Christopher D. Manning

Many tasks in NLP stand to benefit from robust measures of semantic similarity for units above the level of individual words. Rich semantic resources such as WordNet provide local semantic information at the lexical level. However, effectively combining this information to compute scores for phrases or sentences is an open problem. Our algorithm aggregates local relatedness information via a ra...

متن کامل

Term-Frequency Surrogates in Text Similarity Computations

2008

Stefan Pohl Alistair Moffat

Inverted indexes on external storage perform best when accesses are ordered and data is read sequentially, so that seek times are minimized. As a consequence, the various items required to compute Boolean, ranked and phrase queries are often interleaved in the inverted lists. While suitable for query types in which all items are required, this arrangement has the drawback that other query types...

متن کامل

Comparison of Ontology-Based Semantic-Similarity Measures in the Biomedical Text

Journal: :Journal of Computer and Communications 2017

متن کامل

Text similarity, boilerplates and their determinants in key audit matters disclosure

Journal: :Corporate Ownership and Control 2023

Like the European Commission, many regulators and standard setters worldwide have substantially revised requirements for auditor’s reports on statutory audits of public interest entities. Their objective was to improve report’s information content and, hence, transparency audit. A significant change introduction a key audit matters (KAM) disclosure which increased scope, meaningfulness, individ...

متن کامل

Similarity-Based Text Clustering: A Comparative Study

2006

Joydeep Ghosh Alexander Strehl

Clustering of text documents enables unsupervised categorization and facilitates browsing and search. Any clustering method has to embed the objects to be clustered in a suitable representational space that provides a measure of (dis)similarity between any pair of objects. While several clustering methods and the associated similarity measures have been proposed in the past for text clustering,...

متن کامل

A Survey of Text Similarity Approaches

2013

Wael H. Gomaa Aly A. Fahmy

Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches;...

متن کامل

Econo-ESA in semantic text similarity

2014

Faisal Rahutomo Masayoshi Aritsugi

Explicit semantic analysis (ESA) utilizes an immense Wikipedia index matrix in its interpreter part. This part of the analysis multiplies a large matrix by a term vector to produce a high-dimensional concept vector. A similarity measurement between two texts is performed between two concept vectors with numerous dimensions. The cost is expensive in both interpretation and similarity measurement...

متن کامل

Text similarity in academic conference papers

2006

Jun-Peng Bao James A. Malcolm

If we are to use electronic plagiarism detectors on student work, it would be interesting to know how much similarity should be expected in independently written documents on a similar topic. If our measure is coarse, the answer should be zero, but a finer grained analysis (such as would be needed to detect inadequate paraphrasing) is likely to detect some background noise. How much background ...

متن کامل

Text Segmentation Based on Similarity between Words

1993

Hideki Kozima

This paper proposes a new indicator of text structure, called the lexical cohesion pro le (LCP), which locates segment boundaries in a text. A text segment is a coherent scene; the words in a segment are linked together via lexical cohesion relations. LCP records mutual similarity of words in a sequence of text. The similarity of words, which represents their cohesiveness, is computed using a s...

متن کامل

Document Retrieval Using SIFT Image Features

Journal: :J. UCS 2011

Dan J. Smith Richard Harvey

This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted series of experiments using cosine distance as our similarity measure, selecting varying numbers local interest points per page, and varying numbers of nearest neighbour points in the similarity calculations. We have found th...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید