نتایج جستجو برای: text similarity
تعداد نتایج: 268086 فیلتر نتایج به سال:
Previous work on paraphrase extraction and application has relied on either parallel datasets, or on distributional similarity metrics over large text corpora. Our approach combines these two orthogonal sources of information and directly integrates them into our paraphrasing system’s log-linear model. We compare different distributional similarity feature-sets and show significant improvements...
In this paper, we explore unsupervised techniques for the task of automatic short answer grading. We compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. Overall...
Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN-verbs used to describe movements of indices, stocks and shares. In Study 1 people identified antonyms of these verb sets in a free-generation task and a match-theopposite task and the most commonly identified antonyms were compiled. In Study 2, we ...
Text similarity join operator joins two relations if their join attributes are textually similar to each other, and it has a variety of application domains including integration and querying of data from heterogeneous resources; cleansing of data; and mining of data. Although, the text similarity join operator is widely used, its processing is expensive due to the huge number of similarity comp...
Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval citation recommendation. There are broad approaches for task — network-based text-based. Prior consider citations only to prior-cases (also called precedents) (PCNet). This approach misses signals inherent in Statutes (written la...
Patent application is one of the important ways to protect innovation achievements that have great commercial value for enterprises; it initial step enterprises set business development track, as well a powerful means their core competitiveness. The emergence large amount patent data makes effective detection difficult, and infringement cases occur frequently. Manual measurement in slow, costly...
With the development of deep learning, demand for similarity matching between texts in text classification is becoming increasingly high. How to match quickly under premise keeping private information secure has become a research hotspot. However, most existing protocols currently have full set limitations, and applicability these methods limited when data size large scattered. Therefore, this ...
Detecting text reuse is a fundamental requirement for a variety of tasks and applications, ranging from journalistic text reuse to plagiarism detection. Text reuse is traditionally detected by computing similarity between a source text and a possibly reused text. However, existing text similarity measures exhibit a major limitation: They compute similarity only on features which can be derived ...
We propose a method for joint unsupervised discovery of multiword expressions (MWEs) and their translations from parallel corpora. First, we apply independent monolingual MWE extraction in source and target languages simultaneously. Then, we calculate translation probability, association score and distributional similarity of co-occurring pairs. Finally, we rank all translations of a given MWE ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید