cosine similarity measure

A Template Based Hybrid Model for Chinese Personal Name Disambiguation

2012

Hao Zong Derek F. Wong Lidia S. Chao

This paper proposes a template based hybrid model for Chinese Personal Name Disambiguation (CPND). The template makes use of the features of personal role such as discriminating personal name (nickname, stage name), together with the specific context of most frequent words, personal name nearest words named entities, date and time that are effective for this disambiguation task, as well as surr...

متن کامل

Multiple topic identification in telephone conversations

2013

Xavier Bost Marc El-Bèze Renato De Mori

This paper deals with the automatic analysis of conversations between a customer and an agent in a call centre of a customer care service. The purpose of the analysis is to hypothesize themes about problems and complaints discussed in the conversation. Themes are defined by the application documentation topics. A conversation may contain mentions that are irrelevant for the application purpose ...

متن کامل

Offering A Product Recommendation System in E-commerce

Journal: :CoRR 2011

Ruma Dutta Debajyoti Mukhopadhyay

This paper proposes a number of explicit and implicit ratings in product recommendation system for Business-to-customer ecommerce purposes. The system recommends the products to a new user. It depends on the purchase pattern of previous users whose purchase pattern is close to that of a user who asks for a recommendation. The system is based on weighted cosine similarity measure to find out the...

متن کامل

The University of Amsterdam at WebCLEF 2007: Using Centrality to Rank Web Snippets

2007

Valentin Jijkoun Maarten de Rijke

We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found ...

متن کامل

A dictionary learning and source recovery based approach to classify diverse audio sources

Journal: :CoRR 2015

K. V. Vijay Girish T. V. Ananthapadmanabha A. G. Ramakrishnan

A dictionary learning based audio source classification algorithm is proposed to classify a sample audio signal as one amongst a finite set of different audio sources. Cosine similarity measure is used to select the atoms during dictionary learning. Based on three objective measures proposed, namely, signal to distortion ratio (SDR), the number of non-zero weights and the sum of weights, a fram...

متن کامل

Blind Assessment of Wavelet-Compressed Images Based On Subband Statistics of Natural Scenes

Journal: :IJAPUC 2014

Ying-Chun Guo Gang Yan Cui-Hong Xue Yang Yu

This paper presents a no-reference image quality assessment metric that makes use of the wavelet subband statistics to evaluate the levels of distortions of wavelet-compressed images. The work is based on the fact that for distorted images the correlation coefficients of the adjacent scale subbands change proportionally with respect to the distortion of a compressed image. Subband similarity is...

متن کامل

A supervised learning approach to the ensemble clustering of genes

Journal: :International journal of data mining and bioinformatics 2014

Andrew K. Rider Geoffrey Siwo Scott J. Emrich Michael T. Ferdig Nitesh V. Chawla

High-throughput techniques have become a primary approach to gathering biological data. These data can be used to explore relationships between genes and guide development of drugs and other research. However, the deluge of data contains an overwhelming amount of unknown information about the organism under study. Therefore, clustering is a common first step in the exploratory analysis of high-...

متن کامل

An Efficient Similarity Join Algorithm with Cosine Similarity Predicate

2010

Dongjoo Lee Jaehui Park Junho Shim Sang-goo Lee

Given a large collection of objects, finding all pairs of similar objects, namely similarity join, is widely used to solve various problems in many application domains.Computation time of similarity join is critical issue, since similarity join requires computing similarity values for all possible pairs of objects. Several existing algorithms adopt prefix filtering to avoid unnecessary similari...

متن کامل

Ranking Abstracts to Identify Relevant Evidence for Systematic Reviews: The University of Sheffield's Approach to CLEF eHealth 2017 Task 2

2017

Amal Alharbi Mark Stevenson

This paper describes Sheffield University’s submission to CLEF 2017 eHealth Task 2: Technologically Assisted Reviews in Empirical Medicine. This task focusses on the identification of relevant evidence for systematic reviews in the medical domain. Participants are provided with systematic review topics (including title, Boolean query and set of PubMed abstracts returned) and asked to identify t...

متن کامل

IRIT at INEX 2013: Tweet Contextualization Track

2013

Liana Ermakova Josiane Mothe

The paper presents IRIT’s approach used at INEX Tweet Contextualization Track 2013. Systems had to provide a context to a tweet. This year we further modified our approach presented at INEX 2011 and 2012 underlain by the product of scores based on hashtag processing, TF-IDF cosine similarity measure enriched by smoothing from local context and document beginning, named entity recognition and pa...

متن کامل