similarity score

Similarity Measurement Method between Two Songs by Using the Conditional Euclidean Distance

2014

Min Woo Park Eui Chul Lee

Since numerous songs have recently been released increasingly, the genre of the song clustering is reasonably more important in terms of the audience’s choice. Also arguments for plagiarism are continuously being raised. For this reason, similarity measurement between two songs is important. In previous works, although similarity measurement has been actively researched in the field of query by...

متن کامل

On the Structure and Efficient Computation of IsoRank Node Similarities

2016

Ehsan Kazemi Matthias Grossglauser

The alignment of protein-protein interaction (PPI) networks has many applications, such as the detection of conserved biological network motifs, the prediction of protein interactions, and the reconstruction of phylogenetic trees [1, 2, 3]. IsoRank is one of the first global network alignment algorithms [4, 5, 6], where the goal is to match all (or most) of the nodes of two PPI networks. The Is...

متن کامل

Comparison of Algorithmic and Human Assessments of Sentence Similarity

2013

John G. Mersch R. Raymond Lang

This paper describes a new method, based on information theory, for measuring sentence similarity. The method first computes the information content (IC) of dependency triples using corpus statistics generated by processing the Open American National Corpus (OANC) with the Stanford Parser. We define the similarity of two sentences as a function of (1) the similarity of their constituent depende...

متن کامل

PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Culture, Identity Consistency, and Subjective Well-Being

2002

Eunkook M. Suh

All individuals have multiple views of themselves. Whereas the consistency among the different aspects of identity is emphasized in Western cultures, the “multiple selves” are often viewed as coexisting realities in East Asian cultures. This research revisits the classic thesis in psychology that identity consistency is a prerequisite condition of psychological well-being. Between individuals (...

متن کامل

A Data-Driven Approach for Social Event Detection

2013

Dimitrios Rafailidis Theodoros Semertzidis Michalis Lazaridis Michael G. Strintzis Petros Daras

In this paper, we present a data-driven approach for challenge 1 of the MediaEval 2013 Social Event Detection Task. Our proposed approach consists of the following steps: (a) initialization based on the images’ spatio-temporal information; (b) computation of clusters’ intercorrelations; and (c) the final clusters’ generation. In the initialization step, the images that have both geolocation and...

متن کامل

Classifying Reddit comments by subreddit

2017

Jee Ian Tam

Reddit.com is a website that is primarily organized by communities called subreddits, where users can post comments to. As subreddits can have very different cultures, we aim to classify comments by subreddit as a means of sentiment analysis. We use a publicly available reddit comment dataset over the year of 2016 and perform a classification on a selection of 20 subreddits among the top 50 by ...

متن کامل

DTSim at SemEval-2016 Task 2: Interpreting Similarity of Texts Based on Automated Chunking, Chunk Alignment and Semantic Relation Prediction

2016

Rajendra Banjade Nabin Maharjan Nobal B. Niraula Vasile Rus

In this paper we describe our system (DTSim) submitted at SemEval-2016 Task 2: Interpretable Semantic Textual Similarity (iSTS). We participated in both gold chunks category (texts chunked by human experts and provided by the task organizers) and system chunks category (participants had to automatically chunk the input texts). We developed a Conditional Random Fields based chunker and applied r...

متن کامل

A Keyword-based Monolingual Sentence Aligner in Text Simplification

2014

Chung-Chi Huang

We introduce a method for learning to align sentences in monolingual parallel articles for text simplification. In our approach, word keyness is integrated to prefer aligning essential words in sentences. The method involves estimating word keyness based on TF*IDF and semantic PageRank, and word nodes’ parts-of-speech and degrees of reference. At run-time, the keyword analyses are used as word ...

متن کامل

Grounded Discovery of Coordinate Term Relationships between Software Entities

Journal: :CoRR 2015

Dana Movshovitz-Attias William W. Cohen

We present an approach for the detection of coordinateterm relationships between entities from the software domain, that refer to Java classes. Usually, relations are found by examining corpus statistics associated with text entities. In some technical domains, however, we have access to additional information about the real-world objects named by the entities, suggesting that coupling informat...

متن کامل

Filter and Match Approach to Pair-wise Web URI Linking

2016

S. Shivashankar Yitong Li Afshin Rahimi

This paper describes the method and results of our approach, submitted as team ‘NLPCruise’ at ALTA shared task 2016. The goal of the shared task is to predict whether two given web Uniform Resource Identifiers (URIs) correspond to the same entity or not. Retrieving the URI content in addition to the dataset provided, we built a two stage filter and match technique that utilises search engine sc...

متن کامل