نتایج جستجو برای: employing jaccard

تعداد نتایج: 69332  

2016
Debajyoti Bera Rameshwar Pratap

The Apriori algorithm is a classical algorithm for the frequent itemset mining problem. A significant bottleneck in Apriori is the number of I/O operation involved, and the number of candidates it generates. We investigate the role of LSH techniques to overcome these problems, without adding much computational overhead. We propose randomized variations of Apriori that are based on asymmetric LS...

Journal: :Journal of nematology 1986
V R Ferris J M Ferris L L Murdock J Faghihi

Protein patterns obtained by two-dimensional polyacrylamide gel electrophoresis for three isolates of Heterodera glycines from southern Indiana appear qualitatively similar and have higher pairwise Jaccard similarity coefficients with each other than with isolates from northern Indiana. Three isolates from three northern counties share proteins not present in the southern isolates, but as a gro...

Journal: :Inf. Process. Manage. 2006
Leo Egghe Ronald Rousseau

Classical information retrieval and overlap measures such as the Jaccard index, the Dice coefficient and Salton’s cosine measure can be characterized by Lorenz curves. This result demonstrates the existence of a formal link between information retrieval and the information sciences on the one hand, and concentration and diversity theory, as used, e.g., in social economics and ecology on the oth...

2013
Tom De Nies Wesley De Neve Erik Mannens Rik Van de Walle

In this paper, we describe our approach to the Search and Hyperlinking task at the MediaEval 2013 benchmark. This task focuses on video retrieval and linking in the context of a large and rich dataset provided by the BBC. Our approach makes use of one of three types of audio transcripts, enriched with Named Entities. To compute similarity, we adapt the Jaccard metric to use Named Entities. This...

Journal: :Inf. Process. Manage. 2003
Leo Egghe Christine Michel

Ordered sets of documents are encountered more and more in information distribution systems, such as information retrieval systems (IRS). Classical similarity measures for ordinary sets of documents hence need to be extended to these ordered sets. This is done in this paper using fuzzy set techniques. First a general similarity measure is developed which contains the classical strong similarity...

2012
Gerard de Melo Collin F. Baker Nancy Ide Rebecca J. Passonneau Christiane Fellbaum

We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical r...

2004
Ralph Mac Nally Erica Fleishman Lesley P. Bulluck Christopher J. Betrus

Methods Data on species composition for both taxonomic groups were collecting using standard inventory methods for birds and butterflies in temperate regions. Data were compiled at three sampling grains, sites (average 12 ha), canyons (average 74 ha) and mountain ranges. For each sampling grain in turn, we calculated similarity of species composition using the Jaccard index. First, we investiga...

2012
Pavla Drázdilová Alisa Babskova Jan Martinovic Katerina Slaninová Stepan Minks

Finding and recommendation of suitable persons based on their characteristics in social or collaboration networks is still a big challenge. The purpose of this paper is to discover and recommend suitable persons or whole community within a developers’ network. The experiments were realized on the data collection of specialized web portal used for collaboration of developers Codeplex.com. Users ...

2007
Axel Mosig Peter Menzel Peter F. Stadler

We present a combinatorial method for discovering cis-regulatory modules in promoter sequences. Our approach combines “sliding window” approaches with a scoring function based on the so-called Tanimoto score. This allows to identify sets of binding sites that tend to occur preferentially in the vicinity of each other in a given set of promoter sequences belonging to co-expressed or orthologous ...

2017
Austin J. Parker Kelly B. Yancey Matthew P. Yancey

This paper addresses the problem of determining the distance between two regular languages. It will show how to expand Jaccard distance, which works on finite sets, to potentially-infinite regular languages. The entropy of a regular language plays a large role in the extension. Much of the paper is spent investigating the entropy of a regular language. This includes addressing issues that have ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید