نتایج جستجو برای: employing jaccard
تعداد نتایج: 69332 فیلتر نتایج به سال:
Of late, social tagging has become popular trend in information organisation. In context of digital resources the tags assigned by users also play vital role in information retrieval. For information discovery the ‘terms’ used to retrieve the results also depend upon the ‘relevancy’ or ‘weightage’ of the keywords. This study investigates ‘relevancy ranking’ of terms used in the full text of the...
This project explores several Machine Learning methods to predict movie genres based on plot summaries. Naive Bayes, Word2Vec+XGBoost and Recurrent Neural Networks are used for text classification, while K-binary transformation, rank method and probabilistic classification with learned probability threshold are employed for the multi-label problem involved in the genre tagging task. Experiments...
This paper presents a study of employing Ranking SVM and Convolutional Neural Network for two missions: legal information retrieval and question answering in the Competition on Legal Information Extraction/Entailment. For the first task, our proposed model used a triple of features (LSI, Manhattan, Jaccard), and is based on paragraph level instead of article level as in previous studies. In fac...
In this extended abstract, we describe and analyse a streaming probabilistic sketch, HYPERMINHASH, to estimate the Jaccard index (or Jaccard similarity coefficient) over two sets A and B. HyperMinHash can be thought of as a compression of standard logn-space MinHash by building off of a HyperLogLog count-distinct sketch. For a multiplicative approximation error 1+ on a Jaccard index t, given a ...
Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various similarity measures are categorized in both syntactic and semantic relationships. In this paper we present a novel similarity, Unilateral Jaccard Similarity Coefficient (uJaccard), which doesn’t only take into consideration the space among two points b...
Statistical association measures have been widely applied in information retrieval research, usually employing a clustering of documents or terms on the basis of their relationships. Applications of the association measures for term clustering include automatic thesaurus construction and query expansion. This research evaluates the similarity of six association measures by comparing the relatio...
In this study Jaccard Distance was performed by measuring the asymmetric information on binary variable and the comparison between vectors component. It compared two objects and notified the degree of similarity of these objects. After thorough preprocessing tasks; like translation, rotation, invariance scale content and noise resistance done onto the hand sketch object, Jaccard distance still ...
We initiate the study of finding the Jaccard center of a given collection N of sets. For two sets X,Y , the Jaccard index is defined as |X ∩ Y |/|X ∪ Y | and the corresponding distance is 1− |X ∩Y |/|X ∪Y |. The Jaccard center is a set C minimizing the maximum distance to any set of N . We show that the problem is NP-hard to solve exactly, and that it admits a PTAS while no FPTAS can exist unle...
We propose an approach for approximating the Jaccard similarity of two streams, J(A,B) = |A∩B| |A∪B| , for domains where this similarity is known to be high. Our method is based on a reduction from Jaccard similarity to F2 norm estimation, for which there exists a sketch that is efficient in terms of both size and compute time, which we augment by a sampling technique. Our approach offers an im...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید