نتایج جستجو برای: three heuristics named cluster

تعداد نتایج: 1529023  

2017
Michal Marcinczuk

In the paper we present a tool for lemmatization of multi-word common noun phrases and named entities for Polish called PoLem1. The tool is based on a set of manually crafted rules and heuristics utilizing a set of dictionaries (including morphological, named entities and inflection patterns). The accuracy of lemmatization obtained by the tool reached 97.99% on a dataset with multi-word common ...

1994
Kyungshik Lim

Given a hexagonal mesh of base stations in cellular systems we consider the problem of nding a cover of disjoint clusters of base stations which generate multiple types of traac among themselves. The problem diiers from general graph partitioning problems in that it considers not only communication costs but also the underlying topology among base stations, such that base stations in a cluster ...

2017
Siddhesh Khandelwal Amit Awekar

K-means is a widely used iterative clustering algorithm. There has been considerable work on improving k-means in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose two heuristics to overcome this bottleneck and speed up k-means. Our first heuristic predicts...

2012
Xuan Jiang Hongyan Liu Jun He Rui Zhu Xiaoyong Du

Many studies show that named entities are closely related to users' search behaviors, which brings increasing interest in studying named entities in search logs recently. This paper addresses the problem of forming fine grained semantic clusters of named entities within a broad domain such as “company”, and generating keywords for each cluster, which help users to interpret the embedded semanti...

Journal: :Procesamiento del Lenguaje Natural 2012
Soto Montalvo Víctor Fresno-Fernández Raquel Martínez-Unanue

Measuring the similarity between documents is an essential task in Document Clustering. This paper presents a new metric that is based on the number and the category of the Named Entities shared between news documents. Three different feature-weighting functions and two standard similarity measures were used to evaluate the quality of the proposed measure in multilingual news clustering. The re...

باقی زاده, ، امین, حامدی, مسعود, رحیمی, مهدی, علوی, نجمه السادات, ملکی, محمود,

27 different populations of Triticum boeoticum were gathered from west and North West of Iran for their grouping using morphological traits. All populations were assessed in farm based on completely random design with three replications in 1393. The measured traits include stem length with spike, spike length with and without awn, awn length, flag leaf length, the woolly leaves, peduncle length...

Journal: :CoRR 2016
Pedro Mota Maxine Eskénazi Luísa Coheur

In this paper we propose a graph-community detection approach to identify cross-document relationships at the topic segment level. Given a set of related documents, we automatically find these relationships by clustering segments with similar content (topics). In this context, we study how different weighting mechanisms influence the discovery of word communities that relate to the different to...

2016
Rodrigo C. Camargos Maria do Carmo Nicoletti

Finding a data clustering in a data set is a challenging task since algorithms usually depend on the adopted inter-cluster distance as well as the employed definition of cluster diameter. The work described in this paper approaches a well-known agglomerative clustering algorithm named AGNES (Agglomerative Nesting), in regards to its performance on three case studies namely, datasets formed by c...

2014
Ryo Sakai Raf Winand Toni Verbeiren Andrew Vande Moere Jan Aerts Jan Oosting Eamonn Maguire Rodrigo Santamaria

Dendrograms are graphical representations of binary tree structures resulting from agglomerative hierarchical clustering. In Life Science, a cluster heat map is a widely accepted visualization technique that utilizes the leaf order of a dendrogram to reorder the rows and columns of the data table. The derived linear order is more meaningful than a random order, because it groups similar items t...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید