A Maximum Profit Coverage Algorithm with Application to Small Molecules Cluster Identification
نویسندگان
چکیده
In this article we model, and analyze the cluster identification of molecules (CIM), which is a clustering problem in a finite metric space. CIM2 has the following characteristics which separate it from other clustering models: 1. In most models outliers are a small portion of the data set, whereas in CIM they may be the vast majority of the objects. (see Figure 1) 2. The clusters identified by CIM are compact and their diameter is bounded. 3. There is a lower bound on the number of objects in a cluster. 4. Clusters may be very close to one another, as a result of the bound on the diameter. What may be considered as one cluster in other clustering models is considered as several clusters in CIM. (see Figure 2). 5. The number of clusters is not known a-priori to the clustering procedure. In this paper we present CIM and model it as a maximum profit coverage problem (MPCP). The model is a measure to be optimized, rather then a heuristic. Consider a finite set S in a metric space M with a distance function d. A ball with center t and radius r is the subset B(t, r) = {x ∈ M |d(t, x) ≤ r}. We say that the ball covers the points of S that it contains. Given a set of balls B of radius r, a coverage P = {S ′ 1, . . . , S′ l} is a set of clusters such that each of them consists of points covered by a single ball of B. Let S′ P = ∪i=1S i, and define the profit of P as ∑
منابع مشابه
Application of A Route Expansion Algorithm for Transit Routes Design in Grid Networks
Establishing a network of transit routes with satisfactory demand coverage is one of the main goals of transitagencies in moving towards a sustainable urban development. A primary concern in obtaining such anetwork is reducing operational costs. This paper deals with the problem of minimizing construction costsin a grid transportation network while satisfying a certain level o...
متن کاملMulti-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms
Due to the resource constraint and dynamic parameters, reducing energy consumption became the most important issues of wireless sensor networks topology design. All proposed hierarchy methods cluster a WSN in different cluster layers in one step of evolutionary algorithm usage with complicated parameters which may lead to reducing efficiency and performance. In fact, in WSNs topology, increasin...
متن کاملTest Power Reduction by Simultaneous Don’t Care Filling and Ordering of Test Patterns Considering Pattern Dependency
Estimating and minimizing the maximum power dissipation during testing is an important task in VLSI circuit realization since the power value affects the reliability of the circuits. Therefore during testing a methodology should be adopted to minimize power consumption. Test patterns generated with –D 1 option of ATALANTA contains don’t care bits (x bits). By suitable filling of don’t cares can...
متن کاملThe Ground-Set-Cost Budgeted Maximum Coverage Problem
We study the following natural variant of the budgeted maximum coverage problem: We are given a budget B and a hypergraph G = (V,E), where each vertex has a non-negative cost and a non-negative profit. The goal is to select a set of hyperedges T ⊆ E such that the total cost of the vertices covered by T is at most B and the total profit of all covered vertices is maximized. Besides being a natur...
متن کاملDynamic Coverage and Clustering: A Maximum Entropy Approach
We present a computational framework we have recently developed for solving a large class of dynamic coverage and clustering problems, ranging from those that arise in the deployment of mobile sensor networks to the identification of ensemble spike trains in computational neuroscience applications. This framework provides for the identification of natural clusters in an underlying dataset, whil...
متن کامل