Similarity Retrieval and Cluster Analysis Using R* Trees

نویسندگان

  • Jiaxiong Pi
  • Yong Shi
  • Zhengxin Chen
چکیده

Data mining is aimed at the extraction of interesting (i.e., nontrivial, implicit, previously unknown, and potentially useful) patterns or knowledge from huge amounts of data. In order to make data mining manageable, data mining has to be database centered. Yet, data mining goes beyond the traditional realm of database techniques; in particular, reasoning methods developed from machine learning techniques and other fields in artificial intelligence (AI) have made important contributions in data mining. Data mining thus offers an excellent opportunity to explore the interesting fundamental issue of the relationship between data and knowledge retrieval and inference and reasoning. Decades ago, researchers made an important remark stating that since knowledge retrieval must respect the semantics of the representation language, knowledge retrieval is a limited form of inference operating on the stored facts (Frisch & Allen, 1982). The inverse side of this statement has also been explored, which views inference as an extension of retrieval. For example, Chen (1996) described a computer model that is able to generate suggestions through document structure mapping based on the notion of reasoning as extended knowledge retrieval; the model was implemented using a relational approach. However, although the issue of foundations of data mining has attracted much attention among data mining researchers (ICDM, 2004), little work has been done on the important relationship between retrieval and inference (or mining). A possible reason of lacking such kind of research is the difficulty of identifying an appropriate common ground that can be used to examine both data retrieval and data mining. On the other hand, from the database perspective, an effective way to achieve efficient Chapter LVIII Similarity Retrieval and Cluster Analysis Using R* Trees

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Organizing image databases as visual-content search trees

An unsupervised algorithm for arranging an image database as a visual-content binary search tree is described. Tree nodes are associated with image subsets, maintaining the property that the similarity among the images associated with the children of a node is higher than the similarity among the images associated with the parent node. Visual-content search trees can be used to automate image r...

متن کامل

Indexing Images by Trees of Visual Content

Haim Schweitzer ([email protected]) The University of Texas at Dallas P.O Box 830688, Richardson, Texas 75083 Abstract An unsupervised algorithm for arranging an image database as a binary tree is described. Tree nodes are associated with image subsets, maintaining the property that the similarity among the images associated with the children of a node is higher than the similarity among the im...

متن کامل

Grouping and Indexing Color Features for Efficient Image Retrieval

Content-based image retrieval (CBIR) aims at searching image databases for specific images that are similar to a given query image based on matching of features derived from the image content. This paper focuses on a low-dimensional color based indexing technique for achieving efficient and effective retrieval performance. In our approach, the color features are extracted using the mean shift a...

متن کامل

Towards Clustering of Web-based Document Structures

Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web use...

متن کامل

Indexing Shapes in Image Databases Using the Centroid-Radii Model

In content-based image retrieval systems, the content of an image such as color, shapes and textures are used to retrieve images that are similar to a query image. Most of the existing work focus on the retrieval e€ectiveness of using content for retrieval, i.e., study the accuracy (in terms of recall and precision) of using di€erent representations of content. In this paper, we address the iss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015