Frequent query discovery: a unifying ILP approach to association rule mining

نویسندگان

  • Luc Dehaspe
  • Hannu Toivonen
چکیده

Discovery of frequent patterns has been studied in a variety of data mining (DM) settings. In its simplest form, known from association rule mining, the task is to nd all frequent itemsets, i.e., to list all combinations of items that are found in a suucient number of examples. A similar task in spirit, but at the opposite end of the complexity scale, is the Inductive Logic Programming (ILP) approach where the goal is to discover queries in rst order logic that succeed with respect to a suucient number of examples. We discuss the relationship of ILP to frequent pattern discovery. On one hand, our goal is to relate data mining problems to ILP. On another hand, we want to demonstrate how ILP can be used to solve both existing and new data mining problems. The fundamental task of association rule and frequent set discovery has been extended in various directions, allowing more useful patterns to be discovered. From an ILP viewpoint, however, it can be argued that these settings are all well-controlled subtasks of the full ILP counterpart of the problem. We try to restore the blurred picture by describing the existing approaches using a uniied database representation. With the representation, we relate also the DM settings to each other and propose some interesting new areas to be explored. We analyse some aspects of the gradual change in the trade-oo between expressivity and eeciency, as one moves from the frequent set problem towards ILP.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

33 Data Mining Query Languages

Many Data Mining algorithms enable to extract different types of patterns from data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and data. The inductive database approach has emerged as an unifying framework for such systems. Following this...

متن کامل

Efficient Frequent Query Discovery in FARMER

The upgrade of frequent item set mining to a setup with multiple relations —frequent query mining— poses many efficiency problems. Taking Object Identity as starting point, we present several optimization techniques for frequent query mining algorithms. The resulting algorithm has a better performance than a previous ILP algorithm and competes with more specialized graph mining algorithms in pe...

متن کامل

Constraint-Based Discovery and Inductive Queries: Application to Association Rule Mining

Recently inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. Querying these databases needs for primitives to: (1) select, manipulate and query data, (2) select, manipulate and query “interesting” patterns (i.e., those patterns that satisfy certain constraints), and (3) cross over patterns and data (e.g., selecting the data in which so...

متن کامل

Using Constraints During Set Mining: Should We Prune or not?

Knowledge discovery in databases (KDD) is an interactive process that can be considered from a querying perspective. Within the inductive database framework, an inductive query on a database is a query that might return generalizations about the data e.g., frequent itemsets, association rules, data dependencies. To study evaluation schemes of such queries, we focus on the simple case of (freque...

متن کامل

SPADA: A Spatial Association Discovery System*

This paper presents a spatial association discovery system, named SPADA, which has been developed according to the theoretical framework of inductive databases. Our approach considers inductive databases as deductive databases with an integrated inductive component and relies on techniques borrowed from the field of Inductive Logic Programming (ILP). In SPADA, an ILP module supports the process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010