Efficient Constraint-Based Exploratory Mining on Large Data Cubes
نویسندگان
چکیده
Analysts often explore data cubes to identify anomalous regions that may represent problem areas or new opportunities. Discovery-driven exploration (proposed by S.Sarawagi et al. [5]) automatically detects and marks the exceptions for the user and reduces the reliance on manual discovery. However, when the data is large, it is hard to materialize the whole cube due to the limitation of both space and time. So, exploratory mining on complete cube cells needs to construct the data cube dynamically. That will take a very long time. In this paper, we investigate the optimization methods by pushing several constraints into the mining process. By enforcing several user-defined constraints, we first restrict the multidimensional space to a small constrained-cube and then mine exceptions on it. Two efficient constrained-cube construction algorithms, the NAÏVE algorithm and the AGOA algorithm, were proposed. Experimental results indicate that this kind of constraint-based exploratory mining method is efficient and scalable.
منابع مشابه
Message from Demo Chairs
Starting with the core data engineering demonstrations, Jaber and Voronkov present UNIDOOR, a deductive object-oriented database system (DOOD). Its distinctive features include a scalable persistent store with crash recovery, and database integrity and transaction control facilities in a multi-user environment. Cabibbo, Panella and Torlone introduce DaWaII (Data Warehouse IntegratIon), a tool f...
متن کاملOMARS: The Framework of an Online Multi-Dimensional Association Rules Mining System
Recently, the integration of data warehouses and data mining has been recognized as the primary platform for facilitating knowledge discovery. Effective data mining from data warehouses, however, needs exploratory data analysis. The users often need to investigate the warehousing data from various perspectives and analyze them at different levels of abstraction. To this end, comprehensive infor...
متن کاملIntegrating both constraint - based and multidimensional mining into one framework provides an interactive , exploratory environment for effective and efficient data analysis and mining . Constraint - Based , Multidimensional Data Mining
A lthough there have been many data-mining methodologies and systems developed in recent years, we contend that by and large, present mining models lack human involvement , particularly in the form of guidance and user control. We believe that data mining is most effective when the computer does what it does best— like searching large databases or counting—and users do what they do best, like s...
متن کاملConstraint-Based Querying for Bayesian Network Exploration
Understanding the knowledge that resides in a Bayesian network can be hard, certainly when a large network is to be used for the first time, or when the network is complex or has just been updated. Tools to assist users in the analysis of Bayesian networks can help. In this paper, we introduce a novel general framework and tool for answering exploratory queries over Bayesian networks. The frame...
متن کاملA constraint-based querying system for exploratory pattern discovery
In this article we present ConQueSt, a constraint based querying system able to support the intrinsically exploratory (i.e., human-guided, interactive, iterative) nature of pattern discovery. Following the inductive database vision, our framework provides users with an expressive constraint based query language, which allows the discovery process to be effectively driven toward potentially inte...
متن کامل