Automatic Discovery of Functional Dependencies and Conditional Functional Dependencies: A Comparative Study
نویسندگان
چکیده
Over the last twenty years, several algorithms have been proposed for automatic rule/constraint discovery from data, for the purpose of data cleaning. These algorithms look for constraints such as functional dependencies (FDs), conditional FDs (CFDs), inclusion dependencies (INDs), conditional INDs (CINDs), association rules, integrity constraints (ICs) and denial constraints (DCs), among others. While some of these techniques are direct generalizations and extensions of others, many differ greatly from the rest in approach, characteristics and general flavour. Many of these algorithms have not been tested and compared against each other, therefore their core differences and relative strengths and weaknesses are hard to comprehend.
منابع مشابه
Discover Dependencies from Data - A Review
Functional and inclusion dependency discovery is important to knowledge discovery, database semantics analysis, database design, and data quality assessment. Motivated by the importance of dependency discovery, this paper reviews the methods for functional dependency, conditional functional dependency, approximate functional dependency and inclusion dependency discovery in relational databases ...
متن کاملApproximation Measures for Conditional Functional Dependencies Using Stripped Conditional Partitions
Received Apr 11, 2017 Revised May 5, 2017 Accepted May 24, 2017 Conditional functional dependencies (CFDs) have been used to improve the quality of data, including detecting and repairing data inconsistencies. Approximation measures have significant importance for data dependencies in data mining. To adapt to exceptions in real data, the measures are used to relax the strictness of CFDs for mor...
متن کاملDiscovering Denial Constraints
Integrity constraints (ICs) provide a valuable tool for enforcing correct application semantics. However, designing ICs requires experts and time. Proposals for automatic discovery have been made for some formalisms, such as functional dependencies and their extension conditional functional dependencies. Unfortunately, these dependencies cannot express many common business rules. For example, a...
متن کاملEffective Pruning for the Discovery of Conditional Functional Dependencies
Conditional Functional Dependencies (CFDs) have been proposed as a new type of semantic rules extended from traditional functional dependencies. They have shown great potential for detecting and repairing inconsistent data. Constant CFDs are 100% confidence association rules. The theoretical search space for the minimal set of CFDs is the set of minimal generators and their closures in data. Th...
متن کاملExtending Matching Rules with Conditions
Matching dependencies (mds) have recently been proposed [10] in order to make dependencies tolerant to various information representations, and proved [13] useful in data quality applications such as record matching. Instead of strict identification function in traditional dependency syntax (e.g., functional dependencies), mds specify dependencies based on similarity matching quality. However, ...
متن کامل