tree attribute

MIDCA --- A Discretization Model for Data Preprocessing in Data Mining

2006

Sam Chao Fai Wong Yiping Li

Decision tree is one of the most widely used and practical methods in data mining and machine learning discipline. However, many discretization algorithms developed in this field focus on univariate only, which is inadequate to handle the critical problems especially owned by medical domain. In this paper, we propose a new multivariate discretization method called Multivariate Interdependent Di...

متن کامل

Direct Mining of Closed Tree Patterns With Subtree Constraint

2009

Viet Anh NGUYEN Koichiro DOI Akihiro YAMAMOTO

Two critical bottle necks in mining frequent tree patterns from tree databases are the exponential number of mined patterns and the lack of user focus on the mining process. We propose, in this paper, an algorithm that solves the problems for unordered attribute trees by mining only the compact representation of tree patterns, i.e. closed tree patterns, and allows users to mine only trees of th...

متن کامل

Arbogodaï, a New Approach for Decision Trees

2003

Djamel A. Zighed Gilbert Ritschard Walid Erray Vasile-Marian Scuturici

Decision tree methods generally suppose that the number of categories of the attribute to be predicted is fixed. Breiman et al., with their Twoing criterion in CART, considered gathering the categories of the predicted attribute into two superclasses. In this paper, we propose an extension of this method. We try to merge the categories in an optimal unspecified number of superclasses. Our metho...

متن کامل

Attribute Selection with a Multi-objective Genetic Algorithm

2002

Gisele L. Pappa Alex Alves Freitas Celso A. A. Kaestner

In this paper we address the problem of multiobjective attribute selection in data mining. We propose a multiobjective genetic algorithm (GA) based on the wrapper approach to discover the best subset of attributes for a given classification algorithm, namely C4.5, a well-known decision-tree algorithm. The two objectives to be minimized are the error rate and the size of the tree produced by C4....

متن کامل

A Counter Example to the Stronger Version of the Binary Tree Hypothesis

1995

Igor Kononenko

The paper describes a counter example to the hypothesis which states that a greedy decision tree generation algorithm that constructs binary decision trees and branches on a single attribute-value pair rather than on all values of the selected attribute will always lead to a tree with fewer leaves for any given training set. We show also that RELIEFF is less myopic than other impurity functions...

متن کامل

Indexing Techniques for Power Management inMulti - Attribute Data

2007

Qinglong Hu Wang-Chien Lee Dik Lun Lee

In this paper, we discuss the power conservative indexing techniques for managing multi-attribute data broadcast on wireless channels. These indexing techniques, namely, index tree, signature and hybrid, aim at improving the battery power consumption of mobile clients. By taking into account the broadcast management factors such as clustering and scheduling, these three indexing schemes may sig...

متن کامل

Containment for Tree Patterns with Attribute Value Comparisons

2013

Evgeny Sherkhonov Maarten Marx

Tree patterns (TP) is a simple and widely used fragment of XPath. The problem of containment in TP has been extensively studied previously. It was shown that the containment problem ranges from PTime to PSpace depending on the available constructs. In this paper we study the complexity of the containment problem for tree patterns with attribute value comparisons. We show that the complexity ran...

متن کامل

On the Use of Taxonomies for Representing Case Features and Local Similarity Measures

1998

Ralph Bergmann

For defining attribute types to be used in the case representation, taxonomies occur quite often. The symbolic values at any node of the taxonomy tree are used as attribute values in a case or a query. A taxonomy type represents a relationship between the symbols through their position within the taxonomy-tree which expresses knowledge about the similarity between the symbols. This paper analyz...

متن کامل

Efficient Model-based Fuzz Testing Using Higher-order Attribute Grammars

Journal: :JSW 2013

Fan Pan Ying Hou Zheng Hong Lifa Wu Haiguang Lai

Format specifications of data input are critical to model-based fuzz testing. Present methods cannot describe the format accurately, which leads to high redundancy in testing practices. In order to improve testing efficiency, we propose a grammar-driven approach to fuzz testing. Firstly, we build a formal model of data format using higher-order attribute grammars, and construct syntax tree on t...

متن کامل

SQL-AG: Querying structured documents using attribute grammars

2003

Jan Van den Bussche Stijn Vansummeren Dieter Vrancken

Structured documents, such as program source texts, technical documentation, or XML data, comprise an important class of data in many applications. Structured documents are distinguished from flat text by their tree structure. In a program source text, this structure is the abstract syntax tree of the program. In a technical document, this structure is the division in chapters, sections, paragr...

متن کامل