Pattern Theoretic Learning
نویسندگان
چکیده
The goal of learning from sample data is to extract a concept that captures the underlying pattern while still representing it in a way useful to the investigator. A new approach based on function decomposition in the Pattern Theory framework is presented here. The objective of this extended abstract is three-fold. The first is to provide an overview of our new approach to learning. Specifically, we wish to show the applicability to discovery. Second, we will demonstrate the correlation of decomposed function cardinality (DFC) and "patterned." Finally, we demonstrate the robustness of this approach by exhibiting experimental results on binary functions with C4.5. This new approach to discovery and learning is a powerful method for finding patterns in a robust manner. 1 The Pattern Theory Approach Pattern Theory is a discipline that arose out of machine learning [2] [6] and switching theory [5]. The original goal was to develop formal methods of algorithm design from specifications. The approach is based on a technique called function decomposition and a measure called decomposed function cardinality (DFC). Since Pattern Theory is able to extrapolate available information based on the inherent structure in the data, it is directly related to scientific discovery. Decomposing a function involves breaking it up into smaller subfunctions. These smaller functions are further broken down until all subfunctions will no longer decompose. For a given function, the number of ways to choose two sets of variables (the partition space) exponential. The decomposition space is even larger, since there are several ways the subfunctions can be *Emaih goldmanj ~aa.wpafb.af.mil Figure 1: Lookup Table X¥~ F(X,Y,Z,W) Figure 2: Decomposition combined and there are several levels of subfunctions possible. The complexity measure that we use to determine the relative predictive power of different function decompositions is called DFC. DFC is calculated by adding the cardinalities of each of the subfunctions in the decomposition. The cardinality of an n-variable binary function is 2n. We illustrate the measure in the above figures. In Figure 1, we have a function on four variables with cardinality 24 = 16. In Figure 2, we show the same function after it has been decomposed. The DFC of this representation for the original function is 22 + 22 + 22 = 12. The DFC measures the relative complexity of a function. When we search through the possible decompositions for a function, we choose one with the smallest DFC. This decomposition is our learned concept. The decomposed representation of the function is one that exhibits more information than the alternative. For example, Figure 1 is essentially a lookup table of inputs and outputs. Figure 2, on the other hand, is a function that is not simply a table. The decomposition, for example, could be two simple functions combined together. Throughout the paper when we refer to a minimal function decomposition, we use "minimal" to mean a decomposition such that the DFC is the smallest possible for the entire set of decompositions. It is noted that a given minimal decomposition is not unique. For a more rigorous explanation of the inner workings of 125 From: AAAI Technical Report SS-95-03. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved.
منابع مشابه
NGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map
Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...
متن کاملInformation theoretic combination of pattern classifiers
Combining several classifiers has proved to be an effective machine learning technique. Two concepts clearly influence the performances of an ensemble of classifiers: the diversity between classifiers and the individual accuracies of the classifiers. In this paper we propose an information theoretic framework to establish a link between these quantities. As they appear to be contradictory, we p...
متن کاملInformation Theoretic Clustering
Clustering is one of the important topics in pattern recognition. Since only the structure of the data dictates the grouping (unsupervised learning), information theory is an obvious criteria to establish the clustering rule. This paper describes a novel valley seeking clustering algorithm using an information theoretic measure to estimate the cost of partitioning the data set. The information ...
متن کاملThe Effect of Grammar vs. Vocabulary Pre-teaching on EFL Learners’ Reading Comprehension: A Schema-Theoretic View of Reading
This study was designed to investigate the effect of grammar and vocabulary pre-teaching, as two types of pre-reading activities, on the Iranian EFL learners’ reading comprehension from a schema–theoretic perspective. The sample consisted of 90 female students studying at pre-university centers of Isfahan. The subjects were randomly divided into three equal-in-number groups. They participated ...
متن کاملInformatIon theoretIc combInatIon of classIfIers wIth applIcatIon to face DetectIon
Combining several classifiers has become a very active subdiscipline in the field of pattern recognition. For years, pattern recognition community has focused on seeking optimal learning algorithms able to produce very accurate classifiers. However, empirical experience proved that is is often much easier finding several relatively good classifiers than only finding one single very accurate pre...
متن کاملTraining of spiking neural networks based on information theoretic costs
Spiking neural network is a type of artificial neural network in which neurons communicate between each other with spikes. Spikes are identical Boolean events characterized by the time of their arrival. A spiking neuron has internal dynamics and responds to the history of inputs as opposed to the current inputs only. Because of such properties a spiking neural network has rich intrinsic capabil...
متن کامل