Extended Similarity Trees

نویسنده

  • JAMES E. CORTER
چکیده

Trees are commonly used to represent proximity relations that emerge, for instance, from studies of classification, similarity, and identification. Trees are employed to describe the data, explore their structure and model their generating process. They offer a convenient graphical display that is readily interpretable in terms of a hierarchy of clusters (Sokal &Sneath, 1963) or in terms of common and distinctive features (Tversky, 1977). The simplest tree structure is the hierarchical clustering model (Jardine & Sibson, 1971; Johnson, 1967) based on the ultrametric inequality, which states that for any triple of points the two larger distances are equal. That is, any three points can be labeled x, y, z such that d(x, z) = d(y, z) >_ d(x, y). This assumption gives rise to a tree in which all the endpoints (leaves) are equally distant from the root. The ultrametric tree is highly restrictive because any two elements of one cluster must be equally similar to any other element outside the cluster. This restriction is relaxed in the additive tree (e.g., Cunningham, 1978; Sattath & Tversky, 1977), where the leaves are not necessarily equidistant from the root. The additive tree provides greater flexibility than the ultrametric tree, but it too cannot accomodate (nonnested) overlapping clusters because any two clusters in a tree are either nested or disjoint. Throughout the paper we use the standard abbreviations (e.g., HICLUS, ADDTREE, ADCLUS) for scaling algorithms, and the unabbreviated forms (e.g., hierarchical clustering, additive tree, additive clustering) for the respective models. This article describes a new representation of proximity relations, called an extended tree, which accommodates nonnested feature structures while maintaining the basic property of a tree that every pair of points is joined by a unique path. To motivate and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On pseudosimilarity in trees

Two vertices u and v in a graph G are said to be removal-similar if G\u z G\v. Vertices which are removal-similar but not similar are said to be pseudosimilar. A characterization theorem is presented for trees (later extended to forests and block graphs) with pseudosimilar vertices. It follows from this characterization that it is not possible to have three or more mutually pseudosimilar vertic...

متن کامل

Computer simulation of tree development with random variations and probabilistic growth of branches

This paper presents the use of self similarity with random variations and probabilistic growth of branches as the basis of generating natural forms of trees. Bifurcation branching geometry has been considered as the fundamental element in the formation of tree structures and Honda’s model was used to generate the recursive branching patterns. In order to simulate the dynamic variations of natur...

متن کامل

P´olya Urn Models and Connections to Random Trees: A Review

This paper reviews P´olya urn models and their connection to random trees. Basic results are presented, together with proofs that underly the historical evolution of the accompanying thought process. Extensions and generalizations are given according to chronology: • P´olya-Eggenberger’s urn • Bernard Friedman’s urn • Generalized P´olya urns • Extended urn schemes • Invertible urn schemes ...

متن کامل

Log-Poisson Statistics and Extended Self-Similarity in Driven Dissipative Systems

The Bak-Chen-Tang forest fire model (1) was proposed as a toy model of turbulent systems, where energy (in the form of trees) is injected uniformly and globally, but is dissipated (burns) locally. We review our previous results on the model (2; 3) and present our new results on the statistics of the higher-order moments for the spatial distribution of fires. We show numerically that the spatial...

متن کامل

A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees

Decision trees are among the most popular pattern types in data mining due to their intuitive representation. However, little attention has been given on the definition of measures of semantic similarity between decision trees. In this work, we present a general framework for similarity estimation that includes as special cases the estimation of semantic similarity between decision trees, as we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005