Graph Model Selection using the Minimum Description Length Principle

ثبت نشده

چکیده

In recent years, there has been a proliferation of theoretical graph models, e.g., preferential attachment, motivated by real-world graphs such as the Web or Internet topology. Typically these models are designed to mimic particular properties observed in the graphs, such as power-law degree distribution or the small-world phenomenon. The mainstream approach to comparing models for these graphs has been somewhat subjective and very application dependent — comparisons are often based on ad hoc graph properties. We use the Minimum Description Length principle to compare graph models: models are scored based on the degree of compression that they achieve on real data. This principle is popular across fields for various types of model selection because it is objective and not application specific. Unfortunately, computing this metric is usually a daunting algorithmic task, especially for existing models that were not designed with this metric in mind. To illustrate the feasibility of our approach, we design and implement sophisticated algorithms for computing the description length for four natural models: a power-law random graph model, a preferential attachment model, a small-world model, and a uniform random graph model. Based on experiments on three snapshots of the Internet topology graph, we find that the preferential attachment model ranks highest, while the uniform random graph model performs the worst. We hope that this metric will enable a more objective model comparison and the development of improved models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uncertainty Measures of Rough Set Prediction

The main statistics used in rough set data analysis, the approximation quality, is of limited value when there is a choice of competing models for predicting a decision variable. In keeping within the rough set philosophy of non–invasive data analysis, we present three model selection criteria, using information theoretic entropy in the spirit of the minimum description length principle. Our ma...

متن کامل

Model Selection using Information Theory and the MDL Principle ∗

Information theory offers a coherent, intuitive view of model selection. This perspective arises from thinking of a statistical model as a code, an algorithm for compressing data into a sequence of bits. The description length is the length of this code for the data plus the length of a description of the model itself. The length of the code for the data measures the fit of the model to the dat...

متن کامل

Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles minimum description length (MDL) and minimum message length (MML), abstracted as the ideal MDL principle and defined from Bayes’s rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be app...

متن کامل

Algorithmic Complexity and Structural Models of Social Networks∗

This article explores how the algorithmic complexity approach can be used to address the problem of identifying group structures in social networks. A specific implementation of the algorithmic complexity approach based on the principle of minimum description length (MDL) is compared to other model selection criteria, and compared and contrasted with a Bayesian approach to model selection. The ...

متن کامل

The Minimum Description Length Principle in Coding and Modeling

We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon’s basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Graph Model Selection using the Minimum Description Length Principle

ثبت نشده

چکیده

منابع مشابه

Uncertainty Measures of Rough Set Prediction

Model Selection using Information Theory and the MDL Principle ∗

Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

Algorithmic Complexity and Structural Models of Social Networks∗

The Minimum Description Length Principle in Coding and Modeling

عنوان ژورنال:

اشتراک گذاری