Stochastic blockmodels with growing number of classes

نویسندگان

  • David S. Choi
  • Patrick J. Wolfe
  • Edoardo M. Airoldi
چکیده

Latent variable models are frequently used to identify structure in dichotomousnetwork data, in part because they give rise to a Bernoulli product likelihood thatis both well understood and consistent with the notion of exchangeable randomgraphs. In this article we propose conservative confidence sets that hold with re-spect to these underlying Bernoulli parameters as a function of any given partitionof network nodes, enabling us to assess estimates of residual network structure,that is, structure that cannot be explained by known covariates and thus cannot beeasily verified by manual inspection. We demonstrate the proposed methodologyby analyzing student friendship networks from the National Longitudinal Surveyof Adolescent Health that include race, gender, and school year as covariates. Weemploy a stochastic expectation-maximization algorithm to fit a logistic regres-sion model that includes these explanatory variables as well as a latent stochasticblockmodel component and additional node-specific effects. Although maximum-likelihood estimates do not appear consistent in this context, we are able to evalu-ate confidence sets as a function of different blockmodel partitions, which enablesus to qualitatively assess the significance of estimated residual network structurerelative to a baseline, which models covariates but lacks block structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic blockmodels with a growing number of classes.

We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this si...

متن کامل

Variable selection for (realistic) stochastic blockmodels

Stochastic blockmodels provide a convenient representation of relations between communities of nodes in a network. However, they imply a notion of stochastic equivalence that is often unrealistic for real networks, and they comprise large number of parameters that can make them hardly interpretable. We discuss two extensions of stochastic blockmodels, and a recently proposed variable selection ...

متن کامل

Building stochastic blockmodels

The literature devoted to the construction of stochastic blockmodels is relatively rare compared to that of the deterministic variety. In this paper, a general definition of a stochastic blockmodel is given and a number of techniques for building such blockmodels are presented. In the statistical approach, the likelihood ratio statistic provides a natural index to evaluate the fit of the model ...

متن کامل

Network Construction Methods for the Simulation of Stochastic Blockmodels DRAFT

This working paper summarizes a number of methods to construct networks represented as stochastic blockmodels. The different methods are developed to obtain networks that vary in relevant network measures, e.g., density, outdegrees, indegrees, degree variances.

متن کامل

Clustering via Content-Augmented Stochastic Blockmodels

Much of the data being created on the web contains interactions between users and items. Stochastic blockmodels, and other methods for community detection and clustering of bipartite graphs, can infer latent user communities and latent item clusters from this interaction data. These methods, however, typically ignore the items’ contents and the information they provide about item clusters, desp...

متن کامل

Scalable MCMC for Mixed Membership Stochastic Blockmodels

We propose a stochastic gradient Markov chain Monte Carlo (SG-MCMC) algorithm for scalable inference in mixed-membership stochastic blockmodels (MMSB). Our algorithm is based on the stochastic gradient Riemannian Langevin sampler and achieves both faster speed and higher accuracy at every iteration than the current state-of-the-art algorithm based on stochastic variational inference. In additio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1011.4644  شماره 

صفحات  -

تاریخ انتشار 2010