Motif estimation via subgraph sampling: The fourth-moment phenomenon

نویسندگان

چکیده

Network sampling is an indispensable tool for understanding features of large complex networks where it practically impossible to search over the entire graph. In this paper, we develop a framework statistical inference counting network motifs, such as edges, triangles and wedges, in widely used subgraph model, each vertex sampled independently, induced by vertices observed. We derive necessary sufficient conditions consistency asymptotic normality natural Horvitz–Thompson (HT) estimator, which can be constructing confidence intervals hypothesis testing motif counts based on particular, show that HT estimator exhibits interesting fourth-moment phenomenon, asserts (appropriately centered rescaled) converges distribution standard normal whenever its 3 (the distribution). As consequence, exact thresholds various graph ensembles, sparse graphs with bounded degree, Erd?s–Rényi random graphs, regular dense graphons.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy and the fourth moment phenomenon

We develop a new method for bounding the relative entropy of a random vector in terms of its Stein factors. Our approach is based on a novel representation for the score function of smoothly perturbed random variables, as well as on the de Bruijn’s identity of information theory. When applied to sequences of functionals of a general Gaussian field, our results can be combined with the Carbery-W...

متن کامل

Graph animals, subgraph sampling, and motif search in large networks.

We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for "graph animals," i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan et al., Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of...

متن کامل

Distributed Bayesian Posterior Sampling via Moment Sharing

We propose a distributed Markov chain Monte Carlo (MCMC) inference algorithm for large scale Bayesian posterior simulation. We assume that the dataset is partitioned and stored across nodes of a cluster. Our procedure involves an independent MCMC posterior sampler at each node based on its local partition of the data. Moment statistics of the local posteriors are collected from each sampler and...

متن کامل

The optimal fourth moment theorem

We compute the exact rates of convergence in total variation associated with the ‘fourth moment theorem’ by Nualart and Peccati (2005), stating that a sequence of random variables living in a fixed Wiener chaos verifies a central limit theorem (CLT) if and only if the sequence of the corresponding fourth cumulants converges to zero. We also provide an explicit illustration based on the Breuer-M...

متن کامل

General Consistent Moment Estimation via Negligibility

The asymptotic normality of the sample mean of iid rv's is equivalent to the well known conditions of Levy and Feller. More recently, additional equivalences have been developed in terms of the quantile function (qf). And other useful probabilistic equivalences could be cited. But the asymptotic normality cited above is also equivalent to appropriately phrased consistency of the sample second m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Annals of Statistics

سال: 2022

ISSN: ['0090-5364', '2168-8966']

DOI: https://doi.org/10.1214/21-aos2134