The Sum-Over-Paths Covariance: A novel covariance measure between nodes of a graph
نویسندگان
چکیده
This work introduces a link-based covariance measure between the nodes of a weighted, directed, graph where a cost is associated to each arc. To this end, a probability distribution on the (usually infinite) set of paths through the network is defined by minimizing the sum of the expected costs between all pairs of nodes while fixing the total relative entropy spread in the network. This results in a probability distribution on the set of paths such that long paths (with a high cost) occur with a low probability while short paths (with a low cost) occur with a high probability. The sum-over-paths (SoP) covariance measure is then computed according to this probability distribution: two nodes will be highly correlated if they often co-occur together on the same – preferably short – paths. The resulting covariance matrix between nodes (say n in total) is a Gram matrix and therefore defines a valid kernel matrix on the graph; it is obtained by inverting a n × n matrix. The proposed model could be used for various graph mining tasks such as computing betweenness centrality, semisupervised classification, visualization, etc.
منابع مشابه
Novel Measures on Directed Graphs and Applications to Large-Scale Within-Network Classification
In recent years, networks have become a major data source in various fields ranging from social sciences to mathematical and physical sciences. Moreover, the size of available networks has grow substantially as well. This has brought with it a number of new challenges, like the need for precise and intuitive measures to characterize and analyze large scale networks in a reasonable time. The fir...
متن کاملCovariance decomposition in multivariate analysis
The covariance between two variables in a multivariate Gaussian distribution is decomposed into a sum of path weights for all paths connecting the two variables in an undirected graph. These weights are useful in determining which variables are important in mediating correlation between the two path endpoints. The decomposition arises in undirected Gaussian graphical models and does not require...
متن کاملCovariance decomposition in undirected Gaussian graphical models
The covariance between two variables in a multivariate Gaussian distribution is decomposed into a sum of path weights for all paths connecting the two variables in an undirected independence graph. These weights are useful in determining which variables are important in mediating correlation between the two path endpoints. The decomposition arises in undirected Gaussian graphical models and doe...
متن کاملFourth order and fourth sum connectivity indices of tetrathiafulvalene dendrimers
The m-order connectivity index (G) m of a graph G is 1 2 1 1 2 1 ... ... 1 ( ) i i im m v v v i i i m d d d G where 1 2 1 ... i i im d d d runs over all paths of length m in G and i d denotes the degree of vertex i v . Also, 1 2 1 1 2 1 ... ... 1 ( ) i i im m v v v i i i ms d d d X G is its m-sum connectivity index. A dendrimer is an artificially manufactured or synth...
متن کاملA Novel Image Structural Similarity Index Considering Image Content Detectability Using Maximally Stable Extremal Region Descriptor
The image content detectability and image structure preservation are closely related concepts with undeniable role in image quality assessment. However, the most attention of image quality studies has been paid to image structure evaluation, few of them focused on image content detectability. Examining the image structure was firstly introduced and assessed in Structural SIMilarity (SSIM) measu...
متن کامل