Fast Inbound Top-K Query for Random Walk with Restart

نویسندگان

  • Chao Zhang
  • Shan Jiang
  • Yucheng Chen
  • Yidan Sun
  • Jiawei Han
چکیده

Random walk with restart (RWR) is widely recognized as one of the most important node proximity measures for graphs, as it captures the holistic graph structure and is robust to noise in the graph. In this paper, we study a novel query based on the RWR measure, called the inbound top-k (Ink) query. Given a query node q and a number k, the Ink query aims at retrieving k nodes in the graph that have the largest weighted RWR scores to q. Ink queries can be highly useful for various applications such as traffic scheduling, disease treatment, and targeted advertising. Nevertheless, none of the existing RWR computation techniques can accurately and efficiently process the Ink query in large graphs. We propose two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner. To identify the top-k nodes, Squeeze iteratively performs matrix-vector multiplication and estimates the lower and upper bounds for all the nodes in the graph. Ripple employs a more aggressive strategy by only estimating the RWR scores for the nodes falling in the vicinity of q, the nodes outside the vicinity do not need to be evaluated because their RWR scores are propagated from the boundary of the vicinity and thus upper bounded. Ripple incrementally expands the vicinity until the top-k result set can be obtained. Our extensive experiments on real-life graph data sets show that Ink queries can retrieve interesting results, and the proposed algorithms are orders of magnitude faster than state-of-the-art method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Exact Top-k Search for Random Walk with Restart

Graphs are fundamental data structures and have been em-ployed for centuries to model real-world systems and phe-nomena. Random walk with restart (RWR) provides a goodproximity score between two nodes in a graph, and it hasbeen successfully used in many applications such as auto-matic image captioning, recommender systems, and link pre-diction. The goal of this work is t...

متن کامل

Reverse Top-k Search using Random Walk with Restart

With the increasing popularity of social networks, large volumes of graph data are becoming available. Large graphs are also derived by structure extraction from relational, text, or scientific data (e.g., relational tuple networks, citation graphs, ontology networks, protein-protein interaction graphs). Node-to-node proximity is the key building block for many graph-based applications that sea...

متن کامل

Random and Directed Walk-Based Top-k Queries in Wireless Sensor Networks

In wireless sensor networks, filter-based top-  query approaches are the state-of-the-art solutions and have been extensively researched in the literature, however, they are very sensitive to the network parameters, including the size of the network, dynamics of the sensors' readings and declines in the overall range of all the readings. In this work, a random walk-based top-  query approach ca...

متن کامل

Supervised and Extended Restart in Random Walks for Ranking and Link Prediction in Networks

Given a real-world graph, how can we measure relevance scores for ranking and link prediction? Random walk with restart (RWR) provides an excellent measure for this and has been applied to various applications such as friend recommendation, community detection, anomaly detection, etc. However, RWR suffers from two problems: 1) using the same restart probability for all the nodes limits the expr...

متن کامل

IRWRLDA: improved random walk with restart for lncRNA-disease association prediction

In recent years, accumulating evidences have shown that the dysregulations of lncRNAs are associated with a wide range of human diseases. It is necessary and feasible to analyze known lncRNA-disease associations, predict potential lncRNA-disease associations, and provide the most possible lncRNA-disease pairs for experimental validation. Considering the limitations of traditional Random Walk wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD

دوره 9285  شماره 

صفحات  -

تاریخ انتشار 2015