Bipartite Graph Sampling Methods for Sampling Recommendation Data

نویسنده

Zan Huang

چکیده

Sampling is the common practice involved in academic and industry efforts on recommendation algorithm evaluation and selection. Experimental analysis often uses a subset of the entire useritem interaction data available in the operational recommender system, often derived by including all transactions associated with a subset of uniformly randomly selected users. Our paper formally studies the sampling problem for recommendation to understand to what extent population-based algorithm evaluation results correspond with sample-based results using different sampling methods. We use a bipartite graph to represent the key input data of user-item interaction for recommendation algorithms and build on the literature on unipartite graph sampling to develop sampling methods for our context of bipartite graph sampling. We also developed several metrics for assessing the quality of a given sample, including performance recovery and ranking recovery measures for assessing both single-sample and multiple-sample recovery performances. Based on the empirical results from two real-world datasets we provide some general recommendations for sampling for recommendation algorithm evaluation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampling Online Social Networks by Random Walk with Indirect Jumps

Random walk-based sampling methods are gaining popularity and importance in characterizing large networks. While powerful, they suffer from the slow mixing problem when the graph is loosely connected, which results in poor estimation accuracy. Random walk with jumps (RWwJ) can address the slow mixing problem but it is inapplicable if the graph does not support uniform vertex sampling (UNI). In ...

متن کامل

Estimating Node Similarity by Sampling Streaming Bipartite Graphs

Bipartite graph data increasingly occurs as a stream of edges that represent transactions, e.g., purchases by retail customers. Applications such as recommender systems employ neighborhood-based measures of node similarity, such as the pairwise number of common neighbors (CN) and related metrics. While the number of node pairs that share neighbors is potentially enormous, in real-word graphs on...

متن کامل

Efficient Sampling for Bipartite Matching Problems

Bipartite matching problems characterize many situations, ranging from ranking in information retrieval to correspondence in vision. Exact inference in realworld applications of these problems is intractable, making efficient approximation methods essential for learning and inference. In this paper we propose a novel sequential matching sampler based on a generalization of the PlackettLuce mode...

متن کامل

Vertex-Context Sampling for Weighted Network Embedding

Network embedding methods have garnered increasing aention because of their eectiveness in various information retrieval tasks. e goal is to learn low-dimensional representations of vertexes in an information network and simultaneously capture and preserve the network structure. Critical to the performance of a network embedding method is how the edges/vertexes of the network is sampled for ...

متن کامل

Perfect Matchings in Õ(n) Time in Regular Bipartite Graphs

We consider the well-studied problem of finding a perfect matching in d-regular bipartite graphs with 2n vertices and m = nd edges. While the best-known algorithm for general bipartite graphs (due to Hopcroft and Karp) takes O(m √ n) time, in regular bipartite graphs, a perfect matching is known to be computable in O(m) time. Very recently, the O(m) bound was improved to O(min{m, n 2.5 lnn d })...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Bipartite Graph Sampling Methods for Sampling Recommendation Data

نویسنده

چکیده

منابع مشابه

Sampling Online Social Networks by Random Walk with Indirect Jumps

Estimating Node Similarity by Sampling Streaming Bipartite Graphs

Efficient Sampling for Bipartite Matching Problems

Vertex-Context Sampling for Weighted Network Embedding

Perfect Matchings in Õ(n) Time in Regular Bipartite Graphs

عنوان ژورنال:

اشتراک گذاری