Low-Rank Linear Cold-Start Recommendation from Social Data

نویسندگان

  • Suvash Sedhain
  • Aditya Krishna Menon
  • Scott Sanner
  • Lexing Xie
  • Darius Braziunas
چکیده

The cold-start problem involves recommendation of content to new users of a system, for whom there is no historical preference information available. This proves a challenge for collaborative filtering algorithms that inherently rely on such information. Recent work has shown that social metadata, such as users’ friend groups and page likes, can strongly mitigate the problem. However, such approaches either lack an interpretation as optimising some principled objective, involve iterative non-convex optimisation with limited scalability, or require tuning several hyperparameters. In this paper, we first show how three popular cold-start models are special cases of a linear content-based model, with implicit constraints on the weights. Leveraging this insight, we propose LoCo, a new model for cold-start recommendation based on three ingredients: (a) linear regression to learn an optimal weighting of social signals for preferences, (b) a low-rank parametrisation of the weights to overcome the high dimensionality common in social data, and (c) scalable learning of such low-rank weights using randomised SVD. Experiments on four realworld datasets show that LoCo yields significant improvements over state-of-the-art cold-start recommenders that exploit high-dimensional social network metadata. Introduction Collaborative filtering has emerged as the gold-standard approach to personalised recommendation of content to users (Leavitt 2013). The central idea of collaborative filtering is to infer a user’s preferences based on their past interactions with a system, as well as the preferences of other likeminded users (Goldberg et al. 1992; Resnick et al. 1994). While this approach has seen considerable success, it has an obvious failure mode: how do we recommend content to new users without any historical preference information? This is known as the user cold-start problem (Schein et al. 2002), and is pervasive in real-world recommendation applications. Cold-start problems may be addressed by exploiting exogenous side-information about users, such as demographic attributes. This can be done using content-based rather than collaborative filtering, where the central idea of the former is to infer a user’s preferences by explaining their past interactions based on the side-information (Pazzani and Billsus 2007). The cold-start problem does not plague such Copyright c © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. approaches, and has thus been addressed both by vanilla content-based filtering recommenders (Billsus and Pazzani 1999; Mooney and Roy 2000) as well as hybrid collaborative and content-based filtering recommenders (Schein et al. 2002; Gunawardana and Meek 2009). The precise form of side-information available has an impact on the accuracy of cold-start predictions (Gantner et al. 2010; Sedhain et al. 2014). Recent work has shown that social information, such as users’ friend groups and page likes, is sufficiently rich to strongly mitigate the cold-start problem. Various means of incorporating this information into neighbourhood (Zhang et al. 2010; Sahebi and Cohen 2011; Sedhain et al. 2014; Rosli et al. 2014; Rohani et al. 2014) and matrix factorisation (or latent feature) approaches (Ma et al. 2008; Cao, Liu, and Yang 2010; Jamali and Ester 2010; Noel et al. 2012; Krohn-Grimberghe et al. 2012) have been studied, with encouraging results. However, both strands of work have limitations. Neighbourhood methods lack an interpretation as minimising some principled objective, potentially resulting in sub-optimal solutions. Matrix factorisation methods, on the other hand, involve time-consuming iterative optimisation of a non-convex objective and require tuning of a potentially large number of hyperparameters. This paper proposes an efficient, accurate, learning-based approach for the cold-start problem that leverages social data. Our first contribution is to show how three popular cold-start models (Sedhain et al. 2014; Gantner et al. 2010; Krohn-Grimberghe et al. 2012) can be seen as a special case of a linear content-based model, which explains some of their drawbacks. Leveraging this insight, our second contribution is a new model, LoCo, that overcomes these limitations by employing three ingredients: (a) multivariate linear regression to learn an optimal weighting of social signals for preferences, (b) a low-rank parametrisation of the regression weights to address the high dimensionality common in social data, (c) highly scalable learning of such low-rank weights via randomised SVD (Halko, Martinsson, and Tropp 2011). While each of these ideas is simple, using them in conjunction is powerful: experiments on four real-world datasets demonstrate that LoCo yields substantial improvements over state-of-the-art cold-start recommenders leveraging highdimensional side-information from a social network. Background and notation Suppose we have a database of U users and I items. Let R ∈ {0, 1}U×I denote a purchase1 matrix, where R[u, i] = 1 means that user u purchased item i. Let R[:, i] ∈ {0, 1} denote the vector of item purchases. In many applications, we additionally have a side-information (or metadata) matrix X ∈ RU×P . We will think of Xup being whether or not user u “likes” a webpage p, though X could equally reflect e.g. users’ group memberships, friend circles, et cetera. Collaborative and content-based filtering The three high-level approaches to personalised recommendation may be summarised as follows: (1) Content-based filtering: exploit correlations between user side-information X and item preferences R, e.g. by deriving metadata-based user similarities (Billsus and Pazzani 1999), or learning metadata-to-preference classifiers (Mooney and Roy 2000); (2) Collaborative filtering: exploit correlations amongst the preferences R of all users, e.g. by k-nearest neighbour recommendation (Herlocker et al. 1999) or matrix factorisation (Koren, Bell, and Volinsky 2009); (3) Hybrid filtering: exploit both forms of correlation (Basu, Hirsh, and Cohen 1998; Melville, Mooney, and Nagarajan 2002; Basilico and Hofmann 2004). As k-nearest neighbour and matrix factorisation methods feature heavily in the sequel, we briefly summarise them here. In (user) k-nearest neighbour approaches, one models R ≈ SRtr (1) for some pre-defined similarity matrix S, typically based on the cosine similarity of R with itself. In matrix factorisation approaches, one models R ≈ UV (2) for some latent representations U ∈ RU×K ,V ∈ RK×I with latent dimensionality K min(U, I). The parameters U,V are typically estimated by solving min U,V ||R−UV||F + λU 2 ||U||F + λV 2 ||V||F . (3) The user cold-start problem The recommendation problem we consider is the cold-start scenario, where a user has no prior purchases. We split the set of users into the (training) Utr warm-start users with at least one purchase, and the rest as the (test) Ute cold-start users. We denote the corresponding slices of the purchase matrix by Rtr ∈ RUtr×I ,Rte ∈ RUte×I , where by definition Rte = 0. Our interest will be in producing R̂te ∈ RUte×I , a recommendation matrix for the cold-start users. Personalised recommendation for cold-start users is intuitively impossible from R alone. But suppose we have a metadata matrix X, with Xtr,Xte being the metadata for the warmand cold-start users respectively. Then, we might hope to leverage correlations between Xtr and Rtr to make meaningful predictions for the cold-start users. More generally, purchases can be substituted by any positive interactions between users and items. Approaches to (social) cold-start recommendation We summarise the various approaches to exploiting (social) side-information to ameliorate the cold-start problem. Neighbourhood + metadata similarity In cold-start scenarios, one cannot use a neighbourhood model (Equation 1) with a similarity S computed from Rte since, by definition, Rte = 0. One can however compute new similarity metrics based on metadata (Billsus and Pazzani 1999). Recently, several works have designed S based on social information (Zhang et al. 2010; Sahebi and Cohen 2011; Sedhain et al. 2014; Rosli et al. 2014; Rohani et al. 2014). For example, (Sedhain et al. 2014) proposed R̂te = Xte Xtr ?Rtr (4) where , ? refer to generalised matrix operations. When ? is the standard inner product, this is a neighbourhood method with metadata-derived similarity S = Xte Xtr. Matrix factorisation with regularisation In cold-start scenarios, one cannot use a matrix factorisation model (Equation 2) with U estimated as per Equation 3 since it is optimal for Ute = 0. One can however regularise the latent features based on metadata similarity (Ma et al. 2008; Agarwal and Chen 2009; Cao, Liu, and Yang 2010; Jamali and Ester 2010; Yang et al. 2011; Noel et al. 2012; KrohnGrimberghe et al. 2012). For example, (Krohn-Grimberghe et al. 2012) proposed an objective based on collective matrix factorisation (CMF) (Singh and Gordon 2008): min U,V,Z ||R−UV||F + μ||X−UZ||F + Ω(U,V,Z), Ω(U,V,Z) = λU 2 ||U||F + λV 2 ||V||F + λZ 2 ||Z||F . (5) where U ∈ RU×K ,V ∈ RK×I , and Z ∈ RK×P for some latent dimensionality K min(U, I). Intuitively, we find a latent subspace U for users that is jointly predictive of both their preferences and social characteristics. We then predict R̂te = UteV. (6) While this prediction has the same form as Equation 2, the estimation of Ute here will be non-trivial owing to the additional regularisation derived from requiring it to model Xte. An alternate but less generic approach is the fLDA model (Agarwal and Chen 2010), which combines matrix factorisation and LDA when the metadata comprises textual features. Matrix factorisation with feature mapping Matrix factorisation approaches may also be adapted to the cold-start regime via a two-step model. Here, the first step is to model the warm-start users by Rtr ≈ R̂tr = UtrV, with latent features Utr,V as before. The second step is to learn a mapping between the side-information Xtr and latent features Utr. This mapping is used to estimate Ute from Xte, with predictions then made as per Equation 6. A canonical example of this approach is BPR-LinMap (Gantner et al. 2010), where in the second step the mapping may be done via linear regression, so that one estimates Ute = XteT (7) where T ∈ RP×K is chosen so that Utr ≈ XtrT via min T ‖Utr −XtrT‖F + λT 2 ‖T‖F . (8)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized Recommendation of Research Papers by Fusing Recommendations from Explicit and Implicit Social Network

Combining social network information with collaborative filtering recommendation algorithms has helped to alleviate some drawbacks of collaborative filtering, for example, the cold start problem, and has increased the accuracy of recommendations. However, the user coverage of recommendation for social-based recommendation is low as there is often insufficient data about explicit social relation...

متن کامل

Trust-based Service Recommendation in Social Network

With the number of Web services increasing constantly on the Internet, how to recommend personalized Web services for users has become more and more important. At present, there emerged some service recommendation systems utilizing influence ranking and collaborative filtering algorithms in service recommendation. However, they neither considered trust relationships among users, nor deal with t...

متن کامل

“Like Attracts Like!”– A Social Recommendation Framework Through Label Propagation

Recently label propagation recommendation receives much attention from both industrial and academic fields due to its low requirement of labeled training data and effective prediction. Previous methods propagate preferences on a user or item similarity graph for making recommendation. However, they still suffer some major problems, including data sparsity, lack of trustworthiness, cold-start pr...

متن کامل

Personalized recommendation of linear content on interactive TV platforms: beating the cold start and noisy implicit user feedback

Recommender systems in TV applications mostly focus on the recommendation of video-on-demand (VOD) content, although the major part of users’ content consumption is realized on linear channel programs (live or recorded), termed EPG programs. The accurate collaborative filtering algorithms suitable for VOD recommendation cannot be directly carried over for EPG program recommendation. First, EPG ...

متن کامل

Improving the performance of recommender systems in the face of the cold start problem by analyzing user behavior on social network

The goal of recommender system is to provide desired items for users. One of the main challenges affecting the performance of recommendation systems is the cold-start problem that is occurred as a result of lack of information about a user/item. In this article, first we will present an approach, uses social streams such as Twitter to create a behavioral profile, then user profiles are clusteri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017