Online Clustering of Bandits
نویسندگان
چکیده
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation (“bandit”) strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and real-world datasets. Our experiments show a significant increase in prediction performance over state-of-the-art methods for bandit problems.
منابع مشابه
Online Clustering of Contextual Cascading Bandits
We consider a new setting of online clustering of contextual cascading bandits, an online learning problem where the underlying cluster structure over users is unknown and needs to be learned from a random prefix feedback. More precisely, a learning agent recommends an ordered list of items to a user, who checks the list and stops at the first satisfactory item, if any. We propose an algorithm ...
متن کاملOnline Context-Dependent Clustering in Recommendations based on Exploration-Exploitation Algorithms
We investigate two context-dependent clustering techniques for content recommendation based on exploration-exploitation strategies in contextual multiarmed bandit settings. Our algorithms dynamically group users based on the items under consideration and, possibly, group items based on the similarity of the clusterings induced over the users. The resulting algorithm thus takes advantage of pref...
متن کاملContext-Aware Bandits
We propose an efficient Context-Aware clustering of Bandits (CAB) algorithm,which can capture collaborative effects. CAB can be easily deployed in a real-world recommendation system, where multi-armed bandits have been shown toperform well in particular with respect to the cold-start problem. CAB utilizes acontext-aware clustering augmented by exploration-exploitation strategies...
متن کاملBeat the Mean Bandit
The Dueling Bandits Problem is an online learning framework in which actions are restricted to noisy comparisons between pairs of strategies (also called bandits). It models settings where absolute rewards are difficult to elicit but pairwise preferences are readily available. In this paper, we extend the Dueling Bandits Problem to a relaxed setting where preference magnitudes can violate trans...
متن کاملGraph Clustering Bandits for Recommendation
We investigate an efficient context-dependent clustering technique for recommender systems based on exploration-exploitation strategies through multi-armed bandits over multiple users. Our algorithm dynamically groups users based on their observed behavioral similarity during a sequence of logged activities. In doing so, the algorithm reacts to the currently served user by shaping clusters arou...
متن کامل