Combinatorial Causal Bandits
نویسندگان
چکیده
In combinatorial causal bandits (CCB), the learning agent chooses at most K variables in each round to intervene, collects feedback from observed variables, with goal of minimizing expected regret on target variable Y. We study under context binary generalized linear models (BGLMs) a succinct parametric representation models. present algorithm BGLM-OFU for Markovian BGLMs (i.e., no hidden variables) based maximum likelihood estimation method and give analysis it. For special case we apply inference techniques such as do calculus convert original model into model, then show that our another regression both solve variables. Our novelty includes (a) considering intervention action space general graph structures including ones (b) integrating adapting diverse studies online influence maximization, (c) avoiding unrealistic assumptions (such knowing joint distribution parents Y all interventions) factors exponential size prior studies.
منابع مشابه
Combinatorial Bandits
We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S ⊆ R. At the same time, the opponent chooses a “loss” vector in R and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the ...
متن کاملMatroid Bandits: Practical Large-Scale Combinatorial Bandits
A matroid is a notion of independence that is closely related to computational efficiency in combinatorial optimization. In this work, we bring together the ideas of matroids and multiarmed bandits, and propose a new class of stochastic combinatorial bandits, matroid bandits. A key characteristic of this class is that matroid bandits can be solved both computationally and sample efficiently. We...
متن کاملCombinatorial Bandits Revisited
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret....
متن کاملCombinatorial Cascading Bandits
We propose combinatorial cascading bandits, a class of partial monitoring problems where at each step a learning agent chooses a tuple of ground items subject to constraints and receives a reward if and only if the weights of all chosen items are one. The weights of the items are binary, stochastic, and drawn independently of each other. The agent observes the index of the first chosen item who...
متن کاملContextual Combinatorial Cascading Bandits
We propose the contextual combinatorial cascading bandits, a combinatorial online learning game, where at each time step a learning agent is given a set of contextual information, then selects a list of items, and observes stochastic outcomes of a prefix in the selected items by some stopping criterion. In online recommendation, the stopping criterion might be the first item a user selects; in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i6.25917