Combinatorial Causal Bandits

نویسندگان

چکیده

In combinatorial causal bandits (CCB), the learning agent chooses at most K variables in each round to intervene, collects feedback from observed variables, with goal of minimizing expected regret on target variable Y. We study under context binary generalized linear models (BGLMs) a succinct parametric representation models. present algorithm BGLM-OFU for Markovian BGLMs (i.e., no hidden variables) based maximum likelihood estimation method and give analysis it. For special case we apply inference techniques such as do calculus convert original model into model, then show that our another regression both solve variables. Our novelty includes (a) considering intervention action space general graph structures including ones (b) integrating adapting diverse studies online influence maximization, (c) avoiding unrealistic assumptions (such knowing joint distribution parents Y all interventions) factors exponential size prior studies.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combinatorial Bandits

We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S ⊆ R. At the same time, the opponent chooses a “loss” vector in R and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the ...

متن کامل

Matroid Bandits: Practical Large-Scale Combinatorial Bandits

A matroid is a notion of independence that is closely related to computational efficiency in combinatorial optimization. In this work, we bring together the ideas of matroids and multiarmed bandits, and propose a new class of stochastic combinatorial bandits, matroid bandits. A key characteristic of this class is that matroid bandits can be solved both computationally and sample efficiently. We...

متن کامل

Combinatorial Bandits Revisited

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret....

متن کامل

Combinatorial Cascading Bandits

We propose combinatorial cascading bandits, a class of partial monitoring problems where at each step a learning agent chooses a tuple of ground items subject to constraints and receives a reward if and only if the weights of all chosen items are one. The weights of the items are binary, stochastic, and drawn independently of each other. The agent observes the index of the first chosen item who...

متن کامل

Contextual Combinatorial Cascading Bandits

We propose the contextual combinatorial cascading bandits, a combinatorial online learning game, where at each time step a learning agent is given a set of contextual information, then selects a list of items, and observes stochastic outcomes of a prefix in the selected items by some stopping criterion. In online recommendation, the stopping criterion might be the first item a user selects; in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i6.25917