Playing Repeated Network Interdiction Games with Semi-Bandit Feedback

نویسندگان

  • Qingyu Guo
  • Bo An
  • Long Tran-Thanh
چکیده

We study repeated network interdiction games with no prior knowledge of the adversary and the environment, which can model many real world network security domains. Existing works often require plenty of available information for the defender and neglect the frequent interactions between both players, which are unrealistic and impractical, and thus, are not suitable for our settings. As such, we provide the first defender strategy, that enjoys nice theoretical and practical performance guarantees, by applying the adversarial online learning approach. In particular, we model the repeated network interdiction game with no prior knowledge as an online linear optimization problem, for which a novel and efficient online learning algorithm, SBGA, is proposed, which exploits the unique semi-bandit feedback in network security domains. We prove that SBGA achieves sublinear regret against adaptive adversary, compared with both the best fixed strategy in hindsight and a near optimal adaptive strategy. Extensive experiments also show that SBGA significantly outperforms existing approaches with fast convergence rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Playing Congestion Games with Bandit Feedbacks

Almost all convergence results from each player adopting specific “no-regret” learning algorithms such as multiplicative updates or the more general mirror-descent algorithms in repeated games are only known in the more generous information model, in which each player is assumed to have access to the costs of all possible choices, even the unchosen ones, at each time step. This assumption in ge...

متن کامل

Multi-agent Learning Experiments on Repeated Matrix Games

This paper experimentally evaluates multiagent learning algorithms playing repeated matrix games to maximize their cumulative return. Previous works assessed that Qlearning surpassed Nash-based multi-agent learning algorithms. Based on all-againstall repeated matrix game tournaments, this paper updates the state of the art of multiagent learning experiments. In a first stage, it shows that M-Qu...

متن کامل

Minimax Policies for Combinatorial Prediction Games

We address the online linear optimization problem when the actions of the forecaster are represented by binary vectors. Our goal is to understand the magnitude of the minimax regret for the worst possible set of actions. We study the problem under three different assumptions for the feedback: full information, and the partial information models of the so-called “semi-bandit”, and “bandit” probl...

متن کامل

Subjective Games and Equilibria

Applying the concepts of Nash, Bayesian, and correlated equilibria to the analysis of strategic interaction requires that players possess objective knowledge of the game and opponents' strategies. Such knowledge is often not available. The proposed notions of subjective games and of subjective Nash and correlated equilibria replace essential but unavailable objective knowledge by subjective ass...

متن کامل

Learning and Transfer of Learning with No Feedback: An Experimental Test Across Games

This paper explores the extent to which people learn in repeated games without feedback, and the extent to which this learning transfers to new games. Current theories of learning model learning as adjustment in behavior in response to feedback about outcomes and payoffs and largely ignore the possibility that learning may take place in the absence of such feedback. An earlier paper (Weber, in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017