Social reward shaping in the prisoner's dilemma

نویسندگان

  • Monica Babes-Vroman
  • Enrique Munoz de Cote
  • Michael L. Littman
چکیده

Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to nearoptimal behavior. In this paper, we introduce social reward shaping, which is reward shaping applied in the multiagentlearning framework. We present preliminary experiments in the iterated Prisoner’s dilemma setting that show that agents using social reward shaping appropriately can behave more effectively than other classical learning and nonlearning strategies. In particular, we show that these agents can both lead —encourage adaptive opponents to stably cooperate— and follow —adopt a best-response strategy when paired with a fixed opponent— where better known approaches achieve only one of these objectives.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Individual differences in game motivation as moderators of preprogrammed strategy effects in prisoner's dilemma.

The impact of three programmed strategies (tit-for-tat, 100% cooperation, and 100% defection) on cooperation level in the Prisoner's Dilemma game is examined as a function of the subject's motivational orientation (cooperative, competitive, or individualistic). Motivational orientation was assessed on the basis of each subject's choices across four classes of decomposed games. Following this as...

متن کامل

The shared reward dilemma.

One of the most direct human mechanisms of promoting cooperation is rewarding it. We study the effect of sharing a reward among cooperators in the most stringent form of social dilemma, namely the prisoner's dilemma (PD). Specifically, for a group of players that collect payoffs by playing a pairwise PD game with their partners, we consider an external entity that distributes a fixed reward equ...

متن کامل

Carrots without Bite: On the Ineffectiveness of ‘Rewards’ in sustaining Cooperation in Social Dilemmas

Rewards are identified as an instrument that promotes cooperation in social dilemmas. All previous experimental studies on this topic use a design that allows for reciprocity with respect to play in the social dilemma, but not for reciprocity to rewards received (for example because interaction is one-shot, or because subject identity labels are randomly reassigned between periods). We introduc...

متن کامل

A Trust Model of E-commerce Based on Iterated Prisoner's Dilemma Game

Along with fierce e-commerce market competitions, some sellers may be worried about losing their customers so they bribe advisors by material means. The behavior of the advisor not only depends on their intrinsic properties, but also depends on their motivation that they may provide untruthful information to obtain additional material reward. The balance between profit and information truth con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008