Choice between reliable and unreliable reinforcement alternatives revisited: Preference for unreliable reinforcement.

نویسندگان

T W Belke

M L Spetch

چکیده

Pigeons' choices between a reliable alternative that always provided food after a delay (i.e., 100% reinforcement) and an unreliable one that provided food or blackout equally often after a delay (i.e., 50% reinforcement) was studied using a discrete-trials concurrent-chains procedure modified to prevent choice between alternatives following a blackout outcome. Initial links were fixed-ratio 1 schedules, and terminal links were fixed-time schedules. Stimuli presented during the terminal-link delays were correlated with the food and blackout outcomes. In Experiment 1, terminal-link durations were varied. With short terminal links (i.e., 10 s), 6 of 8 subjects showed strong preference for the 50% side. As terminal-link duration increased to 30 s, preference, regardless of direction, became less extreme. In Experiment 2, the side-key location of the 50% and 100% alternatives was reversed for 3 subjects. Preference for the 50% alternative reoccurred following the key reversal. When a 5-s separation was subsequently interposed between the initial and terminal links for both alternatives, all birds reversed to a preference for the 100% side. In general, the strong preference for the 50% side was qualitatively consistent with the expectation that the procedure enhanced the conditioned-reinforcement effectiveness of the food-associated terminal-link stimulus on the 50% side. Implications of the results for various accounts of choice of the 50% alternative are discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choice between reliable and unreliable outcomes: mixed percentage-reinforcement in concurrent chains.

Pigeons' choices between alternatives that provided different percentages of reinforcement in mixed schedules were studied using the concurrent-chains procedure. In Experiment 1, the alternatives were terminal-link schedules that were equal in delay and magnitude of reinforcement, but that provided different percentages of reinforcement, with one schedule providing, reinforcement twice as relia...

متن کامل

Suboptimal choice in a percentage-reinforcement procedure: effects of signal condition and terminal-link length.

Pigeons' choice between reliable (100%) and unreliable (50%) reinforcement was studied using a concurrent-chains procedure. Initial links were fixed-ratio 1 schedules, and terminal links were equal fixed-time schedules. The duration of the terminal links was varied across conditions. The terminal link on the reliable side always ended in food; the terminal link on the unreliable side ended with...

متن کامل

Suboptimal Choice in Pigeons: Stimulus Value Predicts Choice over Frequencies

Pigeons have shown suboptimal gambling-like behavior when preferring a stimulus that infrequently signals reliable reinforcement over alternatives that provide greater reinforcement overall. As a mechanism for this behavior, recent research proposed that the stimulus value of alternatives with more reliable signals for reinforcement will be preferred relatively independently of their frequencie...

متن کامل

Value of knowing when reinforcement is due.

In 4 experiments concerning preference for knowing when reinforcement is to be delivered—although that information has no apparent instrumental value—pigeons chose between informative and noninformative stimulus sequences. Following an informative choice, the stimulus was correlated with the prevailing interval before reinforcement; following a noninformative choice, it was not. Strong preferen...

متن کامل

Interactive Q-Learning with Ordinal Rewards and Unreliable Tutor

Conventional reinforcement learning (RL) requires the specification of a numeric reward function, which is often a difficult task. In this paper, we extend the Q-learning approach toward the handling of ordinal rewards. The method we propose is interactive in the sense of allowing the agent to query a tutor for comparing sequences of ordinal rewards. More specifically, this method can be seen a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Journal of the experimental analysis of behavior

دوره 62 3 شماره

صفحات -

تاریخ انتشار 1994

Choice between reliable and unreliable reinforcement alternatives revisited: Preference for unreliable reinforcement.

نویسندگان

چکیده

منابع مشابه

Choice between reliable and unreliable outcomes: mixed percentage-reinforcement in concurrent chains.

Suboptimal choice in a percentage-reinforcement procedure: effects of signal condition and terminal-link length.

Suboptimal Choice in Pigeons: Stimulus Value Predicts Choice over Frequencies

Value of knowing when reinforcement is due.

Interactive Q-Learning with Ordinal Rewards and Unreliable Tutor

عنوان ژورنال:

اشتراک گذاری