A Nonlinear Reinforcement Scheme for Stochastic Learning Automata

نویسندگان

FLORIN STOICA

EMIL M. POPA

چکیده

A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. This response may be favorable or unfavorable. The aim is to design an automaton that can determine the best action guided by past actions and responses. The reinforcement scheme presented is shown to satisfy all necessary and sufficient conditions for absolute expediency for a stationary environment. An automaton using this scheme is guaranteed to „do better” at every time step than at the previous step (expected value of the average penalty at one iteration step is less than of the previous step for all steps). Some simulation results are presented, which prove that our algorithm converges to a solution faster than the one given in [7]. Key-Words: Stochastic Learning Automata, Reinforcement Learning

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata

Reinforcement schemes represent the basis of the learning process for stochastic learning automata, generating their learning behavior. An automaton using a reinforcement scheme can decide the best action, based on past actions and environment responses. The aim of this paper is to introduce a new reinforcement scheme for stochastic learning automata. We test our schema and compare with other n...

متن کامل

A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata

A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. The aim is to design an automaton, using an evolutionary reinforcement scheme (the basis of the learning process), that can determine the best action guided by past ac...

متن کامل

Automatic control based on Wasp Behavioral Model and Stochastic Learning Automata

A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. The aim is to design an automaton, using a reinforcement scheme based on the computational model of wasp behaviour that can determine the best action guided by past ac...

متن کامل

Gaussian Process Based Optimistic Knapsack Sampling with Applications to Stochastic Resource Allocation

The stochastic non-linear fractional knapsack problem is a challenging optimization problem with numerous applications, including resource allocation. The goal is to find the most valuable mix of materials that fits within a knapsack of fixed capacity. When the value functions of the involved materials are fully known and differentiable, the most valuable mixture can be found by direct applicat...

متن کامل