Learning ε-Pareto Efficient Solutions With Minimal Knowledge Requirements Using Satisficing
نویسندگان
چکیده
Many problems in multiagent learning involve repeated play. As such, naive application of Nash equilibrium concepts are often inappropriate. A recent algorithm in the literature (Stimpson & Goodrich 2003) uses a Nash bargaining perspective instead of a Nash equilibrium perspective, and learns to cooperate in self play in a social dilemma without exposing itself to being exploited by selfish agents. It does so without knowledge of the game or the actions of other agents. In this paper, we show that this algorithm likely converges to near pareto efficient solutions in self play in most nonzero-sum nagent, m-action matrix games provided that parameters are set appropriately. Furthermore, we present a tremble based extension of this algorithm and show that it is guaranteed to play near pareto efficient solutions arbitrarily high percentages of the time in self play for the same large class of matrix games while allowing adaptation to changing environments.
منابع مشابه
Satisficing in Multiagent Learning
Learning in the presence of multiple, possibly antagonistic adaptive agents presents special challenges to algorithm designers, especially in environments with limited information. Desirable properties of such algorithms include (a) security, which requires agents to avoid exploitation by antagonistic agents, and (b) efficiency, which requires agents to find nearly pareto efficient solutions wh...
متن کاملLearning To Cooperate in a Social Dilemma: A Satisficing Approach to Bargaining
Learning in many multi-agent settings is inherently repeated play. This calls into question the naive application of single play Nash equilibria in multi-agent learning and suggests, instead, the application of give-andtake principles of bargaining. We modify and analyze a satisficing algorithm based on (Karandikar et al., 1998) that is compatible with the bargaining perspective. This algorithm...
متن کاملSolving a bi-objective mathematical model for location-routing problem with time windows in multi-echelon reverse logistics using metaheuristic procedure
During the last decade, the stringent pressures from environmental and social requirements have spurred an interest in designing a reverse logistics (RL) network. The success of a logistics system may depend on the decisions of the facilities locations and vehicle routings. The location-routing problem (LRP) simultaneously locates the facilities and designs the travel routes for vehicles among ...
متن کاملOptical Design with Epsilon-Dominated Multi-objective Evolutionary Algorithm
Significant improvement over a patented lens design is achieved using multi-objective evolutionary optimization. A comparison of the results obtained from NSGA2 and ε-MOEA is done. In our current study, ε-MOEA converged to essentially the same Pareto-optimal solutions as the one with NSGA2, but ε-MOEA proved to be better in providing reasonably good solutions, comparable to the patented design,...
متن کاملAn interactive fuzzy satisficing method for random fuzzy multiobjective linear programming problems through fractile criteria optimization with possibility
This paper considers multiobjective linear programming problems where each coefficient of the objective functions is expressed by a random fuzzy variable. A new decision making model is proposed by incorporating the concept of fractile criteria optimization into a possibilistic programming model. An interactive fuzzy satisficing method is presented for deriving a satisficing solution for a deci...
متن کامل