Value-Decomposition Multi-Agent Actor-Critics
نویسندگان
چکیده
The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves best performance on StarCraft II micromanagement testbed, common MARL benchmark. However, our experiments demonstrate that, some cases, performs sub-optimally with A2C framework, training paradigm that promotes algorithm efficiency. To obtain reasonable trade-off between efficiency performance, we extend value-decomposition to actor-critic methods are compatible propose novel (VDAC). We evaluate VDAC task proposed framework improves median over other methods. Furthermore, use set ablation identify key factors contribute VDAC.
منابع مشابه
Revisiting Natural Actor-Critics with Value Function Approximation
Actor-critics architectures have become popular during the last decade in the field of reinforcement learning because of the introduction of the policy gradient with function approximation theorem. It allows combining rationally actorcritic architectures with value function approximation and therefore addressing large-scale problems. Recent researches led to the replacement of policy gradient b...
متن کاملValue-Decomposition Networks For Cooperative Multi-Agent Learning
We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the “lazy agent” problem, which arises due to partial obser...
متن کاملTask Coordination and Decomposition in Multi-Actor Planning Systems
We discuss a framework for coordinating self-interested agents that can be used to decompose a multi-agent task based planning problem into independent subproblems. This problem decomposition can be achieved by a simple protocol and allows the agents to solve their part of the problem without the need to interact with other agents and in such a way that the resulting plans can be seamlessly int...
متن کاملمدلسازی احساسات در سیستمهای multi-agent یادگیرنده
این پایان نامه به بررسی نقش مثبت یا منفی احساسات روی کارایی عامل های یادگیرنده در یک محیط multi-agent می پردازد. در این راستا مدلی برای عامل های یادگیرنده دارای احساس معرفی می شود. برای بررسی نقش احساسات، یک محیط فرضی multi-agent شبیه سازی شده و حالت های گوناگونی در آن نظر گرفته می شوند. در حالت نخست، کارایی عامل هایی بررسی می شود که دارای احساس نیستند و فقط قابلیت یادگیری دارند. در دومین حالت...
15 صفحه اولExploring multi-actor value creation in IT service processes
Organizational information technology (IT) needs are served through increasingly complex configurations of people, technologies, organizations, and shared information. Ideally, an organizational IT service is valuable for both the providers and users of systems and solutions. However, mutually beneficial outcomes may be difficult to achieve within the configurations through which IT services ar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i13.17353