The static nature of cyber defense systems gives attackers a sufficient amount time to explore and further exploit the vulnerabilities information technology systems. In this paper, we investigate problem where multiagent sensing acting in an environment contribute adaptive defense. We present learning strategy that enables multiple agents learn optimal policies using reinforcement (MARL). Our ...