Policy regularization for legible behavior

نویسندگان

چکیده

Abstract In this paper we propose a method to augment Reinforcement Learning agent with legibility. This is inspired by the literature in Explainable Planning and allows regularize agent’s policy after training, without requiring modify its learning algorithm. achieved evaluating how optimal may produce observations that would make an observer model infer wrong policy. our formulation, decision boundary introduced legibility impacts states which returns action non-legible because having high likelihood also other policies. these cases, trade-off between such action, legible/sub-optimal made. We tested grid-world environment highlighting policy, gathered both quantitative qualitative results. addition, discuss proposed regularization generalizes over methods functioning goal-driven policies, applicable general policies of are special case.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generating Legible Motion

Legible motion — motion that communicates its intent to a human observer — is crucial for enabling seamless human-robot collaboration. In this paper, we propose a functional gradient optimization technique for autonomously generating legible motion. Our algorithm optimizes a legibility metric inspired by the psychology of action interpretation in humans, resulting in motion trajectories that pu...

متن کامل

Developing Legible Visualizations for Online Social Spaces

Although constructed for researchers to share news and information, Usenet quickly developed into a social environment with varied styles of interactions. Unfortunately, the browsers developed to view the shared messages fail to effectively convey the rich social features of a newsgroup, let alone all of Usenet. The goal of our research is to use the salient features of social interaction to bu...

متن کامل

An Experimental Study for Identifying Features of Legible Manipulator Paths

This work performs an experimental study on the legibility of paths executed by a manipulation arm available on a Baxter robot. In this context, legibility is defined as the ability of people to effectively predict the target of the arm’s motion. Paths that are legible can improve the collaboration of robots with humans since they allow people to intuitively understand the robot’s intentions. E...

متن کامل

Lipschitz Behavior of the Robust Regularization

To minimize or upper-bound the value of a function “robustly”, we might instead minimize or upper-bound the “ -robust regularization”, defined as the map from a point to the maximum value of the function within an -radius. This regularization may be easy to compute: convex quadratics lead to semidefinite-representable regularizations, for example, and the spectral radius of a matrix leads to ps...

متن کامل

Regularization behavior in a non-linguistic domain

Language learners tend to regularize unpredictable variation and some claim that is due to a language-specific regularization bias. We investigate the role of task difficulty on regularization behavior in a non-linguistic frequency learning task and show that adults regularize variable input when tracking multiple frequencies concurrently, but reliably reproduce the variation they have observed...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Computing and Applications

سال: 2022

ISSN: ['0941-0643', '1433-3058']

DOI: https://doi.org/10.1007/s00521-022-07942-7