Approximate MaxEnt Inverse Optimal Control
نویسندگان
چکیده
Maximum entropy inverse optimal control (MaxEnt IOC) is an effective means of discovering the underlying cost function of demonstrated agent’s activity. To enable inference in large state spaces, we introduce an approximate MaxEnt IOC procedure to address the fundamental computational bottleneck stemming from calculating the partition function via dynamic programming. Approximate MaxEnt IOC is based on two components: approximate dynamic programming and Monte Carlo sampling. This approach has a finite-sample error upper bound guarantee on its excess loss. We validate the proposed method in the context of analyzing dual-agent interactions from video, where we use approximate MaxEnt IOC to simulate mental images of a single agents body pose sequence (a high-dimensional image space). We experiment with sequences image data taken from RGB data and show that it is possible to learn cost functions that lead to accurate predictions in high-dimensional problems that were previously intractable.1
منابع مشابه
Approximate MaxEnt Inverse Optimal Control and its Application for Mental Simulation of Human Interactions (Extended Version with Proofs)
Maximum entropy inverse optimal control (MaxEnt IOC) is an effective means of discovering the underlying cost function of demonstrated human activity and can be used to predict human behavior over low-dimensional state spaces (i.e., forecasting of 2D trajectories). To enable inference in very large state spaces, we introduce an approximate MaxEnt IOC procedure to address the fundamental computa...
متن کاملApproximate MaxEnt Inverse Optimal Control and Its Application for Mental Simulation of Human Interactions
Maximum entropy inverse optimal control (MaxEnt IOC) is an effective means of discovering the underlying cost function of demonstrated human activity and can be used to predict human behavior over low-dimensional state spaces (i.e., forecasting of 2D trajectories). To enable inference in very large state spaces, we introduce an approximate MaxEnt IOC procedure to address the fundamental computa...
متن کاملGuided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Reinforcement learning can acquire tcomplex behaviors from high-level specifications. However, defining a cost function that can be optimized effectively and encodes the correct task is challenging in practice. We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems. Our method addre...
متن کاملApproximate Pareto Optimal Solutions of Multi objective Optimal Control Problems by Evolutionary Algorithms
In this paper an approach based on evolutionary algorithms to find Pareto optimal pair of state and control for multi-objective optimal control problems (MOOCP)'s is introduced. In this approach, first a discretized form of the time-control space is considered and then, a piecewise linear control and a piecewise linear trajectory are obtained from the discretized time-control space using ...
متن کاملComplete pivoting strategy for the $IUL$ preconditioner obtained from Backward Factored APproximate INVerse process
In this paper, we use a complete pivoting strategy to compute the IUL preconditioner obtained as the by-product of the Backward Factored APproximate INVerse process. This pivoting is based on the complete pivoting strategy of the Backward IJK version of Gaussian Elimination process. There is a parameter $alpha$ to control the complete pivoting process. We have studied the effect of dif...
متن کامل