Knowledge distillation between machine learning models has opened many new avenues for parameter count reduction, performance improvements, or amortizing training time when changing architectures the teacher and student network. In case of reinforcement learning, this technique also been applied to distill policies students. Until now, policy required access a simulator real world trajectories....