Object detection and tracking in videos represent essential computationally demanding building blocks for current future visual perception systems. In order to reduce the efficiency gap between available methods computational requirements of real-world applications, we propose re-think one most successful image object detection, Faster R-CNN, extend it video domain. Specifically, framework lear...