Motion and appearance based Multi-Task Learning network for autonomous driving

ثبت نشده

چکیده

Autonomous driving has various visual perception tasks such as object detection, 1 motion detection, depth estimation and flow estimation. Multi-task learning (MTL) 2 has been successfully used for jointly estimating some of these tasks. Previous 3 work was focused on utilizing appearance cues. In this paper, we address the gap 4 of incorporating motion cues in a multi-task learning system. We propose a novel 5 two-stream architecture for joint learning of object detection, road segmentation 6 and motion segmentation. We designed three different versions of our network to 7 establish systematic comparison. We show that the joint training of tasks signifi8 cantly improves accuracy compared to training them independently even with a 9 relatively smaller amount of annotated samples for motion segmentation. To enable 10 joint training, we extended KITTI object detection dataset to include moving/static 11 annotations of the vehicles. An extension of this new dataset named KITTI MOD 12 is made publicly available via the official KITTI benchmark website . Our baseline 13 network outperforms MPNet which is a state of the art for single stream CNN-based 14 motion detection. The proposed two-stream architecture improves the mAP score 15 by 21.5% in KITTI MOD. We also evaluated our algorithm on the non-automotive 16 DAVIS dataset and obtained accuracy close to the state-of-the-art performance. 17 The proposed network runs at 8 fps on a Titan X GPU using a two-stream VGG16 18 encoder. Demonstration of the work is provided in. 19

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Motion and Appearance Based Multi-Task Learning Network for Autonomous Driving

Autonomous driving has various visual perception tasks such as object detection, motion detection, depth estimation and flow estimation. Multi-task learning (MTL) has been successfully used for jointly estimating some of these tasks. Previous work was focused on utilizing appearance cues. In this paper, we address the gap of incorporating motion cues in a multi-task learning system. We propose ...

متن کامل

Multi-Modal Multi-Task Deep Learning for Autonomous Driving

Several deep learning approaches have been applied to the autonomous driving task, many employing end-toend deep neural networks. Autonomous driving is complex, utilizing multiple behavioral modalities ranging from lane changing to turning and stopping. However, most existing approaches do not factor in the different behavioral modalities of the driving task into the training strategy. This pap...

متن کامل

MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

For autonomous driving, moving objects like vehicles and pedestrians are of critical importance as they primarily influence the maneuvering and braking of the car. Typically, they are detected by motion segmentation of dense optical flow augmented by a CNN based object detector for capturing semantics. In this paper, our aim is to jointly model motion and appearance cues in a single convolution...

متن کامل

Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation

As the demand for enabling high-level autonomous driving has increased in recent years and visual perception is one of the critical features to enable fully autonomous driving, in this paper, we introduce an efficient approach for simultaneous object detection, depth estimation and pixel-level semantic segmentation using a shared convolutional architecture. The proposed network model, which we ...

متن کامل

End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception

Convolutional Neural Networks (CNN) have been successfully applied to autonomous driving tasks, many in an endto-end manner. Previous end-to-end steering control methods take an image or an image sequence as the input and directly predict the steering angle with CNN. Although single task learning on steering angles has reported good performances, the steering angle alone is not sufficient for v...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Motion and appearance based Multi-Task Learning network for autonomous driving

ثبت نشده

چکیده

منابع مشابه

Motion and Appearance Based Multi-Task Learning Network for Autonomous Driving

Multi-Modal Multi-Task Deep Learning for Autonomous Driving

MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation

End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception

عنوان ژورنال:

اشتراک گذاری