From Pixels to Layers: Joint Motion Estimation and Segmentation
نویسندگان
چکیده
of “From Pixels to Layers: Joint Motion Estimation and Segmentation” by Deqing Sun, Brown University, May 2013 Estimating image motion, or optical flow, in scenes with multiple moving objects and segmenting the individual moving objects are two fundamental problems in computer vision and have applications in many fields, including medical imaging, image processing, graphics, and robotics. Motion estimation and scene segmentation are particularly challenging because of lighting changes, motion boundaries, occlusions, and indiscriminative appearances. Despite decades of extensive research effort, current methods still tend to produce large optical flow errors near motion boundaries and in occlusion regions and falsely merge foreground objects with the background. A key feature of optical flow methods is an energy term, or prior, that prefers spatially smooth flow fields. In this dissertation, we show that image-dependent and non-local prior models can better preserve motion boundaries than the widely used pairwise Markov Random Field (MRF) models. We also demonstrate that joint motion estimation and segmentation can achieve more accurate results than the separate treatment of each problem. First, we formulate fully learnable low-level models of optical flow and learn the models from training data. Our results show that image-dependent, steerable models outperform standard MRF models, especially in recovering motion boundaries. Second, to understand what makes optical flow accurate, we perform a quantitative analysis of recent practices in optical flow estimation. Median filtering of the flow field is one of the key features of the most accurate methods and we formalize this as a non-local smoothness term that integrates information over a large spatial neighborhood. We further define a weighted non-local smoothness term that uses both image and motion cues to preserve motion boundaries. Third, inspired by recent successes in static image segmentation we develop a layered model to segment moving objects (layers) using image-dependent, continuous support functions. The method orders each layer in depth and explicitly models the occlusions between layers and the temporal consistency of layers. In an attempt to avoid being trapped in poor local optima, we define a discrete formulation of our objective function and extend graph cuts optimization methods to obtain good initial values for the continuous formulation. The mixed continuous-discrete optimizer can automatically infer the number of layers and their depth ordering for a given scene. Experimental results on benchmark datasets demonstrate the benefits of joint motion estimation and segmentation: the layered approach achieves more accurate motion estimates in motion boundary and occlusion regions and better segments the foreground from the background when compared with solving each problem separately.
منابع مشابه
Motion-Based Segmentation of Transparent Layers in Video Sequences
We present a method for segmenting moving transparent layers in video sequences. We assume that the images can be divided into areas containing at most two moving transparent layers. We call this configuration (which is the mostly encountered one) bi-distributed transparency. The proposed method involves three steps: initial blockmatching for two-layer transparent motion estimation, motion clus...
متن کاملSalt and Pepper Noise Removal using Pixon-based Segmentation and Adaptive Median Filter
Removing salt and pepper noise is an active research area in image processing. In this paper, a two-phase method is proposed for removing salt and pepper noise while preserving edges and fine details. In the first phase, noise candidate pixels are detected which are likely to be contaminated by noise. In the second phase, only noise candidate pixels are restored using adaptive median filter. In...
متن کاملWhat Went Where
We present a novel framework for motion segmentation that combines the concepts of layer-based methods and featurebased motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of p...
متن کاملPseudo Zernike Moment-based Multi-frame Super Resolution
The goal of multi-frame Super Resolution (SR) is to fuse multiple Low Resolution (LR) images to produce one High Resolution (HR) image. The major challenge of classic SR approaches is accurate motion estimation between the frames. To handle this challenge, fuzzy motion estimation method has been proposed that replaces value of each pixel using the weighted averaging all its neighboring pixels i...
متن کاملSpatiotemporal segmentation using genetic algorithms
Segmentation is the process of identifying uniform regions based on certain conditions. Segmentation has been used for a long time in image analysis and computer vision for a variety of applications. In particular, there has been a growing interest in video sequence segmentation mainly due to the development of MPEG-4, which enables the content-based manipulation of multimedia data [1,2]. For t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012