Real-time human-centric segmentation for complex video scenes

نویسندگان

چکیده

Most existing video tasks related to “human” focus on the segmentation of salient humans, ignoring unspecified others in video. Few studies have focused segmenting and tracking all humans a complex video, including pedestrians other states (e.g., seated, riding, or occluded). In this paper, we propose novel framework, abbreviated as HVISNet, that segments tracks presented people given videos based one-stage detector. To better evaluate scenes, offer new benchmark called HVIS (Human Video Instance Segmentation), which comprises 1447 human instance masks 805 high-resolution diverse scenes. Extensive experiments show our proposed HVISNet outperforms state-of-the-art methods terms accuracy at real-time inference speed (30 FPS), especially We also notice using center bounding box distinguish different individuals severely deteriorates accuracy, heavily occluded conditions. This common phenomenon is referred ambiguous positive samples problem. alleviate problem, mechanism named Inner Center Sampling improve segmentation. Such plug-and-play inner sampling can be incorporated any model detector performance. particular, it gains 4.1 mAP improvement method case humans. Code data are available https://github.com/IIGROUP/HVISNet.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real-Time Video Object Segmentation for MPEG Encoded Video Sequences

We propose a real-time object segmentation method for MPEG encoded video. Computational superiority is the main advantage of compressed domain processing. We exploit the macro-block structure of the encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load. Further reduction is achieved by temporal grouping of the intra-coded and ...

متن کامل

A Real-Time Sound Rendering Algorithm for Complex Scenes

We present a novel output-sensitive algorithm for sound rendering of complex scenes, i.e. scenes that contain a large amount of sound sources. It allows walkthroughs of complex sound environments (such as a football stadium) in real-time and can be used for observer-dependent auralizations of global sound propagation. The algorithm is based on a stochastic sampling strategy similar to point bas...

متن کامل

Real-time Compressed-domain Spatiotemporal Video Segmentation

In this paper, a novel algorithm for the real-time, unsupervised segmentation of image sequences in the compressed domain is proposed. The algorithm utilizes the motion information present in the compressed stream in the form of P-frame forward motion vectors, as well as basic color information in the form of DC coefficients present in I-frames. An iterative rejection scheme based on the biline...

متن کامل

Real-Time Video-Based Modeling and Rendering of 3D Scenes

virtual reality, developing techniques for synthesizing arbitrary views has become an important technical issue. Given an object’s structural model (such as a polygon or volume model), it’s relatively easy to synthesize arbitrary views. Generating a structural model of an object, however, isn’t necessarily easy. For this reason, research has been progressing on a technique called image-based mo...

متن کامل

Waterfall Segmentation of Complex Scenes

We present an image segmentation technique using the morphological Waterfall algorithm. Improvements in the segmentation are brought about by using improved gradients. These are based on the detection of object boundaries learnt from human segmentations introduced by Martin et al. (2004). We avoid the usual pitfall found when applying Watershed algorithms to these boundaries, namely that the bo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Image and Vision Computing

سال: 2022

ISSN: ['0262-8856', '1872-8138']

DOI: https://doi.org/10.1016/j.imavis.2022.104552