Real-time human-centric segmentation for complex video scenes
نویسندگان
چکیده
Most existing video tasks related to “human” focus on the segmentation of salient humans, ignoring unspecified others in video. Few studies have focused segmenting and tracking all humans a complex video, including pedestrians other states (e.g., seated, riding, or occluded). In this paper, we propose novel framework, abbreviated as HVISNet, that segments tracks presented people given videos based one-stage detector. To better evaluate scenes, offer new benchmark called HVIS (Human Video Instance Segmentation), which comprises 1447 human instance masks 805 high-resolution diverse scenes. Extensive experiments show our proposed HVISNet outperforms state-of-the-art methods terms accuracy at real-time inference speed (30 FPS), especially We also notice using center bounding box distinguish different individuals severely deteriorates accuracy, heavily occluded conditions. This common phenomenon is referred ambiguous positive samples problem. alleviate problem, mechanism named Inner Center Sampling improve segmentation. Such plug-and-play inner sampling can be incorporated any model detector performance. particular, it gains 4.1 mAP improvement method case humans. Code data are available https://github.com/IIGROUP/HVISNet.
منابع مشابه
Real-Time Video Object Segmentation for MPEG Encoded Video Sequences
We propose a real-time object segmentation method for MPEG encoded video. Computational superiority is the main advantage of compressed domain processing. We exploit the macro-block structure of the encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load. Further reduction is achieved by temporal grouping of the intra-coded and ...
متن کاملA Real-Time Sound Rendering Algorithm for Complex Scenes
We present a novel output-sensitive algorithm for sound rendering of complex scenes, i.e. scenes that contain a large amount of sound sources. It allows walkthroughs of complex sound environments (such as a football stadium) in real-time and can be used for observer-dependent auralizations of global sound propagation. The algorithm is based on a stochastic sampling strategy similar to point bas...
متن کاملReal-time Compressed-domain Spatiotemporal Video Segmentation
In this paper, a novel algorithm for the real-time, unsupervised segmentation of image sequences in the compressed domain is proposed. The algorithm utilizes the motion information present in the compressed stream in the form of P-frame forward motion vectors, as well as basic color information in the form of DC coefficients present in I-frames. An iterative rejection scheme based on the biline...
متن کاملReal-Time Video-Based Modeling and Rendering of 3D Scenes
virtual reality, developing techniques for synthesizing arbitrary views has become an important technical issue. Given an object’s structural model (such as a polygon or volume model), it’s relatively easy to synthesize arbitrary views. Generating a structural model of an object, however, isn’t necessarily easy. For this reason, research has been progressing on a technique called image-based mo...
متن کاملWaterfall Segmentation of Complex Scenes
We present an image segmentation technique using the morphological Waterfall algorithm. Improvements in the segmentation are brought about by using improved gradients. These are based on the detection of object boundaries learnt from human segmentations introduced by Martin et al. (2004). We avoid the usual pitfall found when applying Watershed algorithms to these boundaries, namely that the bo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Image and Vision Computing
سال: 2022
ISSN: ['0262-8856', '1872-8138']
DOI: https://doi.org/10.1016/j.imavis.2022.104552