Object Permanence Through Audio-Visual Representations

نویسندگان

چکیده

As robots perform manipulation tasks and interact with objects, it is probable that they accidentally drop objects (e.g., due to an inadequate grasp of unfamiliar object) subsequently bounce out their visual fields. To enable recover from such errors, we draw upon the concept object permanence-objects remain in existence even when are not being sensed seen) directly. In particular, developed a multimodal neural network model-using partial, observed trajectory audio resulting impact as its inputs-to predict full end location dropped object. We empirically show that: 1) our method predicted locations close proximity (i.e., within field robot's wrist camera) actual 2) robot was able retrieve by applying minimal vision-based pick-up adjustments. Additionally, outperformed five comparison baselines retrieving objects. Our results contribute enabling permanence for error recovery drops.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cortical Plasticity of Audio–Visual Object Representations

Several regions in human temporal and frontal cortex are known to integrate visual and auditory object features. The processing of audio-visual (AV) associations in these regions has been found to be modulated by object familiarity. The aim of the present study was to explore training-induced plasticity in human cortical AV integration. We used functional magnetic resonance imaging to analyze t...

متن کامل

Object Category Detection Using Audio-Visual Cues

Categorization is one of the fundamental building blocks of cognitive systems. Object categorization has traditionally been addressed in the vision domain, even though cognitive agents are intrinsically multimodal. Indeed, biological systems combine several modalities in order to achieve robust categorization. In this paper we propose a multimodal approach to object category detection, using au...

متن کامل

Audio-Visual Object Extraction using Graph Cuts

We propose a novel method to automatically extract the audio-visual objects that are present in a scene. First, the synchrony between related events in audio and video channels is exploited to identify the possible locations of the sound sources. Video regions presenting a high coherence with the soundtrack are automatically labelled as being part of the audio-visual object. Next, a graph cut s...

متن کامل

Scene Understanding through Audio-Visual Fusion

Scene understanding involves the integration of a wide variety of information to produce a through description of the robot's environment. By integrating spatial, visual and audio cues, we could provide a greater amount of understanding than can be obtained using one of the modalities alone. In this paper, we describe our current work on using audition to enhance existing object detection and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3115082