A hierarchical active binocular robot vision architecture for scene exploration and object appearance learning
نویسنده
چکیده
This thesis presents an investigation of a computational model of hierarchical visual behaviours within an active binocular robot vision architecture. The robot vision system is able to localise multiple instances of the same object class, while simultaneously maintaining vergence and directing its gaze to attend and recognise objects within cluttered, complex scenes. This is achieved by implementing all image analysis in an egocentric symbolic space without creating explicit pixel-space maps and without the need for calibration or other knowledge of the camera geometry. One of the important aspects of the active binocular vision paradigm requires that visual features in both camera eyes must be bound together in order to drive visual search to saccade, locate and recognise putative objects or salient locations in the robot’s field of view. The system structure is based on the “attentional spotlight” metaphor of biological systems and a collection of abstract and reactive visual behaviours arranged in a hierarchical structure. Several studies have shown that the human brain represents and learns objects for recognition by snapshots of 2-dimensional views of the imaged scene that happens to contain the object of interest during active interaction (exploration) of the environment. Likewise, psychophysical findings specify that the primate’s visual cortex represents common everyday objects by a hierarchical structure of their parts or sub-features and, consequently, recognise by simple but imperfect 2D view object part approximations. This thesis incorporates the above observations into an active visual learning behaviour in the hierarchical active binocular robot vision architecture. By actively exploring the object viewing sphere (as higher mammals do), the robot vision system automatically synthesises and creates its own part-based object representation from multiple observations while a human teacher indicates the object and supplies a classification name. Its is proposed to adopt the computational concepts of a visual learning exploration mechanism that controls the accumulation of visual evidence and directs attention towards the spatial salient object parts. The behavioural structure of the binocular robot vision architecture is loosely modelled by a WHAT and WHERE visual streams. The WHERE stream maintains and binds spatial attention on the object part coordinates that egocentrically characterises the location of the object of interest and extracts spatio-temporal properties of feature coordinates and descriptors. The WHAT stream either determines the identity of an object or triggers a learning behaviour that stores view-invariant feature descriptions of the object part. Therefore, the robot vision is capable to perform a collection of different specific visual tasks such as vergence, detection, discrimination, recognition localisation and multiple same-instance identification. This classification of tasks enables the robot vision system to execute and fulfil specified high-level tasks, e.g. autonomous scene exploration and active object appearance learning.
منابع مشابه
A Portable Active Binocular Robot Vision Architecture for Scene Exploration
We present a portable active binocular robot vision architecture that integrates a number of visual behaviours. This vision architecture inherits the abilities of vergence, localisation, recognition and simultaneous identification of multiple target object instances. To demonstrate the portability of our vision architecture, we carry out qualitative and comparative analysis under two different ...
متن کاملTowards Binocular Active Vision in a Robot Head
This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the ...
متن کاملDetecting Object Surfaces by Using Occlusion Information from Active Binocular Stero
We propose a new method to detect polyhedral object surfaces by using \occlusion information" from active binocular stereo. Occlusion information announces whether each feature point in the scene is visible or not from each viewpoint from which an image has been taken; this is a useful geometric cue to suppose the existence of object surfaces. It is very di cult for conventional stereo methods ...
متن کاملEffective Mechatronic Models and Methods for Implementation an Autonomous Soccer Robot
Omni directional mobile robots have been popularly employed in several applications especially in soccer player robots considered in Robocup competitions. However, Omni directional navigation system, Omni-vision system and solenoid kicking mechanism in such mobile robots have not ever been combined. This situation brings the idea of a robot with no head direction into existence, a comprehensi...
متن کاملActive/Dynamic Stereo Vision
Visual navigation is a challenging issue in automated robot control. In many robot applications, like object manipulation in hazardous environments or autonomous locomotion, it is necessary to automatically detect and avoid obstacles while planning a safe trajectory. In this context the detection of corridors of free space along the robot trajectory is a very important capability which requires...
متن کامل