Shift invariance and the neocognitron

نویسندگان

  • Etienne Barnard
  • David Casasent
چکیده

-We investigate the ability of the neocognitron to perform shift-invariant pattern recognition. Both an intuitive analysis and a more formal investigation show that the performance of the neocognitron is not intrinsically shift invariant, and that certain model parameters must be chosen appropriately to obtain approximate shift invariance. It is shown how these parameters should be chosen to reach a compromise between invariance and classification sensitivity. Keywords--Shift invariance, Neural classifiers, Pattern recognition, Neocognitron. I. I N T R O D U C T I O N One of the main problems of visual pattern recognition is that the objects to be recognized are generally subjected to various forms of transformation. Thus, a successful system needs to be able to recognize objects despite such transformations. Numerous methods to obtain scale, shift, and rotation invariance have been investigated for this purpose. Conventional methods have used two approaches. One set of methods uses feature spaces which are automatically invariant to some transformations (e.g., moments (Hu, 1962), polar-coordinate Fourier transform (Casasent & Psaltis, 1976)). Another approach (Chin & Dyer, 1986) uses object models and tries to match the observed and stored models by determination of the transformation parameters. Unfortunately, these techniques have only been successful in a limited range of applications; they have not come close to duplicating the human ability to perform transformation-invariant pattern recognition. Thus, invariant pattern recognition has been an ideal target for neural nets, since neural nets will hopefully be able to perform functions similar to those which underlie biological pattern recognition. The neocognitron (Fukushima, 1980) is a neuralnetwork model for visual pattern recognition that has Acknowledgment--Funding for this research was provided by a contract from the Defense Advanced Research Project Agency monitored by the U.S. Army Missile Command (Contract DAAH01-89-C-04180). Requests for reprints should be sent to Dr. D. Casasent, Center for Excellence in Optical Data Processing, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213. captured considerable attention (Fukushima & Miyake, 1982; Miyake & Fukushima, 1984; Fukushima, 1987; Johnson, Daniell, & Burman, 1988; Menon & Heinemann, 1988). It is loosely based on known properties of the mammalian visual system, performs unsupervised learning, and is claimed to be shiftinvariant. (A supervised version of the neocognitron also exists; however, our interest will be limited to the more popular self-organizing architecture.) Good performance on a task involving the recognition of one of the ten roman numerals with small shifts within a 16 x 16 image has been demonstrated (Fukushima, 1980). However, in a recent study, Menon and Heinemann (1988) found that the neocognitron did not perform satisfactorily when it had to discriminate between three somewhat larger objects with larger shifts in a 128 x 128 image. It was found that shift invariance could only be obtained by creating a model which simply responds to the total energy in the image. This is not acceptable for most applications of pattern recognition, and can certainly be implemented more straightforwardly than with a neocognitron. We show that the reason for the problems encountered by Menon and Heinemann is the lack of intrinsic shift invariance of the neocognitron. In section II a 12 x 12 model-neocognitron is used to show intuitively why the neocognitron fails to be an intrinsically shift-invariant pattern recognizer. The purpose of section II is to obtain an intuitive understanding of why the neocognitron might fail. Therefore, the model parameters used in section II are not necessarily realistic. The intuition gained in studying this artificial model is used in section III, where we show that these problems persist under more general conditions. How-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How Does Our Visual System Achieve Shift and Size Invariance?

The question of shift and size invariance in the primate visual system is discussed. After a short review of the relevant neurobiology and psychophysics, a more detailed analysis of computational models is given. The two main types of networks considered are the dynamic routing circuit model and invariant feature networks, such as the neocognitron. Some specific open questions in context of the...

متن کامل

A Comparison of Transfer Functions for Feature Extracting Layers in the Neocognitron

Many kinds of artiicial neural networks have been applied to the recognition of handwritten characters. Fukushima's neocognitron is one of few networks that demonstrates invariance to input translation and tolerance of a signiicant degree of input distortion and deformation. This paper shows that the behaviour of the neocognitron is dependent upon the form of non-linearity used by the feature e...

متن کامل

Neocognitron for rotated pattern recognition

Ideally computer pattern recognition systems should be insensitive to scaling, translation, distortion and rotation. Many neural network models have been proposed to address this purpose. The Neocognitron is a multi-layered neural network model for pattern recognition introduced by Fukushima in the early 1980s. It was considered effective and, after supervised learning, it can recognise input p...

متن کامل

Neocognitron's Parameter Tuning by Genetic Algorithms

The further study on the sensitivity analysis of Neocognitron is discussed in this paper. Fukushima's Neocognitron is capable of recognizing distorted patterns as well as tolerating positional shift. Supervised learning of the Neocognitron is fulfilled by training patterns layer by layer. However, many parameters, such as selectivity and receptive fields are set manually. Furthermore, in Fukush...

متن کامل

Evaluation of Two Neocognitron-type Models for Recognition of Rotated Patterns

We examine the number of cells and execution time taken to correctly recognize rotated patterns in two models: a rotation-invariant neocognitron (RNeocognitron) and a neocognitron-type model (TDR-Neocognitron) which recognizes rotated patterns by use of an associative recalled pattern. In numerical simulations handwritten patterns in CEDER database are used for training and evaluation of recogn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural Networks

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1990