Information Theory of a Multilayer Neural Network with Discrete Weights
نویسنده
چکیده
Statistical mechanics is applied to estimate the maximal capacity per weight (a,) of a two-layer feed-forward network with discrete weights of depth 1, functioning as a parity machine of the K hidden units. For each K and lSZo(K'), the maximal theoretical capacity a, = 10g2(2Z) is achieved, the capacity per bit is 1, the average overlap between different solutions is zero and Zo(K) logK for large K. At finite temperature, a one-step replica symmetry-breaking solution is found to be exact for I < Zo(K). In the recent past, statistical-mechanical methods have been applied to investigate the properties of neural networks. Among these systems, the class of multilayer networks plays an important role[l]. The prototype of this class of architecture is the one-layer perceptron [l], consisting of one input layer of N binary units and one-binary output unit. Various statistical mechanical properties of the one-layer perceptron such as the maximal capacity and the generalization ability of the network have been recently investigated by using the pioneering work of Elizabeth Gardner [2,3]. However, the computational capability of a one-layer perceptron is limited, since it cannot solve nonseparable problems. Furthermore, multilayer networks may play an important role in the functioning of biological systems. In particular, it is important to examine the advantages of multilayer networks over the perceptron with respect to quantities such as storage capacity per weight and the effect of synaptic depth (the resolution of the synaptic strength). Furthermore, the applicability of neural networks to biology and to the construction of real devices requires the understanding of the interplay between the synaptic depth and the properties of the network. The effect of the synaptic depth on the system is crucial, since the implementation of a deeper synaptic depth is much more difficult and expensive in real neural network circuits [4]. Hence, an optimal depth should be defined and predicted for each task of the network. In this letter a two-layer feed-forward network is studied using a statistical-mechanical approach. The architecture of the network consists of N binary input units, one hidden layer with K continuous or discrete units and a single-binary output unit. The input units are divided into K equal disjoint sets, each one consisting of NIK units. The j ' s hidden unit is connected only to the i-th input via a weight of depth 1, J , = +1, +2, ..., k I, such that N ( j 1)/K < i < Nj/K. The configuration of the input is denoted by {st} , i = 1, .. . , N, with 182 EUROPHYSICS LETTERS si = k 1. The state of the j-th hidden unit is equal to its induced local field
منابع مشابه
Comparison of Artificial Neural Network and Multiple Regression Analysis for Prediction of Fat Tail Weight of Sheep
A comparative study of artificial neural network (ANN) and multiple regression is made to predict the fat tail weight of Balouchi sheep from birth, weaning and finishing weights. A multilayer feed forward network with back propagation of error learning mechanism was used to predict the sheep body weight. The data (69 records) were randomly divided into two subsets. The first subset is the train...
متن کاملDiscrete All-positive Multilayer Perceptrons for Optical Implementation Discrete All-positive Multilayer Perceptrons for Optical Implementation
All-optical multilayer perceptrons diier in various ways from the ideal neural network model. Examples are the use of non-ideal activation functions which are truncated, asymmetric, and have a non-standard gain, restriction of the network parameters to non-negative values, and the limited accuracy of the weights. In this paper, a backpropagation-based learning rule is presented that compensates...
متن کاملDesign of a Neural Network Based Svc Controller
A controller to control the output of a Static Var Compensator (SVC) to damp power system oscillations is developed in this paper. The proposed SVC controller is based on the discrete time filtered direct control theory by which a multilayer neural network with the hyperbolic tangent activation function is derived. Advanced weight tuning algorithm based on a modified delta rule and projection a...
متن کاملEvaluation of monitoring network density using discrete entropy theory
The regional evaluation of monitoring stations for water resources can be of great importance due to its role in finding appropriate locations for stations, the maximum gathering of useful information and preventing the accumulation of unnecessary information and ultimately reducing the cost of data collection. Based on the theory of discrete entropy, this study analyzes the density of rain gag...
متن کاملNeural Network Modeling for Small Datasets
Neural network modeling for small datasets can be justified from a theoretical point of view according to some of Bartlett’s results showing that the generalization performance of a multilayer perceptron (MLP) depends more on the L1 norm ‖c‖1 of the weights between the hidden layer and the output layer rather than on the total number of weights. In this article we investigate some geometrical p...
متن کاملCystoscopic Image Classification Based on Combining MLP and GA
In the past three decades, the use of smart methods in medical diagnostic systems has attracted the attention of many researchers. However, no smart activity has been provided in the field of medical image processing for diagnosis of bladder cancer through cystoscopy images despite the high prevalence in the world. In this paper, a multilayer neural network was applied to clas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006