Neural fuzzy motion estimation and compensation

نویسندگان

  • Hyun Mun Kim
  • Bart Kosko
چکیده

Neural fuzzy systems can improve motion estimation and compensation for video compression. Motion estimation and compensation are key parts of video compression. They help remove temporal redundancies in images. But most motion estimation algorithms neglect the strong temporal correlations of the motion field. The search windows stay the same through the image sequences and the estimation needs heavy computation. A neural vector quantizer system can use the temporal correlation of the motion field to estimate the motion vectors. Firstand second-order statistics of the motion vectors give ellipsoidal search windows. This algorithm reduced the search area and entropy and gave clustered motion fields. Motion-compensated video coding further assumes that each block of pixels moves with uniform translational motion. This often does not hold and can produce block artifacts. We use a neural fuzzy system to compensate for the overlapped block motion. This fuzzy system uses the motion vectors of neighboring blocks to map the prior frame’s pixel values to the current pixel value. The neural fuzzy system used 196 rules that came from the prior decoded frame. The fuzzy system learns and updates its rules as it decodes the image. The fuzzy system also improved the compensation accuracy. The Appendix derives both the fuzzy system and the neural learning laws that tune its parameters. I. MPEG STANDARDS FOR VIDEO COMPRESSION THIS PAPER presents new schemes for motion estimation and compensation based on neural fuzzy systems. Motion estimation and compensation help compress video images because they can remove temporal redundancies in the image data. Motion estimation schemes often neglect the strong temporal correlations of the motion field. The search windows remain the same through the image sequences and the estimation may need heavy computation. We designed an unsupervised neural system that uses the temporal correlation of the motion field to estimate the motion vectors and to reduce the entropy of source coding. Motion-compensated video coding uses the motion of objects in the scene to relate the intensity of each pixel in the current frame to the intensity of some pixel in a prior frame. It predicts the value of the entire current block of pixels as the value of a displaced block from the prior frame. It also assumes that each block of pixels moves with uniform translational motion. This assumption often does not hold and can produce block artifacts. We designed a neural-fuzzy system that uses motion vectors of neighboring blocks to improve the compensation accuracy. Manuscript received February 7, 1996; revised April 2, 1997. The associate editor coordinating the review of this paper and approving it for publication was Dr. Michael Zervakis. The authors are with the Department of Electrical Engineering—Systems, Signal, and Image Processing Institute, University of Southern California, Los Angeles, CA 90089-2564 USA. Publisher Item Identifier S 1053-587X(97)07359-5. Fig. 1 shows the typical structure of the Moving Picture Experts Group (MPEG) encoder. The MPEG standard depends on two basic algorithms. Motion-compensated coding uses block-based motion vector estimation and compensation to remove temporal redundancies. Block discrete cosine transforms reduce spatial redundancy. The MPEG standard defines and forms the bit-stream syntax to achieve interoperability among different blocks. Standards improve interoperability among video systems and help speed the development of high-volume low-cost hardware and software solutions [7]. Most current research in video compression seeks new algorithms or designs high-performance encoders that work with existing standards. These standards give a bit-stream syntax and a decoder and thus allow some flexibility in how one designs a compatible encoder. The MPEG standards do not give a motion estimation algorithm or a rate-control mechanism. This leaves manufacturers free to use the flexibility of the syntax. Our neural quantizer system uses the firstand secondorder statistics of the motion vectors to give ellipsoidal search windows. This method reduces the search area and gives clustered motion fields. It reduces the computation for motion estimation and decreases the entropy that the system needs to transmit the entropy-coded motion vectors. We also propose a neural fuzzy overlapped block motion compensation (FOBMC) scheme for motion compensation. Fuzzy systems use a set of if-then rules to map inputs to outputs. Neural fuzzy systems learn the rules from data and tune the rules with new data. The FOBMC estimates each pixel intensity using the block-based motion vectors available to the decoder. The fuzzy system uses the motion vectors of neighboring blocks to map the prior frame’s pixel values to the current pixel value. The 196 rules come from the prior decoded frame. The neural fuzzy system tunes its rules as it decodes the image. The fuzzy system defined a nonlinear “black box” function approximator that improved the compensation accuracy. The Appendix derives the supervised learning laws that tune the parameters of the fuzzy system. II. MOTION ESTIMATION AND COMPENSATION This section briefly reviews the standard techniques of motion estimation and compensation. A. Motion Estimation Motion estimation occurs in many areas of image processing. Video coding schemes often exploit the high temporal redundancy between successive frames in a sequence by predicting the current frame from the prior frame based on an estimated motion field. Then, the schemes code and 1053–587X/97$10.00  1997 IEEE 2516 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 10, OCTOBER 1997 Fig. 1. Block diagram of the Moving Picture Experts Group (MPEG) encoder. transmit the prediction error image. The schemes may also need to transmit the motion field if the motion estimation algorithm uses information that the receiver does not have. The prediction error often contains much less information than does the original frame if the motion estimates are accurate. The MPEG standard uses three types of pictures that depend on the mode of motion prediction. The intra (I) picture serves as the reference picture for prediction. Block discrete cosine transforms (DCT’s) code the intra pictures, and no motion estimation prevents long range error propagation. Coding the predicted (P) pictures uses forward prediction of motion. We divide each image into macroblocks of size pixels and search blocks of the same size in the prior reference I frame or P frame. A second type of picture is the bidirectional interpolated (B) picture. We perform both forward and backward motion prediction with respect to the prior or future reference I or P frames. Averaging these two predictions gives the interpolation. Bidirectional interpolation can handle just covered or uncovered areas since the system cannot predict an area just uncovered from the past reference. The system can still predict the areas from the future reference frame. The one that has the smallest mean-square error among the forward, backward, and interpolated prediction gives the best motion prediction. The encoding and decoding orders of video sequences can differ from that of the original sequences due to the three types of frames. Therefore, the th and th frames follow the th frame as in Fig. 2. The decoder needs to reorder the frames to display them. The two main types of motion estimation use pel-recursive algorithms or block matching algorithms [23]. Pel-recursive algorithms predict the motion field at the decoder based on how neighboring pixels decoded in the current frame relate to pixels in the prior frame. Block-based motion estimation derives from the need for relatively accurate motion fields while keeping low the side information one needs to represent the motion vectors. Image sequence coding often uses full-search block matching among the block-based motion estimation techniques. This scheme is simple and easy to implement in Fig. 2. Ordering of video sequences in MPEG. hardware. Exhaustive search within a maximum displacement range leads to the absolute minimum for the energy of the prediction error and is optimal in this sense. This acts as a type of codeword search in vector quantization (VQ) [10]. VQ finds a codeword from the codebook that minimizes some criteria such as mean-squared error (MSE). It locates the minimum for the energy of the prediction error and tends to have a heavy computational load. Accurate modeling of the motion field becomes more important under the constraint of a very low bit rate [21]. Here, however, full block search technique tends to produce noisy motion fields that do not correspond to the true 2-D motion in the scene. Noises in real video images can also affect the locations of the smallest distortion. Noise gives rise to a blocky effect in motion-compensated prediction images and has no physical meaning in terms of the estimated motion vectors. These artificial discontinuities lead to an increase of the side information to transmit the entropy-coded motion vectors. A decrease in this side information while keeping the same accuracy for the motion fields improves low bit rate applications [5]. Therefore, we propose a new adaptive scheme KIM AND KOSKO: NEURAL FUZZY MOTION ESTIMATION AND COMPENSATION 2517 to estimate motion vectors that have spatial consistency. The scheme uses the temporal correlation of the motion field to reduce the computation and to give a clustered motion field. B. Motion Compensation Motion-compensated video coding relates the intensity of each pixel in the current frame to the intensity of some pixel in a prior frame. It links these pixels by predicting the motion of objects in the scene. However, the transmission overhead needed to inform the decoder of the true motion at every pixel in the image may far outweigh the gains of motion compensation. Motion compensation assigns only one motion vector to each square (often a -pixel) block in the frame. The encoder selects this motion vector to minimize the meansquared prediction error. It predicts the value of the entire current block of pixels by the value of a displaced block from the prior frame. Therefore, it assumes that each block of pixels moves with uniform translational motion. This assumption often does not hold and can produce block artifacts. Orchard and Sullivan [22], proposed overlapped block motion compensation (OBMC) to overcome this problem. This linear scheme estimates each pixel intensity using the blockbased motion vectors available to the decoder. It predicts the current frame of a sequence by repositioning overlapping blocks of pixels from the prior frame. Then, it computes the coefficients of the linear estimator by solving the normal equations of least squares, but this scheme has at least two problems. The coefficients computed from the training sequences may not work well for the test sequences, and the coefficient calculation is computationally heavy and the decoder must store these values. We propose a fuzzy overlapped block motion compensation (FOBMC). A fuzzy rule-based system estimates pixel intensities using the block-based motion vectors available to the decoder. Fuzzy systems compute a model-free conditional mean [16], [20], [24] and thus compute a least-mean-square nonlinear estimate of the random variable based on our knowledge of the random vector . The FOBMC system uses the conditional mean to predict each pixel intensity. It uses the motion vectors of neighboring blocks to map the prior frame’s pixel values to the current pixel value. This has at least two advantages. The rules come from the prior decoded frame, and the neural fuzzy system tunes its rules as it decodes the image. Simulation results showed that the FOBMC improved the compensation accuracy. This method also shows how to insert expert knowledge into the compensation process. III. ADDITIVE FUZZY SYSTEMS AND LEARNING This section reviews the standard additive model (SAM) fuzzy system and how SAM’s learn with and without supervision. The Appendix derives the ratio structure of the SAM and supervised learning laws that tune its parameters. A. Additive Fuzzy Systems A fuzzy system stores rules of the word from “If , then ” or the patch form . The if-part fuzzy sets and then-part fuzzy sets have set functions and . The system can use the joint set function [11] or some factored form such as , or , or any other conjunctive form for input vector . An additive fuzzy system [17]–[20] sums the “fired” then part sets

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A neural fuzzy system for image motion estimation

Many methods for computing optical ow (image motion vector) have been proposed while others continue to appear. Block-matching methods are widely used because of their simplicity and easy implementation. The motion vector is uniquely de ned, in block-matching methods, by the best t of a small reference subblock from a previous image frame in a larger, search region from the present image frame....

متن کامل

Image Backlight Compensation Using Recurrent Functional Neural Fuzzy Networks Based on Modified Differential Evolution

In this study, an image backlight compensation method using adaptive luminance modification is proposed for efficiently obtaining clear images.The proposed method combines the fuzzy C-means clustering method, a recurrent functional neural fuzzy network (RFNFN), and a modified differential evolution.The proposed RFNFN is based on the two backlight factors that can accurately detect the compensat...

متن کامل

Gyroscope Random Drift Modeling, using Neural Networks, Fuzzy Neural and Traditional Time- series Methods

In this paper statistical and time series models are used for determining the random drift of a dynamically Tuned Gyroscope (DTG). This drift is compensated with optimal predictive transfer function. Also nonlinear neural-network and fuzzy-neural models are investigated for prediction and compensation of the random drift. Finally the different models are compared together and their advantages a...

متن کامل

Digital Image Stabilization Using a Functional Neural Fuzzy Network

This study proposes a real-time video stabilization method to eliminate unwanted vibration, preserve the intended movement of camera, and improve the stability of the captured video sequence. The proposed method uses a functional neural fuzzy network to learn the characteristics of different vibrations and then choose the adequate compensation weight for two different methods to calculate the c...

متن کامل

Fuzzy Direct Torque-controlled Induction Motor Drives for Traction with Neural Compensation of Stator Resistance

In this chapter, a new method for stator resistance compensation in direct torque control (DTC) drives, based on neural networks, is presented. The estimation of electromagnetic torque and stator flux linkages using the measured stator voltages and currents is crucial to the success of DTC drives. The estimation is dependent only on one machine parame‐ ter, which is the stator resistance. Chang...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Signal Processing

دوره 45  شماره 

صفحات  -

تاریخ انتشار 1997