Mixed Precision Training

نویسندگان

  • Paulius Micikevicius
  • Sharan Narang
  • Jonah Alben
  • Gregory Frederick Diamos
  • Erich Elsen
  • David Garcia
  • Boris Ginsburg
  • Michael Houston
  • Oleksii Kuchaiev
  • Ganesh Venkatesh
  • Hao Wu
چکیده

Increasing the size of a neural network typically improves accuracy but also increases the memory and compute requirements for training the model. We introduce methodology for training deep neural networks using half-precision floating point numbers, without losing model accuracy or having to modify hyperparameters. This nearly halves memory requirements and, on recent GPUs, speeds up arithmetic. Weights, activations, and gradients are stored in IEEE halfprecision format. Since this format has a narrower range than single-precision we propose three techniques for preventing the loss of critical information. Firstly, we recommend maintaining a single-precision copy of weights that accumulates the gradients after each optimizer step (this copy is rounded to half-precision for the forwardand back-propagation). Secondly, we propose loss-scaling to preserve gradient values with small magnitudes. Thirdly, we use half-precision arithmetic that accumulates into single-precision outputs, which are converted to halfprecision before storing to memory. We demonstrate that the proposed methodology works across a wide variety of tasks and modern large scale (exceeding 100 million parameters) model architectures, trained on large datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only A...

متن کامل

Getting nowhere fast: trade-off between speed and precision in training to execute image-guided hand-tool movements

BACKGROUND The speed and precision with which objects are moved by hand or hand-tool interaction under image guidance depend on a specific type of visual and spatial sensorimotor learning. Novices have to learn to optimally control what their hands are doing in a real-world environment while looking at an image representation of the scene on a video monitor. Previous research has shown slower t...

متن کامل

A survey on the comparison between precision and traditional agriculture by budgeting method

The present study was conducted to compare precision and traditional agriculture by budgeting technique. Its statistical population consists of 210 experts in agricultural jihad organization of Qom province. The validity of Questionnaire as research tool ware confirmed by professors while its reliability was corroborated by Cranach’s alpha to 0.78-0.94 intervals. According to the findings, ther...

متن کامل

Age-Related Decline of Wrist Position Sense and its Relationship to Specific Physical Training

Perception of limb and body positions is known as proprioception. Sensory feedback, especially from proprioceptive receptors, is essential for motor control. Aging is associated with a decline in position sense at proximal joints, but there is inconclusive evidence of distal joints being equally affected by aging. In addition, there is initial evidence that physical activity attenuates age-rela...

متن کامل

Mixed-precision training of deep neural networks using computational memory

Deep neural networks have revolutionized the field of machine learning by providing unprecedented human-like performance in solving many real-world problems such as image and speech recognition. Training of large DNNs, however, is a computationally intensive task, and this necessitates the development of novel computing architectures targeting this application. A computational memory unit where...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.03740  شماره 

صفحات  -

تاریخ انتشار 2017