Patent · US Active

Loss-scaling for deep neural network training with reduced precision

US11842280B2 · kind B2 · utility

0Cited by

0References

32Claims

0Family size

Assignee

NVIDIA Corporation · US

Inventors

Jonah M. Alben · San Jose, US
Paulius Micikevicius · Santa Clara, US
Hao Wu · Santa Clara, US

Key dates

Filing date	May 4, 2018
Grant date	Dec 12, 2023
Priority date	—
Expiry date	Jan 18, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/09
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In training a deep neural network using reduced precision, gradient computation operates on larger values without affecting the rest of the training procedure. One technique trains the deep neural network to develop loss, scales the loss, computes gradients at a reduced precision, and reduces the magnitude of the computed gradients to compensate for scaling of the loss. In one example non-limiting arrangement, the training forward pass scales a loss value by some factor S and the weight update reduces the weight gradient contribution by 1/S. Several techniques can be used for selecting scaling factor S and adjusting the weight update.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.