Patent · US Active

Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization

US11270187B2 · kind B2 · utility

3Cited by
0References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 7, 2018
Grant dateMar 8, 2022
Priority date
Expiry dateJan 7, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/09
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method is provided. The method includes selecting a neural network model, wherein the neural network model includes a plurality of layers, and wherein each of the plurality of layers includes weights and activations; modifying the neural network model by inserting a plurality of quantization layers within the neural network model; associating a cost function with the modified neural network model, wherein the cost function includes a first coefficient corresponding to a first regularization term, and wherein an initial value of the first coefficient is pre-defined; and training the modified neural network model to generate quantized weights for a layer by increasing the first coefficient until all weights are quantized and the first coefficient satisfies a pre-defined threshold, further including optimizing a weight scaling factor for the quantized weights and an activation scaling factor for quantized activations, and wherein the quantized weights are quantized using the optimized weight scaling factor.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.