Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization
US11270187B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 7, 2018 |
| Grant date | Mar 8, 2022 |
| Priority date | — |
| Expiry date | Jan 7, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/09
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method is provided. The method includes selecting a neural network model, wherein the neural network model includes a plurality of layers, and wherein each of the plurality of layers includes weights and activations; modifying the neural network model by inserting a plurality of quantization layers within the neural network model; associating a cost function with the modified neural network model, wherein the cost function includes a first coefficient corresponding to a first regularization term, and wherein an initial value of the first coefficient is pre-defined; and training the modified neural network model to generate quantized weights for a layer by increasing the first coefficient until all weights are quantized and the first coefficient satisfies a pre-defined threshold, further including optimizing a weight scaling factor for the quantized weights and an activation scaling factor for quantized activations, and wherein the quantized weights are quantized using the optimized weight scaling factor.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.