Neural network layer processing with scaled quantization
US11544521B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Feb 25, 2019 |
| Grant date | Jan 3, 2023 |
| Priority date | — |
| Expiry date | Nov 5, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/0495
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Processors and methods for neural network processing are provided. A method includes receiving a subset of data corresponding to a layer of a neural network. The method further includes prior to performing any matrix operations using the subset of the data, scaling the subset of the data by a scaling factor to generate a scaled subset of data. The method further includes quantizing the scaled subset of the data to generate a scaled and quantized subset of data. The method further includes performing the matrix operations using the scaled and quantized subset of the data to generate a subset of results of the matrix operations. The method further includes descaling the subset of the results of the matrix operations, by multiplying the subset of the results of the matrix operations with an inverse of the scaling factor, to generate a descaled subset of results of the matrix operations.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.