Patent · US Active

Neural network layer processing with scaled quantization

US11544521B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateFeb 25, 2019
Grant dateJan 3, 2023
Priority date
Expiry dateNov 5, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/0495
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Processors and methods for neural network processing are provided. A method includes receiving a subset of data corresponding to a layer of a neural network. The method further includes prior to performing any matrix operations using the subset of the data, scaling the subset of the data by a scaling factor to generate a scaled subset of data. The method further includes quantizing the scaled subset of the data to generate a scaled and quantized subset of data. The method further includes performing the matrix operations using the scaled and quantized subset of the data to generate a subset of results of the matrix operations. The method further includes descaling the subset of the results of the matrix operations, by multiplying the subset of the results of the matrix operations with an inverse of the scaling factor, to generate a descaled subset of results of the matrix operations.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.