Patent · US Active

Neural network layer processing with scaled quantization

US11544521B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventor

Daniel Lo · Bothell, US

Key dates

Filing date	Feb 25, 2019
Grant date	Jan 3, 2023
Priority date	—
Expiry date	Nov 5, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/0495
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Processors and methods for neural network processing are provided. A method includes receiving a subset of data corresponding to a layer of a neural network. The method further includes prior to performing any matrix operations using the subset of the data, scaling the subset of the data by a scaling factor to generate a scaled subset of data. The method further includes quantizing the scaled subset of the data to generate a scaled and quantized subset of data. The method further includes performing the matrix operations using the scaled and quantized subset of the data to generate a subset of results of the matrix operations. The method further includes descaling the subset of the results of the matrix operations, by multiplying the subset of the results of the matrix operations with an inverse of the scaling factor, to generate a descaled subset of results of the matrix operations.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.