Patent · US Active

Method for automatic hybrid quantization of deep artificial neural networks

US11763158B2 · kind B2 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 4, 2020
Grant dateSep 19, 2023
Priority date
Expiry dateJun 1, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/045
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method includes, for each floating-point layer in a set of floating-point layers: calculating a set of input activations and a set of output activations of the floating-point layer; converting the floating-point layer to a low-bit-width layer; calculating a set of low-bit-width output activations based on the set of input activations; and calculating a per-layer deviation statistic of the low-bit-width layer. The method also includes ordering the set of low-bit-width layers based on the per-layer deviation statistic of each low-bit-width layer. The method additionally includes, while a loss-of-accuracy threshold exceeds the accuracy of the quantized network: converting a floating-point layer represented by the low-bit-width layer to a high-bit-width layer; replacing the low-bit-width layer with the high-bit-width layer in the quantized network; updating the accuracy of the quantized network; and, in response to the accuracy of the quantized network exceeding the loss-of-accuracy threshold, returning the quantized network.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.