Patent · US Active

Method for automatic hybrid quantization of deep artificial neural networks

US12205011B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

Deep Vision Inc. · US

Inventors

Wajahat Qadeer · Campbell, US
Rehan Hameed · Palo Alto, US
Satyanarayana Raju Uppalapati · Hyderabad, IN
Abhilash Bharath Ghanore · Hyderabad, IN
Kasanagottu Sai Ram · Hyderabad, IN

Key dates

Filing date	Aug 9, 2023
Grant date	Jan 21, 2025
Priority date	—
Expiry date	Aug 9, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/045
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes, for each floating-point layer in a set of floating-point layers: calculating a set of input activations and a set of output activations of the floating-point layer; converting the floating-point layer to a low-bit-width layer; calculating a set of low-bit-width output activations based on the set of input activations; and calculating a per-layer deviation statistic of the low-bit-width layer. The method also includes ordering the set of low-bit-width layers based on the per-layer deviation statistic of each low-bit-width layer. The method additionally includes, while a loss-of-accuracy threshold exceeds the accuracy of the quantized network: converting a floating-point layer represented by the low-bit-width layer to a high-bit-width layer; replacing the low-bit-width layer with the high-bit-width layer in the quantized network; updating the accuracy of the quantized network; and, in response to the accuracy of the quantized network exceeding the loss-of-accuracy threshold, returning the quantized network.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.