Patent · US Active

Adaptive quantization and mixed precision in a network

US11507823B2 · kind B2 · utility

0Cited by
0References
13Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 10, 2019
Grant dateNov 22, 2022
Priority date
Expiry dateSep 22, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2207/4824
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method of adaptive quantization for a convolutional neural network, includes at least one of receiving an acceptable model accuracy, determining a float value multiply accumulate for the layer based on a float value weight and a float value input, quantizing the float value weight at multiple weight quantization precisions, quantizing the float value input at multiple input quantization precisions, determining a multiply accumulate at multiple multiply accumulate quantization precisions based on the weight quantization precisions and the input quantization precisions, determining multiple quantization errors based on differences between the float value multiply accumulate and the multiple multiply accumulate quantization precisions and selecting one of the multiple weight quantization precisions, one of the multiple input quantization precisions and one of the multiple multiply accumulate quantization precisions based on the predetermined acceptable model accuracy and the multiple quantization errors.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.