Patent · US Active

Separate quantization method of forming combination of 4-bit and 8-bit data of neural network

US11531884B2 · kind B2 · utility

0Cited by
2References
11Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 27, 2019
Grant dateDec 20, 2022
Priority date
Expiry dateOct 20, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/063
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A separate quantization method of forming a combination of 4-bit and 8-bit data of a neural network is disclosed. When a training data set and a validation data set exist, a calibration manner is used to determine a threshold for activations of each of a plurality of layers of a neural network model, so as to determine how many of the activations to perform 8-bit quantization. In a process of weight quantization, the weights of each layer are allocated to 4-bit weights and 8-bit weights according to a predetermined ratio, so as to make the neural network model have a reduced size and a combination of 4-bit and 8-bit weights.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.