Patent · US Active

Separate quantization method of forming combination of 4-bit and 8-bit data of neural network

US11531884B2 · kind B2 · utility

0Cited by

2References

11Claims

0Family size

Assignee

National Chiao Tung University · TW

Inventors

Tien-Fu Chen · Hsinchu, TW
Chien-Chih Chen · Hsinchu, TW
Jing CHEN · Tianjin, CN

Key dates

Filing date	Sep 27, 2019
Grant date	Dec 20, 2022
Priority date	—
Expiry date	Oct 20, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/063
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A separate quantization method of forming a combination of 4-bit and 8-bit data of a neural network is disclosed. When a training data set and a validation data set exist, a calibration manner is used to determine a threshold for activations of each of a plurality of layers of a neural network model, so as to determine how many of the activations to perform 8-bit quantization. In a process of weight quantization, the weights of each layer are allocated to 4-bit weights and 8-bit weights according to a predetermined ratio, so as to make the neural network model have a reduced size and a combination of 4-bit and 8-bit weights.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.