Patent · US Active

Cursor-based adaptive quantization for deep neural networks

US12039427B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignees

BAIDU USA LLC · US
BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD. · CN

Inventors

Baopu Li · Santa Clara, US
Yanwen Fan · Beijing, CN
Zhiyu Cheng · Sunnyvale, US
Yingze Bao · Beijing, CN

Key dates

Filing date	Sep 24, 2019
Grant date	Jul 16, 2024
Priority date	—
Expiry date	Jun 1, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06F16/28
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. Presented herein are novel cursor-based adaptive quantization embodiments. In embodiments, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. In embodiments, the cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. In embodiments, a new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.