Patent · US Active

Adaptive quantization for execution of machine learning models

US11861467B2 · kind B2 · utility

1Cited by
1References
42Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 5, 2020
Grant dateJan 2, 2024
Priority date
Expiry dateMar 20, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/04
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.