Patent · US Active

Sparse machine learning acceleration

US12254398B2 · kind B2 · utility

0Cited by

8References

19Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Kun Xu · Sunol, US
Ron Diamant · Santa Clara, US
Patricio Kaplan · Palo Alto, US

Key dates

Filing date	Mar 30, 2021
Grant date	Mar 18, 2025
Priority date	—
Expiry date	Jan 18, 2044

Classification

Technology area (CPC H)Electricity
CPC primaryH03M7/6005
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

To reduce the storage size of weight tensors and speed up loading of weight tensors from system memory, a compression technique can be employed to remove zero values from a weight tensor before storing the weight tensor in system memory. A sparsity threshold can be enforced to achieve a compression ratio target by forcing small weight values to zero during training. When the weight tensor is loaded from system memory, a direct memory access (DMA) engine with an in-line decompression unit can decompress the weight tensor on-the-fly. By performing the decompression in the DMA engine, expansion of the weight values back to the original weight tensor size can be carried out in parallel while other neural network computations are being performed by the processing unit.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.