Sparse machine learning acceleration
US12254398B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 30, 2021 |
| Grant date | Mar 18, 2025 |
| Priority date | — |
| Expiry date | Jan 18, 2044 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/6005
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
To reduce the storage size of weight tensors and speed up loading of weight tensors from system memory, a compression technique can be employed to remove zero values from a weight tensor before storing the weight tensor in system memory. A sparsity threshold can be enforced to achieve a compression ratio target by forcing small weight values to zero during training. When the weight tensor is loaded from system memory, a direct memory access (DMA) engine with an in-line decompression unit can decompress the weight tensor on-the-fly. By performing the decompression in the DMA engine, expansion of the weight values back to the original weight tensor size can be carried out in parallel while other neural network computations are being performed by the processing unit.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.