Architecture for dense operations in machine learning inference engine
US10896045B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 19, 2018 |
| Grant date | Jan 19, 2021 |
| Priority date | — |
| Expiry date | Jan 5, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A processing unit of an inference engine for machine learning (ML) includes a first, a second, and a third register, and a matrix multiplication block. The first register receives a first stream of data associated with a first matrix data that is read only once. The second register receives a second stream of data associated with a second matrix data that is read only once. The matrix multiplication block performs a multiplication operation based on data from the first register and the second register resulting in an output matrix. A row associated with the first matrix is maintained while rows associated with the second matrix is fed to the matrix multiplication block to perform a multiplication operation. The process is repeated for each row of the first matrix. The third register receives the output matrix from the matrix multiplication block and stores the output matrix.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.