Low latency neural network model loading
US11182314B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 27, 2019 |
| Grant date | Nov 23, 2021 |
| Priority date | — |
| Expiry date | Jan 29, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2213/40
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An integrated circuit device implementing a neural network accelerator may have a peripheral bus interface to interface with a host memory, and neural network models can be loaded from the host memory onto the state buffer of the neural network accelerator for execution by the array of processing elements. The neural network accelerator may also have a memory interface to interface with a local memory. The local memory may store neural network models from the host memory, and the models can be loaded from the local memory into the state buffer with reduced latency as compared to loading from the host memory. In systems with multiple accelerators, the models in the local memory can also be shared amongst different accelerators.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.