Time-based memory allocation for neural network inference
US11610102B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 27, 2019 |
| Grant date | Mar 21, 2023 |
| Priority date | — |
| Expiry date | Sep 25, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/04
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for time-based memory allocation for a neural network inference are disclosed. A description of a neural network comprising a plurality of operations to be executed across a set of accelerators is received. A plurality of interconnect times at a plurality of partition points within the neural network are calculated. Each of the plurality of interconnect times corresponds to a duration of time for transferring an output feature map from one of the set of accelerators to another of the set of accelerators to be used as an input feature map. A partitioning scheme that divides the plurality of operations into a set of subgraphs is determined based on the plurality of interconnect times. Each of the set of subgraphs is assigned to a different accelerator of the set of accelerators in accordance with the partitioning scheme.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.