Patent · US Active

Time-based memory allocation for neural network inference

US11610102B1 · kind B1 · utility

3Cited by
1References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 27, 2019
Grant dateMar 21, 2023
Priority date
Expiry dateSep 25, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/04
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques for time-based memory allocation for a neural network inference are disclosed. A description of a neural network comprising a plurality of operations to be executed across a set of accelerators is received. A plurality of interconnect times at a plurality of partition points within the neural network are calculated. Each of the plurality of interconnect times corresponds to a duration of time for transferring an output feature map from one of the set of accelerators to another of the set of accelerators to be used as an input feature map. A partitioning scheme that divides the plurality of operations into a set of subgraphs is determined based on the plurality of interconnect times. Each of the set of subgraphs is assigned to a different accelerator of the set of accelerators in accordance with the partitioning scheme.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.