Placement of compute and memory for accelerated deep learning
US12204954B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 29, 2020 |
| Grant date | Jan 21, 2025 |
| Priority date | — |
| Expiry date | Sep 28, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/01
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques in placement of compute and memory for accelerated deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element comprises a compute element to execute programmed instructions using the data and a router to route the wavelets. The routing is in accordance with virtual channel specifiers of the wavelets and controlled by routing configuration information of the router. A software stack determines placement of compute resources and memory resources based on a description of a neural network. The determined placement is used to configure the routers including usage of the respective colors. The determined placement is used to configure the compute elements including the respective programmed instructions each is configured to execute.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.