Patent · US Active

Dataflow all-reduce for reconfigurable processor systems

US11237880B1 · kind B1 · utility

20Cited by
6References
22Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 19, 2021
Grant dateFeb 1, 2022
Priority date
Expiry dateJul 19, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/048
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Roughly described, a system for data parallel training of a neural network on multiple reconfigurable units configured by a host with dataflow pipelines to perform different steps in the training CGRA units are configured to evaluate first and second sequential sections of neural network layers based on a respective subset of training data, and to back-propagate the error through the sections to calculate parameter gradients for the respective subset. Gradient synchronization and reduction are performed by one or more units having finer grain reconfigurability, such as an FPGA. The FPGA performs synchronization and reduction of the gradients for the second section while the CGRA units perform back-propagation through the first sequential section. Intermediate results are transmitted using a P2P message passing protocol layer. Execution of dataflow segments in the different units is triggered by receipt of data, rather than by a command from any host system.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.