Graphic processor unit topology-aware all-reduce operation
US10909651B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 8, 2018 |
| Grant date | Feb 2, 2021 |
| Priority date | — |
| Expiry date | Aug 8, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F7/76
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer-implemented topology-aware all-reduce method for an environment including a plurality of systems is provided. Each system of the systems includes a plurality of computing modules. The computer-implemented topology-aware all-reduce method according to aspects of the invention includes locally partitioning and scattering data slices among the computing modules of each system to produce local summation results. The local summation results are copied from the computing modules to corresponding host memories of the f systems. A cross system all-reduce operation is executed among the systems to cause an exchange of the local summation results across the host memories and a determination of final summation partitions from the local summation results. The final summation partitions are copied from the host memories to the corresponding computing modules of each system. The final summation partitions are forwarded to all graphical processing units (GPUs) to cause a determination of final summation results therefrom.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.