Techniques for efficiently performing data reductions in parallel processing units
US11061741B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 16, 2019 |
| Grant date | Jul 13, 2021 |
| Priority date | — |
| Expiry date | Nov 26, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2209/521
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques are disclosed for reducing the latency associated with performing data reductions in a multithreaded processor. In response to a single instruction associated with a set of threads executing in the multithreaded processor, a warp reduction unit acquires register values stored in source registers, where each register value is associated with a different thread included in the set of threads. The warp reduction unit performs operation(s) on the register values to compute an aggregate value. The warp reduction unit stores the aggregate value in a destination register that is accessible to at least one of the threads in the set of threads. Because the data reduction is performed via a single instruction using hardware specialized for data reductions, the number of cycles required to perform the data reduction is decreased relative to prior-art techniques that are performed via multiple instructions using hardware that is not specialized for data reductions.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.