Counter-based delay of dependent thread group execution
US7526634B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 27, 2006 |
| Grant date | Apr 28, 2009 |
| Priority date | — |
| Expiry date | May 14, 2027 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2209/548
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods for synchronizing processing work performed by threads, cooperative thread arrays (CTAs), or “sets” of CTAs. A central processing unit can load launch commands for a first set of CTAs and a second set of CTAs in a pushbuffer, and specify a dependency of the second set upon completion of execution of the first set. A parallel or graphics processor (GPU) can autonomously execute the first set of CTAs and delay execution of the second set of CTAs until the first set of CTAs is complete. In some embodiments the GPU may determine that a third set of CTAs is not dependent upon the first set, and may launch the third set of CTAs while the second set of CTAs is delayed. In this manner, the GPU may execute launch commands out of order with respect to the order of the launch commands in the pushbuffer.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.