Patent · US Active

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

US10223762B2 · kind B2 · utility

3Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 16, 2018
Grant dateMar 5, 2019
Priority date
Expiry dateMar 16, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.