Patent · US Active

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

US9972063B2 · kind B2 · utility

8Cited by

2References

20Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Arash Ashari · Kirkland, US
Matthias Boehm · Neuendettelsau, DE
Keith W. Campbell · Ottawa, CA
Alexandre Evfimievski · San Jose, US
John D. Keenleyside · Pickering, CA
Berthold Reinwald · San Jose, US
Shirish Tatikonda · Santa Clara, US

Key dates

Filing date	Jul 30, 2015
Grant date	May 15, 2018
Priority date	—
Expiry date	Mar 28, 2036

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/00
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.