Patent · US Active

Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

US10223762B2 · kind B2 · utility

3Cited by

3References

20Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Arash Ashari · Kirkland, US
Matthias Boehm · Neuendettelsau, DE
Keith W. Campbell · Ottawa, CA
Alexandre Evfimievski · San Jose, US
John D. Keenleyside · Pickering, CA
Berthold Reinwald · San Jose, US
Shirish Tatikonda · Santa Clara, US

Key dates

Filing date	Mar 16, 2018
Grant date	Mar 5, 2019
Priority date	—
Expiry date	Mar 16, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/00
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.