Patent · US Active

Apparatus and method for adaptable and efficient lane-wise tensor processing

US11379229B2 · kind B2 · utility

0Cited by
5References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 7, 2020
Grant dateJul 5, 2022
Priority date
Expiry dateAug 7, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F12/0897
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.