Patent · US Active

Fully-fused neural network execution

US11935179B2 · kind B2 · utility

1Cited by
2References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 15, 2023
Grant dateMar 19, 2024
Priority date
Expiry dateMar 15, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T2210/52
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network. The computational cost of fully-connected neural networks scale quadratically with its width, whereas its memory traffic scales linearly. Modern graphics processing units typically have much greater computational throughput compared with memory bandwidth, so that for narrow, fully-connected neural networks, the linear memory traffic is the bottleneck. The key to improving performance of the fully-connected neural network is to minimize traffic to slow “global” memory (off-chip memory and high-level caches) and to fully utilize fast on-chip memory (low-level caches, “shared” memory, and registers), which is achieved by the fully-fused approach. A real-time neural radiance caching technique for path-traced global illumination is implemented using the fully-fused neural network for caching scattered radiance components of global illumination.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.