Patent · US Active

Fully-fused neural network execution

US11935179B2 · kind B2 · utility

1Cited by

2References

21Claims

0Family size

Assignee

NVIDIA Corporation · US

Inventors

Thomas Müller · Herzogenbuchsee, CH
Nikolaus Binder · Berlin, DE
Fabrice Rousselle · Ostermundigen, CH
Jan Novak · Meilen, CH
Alexander Keller · Berlin, DE

Key dates

Filing date	Mar 15, 2023
Grant date	Mar 19, 2024
Priority date	—
Expiry date	Mar 15, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06T2210/52
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network. The computational cost of fully-connected neural networks scale quadratically with its width, whereas its memory traffic scales linearly. Modern graphics processing units typically have much greater computational throughput compared with memory bandwidth, so that for narrow, fully-connected neural networks, the linear memory traffic is the bottleneck. The key to improving performance of the fully-connected neural network is to minimize traffic to slow “global” memory (off-chip memory and high-level caches) and to fully utilize fast on-chip memory (low-level caches, “shared” memory, and registers), which is achieved by the fully-fused approach. A real-time neural radiance caching technique for path-traced global illumination is implemented using the fully-fused neural network for caching scattered radiance components of global illumination.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.