Patent · US Active

Training giant neural networks using pipeline parallelism

US11232356B2 · kind B2 · utility

1Cited by

0References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Zhifeng Chen · Sunnyvale, US
Yanping Huang · Mountain View, US
Youlong Cheng · Mountain View, US
HyoukJoong Lee · Millbrae, US
Dehao Chen · Mountain View, US
Jiquan Ngiam · Mountain View, US

Key dates

Filing date	Aug 10, 2020
Grant date	Jan 25, 2022
Priority date	—
Expiry date	Aug 27, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/048
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.