Patent · US Active

System with hybrid communication strategy for large-scale distributed deep learning

US11106998B2 · kind B2 · utility

0Cited by
0References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 16, 2017
Grant dateAug 31, 2021
Priority date
Expiry dateJun 23, 2040

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L67/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer in a distributed computing system is disclosed. The computer includes: a graphics processing unit (GPU) memory; a central processing unit (CPU) memory comprising a Key-Value Store (KVS) module; an execution engine module configured to run a deep learning (DL) program to create a plurality of operator graph layers in the graphics processing unit memory; a client library module configured to create a GPU-CPU synchronization (GCS) module for each of the plurality of operator graph layers; a coordination service module configured to compute network cost of a first and a second communication scheme and select, based on the network cost, one of the first and second communication scheme for transmitting data associated with one of the plurality of operator graph layers from a corresponding GCS module.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.