Patent · US Active

Distributed training of reinforcement learning systems

US11507827B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 14, 2019
Grant dateNov 22, 2022
Priority date
Expiry dateMar 1, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/098
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.