Patent · US Active

Reinforcement learning with auxiliary tasks

US10956820B2 · kind B2 · utility

2Cited by

0References

20Claims

0Family size

Assignee

DeepMind Technologies Limited · GB

Inventors

Volodymyr Mnih · Toronto, CA
Wojciech Czarnecki · London, GB
Maxwell Elliot Jaderberg · London, GB
Tom Schaul · London, GB
David Silver · Hitchin, GB
Koray Kavukcuoglu · London, GB

Key dates

Filing date	May 3, 2019
Grant date	Mar 23, 2021
Priority date	—
Expiry date	May 3, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/00
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. The method includes: training an action selection policy neural network, and during the training of the action selection neural network, training one or more auxiliary control neural networks and a reward prediction neural network. Each of the auxiliary control neural networks is configured to receive a respective intermediate output generated by the action selection policy neural network and generate a policy output for a corresponding auxiliary control task. The reward prediction neural network is configured to receive one or more intermediate outputs generated by the action selection policy neural network and generate a corresponding predicted reward. Training each of the auxiliary control neural networks and the reward prediction neural network comprises adjusting values of the respective auxiliary control parameters, reward prediction parameters, and the action selection policy network parameters.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.