Patent · US Active

Continual reinforcement learning with a multi-task agent

US12154029B2 · kind B2 · utility

1Cited by

0References

16Claims

0Family size

Assignee

DeepMind Technologies Limited · GB

Inventors

Tom Schaul · London, GB
Matteo Hessel · London, GB
Hado Philip van Hasselt · London, GB
Daniel J. Mankowitz · St Albans, GB

Key dates

Filing date	Feb 5, 2019
Grant date	Nov 26, 2024
Priority date	—
Expiry date	May 22, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/098
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method of training an action selection neural network for controlling an agent interacting with an environment to perform different tasks is described. The method includes obtaining a first trajectory of transitions generated while the agent was performing an episode of the first task from multiple tasks; and training the action selection neural network on the first trajectory to adjust the control policies for the multiple tasks. The training includes, for each transition in the first trajectory: generating respective policy outputs for the initial observation in the transition for each task in a subset of tasks that includes the first task and one other task; generating respective target policy outputs for each task using the reward in the transition, and determining an update to the current parameter values based on, for each task, a gradient of a loss between the policy output and the target policy output for the task.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.