Patent · US Active

Continual reinforcement learning with a multi-task agent

US12154029B2 · kind B2 · utility

1Cited by
0References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 5, 2019
Grant dateNov 26, 2024
Priority date
Expiry dateMay 22, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/098
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method of training an action selection neural network for controlling an agent interacting with an environment to perform different tasks is described. The method includes obtaining a first trajectory of transitions generated while the agent was performing an episode of the first task from multiple tasks; and training the action selection neural network on the first trajectory to adjust the control policies for the multiple tasks. The training includes, for each transition in the first trajectory: generating respective policy outputs for the initial observation in the transition for each task in a subset of tasks that includes the first task and one other task; generating respective target policy outputs for each task using the reward in the transition, and determining an update to the current parameter values based on, for each task, a gradient of a loss between the policy output and the target policy output for the task.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.