Patent · US Active

Task prioritized experience replay algorithm for reinforcement learning

US12277194B2 · kind B2 · utility

0Cited by

0References

18Claims

0Family size

Assignee

SONY GROUP CORPORATION · JP

Inventors

Varun Kompella · Aachen, DE
James MacGlashan · Luthers Corners, US
Peter R. Wurman · Acton, US
Peter Stone · Bradford, GB

Key dates

Filing date	Sep 29, 2020
Grant date	Apr 15, 2025
Priority date	—
Expiry date	Jun 26, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/00
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A task prioritized experience replay (TaPER) algorithm enables simultaneous learning of multiple RL tasks off policy. The algorithm can prioritize samples that were part of fixed length episodes that led to the achievement of tasks. This enables the agent to quickly learn task policies by bootstrapping over its early successes. Finally, TaPER can improve performance on all tasks simultaneously, which is a desirable characteristic for multi-task RL. Unlike conventional ER algorithms that are applied to single RL task learning settings or that require rewards to be binary or abundant, or are provided as a parameterized specification of goals, TaPER poses no such restrictions and supports arbitrary reward and task specifications.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.