Patent · US Active

Data-efficient hierarchical reinforcement learning

US11992944B2 · kind B2 · utility

0Cited by

4References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Honglak Lee · Mountain View, US
Shixiang Gu · Mountain View, US
Sergey Levine · Redmond, US

Key dates

Filing date	May 17, 2019
Grant date	May 28, 2024
Priority date	—
Expiry date	Apr 23, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/08
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.