Patent · US Active

Training a policy model for a robotic task, using reinforcement learning and utilizing data that is based on episodes, of the robotic task, guided by an engineered policy

US12210943B2 · kind B2 · utility

0Cited by

1References

19Claims

0Family size

Assignee

Google LLC · US

Inventors

Adrian Li · San Francisco, US
Benjamin Holson · Sunnyvale, US
Alexander Herzog · San Jose, US
Mrinal Kalakrishnan · Palo Alto, US

Key dates

Filing date	Jan 29, 2021
Grant date	Jan 28, 2025
Priority date	—
Expiry date	Dec 2, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG05B2219/39298
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Implementations disclosed herein relate to utilizing at least one existing manually engineered policy, for a robotic task, in training an RL policy model that can be used to at least selectively replace a portion of the engineered policy. The RL policy model can be trained for replacing a portion of a robotic task and can be trained based on data from episodes of attempting performance of the robotic task, including episodes in which the portion is performed based on the engineered policy and/or other portion(s) are performed based on the engineered policy. Once trained, the RL policy model can be used, at least selectively and in lieu of utilization of the engineered policy, to perform the portion of robotic task, while other portion(s) of the robotic task are performed utilizing the engineered policy and/or other similarly trained (but distinct) RL policy model(s).

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.