Training a policy model for a robotic task, using reinforcement learning and utilizing data that is based on episodes, of the robotic task, guided by an engineered policy
US12210943B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 29, 2021 |
| Grant date | Jan 28, 2025 |
| Priority date | — |
| Expiry date | Dec 2, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG05B2219/39298
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Implementations disclosed herein relate to utilizing at least one existing manually engineered policy, for a robotic task, in training an RL policy model that can be used to at least selectively replace a portion of the engineered policy. The RL policy model can be trained for replacing a portion of a robotic task and can be trained based on data from episodes of attempting performance of the robotic task, including episodes in which the portion is performed based on the engineered policy and/or other portion(s) are performed based on the engineered policy. Once trained, the RL policy model can be used, at least selectively and in lieu of utilization of the engineered policy, to perform the portion of robotic task, while other portion(s) of the robotic task are performed utilizing the engineered policy and/or other similarly trained (but distinct) RL policy model(s).
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.