Generalized reinforcement learning agent
US11526812B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 1, 2019 |
| Grant date | Dec 13, 2022 |
| Priority date | — |
| Expiry date | May 12, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/092
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An apparatus has a memory storing a reinforcement learning policy with an optimization component and a data collection component. The apparatus has a regularization component which applies regularization selectively between the optimization component of the reinforcement learning policy and the data collection component of the reinforcement learning policy. A processor carries out a reinforcement learning process by: triggering execution of an agent according to the policy and with respect to a first task; observing values of variables comprising: an observation space of the agent, an action of the agent; and updating the policy using reinforcement learning according to the observed values and taking into account the regularization.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.