Reinforcement learning through a double actor critic algorithm
US11816591B2 · kind B2 · utility
Assignees
Inventor
Key dates
| Filing date | Feb 25, 2020 |
| Grant date | Nov 14, 2023 |
| Priority date | — |
| Expiry date | Jun 17, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The Double Actor Critic (DAC) reinforcement-learning algorithm affords stable policy improvement and aggressive neural-net optimization without catastrophic overfitting of the policy. DAC trains models using an arbitrary history of data in both offline and online learning and can be used to smoothly improve on an existing policy learned or defined by some other means. Finally, DAC can optimize reinforcement learning problems with discrete and continuous action spaces.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.