Control systems using deep reinforcement learning
US11062207B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 30, 2017 |
| Grant date | Jul 13, 2021 |
| Priority date | — |
| Expiry date | May 14, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Data indicative of a plurality of observations of an environment are received at a control system. Machine learning using deep reinforcement learning is applied to determine an action based on the observations. The deep reinforcement learning applies a convolutional neural network or a deep auto encoder to the observations and applies a training set to locate one or more regions having a higher reward. The action is applied to the environment. A reward token indicative of alignment between the action and a desired result is received. A policy parameter of the control system is updated based on the reward token. The updated policy parameter is applied to determine a subsequent action responsive to a subsequent observation.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.