Updating policy parameters under Markov decision process system environment
US8909571B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 21, 2013 |
| Grant date | Dec 9, 2014 |
| Priority date | — |
| Expiry date | May 21, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments relate to updating a parameter defining a policy under a Markov decision process system environment. An aspect includes updating the policy parameter stored in a storage section of a controller according to an update equation. The update equation includes a term for decreasing a weighted sum of expected hitting times over a first state (s) and a second state (s′) of a statistic on the number of steps required to make a first state transition from the first state (s) to the second state (s′).
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.