Patent · US Active

Recording medium, reinforcement learning method, and reinforcement learning apparatus

US11645574B2 · kind B2 · utility

0Cited by
1References
10Claims
0Family size

Assignees

Inventors

Key dates

Filing dateSep 13, 2018
Grant dateMay 9, 2023
Priority date
Expiry dateAug 25, 2040

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L43/08
  • WIPO fieldDigital communication
  • WIPO sectorElectrical engineering

Abstract

A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.