Patent · US Active

Recording medium, reinforcement learning method, and reinforcement learning apparatus

US11645574B2 · kind B2 · utility

0Cited by

1References

10Claims

0Family size

Assignees

Inventors

Tomotake Sasaki · Kawasaki, JP
Eiji Uchibe · Kunigami, JP
Kenji Doya · Kyoto, JP
Hirokazu Anai · Kawasaki, JP
Hitoshi Yanami · Kawasaki, JP
Hidenao Iwane · Kawasaki, JP

Key dates

Filing date	Sep 13, 2018
Grant date	May 9, 2023
Priority date	—
Expiry date	Aug 25, 2040

Classification

Technology area (CPC H)Electricity
CPC primaryH04L43/08
WIPO fieldDigital communication
WIPO sectorElectrical engineering

Abstract

A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.