Patent · US Active

Reinforcement learning for concurrent actions

US11580378B2 · kind B2 · utility

1Cited by

0References

20Claims

0Family size

Assignee

Electronic Arts Inc. · US

Inventors

Jack Harmer · Stockholm, SE
Linus Gisslén · Stockholm, SE
Magnus Nordin · Gothenburg, SE
Jorge del Val Santos · Stockholm, SE

Key dates

Filing date	Nov 12, 2018
Grant date	Feb 14, 2023
Priority date	—
Expiry date	Jun 2, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG06N20/10
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A computer-implemented method comprises instantiating a policy function approximator. The policy function approximator is configured to calculate a plurality of estimated action probabilities in dependence on a given state of the environment. Each of the plurality of estimated action probabilities corresponds to a respective one of a plurality of discrete actions performable by the reinforcement learning agent within the environment. An initial plurality of estimated action probabilities in dependence on a first state of the environment are calculated. Two or more of the plurality of discrete actions are concurrently performed within the environment when the environment is in the first state. In response to the concurrent performance, a reward value is received. In response to the received reward value being greater than a baseline reward value, the policy function approximator is updated, such that it is configured to calculate an updated plurality of estimated action probabilities.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.