Patent · US Active

Action selection by reinforcement learning and numerical optimization

US11551165B1 · kind B1 · utility

0Cited by

2References

19Claims

0Family size

Assignee

Latent Strategies LLC · US

Inventor

John Reynders · Newton, US

Key dates

Filing date	Apr 5, 2022
Grant date	Jan 10, 2023
Priority date	—
Expiry date	Apr 5, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/092
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises, at each of one or more time steps: generating a respective action score for each action in a set of possible actions, wherein the set of possible actions comprises: (i) a plurality of atomistic actions, and (ii) one or more optimization actions, wherein each optimization action is associated with a respective objective function that measures performance of the agent on a corresponding auxiliary task; selecting an action from the set of possible actions in accordance with the action scores, wherein the selected action is an optimization action; in response to selecting the optimization action, performing a numerical optimization to identify a sequence of one or more atomistic actions that are predicted to optimize the objective function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.