Patent · US Active

Action selection by reinforcement learning and numerical optimization

US11551165B1 · kind B1 · utility

0Cited by
2References
19Claims
0Family size

Assignee

Inventor

Key dates

Filing dateApr 5, 2022
Grant dateJan 10, 2023
Priority date
Expiry dateApr 5, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/092
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises, at each of one or more time steps: generating a respective action score for each action in a set of possible actions, wherein the set of possible actions comprises: (i) a plurality of atomistic actions, and (ii) one or more optimization actions, wherein each optimization action is associated with a respective objective function that measures performance of the agent on a corresponding auxiliary task; selecting an action from the set of possible actions in accordance with the action scores, wherein the selected action is an optimization action; in response to selecting the optimization action, performing a numerical optimization to identify a sequence of one or more atomistic actions that are predicted to optimize the objective function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.