Patent · US Active

Reinforcement learning using obfuscated environment models

US11144847B1 · kind B1 · utility

1Cited by
3References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateApr 15, 2021
Grant dateOct 12, 2021
Priority date
Expiry dateApr 15, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/092
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection system used to select actions to be performed by an agent interacting with a target environment to perform a task in the target environment. In one aspect, a method comprises: obtaining a target environment model of the target environment; modifying the target environment model of the target environment to generate an obfuscated environment model of an obfuscated environment that represents an obfuscation of the target environment; obtaining, from each of a plurality of users, one or more obfuscated environment trajectories that represent interaction of the user with the obfuscated environment through the corresponding obfuscated environment simulation; mapping each of the obfuscated environment trajectories to a corresponding target environment trajectory; and training the action selection system on the target environment trajectories.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.