Patent · US Active

Configuring a system which interacts with an environment

US11402808B2 · kind B2 · utility

0Cited by
0References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 10, 2020
Grant dateAug 2, 2022
Priority date
Expiry dateFeb 26, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG05B2219/39289
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system is described for configuring another system, e.g., a robotics system. The other system interacts with an environment according to a deterministic policy by repeatedly obtaining, from a sensor, sensor data indicative of a state of the environment, determining a current action, and providing, to an actuator, actuator data causing the actuator to effect the current action in the environment. To configure the other system, the system optimizes a loss function based on an accumulated reward distribution with respect to a set of parameters of the policy. The accumulated reward distribution includes an action probability of an action of a previous interaction log being performed according to the current set of parameters. The action probability is approximated using a probability distribution defined by an action selected by the deterministic policy according to the current set of parameters.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.