Patent · US Active

Training actor-critic algorithms in laboratory settings

US12423571B2 · kind B2 · utility

0Cited by
1References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 26, 2020
Grant dateSep 23, 2025
Priority date
Expiry dateJan 23, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N7/01
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Reinforcement learning methods can use actor-critic networks where (1) additional laboratory-only state information is used to train a policy that much act without this additional laboratory-only information in a production setting; and (2) complex resource-demanding policies are distilled into a less-demanding policy that can be more easily run at production with limited computational resources. The production actor network can be optimized using a frozen version of a large critic network, previously trained with a large actor network. Aspects of these methods can leverage actor-critic methods in which the critic network models the action value function, as opposed to the state value function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.