Patent · US Active

Training actor-critic algorithms in laboratory settings

US12423571B2 · kind B2 · utility

0Cited by

1References

16Claims

0Family size

Assignee

SONY GROUP CORPORATION · JP

Inventors

Piyush Khandelwal · Austin, US
James MacGlashan · Luthers Corners, US
Peter R. Wurman · Acton, US

Key dates

Filing date	Aug 26, 2020
Grant date	Sep 23, 2025
Priority date	—
Expiry date	Jan 23, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N7/01
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Reinforcement learning methods can use actor-critic networks where (1) additional laboratory-only state information is used to train a policy that much act without this additional laboratory-only information in a production setting; and (2) complex resource-demanding policies are distilled into a less-demanding policy that can be more easily run at production with limited computational resources. The production actor network can be optimized using a frozen version of a large critic network, previously trained with a large actor network. Aspects of these methods can leverage actor-critic methods in which the critic network models the action value function, as opposed to the state value function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.