Patent · US Active

Enhanced reinforcement learning algorithms using future state prediction scaled reward values

US12019712B2 · kind B2 · utility

0Cited by
6References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 2, 2021
Grant dateJun 25, 2024
Priority date
Expiry dateJun 30, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N20/00
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In various embodiments, the present disclosure relates to systems and methods for enhanced reinforcement learning (RL) algorithms using future state prediction. In some embodiments, an offline emulator can be applied allowing the generation of samples, thus supporting continuous training of the system and fast-forward fabric saturation. The fabric accepts transactions which allocate resources with respect to the transactions needs and constraints and contains an RL/AI model(s) which are continuously learning based on the current reward combined with reward scaling. By modelling the fabric and transactions in an emulator, it is possible to predict future states and calculate adjusted rewards with respect to the optimal criterion. A state generator is based on modeling past historical transactions, allowing a user to anticipate future state characteristics of the fabric. In some embodiments, online learning is based on adjusted rewards which are more representative with respect to the objective function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.