Enhanced reinforcement learning algorithms using future state prediction scaled reward values
US12019712B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 2, 2021 |
| Grant date | Jun 25, 2024 |
| Priority date | — |
| Expiry date | Jun 30, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In various embodiments, the present disclosure relates to systems and methods for enhanced reinforcement learning (RL) algorithms using future state prediction. In some embodiments, an offline emulator can be applied allowing the generation of samples, thus supporting continuous training of the system and fast-forward fabric saturation. The fabric accepts transactions which allocate resources with respect to the transactions needs and constraints and contains an RL/AI model(s) which are continuously learning based on the current reward combined with reward scaling. By modelling the fabric and transactions in an emulator, it is possible to predict future states and calculate adjusted rewards with respect to the optimal criterion. A state generator is based on modeling past historical transactions, allowing a user to anticipate future state characteristics of the fabric. In some embodiments, online learning is based on adjusted rewards which are more representative with respect to the objective function.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.