Physical environment interaction with an equivariant policy
US12100198B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 8, 2020 |
| Grant date | Sep 24, 2024 |
| Priority date | — |
| Expiry date | Sep 7, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/047
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Some embodiments are directed to a computer-implemented method of interacting with a physical environment according to a policy. The policy determines multiple action probabilities of respective actions based on an observable state of the physical environment. The policy includes a neural network parameterized by a set of parameters. The neural network determines the action probabilities by determining a final layer input from an observable state and applying a final layer of the neural network to the final layer input. The final layer is applied by applying a linear combination of a set of equivariant base weight matrices to the final layer input. The base weight matrices are equivariant in the sense that, for a set of multiple predefined transformations of the final layer input, each transformation causes a corresponding predefined action permutation of the base weight matrix output for the final layer input.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.