Neural episodic control
US11720796B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 23, 2020 |
| Grant date | Aug 8, 2023 |
| Priority date | — |
| Expiry date | Apr 23, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N5/01
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method includes maintaining respective episodic memory data for each of multiple actions; receiving a current observation characterizing a current state of an environment being interacted with by an agent; processing the current observation using an embedding neural network in accordance with current values of parameters of the embedding neural network to generate a current key embedding for the current observation; for each action of the plurality of actions: determining the p nearest key embeddings in the episodic memory data for the action to the current key embedding according to a distance measure, and determining a Q value for the action from the return estimates mapped to by the p nearest key embeddings in the episodic memory data for the action; and selecting, using the Q values for the actions, an action from the multiple actions as the action to be performed by the agent.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.