Method for controlling air conditioning device based on delayed reward
US12188672B2 · kind B2 · utility
Assignees
Inventors
Key dates
| Filing date | Oct 23, 2023 |
| Grant date | Jan 7, 2025 |
| Priority date | — |
| Expiry date | Oct 23, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldThermal processes and apparatus
- WIPO sectorMechanical engineering
Abstract
Disclosed is a method for controlling an air conditioning device, which is performed by at least one computing device, which includes: determining a control action for the air conditioning device at a first time point by using a reinforcement learning agent; determining a reward for the control action at the first time point based on a reward delay time by using the reinforcement learning agent; and performing reinforcement learning related to the control of the air conditioning device based on the determined reward, in which a time point when the reward delay time elapses from the first time point corresponds to a second time point, and the reward for the control action at the first time point is calculated while excluding situations after the first time point and before the second time point.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.