Autoscaling GPU applications in Kubernetes based on GPU utilization
US12210909B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Sep 29, 2021 |
| Grant date | Jan 28, 2025 |
| Priority date | — |
| Expiry date | Feb 23, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T1/20
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, systems, and computer-readable storage media for executing, within the container orchestration system, the application using one or more resource units, each resource unit including an application container and an ancillary container, the ancillary container executing a set of GPU metric exporters, receiving, from the application and for each resource unit, a GPU metric including a GPU utilization associated with a respective resource unit, determining, for each resource unit, a custom GPU metric based on a respective GPU metric, the custom GPU metric aggregating values of the respective GPU metric over a time period, determining, by an autoscaler, an average GPU metric based on one or more custom GPU metrics, and selectively scaling, by the autoscaler, the application within the container orchestration system based on the average GPU metric by adjusting a number of resource units executing the application.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.