Method, system, and computer program product for dynamically scheduling machine learning inference jobs with different quality of services on a shared infrastructure
US11562263B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 17, 2020 |
| Grant date | Jan 24, 2023 |
| Priority date | — |
| Expiry date | Apr 28, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2209/501
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, system, and computer program product for dynamically scheduling machine learning inference jobs receive or determine a plurality of performance profiles associated with a plurality of system resources, wherein each performance profile is associated with a machine learning model; receive a request for system resources for an inference job associated with the machine learning model; determine a system resource of the plurality of system resources for processing the inference job associated with the machine learning model based on the plurality of performance profiles and a quality of service requirement associated with the inference job; assign the system resource to the inference job for processing the inference job; receive result data associated with processing of the inference job with the system resource; and update based on the result data, a performance profile of the plurality of the performance profiles associated with the system resource and the machine learning model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.