Patent · US Active

Method, system, and computer program product for dynamically scheduling machine learning inference jobs with different quality of services on a shared infrastructure

US11562263B2 · kind B2 · utility

1Cited by
5References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 17, 2020
Grant dateJan 24, 2023
Priority date
Expiry dateApr 28, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2209/501
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method, system, and computer program product for dynamically scheduling machine learning inference jobs receive or determine a plurality of performance profiles associated with a plurality of system resources, wherein each performance profile is associated with a machine learning model; receive a request for system resources for an inference job associated with the machine learning model; determine a system resource of the plurality of system resources for processing the inference job associated with the machine learning model based on the plurality of performance profiles and a quality of service requirement associated with the inference job; assign the system resource to the inference job for processing the inference job; receive result data associated with processing of the inference job with the system resource; and update based on the result data, a performance profile of the plurality of the performance profiles associated with the system resource and the machine learning model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.