Patent · US Active

System on chip parallel computing of ML services and applications for partitioning resources based on the inference times

US12399750B2 · kind B2 · utility

0Cited by
2References
14Claims
0Family size

Assignees

Inventors

Key dates

Filing dateJun 10, 2022
Grant dateAug 26, 2025
Priority date
Expiry dateJun 10, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F9/5066
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system obtains a performance profile corresponding to times taken to perform an inferencing by a machine learning (ML) model using a different number of processing resources from a plurality of processing resources. The system determines one or more groupings of processing resources from the plurality of processing resources, each grouping includes one or more partitions. The system calculates performance speeds corresponding to each grouping based on the performance profile. The system determines a grouping having a best performance speed from the calculated performance speeds. The system partitions the processing resources based on the determined grouping to perform the inferencing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.