System on chip parallel computing of ML services and applications for partitioning resources based on the inference times
US12399750B2 · kind B2 · utility
Assignees
Inventors
Key dates
| Filing date | Jun 10, 2022 |
| Grant date | Aug 26, 2025 |
| Priority date | — |
| Expiry date | Jun 10, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F9/5066
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system obtains a performance profile corresponding to times taken to perform an inferencing by a machine learning (ML) model using a different number of processing resources from a plurality of processing resources. The system determines one or more groupings of processing resources from the plurality of processing resources, each grouping includes one or more partitions. The system calculates performance speeds corresponding to each grouping based on the performance profile. The system determines a grouping having a best performance speed from the calculated performance speeds. The system partitions the processing resources based on the determined grouping to perform the inferencing.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.