Patent · US Active

System on chip parallel computing of ML services and applications for partitioning resources based on the inference times

US12399750B2 · kind B2 · utility

0Cited by

2References

14Claims

0Family size

Assignees

BAIDU USA LLC · US
BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD. · CN

Inventors

Haofeng Kou · San Ramon, US
Davy Huang · Sunnyvale, US
Manjiang Zhang · Sunnyvale, US
Xing Li · Webster, US
Lei Wang · Hangzhou City, CN
Huimeng ZHENG · Beijing, CN
Zhen Chen · Beijing, CN
RUICHANG CHENG · Beijing, CN

Key dates

Filing date	Jun 10, 2022
Grant date	Aug 26, 2025
Priority date	—
Expiry date	Jun 10, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06F9/5066
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A system obtains a performance profile corresponding to times taken to perform an inferencing by a machine learning (ML) model using a different number of processing resources from a plurality of processing resources. The system determines one or more groupings of processing resources from the plurality of processing resources, each grouping includes one or more partitions. The system calculates performance speeds corresponding to each grouping based on the performance profile. The system determines a grouping having a best performance speed from the calculated performance speeds. The system partitions the processing resources based on the determined grouping to perform the inferencing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.