Patent · US Active

Method and system for performing multi-device based inference for large language model

US12242564B2 · kind B2 · utility

0Cited by
0References
12Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 11, 2024
Grant dateMar 4, 2025
Priority date
Expiry dateJun 11, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F17/16
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Provided is a method and system for performing multi-device-based inference for a large language model. A multi-device-based inference performance system may include a plurality of devices configured to map to partitions that separate a large language model (LLM) according to an intra-layer parallelism method. Here, each of the plurality of devices may be implemented to synchronize data by sharing a sub-result of matrix multiplication on the data with another device of the plurality of devices while the matrix multiplication is being performed.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.