Patent · US Active

Fault management in a distributed computer system

US11966292B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 27, 2022
Grant dateApr 23, 2024
Priority date
Expiry dateOct 22, 2042

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L67/1029
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In some examples, a distributed computer system includes a plurality of computer nodes, where the plurality of computer nodes include respective programs to cooperate to perform a workload. A first computer node includes a communication proxy between the program of the first computer node and a communication library that supports communications between the program of the first computer node and the programs of other computer nodes of the plurality of computer nodes, and a fault management service to monitor a health of the other computer nodes, and in response to a detection of a fault of a second computer node of the plurality of computer nodes, relaunch the communication proxy. The relaunched communication proxy selects, from a plurality of states, a common state to which the programs are to roll back.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.