Fault management in a distributed computer system
US11966292B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 27, 2022 |
| Grant date | Apr 23, 2024 |
| Priority date | — |
| Expiry date | Oct 22, 2042 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L67/1029
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In some examples, a distributed computer system includes a plurality of computer nodes, where the plurality of computer nodes include respective programs to cooperate to perform a workload. A first computer node includes a communication proxy between the program of the first computer node and a communication library that supports communications between the program of the first computer node and the programs of other computer nodes of the plurality of computer nodes, and a fault management service to monitor a health of the other computer nodes, and in response to a detection of a fault of a second computer node of the plurality of computer nodes, relaunch the communication proxy. The relaunched communication proxy selects, from a plurality of states, a common state to which the programs are to roll back.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.