Method, distributed system and computer program for failure recovery
US8880931B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Dec 24, 2010 |
| Grant date | Nov 4, 2014 |
| Priority date | — |
| Expiry date | Jul 3, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F11/2097
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A distributed system includes: nodes each having a memory, running distributed processes, and checkpointing to create checkpoint data for each process; a selection unit selecting spare nodes for future failure recovery for each process; an allocation unit allocating and transmitting the checkpoint data to the spare nodes to make the spare nodes store the checkpoint data before failure; and a recovery unit selecting one checkpoint data for recovery, activates the selected checkpoint data to run a process on the spare node, or partitions the existing stored checkpoint data, when any checkpoint data is not suitable for recovery, the partitions of the checkpoint data as a whole being integrated into a complete checkpoint data; and transmitting the partitions from the spare nodes to a new node, and reorganizing the partitions into complete data to be activated to run a process on the new node.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.