Suicide among well-mannered cluster nodes experiencing heartbeat failure
US6460149B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 11, 2000 |
| Grant date | Oct 1, 2002 |
| Priority date | — |
| Expiry date | Apr 11, 2020 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F11/3055
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods for re-configuring a cluster computer system of multiple or more nodes when the cluster experiences communications failure. First and second nodes of the cluster have respective channel controllers. A SCSI channel and the controllers communicatively connect the multiple nodes. When a node becomes aware of a possible communications failure, the node attempts to determine the authenticity the failure and responds according to the determined authenticity.According to one method, a first node detects heartbeat node-to-node communications failure on the channel and then tests a physical drive on the channel. If the testing is successful, the node kills the other node. If the testing is unsuccessful, the first node commits suicide.In one embodiment, the coupling includes multiple channels communicatively coupling the first and second nodes and the first node selecting one of the channels for node-to-node communications. In this environment, choosing a physical drive involves testing node-to-node communications on another of the channels if no physical drive is online on the channel (and terminating the re-configuring method). If a drive is available, the first node uses the first…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.