Scalable clusterwide de-duplication
US9727273B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 18, 2016 |
| Grant date | Aug 8, 2017 |
| Priority date | — |
| Expiry date | Feb 18, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2201/815
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for minimizing duplicate data transfer in a clustered storage system, having compute nodes in a compute plane coupled to data nodes in a data plane is provided. The method may include generating a hash key relating to content of a virtual disk associated with a compute node. During a data replication phase, the method may detect duplicate data stored in respective storage units of the compute node and the data node using the hash key. Further, the method may eliminate redundant data transfers through the use of an index and mapping scheme, where only non-duplicate data is transferred along with a set of logical block addresses associated with duplicate data from the replicating compute node to the data node. During a data recovery phase, the method may transfer duplicate data from a peer compute node or from a virtual machine to the requesting compute node, eliminating excess data transfer.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.