Patent · US Active

Scalable clusterwide de-duplication

US9727273B1 · kind B1 · utility

17Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 18, 2016
Grant dateAug 8, 2017
Priority date
Expiry dateFeb 18, 2036

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2201/815
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for minimizing duplicate data transfer in a clustered storage system, having compute nodes in a compute plane coupled to data nodes in a data plane is provided. The method may include generating a hash key relating to content of a virtual disk associated with a compute node. During a data replication phase, the method may detect duplicate data stored in respective storage units of the compute node and the data node using the hash key. Further, the method may eliminate redundant data transfers through the use of an index and mapping scheme, where only non-duplicate data is transferred along with a set of logical block addresses associated with duplicate data from the replicating compute node to the data node. During a data recovery phase, the method may transfer duplicate data from a peer compute node or from a virtual machine to the requesting compute node, eliminating excess data transfer.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.