Patent · US Active

Optimized deduplication based on backup frequency in a distributed data storage system

US11513708B2 · kind B2 · utility

4Cited by
176References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 20, 2021
Grant dateNov 29, 2022
Priority date
Expiry dateJan 20, 2041

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L67/1097
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed deduplication techniques at a distributed data storage system guarantee that space reclamation will not affect deduplicated data integrity even without perfect synchronization between components. By understanding certain “behavioral” characteristics and schedule cadences of backup operations that generate backup copies received at the distributed data storage system, data blocks that are not re-written by subsequent backup copies are pro-actively aged, while promoting continued retention of data blocks that are re-written. An expiry scheme operates with block-level granularity. Each unique deduplicated data block is given an expiry timeframe based on the block's arrival time at the distributed data storage system (i.e., when a backup copy supplies the block) and further based on backup frequencies of the various virtual disks referencing a unique system-wide identifier of the block, which is based on the block's hash value. Communications between components are kept to an as-needed basis. Cloud-based and multi-cloud configurations are disclosed.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.