Patent · US Active

Scalable post-process deduplication

US9946724B1 · kind B1 · utility

17Cited by
6References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 31, 2014
Grant dateApr 17, 2018
Priority date
Expiry dateOct 29, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/1748
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Implementations are provided herein for data deduplication, and more particularly, to post-process data deduplication on a large scale out storage system. Multiple techniques and implementations are disclosed that offer greater efficiency, higher performance, and more stability when performing post-process data deduplication at large scale. Disclosed implementations are based on a process for data deduplication involving four main phases: enumeration, commonality, sharing, and update. Multi-level hashing can be used to identify candidates for deduplication during the enumeration phase, providing a more efficient use of compute resources. In addition, datasets can be phase rotated through the post-process deduplication steps providing a more controllable deduplication environment as well as a more efficient use of resources.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.