Scalable post-process deduplication
US9946724B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 31, 2014 |
| Grant date | Apr 17, 2018 |
| Priority date | — |
| Expiry date | Oct 29, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/1748
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Implementations are provided herein for data deduplication, and more particularly, to post-process data deduplication on a large scale out storage system. Multiple techniques and implementations are disclosed that offer greater efficiency, higher performance, and more stability when performing post-process data deduplication at large scale. Disclosed implementations are based on a process for data deduplication involving four main phases: enumeration, commonality, sharing, and update. Multi-level hashing can be used to identify candidates for deduplication during the enumeration phase, providing a more efficient use of compute resources. In addition, datasets can be phase rotated through the post-process deduplication steps providing a more controllable deduplication environment as well as a more efficient use of resources.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.