Deduplication using sub-chunk fingerprints
US10135462B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 13, 2012 |
| Grant date | Nov 20, 2018 |
| Priority date | — |
| Expiry date | Oct 18, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2201/83
- WIPO fieldBasic communication processes
- WIPO sectorElectrical engineering
Abstract
A computer-implemented method and system for deduplicating sub-chunks in a data storage system selects a data chunk to deduplicate and generates a sketch for the selected data chunk. A similar data chunk is searched for using the sketch. A set of fingerprints corresponding to sub-chunks of the similar data chunk is loaded. The set of fingerprints for the similar data chunk is compared to a set of fingerprints of the selected data chunk and the selected chunk is encoded as a set of references to identical sub-chunks of the similar data chunk and at least one unmatched sub-chunk.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.