System and method for data deduplication for disk storage subsystems
US9678688B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Jul 14, 2011 |
| Grant date | Jun 13, 2017 |
| Priority date | — |
| Expiry date | Sep 14, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/583
- WIPO fieldAudio-visual technology
- WIPO sectorElectrical engineering
Abstract
A method for data deduplication includes the following steps. First, segmenting an original data set into a plurality of data segments. Next, transforming the data in each data segment into a transformed data representation that has a band-type structure for each data segment. The band-type structure includes a plurality of bands. Next, selecting a first set of bands, grouping them together and storing them with the original data set. The first set of bands includes non-identical transformed data for each data segment. Next, selecting a second set of bands and grouping them together. The second set of bands includes identical transformed data for each data segment. Next, applying a hash function onto the transformed data of the second set of bands and thereby generating transformed data segments indexed by hash function indices. Finally, storing the hash function indices and the transformed data representation of one representative data segment in a deduplication database.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.