Tunable data fingerprinting for optimizing data deduplication
US8620877B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Apr 30, 2008 |
| Grant date | Dec 31, 2013 |
| Priority date | — |
| Expiry date | Jan 12, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F11/2094
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present invention provides a method and system of performing de-duplication for at least one computer file in a computer system. In an exemplary embodiment, the method and system include (1) tuning a rolling-hash algorithm for the de-duplication, (2) chunking the data in the file into chunks of data by using the tuned algorithm, (3) producing a content identifier for each of the chunks, and (4) processing the chunks that are unique, the content identifier for each of the chunks that are unique, and references to the chunks that are unique. In an exemplary embodiment, the computer system includes a de-duplication-enabled data store. In an exemplary embodiment, the computer system includes (a) a transferor computer system that is configured to transfer the file to a de-duplication-enabled computer system and (b) the de-duplication-enabled computer system.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.