Data deduplication for streaming sequential data storage applications
US8407193B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 27, 2010 |
| Grant date | Mar 26, 2013 |
| Priority date | — |
| Expiry date | Feb 21, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/174
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Data deduplication compression in a streaming storage application, is provided. The disclosed deduplication process provides a deduplication archive that enables storage of the archive to, and extraction from, a streaming storage medium. One implementation involves compressing fully sequential data stored in a data repository to a sequential streaming storage, by: splitting fully sequential data into data blocks; hashing content of each data block and comparing each hash to an in-memory lookup table for a match, the in-memory lookup table storing all hashes that have been encountered during the compression of the fully sequential data; for each data block without a hash match, adding the data block as a new data block for compression of fully sequential data; and encoding duplicate data blocks using the in-memory lookup table into data segments.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.