Patent · US Active

Data deduplication dictionary system

US8250325B2 · kind B2 · utility

41Cited by

2References

18Claims

0Family size

Assignee

Oracle International Corporation · US

Inventors

Jon Mark Holdman · Wheat Ridge, US
Robert M. Raymond · Boulder, US
Atiq Ahamad · Superior, US
John R. Kostraba, Jr. · Thornton, US
Carl T. Madison, Jr. · Windsor, US

Key dates

Filing date	Apr 1, 2010
Grant date	Aug 21, 2012
Priority date	—
Expiry date	Jan 21, 2031

Classification

Technology area (CPC G)Physics
CPC primaryG06F11/1453
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A data deduplication method using a small hash digest dictionary in fast-access memory. The method includes receiving customer data, dividing the data into smaller chunks, and assigning hash values to each chunk. For each chunk, the method includes performing lookup for a duplicate chunk by accessing a small dictionary in memory with the chunk's hash value. When no entry, the small dictionary is updated to include the hash value to fill the dictionary with earliest received data. When an entry is found, the entry's hash value is compared with lookup value and if matched, reference data is returned and an entry counter is incremented. If not matched, additional accesses are attempted such as with additional indexes calculated using the hash value. Collisions may trigger an entry replacement such that some initially entered entries are replaced when determined to not be most repeating values such as based on their counter value.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.