Patent · US Active

System and method for eliminating duplicate data by generating data fingerprints using adaptive fixed-length windows

US8180740B1 · kind B1 · utility

61Cited by
4References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 12, 2009
Grant dateMay 15, 2012
Priority date
Expiry dateJul 23, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2201/83
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for generating data fingerprints is used to de-duplicate a data set having a high level of redundancy. A fingerprint generator generates a data fingerprint based on a data window. Each byte of the data set is added to the fingerprint generator and used to detect an anchor within the received data. If no anchor is detected, the system continues receiving bytes until a predefined window size is reached. When the window size is reached, the system records a data fingerprint based on the data window and resets the window size. If an anchor is detected, the system extends the window size such that the window ends a specified length after the location of the anchor. If the extended window is greater than a maximum size, the system ignores the anchor. The generated fingerprints are compared to a fingerprint database. The data set is then de-duplicated by replacing matching data segments with references to corresponding stored data segments.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.