Patent · US Active

Data deduplication utilizing extent ID database

US9659047B2 · kind B2 · utility

24Cited by
31References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 3, 2014
Grant dateMay 23, 2017
Priority date
Expiry dateMar 12, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F3/0641
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An extent map (EMAP) database may include one or more extent map entries configured to map extent IDs to PVBNs. Each extent ID may be apportioned into a most significant bit (MSB) portion, i.e., checksum bits, and a least significant bit (LSB) portion, i.e., duplicate bits. A hash may be applied to the data of the extent to calculate the checksum bits, which illustratively represent a fingerprint of the data. The duplicate bits may be configured to denote any reoccurrence of the checksum bits in the EMAP database, i.e., whether there is an existing extent with potentially identical data in a volume of the aggregate. Each extent map entry may be inserted on a node having one or more key/value pairs, wherein the key is the extent ID and the value is the PVBN. The EMAP database may be scanned and utilized to perform data deduplication.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.