Data deduplication utilizing extent ID database
US9659047B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 3, 2014 |
| Grant date | May 23, 2017 |
| Priority date | — |
| Expiry date | Mar 12, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F3/0641
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An extent map (EMAP) database may include one or more extent map entries configured to map extent IDs to PVBNs. Each extent ID may be apportioned into a most significant bit (MSB) portion, i.e., checksum bits, and a least significant bit (LSB) portion, i.e., duplicate bits. A hash may be applied to the data of the extent to calculate the checksum bits, which illustratively represent a fingerprint of the data. The duplicate bits may be configured to denote any reoccurrence of the checksum bits in the EMAP database, i.e., whether there is an existing extent with potentially identical data in a volume of the aggregate. Each extent map entry may be inserted on a node having one or more key/value pairs, wherein the key is the extent ID and the value is the PVBN. The EMAP database may be scanned and utilized to perform data deduplication.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.