Systems and methods for data backup using data binning and deduplication
US10678654B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 25, 2017 |
| Grant date | Jun 9, 2020 |
| Priority date | — |
| Expiry date | Jul 31, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F2201/84
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed are methods and systems for performing data backup which implement data binning using log-structured merge (LSM) trees during deduplication. An exemplary method includes: calculating a reduced hash value (RHV) associated with each of a plurality of data blocks; partitioning the plurality of reduced hash values into groups; selecting a representative hash value for each group; determining whether the representative hash value occurs in a first LSM tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the RHVs in the representative hash value's group into volatile memory; comparing each of the RHVs to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, which references a data block in a segment store.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.