Patent · US Active

Systems and methods for data backup using data binning and deduplication

US10678654B2 · kind B2 · utility

9Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 25, 2017
Grant dateJun 9, 2020
Priority date
Expiry dateJul 31, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2201/84
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed are methods and systems for performing data backup which implement data binning using log-structured merge (LSM) trees during deduplication. An exemplary method includes: calculating a reduced hash value (RHV) associated with each of a plurality of data blocks; partitioning the plurality of reduced hash values into groups; selecting a representative hash value for each group; determining whether the representative hash value occurs in a first LSM tree, the first LSM tree stored in a volatile memory; and when the representative hash value occurs in the first LSM tree: loading the RHVs in the representative hash value's group into volatile memory; comparing each of the RHVs to one or more hash values in a second LSM tree to identify a matching hash value; and writing a segment identifier (ID) corresponding to the matching hash value in an archive, which references a data block in a segment store.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.