Patent · US Active

Methods for optimized variable-size deduplication using two stage content-defined chunking and devices thereof

US10866928B2 · kind B2 · utility

0Cited by
7References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 14, 2019
Grant dateDec 15, 2020
Priority date
Expiry dateJul 28, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/152
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, non-transitory machine readable media, and computing devices that compare a hash value to a predefined value for sliding windows in parallel for segments partitioned from an input data stream. A bit array is parsed according to minimum and maximum chunk sizes to identify chunk boundaries for the input data stream. The bit array is populated based on a result of the comparison and portions of the bit array are parsed in parallel. Unique chunks of the input data stream defined by the chunk boundaries are stored in a storage device. Accordingly, this technology utilizes parallel processing in two stages. In a first stage, rolling window based hashing is performed concurrently to identify potential chunk boundaries. In a second stage, actual chunk boundaries are selected based on minimum and maximum chunk size constraints. This technology advantageously facilitates significant deduplication ratio improvement as well as improved parallel chunking performance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.