Patent · US Active

Content aware chunking for achieving an improved chunk size distribution

US8918375B2 · kind B2 · utility

6Cited by
7References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 31, 2011
Grant dateDec 23, 2014
Priority date
Expiry dateAug 31, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/1752
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The subject disclosure is directed towards partitioning a file into chunks that satisfy a chunk size restriction, such as maximum and minimum chunk sizes, using a sliding window. For file positions within the chunk size restriction, a signature representative of a window fingerprint is compared with a target pattern, with a chunk boundary candidate identified if matched. Other signatures and patterns are then checked to determine a highest ranking signature (corresponding to a lowest numbered Rule) to associate with that chunk boundary candidate, or set an actual boundary if the highest ranked signature is matched. If the maximum chunk size is reached without matching the highest ranked signature, the chunking mechanism regresses to set the boundary based on the candidate with the next highest ranked signature (if no candidates, the boundary is set at the maximum). Also described is setting chunk boundaries based upon pattern detection (e.g., runs of zeros).

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.