Patent · US Active

Extensible pipeline for data deduplication

US8380681B2 · kind B2 · utility

30Cited by
6References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 16, 2010
Grant dateFeb 19, 2013
Priority date
Expiry dateFeb 10, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/13
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The subject disclosure is directed towards data deduplication (optimization) performed by phases/modules of a modular data deduplication pipeline. At each phase, the pipeline allows modules to be replaced, selected or extended, e.g., different algorithms can be used for chunking or compression based upon the type of data being processed. The pipeline facilitates secure data processing, batch processing, and parallel processing. The pipeline is tunable based upon feedback, e.g., by selecting modules to increase deduplication quality, performance and/or throughput. Also described is selecting, filtering, ranking, sorting and/or grouping the files to deduplicate, e.g., based upon properties and/or statistical properties of the files and/or a file dataset and/or internal or external feedback.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.