Patent · US Expired

Detecting duplicate and near-duplicate files

US6658423B1 · kind B1 · utility

549Cited by
6References
38Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 24, 2001
Grant dateDec 2, 2003
Priority date
Expiry dateJan 6, 2022

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Improved duplicate and near-duplicate detection techniques may assign a number of fingerprints to a given document by (i) extracting parts from the document, (ii) assigning the extracted parts to one or more of a predetermined number of lists, and (iii) generating a fingerprint from each of the populated lists. Two documents may be considered to be near-duplicates if any one of their fingerprints match.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.