Patent · US Expired

Method for duplicate detection and suppression

US7603370B2 · kind B2 · utility

12Cited by
8References
15Claims
0Family size

Assignee

Inventor

Key dates

Filing dateMar 22, 2004
Grant dateOct 13, 2009
Priority date
Expiry dateJun 16, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method detects similar objects in a collection of such objects by modification of a previous method in such a way that per-object memory requirements are reduced while false detections are avoided approximately as well as in the previous method. The modification includes (i) combining k samples of features into s supersamples, the value of k being reduced from the corresponding value used in the previous method; (ii) recording each supersample to b bits of precision, the value of b being reduced from the corresponding value used in the previous method; and (iii) requiring l matching supersamples in order to conclude that the two objects are sufficiently similar, the value of l being greater than the corresponding value required in the previous method. One application of the invention is in association with a web search engine query service to determine clusters of query results that are near-duplicate documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.