Patent · US Expired

Detecting query-specific duplicate documents

US7779002B1 · kind B1 · utility

61Cited by
9References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 24, 2003
Grant dateAug 17, 2010
Priority date
Expiry dateMay 19, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99937
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as “snippets”) is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.