Patent · US Active

Optimizing the performance of duplicate identification by content

US7617195B2 · kind B2 · utility

13Cited by
16References
23Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 28, 2007
Grant dateNov 10, 2009
Priority date
Expiry dateApr 8, 2028

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99936
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In accordance with the disclosure, there is provided a method for identifying duplicate documents comprising drafting a first document and creating a near unique representative string based on the document content. The method further comprises searching for other documents with the same NRS and selectively assigning a duplicate group identification to the first document, the duplicate group identification is unique if no near unique representative string matches are found, or the duplicate group identification is the same as an associated duplicate document's duplicate group identification that matches the NRS. The method further comprises placing the DGI into a meta-data of the first document and recalling a list of duplicates of a particular document based upon user demand by searching the meta-data and selecting documents using the same DGI.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.