Optimizing the performance of duplicate identification by content
US7617195B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 28, 2007 |
| Grant date | Nov 10, 2009 |
| Priority date | — |
| Expiry date | Apr 8, 2028 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99936
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In accordance with the disclosure, there is provided a method for identifying duplicate documents comprising drafting a first document and creating a near unique representative string based on the document content. The method further comprises searching for other documents with the same NRS and selectively assigning a duplicate group identification to the first document, the duplicate group identification is unique if no near unique representative string matches are found, or the duplicate group identification is the same as an associated duplicate document's duplicate group identification that matches the NRS. The method further comprises placing the DGI into a meta-data of the first document and recalling a list of duplicates of a particular document based upon user demand by searching the meta-data and selecting documents using the same DGI.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.