Patent · US Active

Similar document detection and electronic discovery

US9208219B2 · kind B2 · utility

8Cited by
0References
24Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 8, 2013
Grant dateDec 8, 2015
Priority date
Expiry dateJan 4, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/9535
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are disclosed for performing duplicate document analyses to identify texturally identical or similar documents, which may be electronic documents stored within an electronic discovery platform. A process is described which includes representing each of the documents, including a target document, as a relatively large n-tuple vector and also as a relatively small m-tuple vector, performing a series of one-dimensional searches on the set of m-tuple vectors to identify a set of documents which are near-duplicates to the target document, and then filtering the near set of near duplicate documents based upon the distance of their n-tuple vectors from that of the target document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.