Patent · US Active

Efficient computation of document similarity

US7610281B2 · kind B2 · utility

11Cited by
1References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 29, 2006
Grant dateOct 27, 2009
Priority date
Expiry dateJul 18, 2027

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, methodologies, media, and other embodiments associated with efficiently computing document similarity are described. One exemplary system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents. The example system may also include logic to selectively partially reconstruct a candidate document from entries in the inverted index and logic to compute an edit distance between a string associated with a query and a string associated with the partially reconstructed candidate document. The example system may also include a signal logic configured to provide a signal corresponding to the edit distance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.