Efficient computation of document similarity
US7610281B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 29, 2006 |
| Grant date | Oct 27, 2009 |
| Priority date | — |
| Expiry date | Jul 18, 2027 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99943
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems, methodologies, media, and other embodiments associated with efficiently computing document similarity are described. One exemplary system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents. The example system may also include logic to selectively partially reconstruct a candidate document from entries in the inverted index and logic to compute an edit distance between a string associated with a query and a string associated with the partially reconstructed candidate document. The example system may also include a signal logic configured to provide a signal corresponding to the edit distance.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.