Method for computing similarity between text spans using factored word sequence kernels
US8077984B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 4, 2008 |
| Grant date | Dec 13, 2011 |
| Priority date | — |
| Expiry date | Oct 14, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/274
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer implemented method and an apparatus for comparing spans of text are disclosed. The method includes computing a similarity measure between a first sequence of symbols representing a first text span and a second sequence of symbols representing a second text span as a function of the occurrences of optionally noncontiguous subsequences of symbols shared by the two sequences of symbols. Each of the symbols comprises at least one consecutive word and is defined according to a set of linguistic factors. Pairs of symbols in the first and second sequences that form a shared subsequence of symbols are each matched according to at least one of the factors.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.