Patent · US Active

Method for computing similarity between text spans using factored word sequence kernels

US8077984B2 · kind B2 · utility

13Cited by
4References
24Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 4, 2008
Grant dateDec 13, 2011
Priority date
Expiry dateOct 14, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/274
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer implemented method and an apparatus for comparing spans of text are disclosed. The method includes computing a similarity measure between a first sequence of symbols representing a first text span and a second sequence of symbols representing a second text span as a function of the occurrences of optionally noncontiguous subsequences of symbols shared by the two sequences of symbols. Each of the symbols comprises at least one consecutive word and is defined according to a set of linguistic factors. Pairs of symbols in the first and second sequences that form a shared subsequence of symbols are each matched according to at least one of the factors.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.