Patent · US Active

Phrase matching for document classification

US8401842B1 · kind B1 · utility

36Cited by
0References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 11, 2008
Grant dateMar 19, 2013
Priority date
Expiry dateJan 18, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/205
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Phrase matching processes for matching phrases comprising a plurality of keywords in document text construct hit lists of the keywords in a document text, and operate on the keywords in either phrase order or without regard to the order of occurrence of the keywords in the phrase. The processes form sorted sets of all keywords, and compare occurrences of the keywords in the sorted sets to a predefined proximity constraint. For unordered phrases, the proximity constraint defines a maximum span between keywords in the highest and lowest positions in the sorted set as MaxSpan=p(k−1), where p is a proximity and k is the number of keywords in the phrase. For ordered phrases, the distances between successive phrase keywords in phrase order must be less than or equal to the proximity p.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.