Patent · US Active

Statistical stemming

US8554543B2 · kind B2 · utility

17Cited by
12References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 10, 2012
Grant dateOct 8, 2013
Priority date
Expiry dateDec 10, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/268
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating suffix rewriting rules. A method includes obtaining a plurality of canonical suffix-rewriting rules each associated with one or more words, generating a suffix tree from the words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of final suffix-rewriting rules from the nodes in the minimum colored subset. Another method includes receiving applicable and non-applicable words for a suffix-rewriting rule, generating a suffix tree from the applicable words and the non-applicable words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of suffix-rewriting rules, wherein each rule corresponds to a node in the minimum colored subset with a valid status.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.