Behavior-driven multilingual stemming
US8793120B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 28, 2010 |
| Grant date | Jul 29, 2014 |
| Priority date | — |
| Expiry date | Feb 17, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/268
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
User behavior data can be used with language-specific rule sets to generate stemming databases useful for such tasks as indexing and search query processing. The terms contained in user queries, as well as user behavior with respect to those queries or results returned for those queries, can be analyzed to determine a relative measure (e.g., relative frequency) of various forms of those terms. When generating a stemming database, language-specific rule sets can be used to determine appropriate stemming rules, and where more than one potential rule is identified the user behavior data can be used to select what is likely the appropriate rule, at least for the respective environment. Whitelists or other such components can be used to handle specific or irregular forms that do not follow the general rules or otherwise are exceptions that might not otherwise be processed correctly.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.