Topic word generation method and system
US8335787B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 7, 2008 |
| Grant date | Dec 18, 2012 |
| Priority date | — |
| Expiry date | Nov 7, 2028 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/355
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of, and system for, extracting topic words from a collection of documents across multiple and potentially very large number of domains. Documents are selected and ranked based on similarity with at least one seed word, which defines a topic. Seed words may be entered directly by a user or provided by another application. Keywords are extracted from documents determined to be a sufficiently good match to the topic and may be displayed to the user or used as input into word prediction or word analysis and display software. Documents are determined to be a sufficiently good match to the topic using an iterative algorithm starting with the best match and selecting documents containing keywords sufficiently similar to the previously selected documents.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.