Word disambiguation apparatus and methods
US5541836A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Dec 30, 1991 |
| Grant date | Jul 30, 1996 |
| Priority date | — |
| Expiry date | Dec 30, 2011 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/45
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Apparatus and methods for determining whether a word/sense pair is proper for a context. Wide contexts (100 words) are employed for both training and testing, and testing is done by adding the weights of vocabulary words from the context. The weights are determined by Bayesian techniques which interpolate between the probability of occurrence of a vocabulary word in a conditional sample of the training text and the probability of its occurrence in the entire training text. A further improvement in testing takes advantage of the fact that a word is generally used in only a single sense in a single discourse. Also disclosed are automated training techniques including training on bilingual bodies of text and training using categories from Roget's Thesaurus.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.