Patent · US Expired

Word disambiguation apparatus and methods

US5541836A · kind A · utility

188Cited by
14References
37Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 30, 1991
Grant dateJul 30, 1996
Priority date
Expiry dateDec 30, 2011

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/45
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Apparatus and methods for determining whether a word/sense pair is proper for a context. Wide contexts (100 words) are employed for both training and testing, and testing is done by adding the weights of vocabulary words from the context. The weights are determined by Bayesian techniques which interpolate between the probability of occurrence of a vocabulary word in a conditional sample of the training text and the probability of its occurrence in the entire training text. A further improvement in testing takes advantage of the fact that a word is generally used in only a single sense in a single discourse. Also disclosed are automated training techniques including training on bilingual bodies of text and training using categories from Roget's Thesaurus.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.