Patent · US Active

Systems and methods for determining lexical associations among words in a corpus

US9519634B2 · kind B2 · utility

2Cited by
0References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 1, 2015
Grant dateDec 13, 2016
Priority date
Expiry dateJun 1, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/216
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for identifying one or more target words of a corpus that have a lexical relationship to a plurality of provided cue words. The cue words and statistical lexical information derived from a corpus of documents are analyzed to determine candidate words that have a lexical association with the cue words. The statistical information includes numerical values indicative of probabilities of word pairs appearing together as adjacent words in a well-formed text or appearing together within a paragraph of a well-formed text. For each candidate word, a statistical association score between the candidate word and each of the cue words is determined. An aggregate score for each of the candidate words is determined based on the statistical association scores. One or more of the candidate words are selected to be the one or more target words based on the aggregate scores.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.