Patent · US Active

Word detection

US8463598B2 · kind B2 · utility

4Cited by
13References
16Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 28, 2011
Grant dateJun 11, 2013
Priority date
Expiry dateJan 28, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/53
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.