Patent · US Active

Word detection

US7917355B2 · kind B2 · utility

67Cited by
8References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 23, 2007
Grant dateMar 29, 2011
Priority date
Expiry dateJan 26, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/53
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.