Patent · US Active

Unsupervised lexicon acquisition from speech and text

US8065149B2 · kind B2 · utility

1Cited by
0References
12Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 6, 2008
Grant dateNov 22, 2011
Priority date
Expiry dateSep 22, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/063
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques for acquiring, from an input text and an input speech, a set of a character string and a pronunciation thereof which should be recognized as a word. A system according to the present invention: selects, from an input text, plural candidate character strings which are candidates to be recognized as a word; generates plural pronunciation candidates of the selected candidate character strings; generates frequency data by combining data in which the generated pronunciation candidates are respectively associated with the character strings; generates recognition data in which character strings respectively indicating plural words contained in the input speech are associated with pronunciations; and selects and outputs a combination contained in the recognition data, out of combinations each consisting of one of the candidate character strings and one of the pronunciation candidates.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.