Unsupervised lexicon acquisition from speech and text
US8065149B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 6, 2008 |
| Grant date | Nov 22, 2011 |
| Priority date | — |
| Expiry date | Sep 22, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/063
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for acquiring, from an input text and an input speech, a set of a character string and a pronunciation thereof which should be recognized as a word. A system according to the present invention: selects, from an input text, plural candidate character strings which are candidates to be recognized as a word; generates plural pronunciation candidates of the selected candidate character strings; generates frequency data by combining data in which the generated pronunciation candidates are respectively associated with the character strings; generates recognition data in which character strings respectively indicating plural words contained in the input speech are associated with pronunciations; and selects and outputs a combination contained in the recognition data, out of combinations each consisting of one of the candidate character strings and one of the pronunciation candidates.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.