Single-count backing-off method of determining N-gram language model values
US5745876A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | May 2, 1996 |
| Grant date | Apr 28, 1998 |
| Priority date | — |
| Expiry date | May 2, 2016 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/197
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
For the recognition of coherently spoken speech with a large vocabulary, language model values which take into account the probability of word sequences are considered at word transitions. Prior to the recognition, these language model values are derived on the basis of training speech signals. If the amount of training data is kept within sensible limits, not all word sequences will actually occur, so that the language model values for, for example an N-gram language model must be determined from word sequences of N-1 words actually occurring. In accordance with the invention, these reduced word sequences from each different, complete word sequence are counted only once, irrespective of the actual frequency of occurrence of the complete word sequence or only reduced training sequences which occur exactly once in the training data are taken into account.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.