Patent · US Expired

Single-count backing-off method of determining N-gram language model values

US5745876A · kind A · utility

8Cited by
5References
3Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 2, 1996
Grant dateApr 28, 1998
Priority date
Expiry dateMay 2, 2016

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/197
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

For the recognition of coherently spoken speech with a large vocabulary, language model values which take into account the probability of word sequences are considered at word transitions. Prior to the recognition, these language model values are derived on the basis of training speech signals. If the amount of training data is kept within sensible limits, not all word sequences will actually occur, so that the language model values for, for example an N-gram language model must be determined from word sequences of N-1 words actually occurring. In accordance with the invention, these reduced word sequences from each different, complete word sequence are counted only once, irrespective of the actual frequency of occurrence of the complete word sequence or only reduced training sequences which occur exactly once in the training data are taken into account.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.