Patent · US Expired

Apparatus and method for forming a filtered inflected language model for automatic speech recognition

US6073091A · kind A · utility

86Cited by
5References
27Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 6, 1997
Grant dateJun 6, 2000
Priority date
Expiry dateAug 6, 2017

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/197
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method of forming a language model for a language having a selected vocabulary of word forms comprises: (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence; (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of frequency ranges; (c) respectively assigning maps to the subsets; (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers; (e) determining n-gram statistics for the indexed integers; and (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.