Patent · US Expired

System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies

US6928404B1 · kind B1 · utility

14Cited by

20References

7Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Ponani Gopalakrishnan · New Delhi, IN
Dimitri Kanevsky · Ossining, US
Michael D. Monkowski · New Windsor, US
Jan Sedivy · Praha, CZ

Key dates

Filing date	Mar 17, 1999
Grant date	Aug 9, 2005
Priority date	—
Expiry date	Mar 17, 2019

Classification

Technology area (CPC Y)Emerging Cross-Sectional Technologies
CPC primaryY10S707/99942
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms. One method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms includes partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms, in at least one the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components and generating a language component vocabulary VC including word forms and word form components. The resulting language component vocabulary, which includes word forms and word components, is used to generate a language model that can be efficiently implemented for real-time automatic speech recognition applications for languages with large vocabularies.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.