Patent · US Active

Method and apparatus for building a language model

US9396724B2 · kind B2 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 14, 2014
Grant dateJul 19, 2016
Priority date
Expiry dateSep 29, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/197
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.